Skip to content

fix(pm-adapter): emit pending section break before TOC/docPartObj SDTs (SD-2557)#2872

Open
tupizz wants to merge 2 commits intomainfrom
tadeu/sd-2557-bug-next-page-section-break-before-tocdocpartobj-content-is
Open

fix(pm-adapter): emit pending section break before TOC/docPartObj SDTs (SD-2557)#2872
tupizz wants to merge 2 commits intomainfrom
tadeu/sd-2557-bug-next-page-section-break-before-tocdocpartobj-content-is

Conversation

@tupizz
Copy link
Copy Markdown
Contributor

@tupizz tupizz commented Apr 20, 2026

Summary

When a Word document stores a w:sectPr on a paragraph immediately before a TOC (or other docPartObj) SDT, the nextPage section break was dropped during pm-adapter conversion. The TOC rendered on the same page as the prior section's content instead of starting a new page as Word does.

Linear: SD-2557

Root cause

In packages/layout-engine/pm-adapter/:

  • findParagraphsWithSectPr (sections/analysis.ts) recurses into index, bibliography, and tableOfAuthorities to count their children as paragraphs, but NOT into documentPartObject / tableOfContents.
  • Section ranges therefore treat a TOC SDT as a single opaque unit — its children don't occupy paragraph indices.
  • handleParagraphNode, handleIndexNode, handleBibliographyNode, and handleTableOfAuthoritiesNode each emit a pending section break before the paragraph whose index matches nextSection.startParagraphIndex.
  • handleDocumentPartObjectNode and handleTableOfContentsNode did NOT run this check — so the deferred break only fired on the next body paragraph AFTER the SDT, leaving the SDT's content in the previous section with no page break before it.

Fix

Add an emitPendingSectionBreakForParagraph(sectionState, nextBlockId, blocks, recordBlockKind) helper in sections/breaks.ts that centralizes the "check, emit, advance" pattern, then call it at the top of handleDocumentPartObjectNode and handleTableOfContentsNode — once per SDT. Since the SDT's children don't affect currentParagraphIndex (findParagraphsWithSectPr skips them), the check fires correctly at the SDT boundary: if the SDT sits at a section boundary, the nextPage break is emitted so the SDT renders on a new page.

No changes to section-range computation — counting stays consistent.

Verification

Both fixtures from the Linear issue:

Fixture Before After
Highstreet Proposal Sample.docx Cover + TOC on same page (p1) Cover on p1, TOC on p2 — matches Word
Heffernan Proposal Sample (2).docx Cover + TOC on same page (p1) Cover on p1, TOC on p2 — matches Word

Side-by-side page-by-page PDF available at /tmp/sd-2557-fixtures/SD-2557-comparison.pdf.

Test plan

  • 3 new unit tests in document-part-object.test.ts:
    • Emits section break at SDT boundary
    • No emission when SDT is not at a section boundary
    • No-op when sectionState is undefined
  • 1740 @superdoc/pm-adapter tests pass (up from 1737)
  • 604 @superdoc/layout-engine tests pass
  • Browser reproduction: highstreet fixture before/after shows TOC correctly starts on page 2
  • CI full suite
  • Upload fixtures to R2 corpus for visual regression coverage (follow-up)

Demo

CleanShot 2026-04-20 at 16 26 24@2x CleanShot 2026-04-20 at 16 26 41@2x

Related

  • SD-2508 (docPartObj SDT handler drops tables) — same SDT subsystem, different bug

…s (SD-2557)

When a Word document stores a `w:sectPr` on a paragraph immediately before a
TOC (or other `docPartObj`) SDT, the section break was dropped. The TOC
ended up rendering on the same page as the prior section's content instead
of starting a new page as Word does.

Root cause in pm-adapter:

  - `findParagraphsWithSectPr` (sections/analysis.ts) recurses into `index`,
    `bibliography`, and `tableOfAuthorities` to count their children as
    paragraphs, but NOT into `documentPartObject` / `tableOfContents`.
  - As a result, section ranges treat a TOC SDT as a single opaque unit —
    its children don't occupy paragraph indices.
  - When processing flow, `handleParagraphNode` / `handleIndexNode` /
    `handleBibliographyNode` / `handleTableOfAuthoritiesNode` each emit a
    pending section break before the paragraph whose index matches
    `nextSection.startParagraphIndex`.
  - `handleDocumentPartObjectNode` and `handleTableOfContentsNode` did NOT
    run this check, so the deferred break only fired on the next body
    paragraph AFTER the SDT. The SDT's content rendered in the PREVIOUS
    section, with no page break before it.

Fix:

  - Add `emitPendingSectionBreakForParagraph(sectionState, nextBlockId,
    blocks, recordBlockKind)` helper in sections/breaks.ts that centralizes
    the "check, emit, advance" pattern.
  - Call the helper at the top of `handleDocumentPartObjectNode` and
    `handleTableOfContentsNode` — once per SDT. Since the SDT's children
    don't affect `currentParagraphIndex` (`findParagraphsWithSectPr` skips
    them), the check fires correctly at the SDT boundary: if the SDT sits
    at a section boundary, the nextPage break is emitted so the SDT renders
    on a new page.
  - No changes to section-range computation — counting stays consistent.

Verified against both fixtures from the issue (Highstreet Proposal Sample,
Heffernan Proposal Sample): cover stays on its own page, TOC starts on a
new page, matching Word.

Tests:

  - 3 new unit tests in document-part-object.test.ts covering:
    - Section break emitted at SDT boundary
    - No emission when SDT is not at a section boundary
    - No-op when sectionState is undefined
  - 1740 pm-adapter tests pass (up from 1737), 604 layout-engine tests pass.
@linear
Copy link
Copy Markdown

linear bot commented Apr 20, 2026

@codecov-commenter
Copy link
Copy Markdown

Codecov Report

✅ All modified and coverable lines are covered by tests.

📢 Thoughts on this report? Let us know!

…ld emission (SD-2557)

The initial fix only handled the sectPr BEFORE a TOC docPartObj (on the
empty paragraph that precedes the SDT). It missed the SECOND sectPr that
Word commonly stores on the trailing empty paragraph INSIDE the TOC SDT —
which signals "TOC section ends here, next section starts on new page".

Because `findParagraphsWithSectPr` did not recurse into `documentPartObject`
or `tableOfContents`, that inner sectPr was never discovered, so no
section-range boundary was built between the TOC and the following body
content. SuperDoc rendered the TOC and the first body section stacked on
the same page (just one page later than before the first fix).

Complete fix:

1. `findParagraphsWithSectPr` (sections/analysis.ts) now recurses into
   `documentPartObject` and `tableOfContents` in addition to `index` /
   `bibliography` / `tableOfAuthorities`. This lets section-range analysis
   see sectPrs stored anywhere inside a TOC SDT.

2. `handleDocumentPartObjectNode` (non-TOC branch) emits the pending
   section break before each child paragraph and advances
   `currentParagraphIndex` — matching the pattern in `handleIndexNode`,
   `handleBibliographyNode`, and `handleTableOfAuthoritiesNode`.

3. `processTocChildren` (toc.ts) accepts `sectionState` via its context
   arg and performs the same per-child emit + increment dance as the
   paragraph handlers. This handles the TOC branch of
   `handleDocumentPartObjectNode` and the nested `tableOfContents`
   recursion path.

With all three changes, the Highstreet fixture now renders exactly like
Word:

  - Page 1: cover
  - Page 2: TOC alone
  - Page 3: ABOUT US body
  - Page 4: ON BEHALF OF HIGHSTREET
  - Page 5: WORKERS COMPENSATION

Tests:
  - 4 new tests in document-part-object.test.ts (non-TOC emission,
    non-boundary no-op, undefined state, sectionState passthrough to
    processTocChildren)
  - 1741/1741 pm-adapter, 604/604 layout-engine, 11377/11377 super-editor
@tupizz tupizz marked this pull request as ready for review April 20, 2026 19:29
Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 158dd29d64

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

} = context;

// See handleDocumentPartObjectNode for rationale (SD-2557).
emitPendingSectionBreakForParagraph({ sectionState, nextBlockId, blocks, recordBlockKind });
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Badge Keep TOC section state aligned with counted child paragraphs

findParagraphsWithSectPr now counts tableOfContents child paragraphs (sections/analysis.ts), but this handler only emits a pending break once at node entry and never advances sectionState.currentParagraphIndex per TOC paragraph. When a document has a tableOfContents node before later section boundaries, the section cursor falls behind and later sectionBreak blocks are emitted at the wrong point (often only by the end-of-document fallback), which changes pagination compared with Word.

Useful? React with 👍 / 👎.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants