perf: Optimize page splitting and relayout, so both per-iteration costs drop from O(N^2) to O(N) by exoego · Pull Request #3390 · diegomura/react-pdf

exoego · 2026-04-16T13:20:54Z

This PR drastically optimizes the pagination mentioned in

#3367 (review)
The major performance bottleneck is pagination, but the real improvement won't be avoiding 1 or 2 layout steps but at this point I need to fully redesign the algorithm to be O(N). But I been struggling to do so

I've added Vitest benchmark for pagination
(yarn vitest bench packages/layout/tests/steps/resolvePagination.bench.ts)
and compared p999 duration msec:

Num of elements	Before	After	Speedup	Scaling (Before)	Scaling (After)
100 (~10 pages)	5.75 ms	2.49 ms	2.3x	1x	1x
500 (~50 pages)	127 ms	12.9 ms	9.8x	22x ≒ 25(5^2) x K	5.18x
1000 (~100 pages)	635 ms	34.2 ms	17.9x	110x ≒ 100(10^2) x K	13.7x
2000 (~200 pages)	3630 ms	89.2 ms	40.7x	631x ≒ 400(20^2) x 1.5 x K	35.8x

Scaling (Before) is ~O(N^2 * K).
It grows quadratically (1→22→110→631 roughly N^2 pattern but worse, since K also scales with N).

Scaling (After( is ~O(N * K).
It grows linearly with a constant overhead seemingly from per-page Yoga relayout, tracking close to the ideal 1→5→10→20 but slightlyWorse due to O(C) relayout per page.

Added several test cases on pagination to ensure the behavior of pagination is preserved.

changeset-bot · 2026-04-16T13:21:00Z

🦋 Changeset detected

Latest commit: ffaa476

The changes in this PR will be included in the next version bump.

This PR includes changesets to release 7 packages

Name	Type
@react-pdf/layout	Minor
@react-pdf/renderer	Patch
@react-pdf/math	Patch
@react-pdf/mermaid	Patch
next-14	Patch
next-15	Patch
@react-pdf/vite-example	Patch

Not sure what this means? Click here to learn what changesets are.

Click here if you're a maintainer who wants to add another changeset to this PR

diegomura · 2026-04-17T14:50:48Z

Thanks @exoego . Can you please explain the logic? Not sure I get what it does.

I think pagination needs to change more drastically though in order to support things like text columns. Right now it's hard to do because we first compute text in a big column and from there breaking is much harder

exoego · 2026-04-17T22:47:06Z

The below table explains my understanding of "Before" logic and how "After" logic optimizes it:

Unit	Before: O(N^2)	After: O(N)
`splitNodes`	Each iteration performed O(N) operations, yielding O(N^2) total: - `nodes.slice(i + 1)`: O(N - i)　copy - `futureNodes.filter(isFixed)`: O(N - i) scan - `shouldNodeBreak(...)` internally re-filtered futureNodes O(N - i) + scanned previousElements O(i)	O(N) pre-computation reduces per-iteration cost to O(1): -`computeSuffixFurthestEnd`: backward pass builds suffix-max array, replacing futureNodes filter+aggregation - `collectFixedIndices`: pre-collects fixed node indices, replacing slice+filter - `hasNonFixedPrevious`: boolean flag replaces previousElements scan
`splitPage`	Called `relayout(nextPage)` after each split: a full Yoga layout pass over all remaining children. Since nextPage was only used as input to the next split (never in final output), this redundant relayout compounded to O(N^2) across all pages.	~~Removed `relayout` entirely.~~Skipped relayout on `nextPage`. `splitNodes` already adjusts `box.top`, and nextPage is never in final output: only each currentPage gets properly relaid out.

I think pagination needs to change more drastically though in order to support things like text columns. Right now it's hard to do because we first compute text in a big column and from there breaking is much harder

Don't worry.
I already have a WIP branch for text columns, which works perfectly with this optimized pagination 😉

diegomura · 2026-04-18T11:07:06Z

backward pass builds suffix-max array, replacing futureNodes filter+aggregation

I'm not sure I fully understand how a suffix array is useful here

Removed relayout entirely.

Yoga (re)-layout should not add much overhead, also, how are dynamic nodes handled if there's no relayout?

I already have a WIP branch for text columns

Haha cool! I'd be curious to see how that looks, but still, I think the current pagination solution has become a bit too complex... I'm hesitant on keep adding complexity to it for perf improvements + new features. At this point I think it has to be redesigned completely as I feel there has to be a simpler solution. Most opened issues are due to pagination errors, wether it's text nodes layout, dynamic nodes, breaks, etc. Every thing I add at the current solution it's an extra layer of complexity or thing. that can break for a future migration.

Right now the algorithm works by rendering everything as if it were a big-long page, and then start breaking things into pieces. This is mostly due to the nature of yoga and flexbox. But some nodes are dynamic, some other's geometries unknown (like text), and these depend on page size, not full-height layout. So I feel we shuold move more towards a "streaming" solution where pages are filled one by one somehow rather than a break-nodes solution. Not yet sure if it's possible

exoego · 2026-04-18T11:29:07Z

I understand that you are more inclined to a full-rewrite of algorithm so it does not only boost performance, but also simplify introducing new features.
But I think such complete rewrite requires weeks or months to get mature.

It would be highly appreciated if this optimization get reviewed and merged as a short-term solution🙇 since there are seemingly many users, including myself, who are facing performance issues.

diegomura · 2026-04-18T20:09:40Z

I understand and I think it's a reasonable ask :) Can you confirm though that dynamic nodes work as expected? I'm a bit scared of removing the re-layout step. Not even myself sometimes get al the weird quirks of the current pagination algorithm :)

diegomura · 2026-04-18T20:15:54Z

@exoego just checked out the branch and tested the examples repo for a quick visual regression test. There's something odd in the mermaid example (if you run yarn dev and select the vite project -> http://localhost:5173/#mermaid). It has completely blank pages

…aling behavior of resolvePagination Baseline results show worse-than-quadratic scaling: - 100 children: 5.1ms - 500 children: 126ms (25x) - 1000 children: 628ms (123x, worse than 100x) - 2000 children: 3,597ms (707x, worse than 400x)

- Remove per-iteration nodes.slice() and futureNodes.filter() calls - Add shouldBreakOptimized() that accepts pre-computed scalar values instead of scanning arrays each call (O(N) → O(1)) - The original shouldBreak is now a thin wrapper that computes¥ pre-computed values from arrays and delegates to shouldBreakOptimized. - Pre-compute suffix max array for furthest end of non-fixed future nodes in a single right-to-left pass (O(N)) - Pre-collect fixed node entries once instead of filtering per iteration - Track hasNonFixedPrevious as a running boolean instead of filtering previousElements array each iteration

Skip the expensive relayoutPage() call on nextPage since it's only used as input to the next splitPage iteration, never added to final output. The currentPage (which IS in the output) is still fully relaid out. Also fix splitNode to always set box.height on the next half, even for auto-height nodes. Without relayout, an auto-height node would keep its original (too large) box.height, causing infinite splitting loops.

…t relayout When a node triggers a split in splitNodes, the remaining siblings (via nodes.slice(i+1)) were pushed to nextChildren with their original box.top values. Previously relayout on nextPage corrected these positions, but after the relayout removal optimization, nodes like footers with marginTop:'auto' retained large top values and were incorrectly classified as "outside" the next page, causing them to appear alone on a separate page.

exoego · 2026-04-19T06:24:13Z

@diegomura

1. mermaid example ... has completely blank pages

Good catch. Fixed it in ffaa476 and performance is still very good.

Cause

When a node triggered a split in splitNodes, the remaining siblings were pushed to nextChildren via nodes.slice(i + 1) with their original box.top values.
Previously relayout on nextPage corrected these, but after the optimization, nodes like the footer (with marginTop: 'auto', positioned near the page bottom) retained their large top values.
On the next splitNodes pass, they were classified as isOutside (wrapArea <= top) and pushed to yet another page, producing pages with only a footer.

Fix

The fix adds adjustRemaining() which subtracts height from box.top for all non-fixed remaining siblings, consistent with how the breaking node itself and the isOutside path already adjust positions.

2. I'm not sure I fully understand how a suffix array is useful here

Problem

The original shouldNodeBreak needed to know: what is the furthest bottom edge among all future non-fixed siblings?
Previously this was computed per iteration by futureNodes.filter(isFixed) + Math.max(…map(n => n.box.top + n.box.height)), which is an O(N) scan each time, yielding O(N^2) total.

Solution (how suffix array helps)

computeSuffixFurthestEnd replaces this with a single right-to-left pass: suffixFurthestEnd[i] stores the max (top + height) of all non-fixed nodes after index i.
Then shouldBreakOptimized can look up the pre-computed value in O(1) instead of scanning the array each time.
It's basically the same data, just pre-computed.

3. how are dynamic nodes handled if there's no relayout?

Sorry for the explanation "Removed relayout entirely" confused you.
It meant Skipped relayout on nextPage.

Dynamic nodes are still fully relaid out.
The optimization only skips relayout on the intermediate nextPage.

At the top of each splitPage call, resolveDynamicPage checks for dynamic nodes and, if found, executes their render props and calls relayoutPage. So the new flow is:

nextPage is created without relayout (the optimization)
On the next iteration, splitPage(nextPage, ...) is called
resolveDynamicPage triggers full relayout if dynamic nodes exist
currentPage is always relaid out before being added to output

So, dynamic nodes are never output without a fresh Yoga pass.

exoego force-pushed the optimize-pagenation branch 2 times, most recently from 77efc9b to f25b95e Compare April 16, 2026 23:52

exoego added 5 commits April 19, 2026 15:08

add changeset

23d9070

exoego force-pushed the optimize-pagenation branch from f25b95e to ffaa476 Compare April 19, 2026 06:08

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

perf: Optimize page splitting and relayout, so both per-iteration costs drop from O(N^2) to O(N)#3390

perf: Optimize page splitting and relayout, so both per-iteration costs drop from O(N^2) to O(N)#3390
exoego wants to merge 5 commits into
diegomura:masterfrom
exoego:optimize-pagenation

exoego commented Apr 16, 2026 •

edited

Loading

Uh oh!

changeset-bot Bot commented Apr 16, 2026 •

edited

Loading

Uh oh!

diegomura commented Apr 17, 2026

Uh oh!

exoego commented Apr 17, 2026 •

edited

Loading

Uh oh!

diegomura commented Apr 18, 2026

Uh oh!

exoego commented Apr 18, 2026 •

edited

Loading

Uh oh!

diegomura commented Apr 18, 2026 •

edited

Loading

Uh oh!

diegomura commented Apr 18, 2026

Uh oh!

exoego commented Apr 19, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Uh oh!

Conversation

exoego commented Apr 16, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

changeset-bot Bot commented Apr 16, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

🦋 Changeset detected

Uh oh!

diegomura commented Apr 17, 2026

Uh oh!

exoego commented Apr 17, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

diegomura commented Apr 18, 2026

Uh oh!

exoego commented Apr 18, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

diegomura commented Apr 18, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

diegomura commented Apr 18, 2026

Uh oh!

exoego commented Apr 19, 2026

1. mermaid example ... has completely blank pages

Cause

Fix

2. I'm not sure I fully understand how a suffix array is useful here

Problem

Solution (how suffix array helps)

3. how are dynamic nodes handled if there's no relayout?

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

exoego commented Apr 16, 2026 •

edited

Loading

changeset-bot Bot commented Apr 16, 2026 •

edited

Loading

exoego commented Apr 17, 2026 •

edited

Loading

exoego commented Apr 18, 2026 •

edited

Loading

diegomura commented Apr 18, 2026 •

edited

Loading