perf: Optimize page splitting and relayout, so both per-iteration costs drop from O(N^2) to O(N)#3390
perf: Optimize page splitting and relayout, so both per-iteration costs drop from O(N^2) to O(N)#3390exoego wants to merge 5 commits into
Conversation
🦋 Changeset detectedLatest commit: ffaa476 The changes in this PR will be included in the next version bump. This PR includes changesets to release 7 packages
Not sure what this means? Click here to learn what changesets are. Click here if you're a maintainer who wants to add another changeset to this PR |
77efc9b to
f25b95e
Compare
|
Thanks @exoego . Can you please explain the logic? Not sure I get what it does. I think pagination needs to change more drastically though in order to support things like text columns. Right now it's hard to do because we first compute text in a big column and from there breaking is much harder |
I'm not sure I fully understand how a suffix array is useful here
Yoga (re)-layout should not add much overhead, also, how are dynamic nodes handled if there's no relayout?
Haha cool! I'd be curious to see how that looks, but still, I think the current pagination solution has become a bit too complex... I'm hesitant on keep adding complexity to it for perf improvements + new features. At this point I think it has to be redesigned completely as I feel there has to be a simpler solution. Most opened issues are due to pagination errors, wether it's text nodes layout, dynamic nodes, breaks, etc. Every thing I add at the current solution it's an extra layer of complexity or thing. that can break for a future migration. Right now the algorithm works by rendering everything as if it were a big-long page, and then start breaking things into pieces. This is mostly due to the nature of yoga and flexbox. But some nodes are dynamic, some other's geometries unknown (like text), and these depend on page size, not full-height layout. So I feel we shuold move more towards a "streaming" solution where pages are filled one by one somehow rather than a break-nodes solution. Not yet sure if it's possible |
|
I understand that you are more inclined to a full-rewrite of algorithm so it does not only boost performance, but also simplify introducing new features. It would be highly appreciated if this optimization get reviewed and merged as a short-term solution🙇 since there are seemingly many users, including myself, who are facing performance issues. |
|
I understand and I think it's a reasonable ask :) Can you confirm though that dynamic nodes work as expected? I'm a bit scared of removing the re-layout step. Not even myself sometimes get al the weird quirks of the current pagination algorithm :) |
|
@exoego just checked out the branch and tested the examples repo for a quick visual regression test. There's something odd in the mermaid example (if you run |
…aling behavior of resolvePagination Baseline results show worse-than-quadratic scaling: - 100 children: 5.1ms - 500 children: 126ms (25x) - 1000 children: 628ms (123x, worse than 100x) - 2000 children: 3,597ms (707x, worse than 400x)
- Remove per-iteration nodes.slice() and futureNodes.filter() calls - Add shouldBreakOptimized() that accepts pre-computed scalar values instead of scanning arrays each call (O(N) → O(1)) - The original shouldBreak is now a thin wrapper that computes¥ pre-computed values from arrays and delegates to shouldBreakOptimized. - Pre-compute suffix max array for furthest end of non-fixed future nodes in a single right-to-left pass (O(N)) - Pre-collect fixed node entries once instead of filtering per iteration - Track hasNonFixedPrevious as a running boolean instead of filtering previousElements array each iteration
Skip the expensive relayoutPage() call on nextPage since it's only used as input to the next splitPage iteration, never added to final output. The currentPage (which IS in the output) is still fully relaid out. Also fix splitNode to always set box.height on the next half, even for auto-height nodes. Without relayout, an auto-height node would keep its original (too large) box.height, causing infinite splitting loops.
…t relayout When a node triggers a split in splitNodes, the remaining siblings (via nodes.slice(i+1)) were pushed to nextChildren with their original box.top values. Previously relayout on nextPage corrected these positions, but after the relayout removal optimization, nodes like footers with marginTop:'auto' retained large top values and were incorrectly classified as "outside" the next page, causing them to appear alone on a separate page.
f25b95e to
ffaa476
Compare
1. mermaid example ... has completely blank pagesGood catch. Fixed it in ffaa476 and performance is still very good. CauseWhen a node triggered a split in FixThe fix adds 2. I'm not sure I fully understand how a suffix array is useful hereProblemThe original Solution (how suffix array helps)
3. how are dynamic nodes handled if there's no relayout?Sorry for the explanation "Removed relayout entirely" confused you. Dynamic nodes are still fully relaid out. At the top of each
So, dynamic nodes are never output without a fresh Yoga pass. |

This PR drastically optimizes the pagination mentioned in
I've added Vitest benchmark for pagination
(
yarn vitest bench packages/layout/tests/steps/resolvePagination.bench.ts)and compared p999 duration msec:
Scaling (Before) is
~O(N^2 * K).It grows quadratically (1→22→110→631 roughly N^2 pattern but worse, since K also scales with N).
Scaling (After( is
~O(N * K).It grows linearly with a constant overhead seemingly from per-page Yoga relayout, tracking close to the ideal 1→5→10→20 but slightlyWorse due to O(C) relayout per page.
Added several test cases on pagination to ensure the behavior of pagination is preserved.