Skip to content

More correct caching logic#911

Draft
nicoburns wants to merge 7 commits into
DioxusLabs:mainfrom
nicoburns:correct-caching
Draft

More correct caching logic#911
nicoburns wants to merge 7 commits into
DioxusLabs:mainfrom
nicoburns:correct-caching

Conversation

@nicoburns
Copy link
Copy Markdown
Member

Objective

Ensure that Taffy correctly computes layouts when re-layouting with a warm/populated cache.

Context

Blitz is seeing bugs in the layout that only occur when incremental mode is enabled (which corresponds to having a Taffy cache populated from a previous frame). There are reports of similar bugs from Floem.

In the screenshots below, note how in incremental mode the text in the top-right ("Create Account" and "Log In") wraps, whereas in non-incremental mode it does. It is not supposed to wrap and does not do so in other browsers.

NON-incremental modeIncremental mode
Screenshot 2026-01-31 at 16 56 46 Screenshot 2026-01-31 at 16 56 29

Benchmarks

This appears to have little effect on some benchmarks. But it is a 40-50% regression on the Flexbox "Deep tree (auto size)" benchmarks and the "mixed tree" benchmarks and a 15-25% regression on the CSS Grid "deep tree" benchmarks.

I'm also seeing perf regressions around the 50-80% mark when doing a full (non-incremental) re-layouts of some websites in Blitz. https://en.wikipedia.org/wiki/Barack_Obama is ~36ms -> ~46ms. https://www.bbc.co.uk/news is ~9ms -> 16ms. Layouts with populated cache are very fast ~100 microseconds. I don't have good numbers for the "partial cache" case.

The good news is that it doesn't seem to affect scaling behaviour. It's a ~flat perf regression regardless of tree size.

cargo bench
   Compiling taffy v0.9.2 (/Users/nico/code/oss/taffy)
   Compiling taffy_benchmarks v0.1.0 (/Users/nico/code/oss/taffy/benches)
    Finished `bench` profile [optimized] target(s) in 6.30s
     Running unittests src/lib.rs (target/release/deps/taffy_benchmarks-4a3d205e86ae3bb9)

running 0 tests

test result: ok. 0 passed; 0 failed; 0 ignored; 0 measured; 0 filtered out; finished in 0.00s

     Running benches/flexbox.rs (target/release/deps/flexbox-cbf27e9137a1920f)
Gnuplot not found, using plotters backend
yoga 'huge nested'/Taffy 0.7 /10000
                        time:   [5.5369 ms 5.5921 ms 5.6491 ms]
                        change: [−2.3938% −0.9506% +0.5954%] (p = 0.22 > 0.05)
                        No change in performance detected.

Wide tree/Taffy 0.7 (2-level hierarchy)/10000
                        time:   [7.0380 ms 7.1307 ms 7.3113 ms]
                        change: [−4.3709% +3.0942% +9.8721%] (p = 0.43 > 0.05)
                        No change in performance detected.

Deep tree (auto size)/Taffy 0.7 (12-level hierarchy)/4000
                        time:   [5.0202 ms 5.1008 ms 5.1881 ms]
                        change: [+51.424% +54.215% +57.129%] (p = 0.00 < 0.05)
                        Performance has regressed.
Found 1 outliers among 10 measurements (10.00%)
  1 (10.00%) high mild
Deep tree (auto size)/Taffy 0.7 (14-level hierarchy)/10000
                        time:   [12.300 ms 12.421 ms 12.552 ms]
                        change: [+33.458% +38.480% +42.594%] (p = 0.00 < 0.05)
                        Performance has regressed.

Deep tree (random size)/Taffy 0.7 (12-level hierarchy)/4000
                        time:   [2.4441 ms 2.4623 ms 2.4923 ms]
                        change: [−9.5346% −6.6524% −3.4755%] (p = 0.00 < 0.05)
                        Performance has improved.
Found 4 outliers among 10 measurements (40.00%)
  2 (20.00%) low mild
  2 (20.00%) high severe
Deep tree (random size)/Taffy 0.7 (14-level hierarchy)/10000
                        time:   [6.3630 ms 6.4582 ms 6.6748 ms]
                        change: [−2.3044% +1.6328% +5.5245%] (p = 0.46 > 0.05)
                        No change in performance detected.

super deep/Taffy 0.7 /100
                        time:   [430.22 µs 447.87 µs 460.40 µs]
                        change: [+39.519% +44.728% +49.049%] (p = 0.00 < 0.05)
                        Performance has regressed.

     Running benches/grid.rs (target/release/deps/grid-fa3bd6dc95a03bd2)
Gnuplot not found, using plotters backend
grid/wide/31x31/961     time:   [1.0470 ms 1.1202 ms 1.2003 ms]
                        change: [−5.6796% +3.9343% +13.747%] (p = 0.45 > 0.05)
                        No change in performance detected.
grid/wide/100x100/10000 time:   [15.392 ms 16.370 ms 17.271 ms]
                        change: [−11.077% −7.1656% −3.3548%] (p = 0.00 < 0.05)
                        Performance has improved.
grid/wide/316x316/99856 time:   [217.80 ms 222.32 ms 227.62 ms]
                        change: [−9.5941% −7.3845% −5.1144%] (p = 0.00 < 0.05)
                        Performance has improved.

grid/deep/2x2/1024      time:   [2.9880 ms 3.0262 ms 3.0604 ms]
                        change: [+13.679% +17.257% +21.333%] (p = 0.00 < 0.05)
                        Performance has regressed.
grid/deep/3x3/6561      time:   [15.620 ms 15.844 ms 16.097 ms]
                        change: [+21.902% +24.069% +26.205%] (p = 0.00 < 0.05)
                        Performance has regressed.
grid/deep/2x2/16384     time:   [56.478 ms 56.713 ms 57.157 ms]
                        change: [+19.212% +21.136% +23.037%] (p = 0.00 < 0.05)
                        Performance has regressed.

Benchmarking grid/superdeep/1x1/100: Collecting 10 samples in estimated 5.0033 s (13k itera
grid/superdeep/1x1/100  time:   [357.68 µs 360.16 µs 362.37 µs]
                        change: [+41.724% +44.073% +46.613%] (p = 0.00 < 0.05)
                        Performance has regressed.
Found 1 outliers among 10 measurements (10.00%)
  1 (10.00%) high mild
Benchmarking grid/superdeep/1x1/1000: Collecting 10 samples in estimated 5.1381 s (1320 ite
grid/superdeep/1x1/1000 time:   [3.6229 ms 3.6789 ms 3.7103 ms]
                        change: [+22.321% +25.045% +27.599%] (p = 0.00 < 0.05)
                        Performance has regressed.

     Running benches/mixed.rs (target/release/deps/mixed-879dff8856f2a28c)
Gnuplot not found, using plotters backend
Benchmarking mixed_flex_grid/mixed/depth_2_width_4: Collecting 100 samples in estimated 6.3
mixed_flex_grid/mixed/depth_2_width_4
                        time:   [2.5616 ms 2.5842 ms 2.6104 ms]
                        change: [+7.9924% +9.4217% +10.971%] (p = 0.00 < 0.05)
                        Performance has regressed.

@nicoburns nicoburns added bug Something isn't working performance Layout go brr controversial This work requires a heightened standard of review due to implementation or design complexity labels Jan 31, 2026
@nicoburns
Copy link
Copy Markdown
Member Author

@jrmoulton Could you give this a go and see if it resolves the issues you're seeing? This should rebase cleanly on top of of Taffy 0.9 if you don't want to upgrade to main just to test.

@jrmoulton
Copy link
Copy Markdown

This does solve the case that I had shared with you but I still have another case that is broken that is fixed by aggressively clearing the cache on the text node on every frame.

Below is a case with your fix applied but not aggressively clearing the cache. If I aggressively clear the cache it doesnt' wrap.

bad-wrap.mp4

For more context:

This case I am aggressively clearing the cache but without your fix applied and it doesn't get enough space. If I use your fix (even without manually clearing the cache) this case works.

did-not-wrap.mp4

@nicoburns
Copy link
Copy Markdown
Member Author

@jrmoulton I've pushed another update that you may wish to test. Expect terrible performance with this one. But that ought to be fixable if it solves the correctness issues.

@jrmoulton
Copy link
Copy Markdown

This change does fix all issues I was having without me doing any additional clearing of the cache ❤️

@nicoburns
Copy link
Copy Markdown
Member Author

Hmm... I think this may only be working because it's thrashing the cache (effectively aggressively clearing it for us). When I try to get the performance back it breaks again on the layouts I'm testing.

@jrmoulton Your layout looks a particularly simple example that stills breaks, and I would be keen to turn into a test case. Would you be able to post independently runnable code (even Floem code) that reproduces the issue?

@jrmoulton
Copy link
Copy Markdown

Whoops. forgot to respond to this. yeah I'll get an example.

@nicoburns nicoburns force-pushed the correct-caching branch 7 times, most recently from 44cbe8d to 5f26b59 Compare March 24, 2026 00:11
@nicoburns nicoburns changed the title Use simple cache key equality check More correct caching logic Mar 24, 2026
@nicoburns
Copy link
Copy Markdown
Member Author

@jrmoulton Another iteration for you to test. This one should be potentially landable, although I'm seeing regressions of 30-60% in "deep" benchmarks (flexbox and grid)

Benchmark results
yoga 'huge nested'/Taffy 0.7 /10000
                        time:   [5.5530 ms 5.6022 ms 5.6558 ms]
                        change: [−7.0555% −5.9233% −4.7558%] (p = 0.00 < 0.05)
                        Performance has improved.
Found 4 outliers among 100 measurements (4.00%)
  3 (3.00%) high mild
  1 (1.00%) high severe

Wide tree/Taffy 0.7 (2-level hierarchy)/10000
                        time:   [7.8183 ms 7.9693 ms 8.1215 ms]
                        change: [+3.2884% +6.7535% +10.627%] (p = 0.00 < 0.05)
                        Performance has regressed.

Deep tree (auto size)/Taffy 0.7 (12-level hierarchy)/4000
                        time:   [5.2799 ms 5.3082 ms 5.3459 ms]
                        change: [+58.170% +62.770% +67.446%] (p = 0.00 < 0.05)
                        Performance has regressed.
Found 1 outliers among 10 measurements (10.00%)
  1 (10.00%) high mild
Deep tree (auto size)/Taffy 0.7 (14-level hierarchy)/10000
                        time:   [13.323 ms 13.408 ms 13.499 ms]
                        change: [+54.681% +58.203% +61.760%] (p = 0.00 < 0.05)
                        Performance has regressed.

Deep tree (random size)/Taffy 0.7 (12-level hierarchy)/4000
                        time:   [2.7064 ms 2.7364 ms 2.7705 ms]
                        change: [+5.6131% +10.387% +16.122%] (p = 0.00 < 0.05)
                        Performance has regressed.
Found 1 outliers among 10 measurements (10.00%)
  1 (10.00%) high severe
Deep tree (random size)/Taffy 0.7 (14-level hierarchy)/10000
                        time:   [6.9714 ms 7.0218 ms 7.1061 ms]
                        change: [−1.6758% +2.8995% +7.5394%] (p = 0.26 > 0.05)
                        No change in performance detected.
Found 1 outliers among 10 measurements (10.00%)
  1 (10.00%) high mild

super deep/Taffy 0.7 /100
                        time:   [431.20 µs 436.88 µs 440.37 µs]
                        change: [+38.650% +45.307% +51.170%] (p = 0.00 < 0.05)
                        Performance has regressed.

     Running benches/grid.rs (target/release/deps/grid-7d8402e1ff92539f)
Gnuplot not found, using plotters backend
grid/wide/31x31/961     time:   [1.3374 ms 1.3710 ms 1.4062 ms]
                        change: [+25.626% +35.091% +43.777%] (p = 0.00 < 0.05)
                        Performance has regressed.
Found 1 outliers among 10 measurements (10.00%)
  1 (10.00%) high mild
grid/wide/100x100/10000 time:   [17.200 ms 17.483 ms 18.106 ms]
                        change: [+2.3861% +6.9475% +11.751%] (p = 0.01 < 0.05)
                        Performance has regressed.
grid/wide/316x316/99856 time:   [223.51 ms 224.08 ms 224.68 ms]
                        change: [+2.5410% +2.8589% +3.1723%] (p = 0.00 < 0.05)
                        Performance has regressed.

grid/deep/2x2/1024      time:   [3.1721 ms 3.1994 ms 3.2444 ms]
                        change: [+38.378% +40.877% +43.927%] (p = 0.00 < 0.05)
                        Performance has regressed.
Found 1 outliers among 10 measurements (10.00%)
  1 (10.00%) high mild
grid/deep/3x3/6561      time:   [16.232 ms 16.311 ms 16.463 ms]
                        change: [+33.071% +34.416% +35.845%] (p = 0.00 < 0.05)
                        Performance has regressed.
grid/deep/2x2/16384     time:   [60.478 ms 60.709 ms 61.088 ms]
                        change: [+34.477% +36.279% +38.449%] (p = 0.00 < 0.05)
                        Performance has regressed.
Found 1 outliers among 10 measurements (10.00%)
  1 (10.00%) high mild

grid/superdeep/1x1/100  time:   [369.98 µs 372.81 µs 375.74 µs]
                        change: [+45.946% +48.288% +50.575%] (p = 0.00 < 0.05)
                        Performance has regressed.
grid/superdeep/1x1/1000 time:   [3.8290 ms 3.8492 ms 3.8676 ms]
                        change: [+30.205% +31.825% +33.603%] (p = 0.00 < 0.05)
                        Performance has regressed.
Found 2 outliers among 10 measurements (20.00%)
  1 (10.00%) low mild
  1 (10.00%) high mild

     Running benches/mixed.rs (target/release/deps/mixed-eed1010c2fd5b815)
Gnuplot not found, using plotters backend
mixed_flex_grid/mixed/depth_2_width_4
                        time:   [2.7138 ms 2.7253 ms 2.7368 ms]
                        change: [+16.233% +17.766% +19.299%] (p = 0.00 < 0.05)
                        Performance has regressed.
Found 1 outliers among 100 measurements (1.00%)
  1 (1.00%) high mild
Benchmarking mixed_flex_grid/mixed/depth_2_width_8: Warming up for 3.0000 s
Warning: Unable to complete 100 samples in 5.0s. You may wish to increase target time to 7.3s, or reduce sample count to 60.
mixed_flex_grid/mixed/depth_2_width_8
                        time:   [5.1472 ms 5.1827 ms 5.2193 ms]
                        change: [+4.9143% +5.8135% +6.6509%] (p = 0.00 < 0.05)
                        Performance has regressed.
Found 1 outliers among 100 measurements (1.00%)
  1 (1.00%) high mild
Benchmarking mixed_flex_grid/mixed/depth_4_width_4: Warming up for 3.0000 s
Warning: Unable to complete 100 samples in 5.0s. You may wish to increase target time to 33.1s, or reduce sample count to 10.
mixed_flex_grid/mixed/depth_4_width_4
                        time:   [60.310 ms 60.470 ms 60.631 ms]
                        change: [+13.134% +14.130% +15.038%] (p = 0.00 < 0.05)
                        Performance has regressed.
Found 2 outliers among 100 measurements (2.00%)
  2 (2.00%) high mild

Signed-off-by: Nico Burns <nico@nicoburns.com>
Signed-off-by: Nico Burns <nico@nicoburns.com>
nicoburns added 2 commits May 15, 2026 18:41
Signed-off-by: Nico Burns <nico@nicoburns.com>
Signed-off-by: Nico Burns <nico@nicoburns.com>
Signed-off-by: Nico Burns <nico@nicoburns.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

bug Something isn't working controversial This work requires a heightened standard of review due to implementation or design complexity performance Layout go brr

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants