Skip to content

memtable: Shrink sparse raft-engine memtable entry index buffers more proactively after compaction. #401

Merged
ti-chi-bot[bot] merged 3 commits into
tikv:masterfrom
LykxSassinator:lucasliang/shrink_in_time
Apr 29, 2026
Merged

memtable: Shrink sparse raft-engine memtable entry index buffers more proactively after compaction. #401
ti-chi-bot[bot] merged 3 commits into
tikv:masterfrom
LykxSassinator:lucasliang/shrink_in_time

Conversation

@LykxSassinator
Copy link
Copy Markdown
Contributor

@LykxSassinator LykxSassinator commented Apr 29, 2026

Problem

Close tikv/tikv#19544

raft-engine memtable memory can grow slowly over time even when workload and region count stay relatively stable.

A notable source is MemTable::entry_indexes: after entries are compacted away, the underlying deque may continue to retain a large amount of spare capacity. The previous shrink policy only triggered once capacity grew beyond 1023, and when triggered it used shrink_to_fit(), which was coarse and could still leave room for improvement in long-running stable clusters.

Solution

This PR makes entry index shrinking more proactive, but keeps it conservative enough to avoid unnecessary capacity churn.

The new policy still only runs after compact_to(), and shrinks entry_indexes only when all of the following are true:

  • the buffer has grown past a minimum capacity threshold;
  • live entries occupy at most 50% of the allocated slots;
  • shrinking would reclaim a meaningful amount of memory.

Instead of shrinking all the way down to len, the new logic keeps a small headroom window (len + CAPACITY_INIT) so a freshly compacted memtable does not immediately regrow on the next small append burst.

In short, this change targets long-lived sparse memtables while avoiding a simple “lower the threshold and always shrink harder” strategy.

Why this should not introduce a meaningful performance regression

The new logic does not run on the append hot path. It is only evaluated after compact_to(), and the fast path is just a few integer checks.

Actual reallocations become possible earlier than before, but only for clearly sparse memtables. The added utilization gate, minimum reclaim gate, and retained headroom together are intended to avoid frequent shrink/regrow oscillation.

This means the tradeoff is:

  • slightly more compact-side realloc opportunities in sparse cases;
  • noticeably better reclamation of long-lived unused entry index capacity.

Given that the logic stays off the append hot path and avoids shrink_to_fit() churn, the risk of a meaningful performance regression is low.

Validation

Added targeted regression tests in src/memtable.rs:

  1. test_memtable_compact_shrinks_sparse_entry_indexes
  • Verifies that a sparse memtable actually shrinks after compaction.
  • Also verifies the new policy keeps headroom instead of shrinking all the way to len.
  1. test_memtable_compact_keeps_busy_entry_indexes_capacity
  • Verifies that a still-busy memtable does not shrink just because compaction happened.
  1. test_memtable_compact_keeps_capacity_when_reclaim_is_too_small
  • Verifies that the deque is not shrunk when the reclaimable space is too small to justify a reallocation.
  1. test_memtable_append_after_compact_shrink_keeps_span_consistent
  • Verifies span, fetch, and live entry accounting remain correct after a shrink followed by new appends.

Build validation:

  • cargo +nightly-2025-04-03 fmt --all --check
  • cargo +nightly-2025-04-03 check --lib
  • cargo +nightly-2025-04-03 check --lib --no-default-features --features internals

Note: full cargo test is currently blocked in this environment by the existing Darwin/arm64 grpcio-sys toolchain issue rather than by this patch itself.

Summary by CodeRabbit

  • Refactor

    • Revised container shrinking to use utilization-aware capacity planning, making memory reclamation more aggressive while preserving a slack window to avoid frequent resizes.
  • Tests

    • Added tests for sparse-shrink, high-utilization (no shrink), minimal-reclaim (skip shrink), and post-shrink correctness with subsequent appends.
  • Style

    • Minor diagnostic message formatting tweaks for clearer error output.

… proactively after compaction.

Signed-off-by: lucasliang <nkcs_lykx@hotmail.com>
@ti-chi-bot ti-chi-bot Bot added the dco-signoff: yes Indicates the PR's author has signed the dco. label Apr 29, 2026
@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented Apr 29, 2026

📝 Walkthrough

Walkthrough

Entry-index shrinking now uses utilization and capacity-planning gates: it computes len, capacity, and a target (from CAPACITY_INIT), applies utilization and minimum-reclaim constraints, lowers the shrink threshold, and calls shrink_to(target) (preserving slack). Tests validate sparse shrink, busy/no-shrink, too-small reclaim, and append-after-shrink consistency.

Changes

Cohort / File(s) Summary
Memtable shrink logic & tests
src/memtable.rs
Reworked maybe_shrink_entry_indexes to compute len, capacity, and a target (based on CAPACITY_INIT); added utilization (len * CAPACITY_SHRINK_UTILIZATION_FACTOR <= capacity) and minimum-reclaim checks (CAPACITY_SHRINK_MIN_RECLAIM) before shrinking; lowered shrink threshold and replaced shrink_to_fit() with shrink_to(target); added tests for sparse-shrink, high-utilization no-shrink, too-small-reclaim no-shrink, and append+fetch consistency after shrink.
Test assertion message
src/engine.rs
Adjusted panic message formatting in test_rewrite_atomic_group to use inline {data:?} capture; no behavioral change.
Log writer panic message
src/file_pipe_log/log_file.rs
Formatting tweak in panic after failed reseek to use {e} in message; no behavioral change.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

Poem

🐰 I nibble buffers, soft and neat,
I shrink with math, not hasty feet.
I leave a slack, a comfy span,
Reclaim a bit — but keep a plan. 🥕

🚥 Pre-merge checks | ✅ 5
✅ Passed checks (5 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title accurately summarizes the main change: improving the memtable entry index buffer shrinking policy to be more proactive after compaction.
Linked Issues check ✅ Passed The PR directly addresses the memory growth issue in #19544 by implementing a more proactive shrink policy with utilization-based heuristics and reduced thresholds.
Out of Scope Changes check ✅ Passed Minor formatting changes in src/engine.rs and src/file_pipe_log/log_file.rs are tangential but reasonable cleanup alongside the main memtable changes.
Docstring Coverage ✅ Passed Docstring coverage is 88.89% which is sufficient. The required threshold is 80.00%.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share
Review rate limit: 7/8 reviews remaining, refill in 7 minutes and 30 seconds.

Comment @coderabbitai help to get the list of available commands and usage tips.

@ti-chi-bot ti-chi-bot Bot added the size/L Denotes a PR that changes 100-499 lines, ignoring generated files. label Apr 29, 2026
Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick comments (1)
src/memtable.rs (1)

69-74: Document why these thresholds are written as N - 1.

128 - 1 and 64 - 1 look like accidental off-by-one values at first glance. A short note explaining the intent would make this harder to “clean up” incorrectly later.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/memtable.rs` around lines 69 - 74, Explain in a short doc comment above
the three constants that the subtraction (e.g. `128 - 1`, `64 - 1`) is
intentional to produce an inclusive threshold (power-of-two minus one) rather
than an accidental off-by-one; mention that this form makes the intent explicit
for comparisons that use <= or bitmasking and prevents future accidental
simplification to just `128` or `64`. Update the comment above
CAPACITY_SHRINK_THRESHOLD, CAPACITY_SHRINK_UTILIZATION_FACTOR, and
CAPACITY_SHRINK_MIN_RECLAIM to state this rationale succinctly so maintainers
understand the deliberate `N - 1` choice.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Nitpick comments:
In `@src/memtable.rs`:
- Around line 69-74: Explain in a short doc comment above the three constants
that the subtraction (e.g. `128 - 1`, `64 - 1`) is intentional to produce an
inclusive threshold (power-of-two minus one) rather than an accidental
off-by-one; mention that this form makes the intent explicit for comparisons
that use <= or bitmasking and prevents future accidental simplification to just
`128` or `64`. Update the comment above CAPACITY_SHRINK_THRESHOLD,
CAPACITY_SHRINK_UTILIZATION_FACTOR, and CAPACITY_SHRINK_MIN_RECLAIM to state
this rationale succinctly so maintainers understand the deliberate `N - 1`
choice.

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: e7f7c8bd-991e-40d3-b4d8-4a7a361d369b

📥 Commits

Reviewing files that changed from the base of the PR and between f0fcebe and ebcd962.

📒 Files selected for processing (1)
  • src/memtable.rs

Copy link
Copy Markdown
Contributor

@glorv glorv left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@ti-chi-bot ti-chi-bot Bot added needs-1-more-lgtm Indicates a PR needs 1 more LGTM. approved labels Apr 29, 2026
Signed-off-by: lucasliang <nkcs_lykx@hotmail.com>
Signed-off-by: lucasliang <nkcs_lykx@hotmail.com>
Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick comments (1)
src/memtable.rs (1)

627-640: Consider making reclaim math explicit for readability.

The logic looks correct; a small cleanup can make the “minimum reclaim” gate easier to scan and maintain.

♻️ Suggested refactor
     fn maybe_shrink_entry_indexes(&mut self) {
         let len = self.entry_indexes.len();
         let capacity = self.entry_indexes.capacity();
         let target = std::cmp::max(len.saturating_add(CAPACITY_INIT), CAPACITY_INIT);
+        let reclaimable = capacity.saturating_sub(target);
         if capacity < CAPACITY_SHRINK_THRESHOLD
             || len.saturating_mul(CAPACITY_SHRINK_UTILIZATION_FACTOR) > capacity
-            || capacity <= target.saturating_add(CAPACITY_SHRINK_MIN_RECLAIM)
+            || reclaimable <= CAPACITY_SHRINK_MIN_RECLAIM
         {
             return;
         }
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/memtable.rs` around lines 627 - 640, The capacity-shrink gating is
correct but hard to read—refactor the condition in the method that manipulates
self.entry_indexes (using symbols: entry_indexes, CAPACITY_INIT,
CAPACITY_SHRINK_THRESHOLD, CAPACITY_SHRINK_UTILIZATION_FACTOR,
CAPACITY_SHRINK_MIN_RECLAIM, target, len, capacity, shrink_to) by computing
named intermediate values (e.g., target, reclaimable =
capacity.saturating_sub(target), utilization_limit = capacity /
CAPACITY_SHRINK_UTILIZATION_FACTOR or len.saturating_mul(...), and a boolean for
min_reclaim_ok) and then use those named booleans in a single if that returns
early; this makes the “minimum reclaim” gate explicit and improves readability
without changing behavior.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Nitpick comments:
In `@src/memtable.rs`:
- Around line 627-640: The capacity-shrink gating is correct but hard to
read—refactor the condition in the method that manipulates self.entry_indexes
(using symbols: entry_indexes, CAPACITY_INIT, CAPACITY_SHRINK_THRESHOLD,
CAPACITY_SHRINK_UTILIZATION_FACTOR, CAPACITY_SHRINK_MIN_RECLAIM, target, len,
capacity, shrink_to) by computing named intermediate values (e.g., target,
reclaimable = capacity.saturating_sub(target), utilization_limit = capacity /
CAPACITY_SHRINK_UTILIZATION_FACTOR or len.saturating_mul(...), and a boolean for
min_reclaim_ok) and then use those named booleans in a single if that returns
early; this makes the “minimum reclaim” gate explicit and improves readability
without changing behavior.

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 2dcdd967-8c5a-41eb-a93b-5968c5331b21

📥 Commits

Reviewing files that changed from the base of the PR and between 5380ff6 and 5db97bb.

📒 Files selected for processing (1)
  • src/memtable.rs

@codecov
Copy link
Copy Markdown

codecov Bot commented Apr 29, 2026

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 97.82%. Comparing base (f0fcebe) to head (5db97bb).
⚠️ Report is 1 commits behind head on master.

Additional details and impacted files
@@            Coverage Diff             @@
##           master     #401      +/-   ##
==========================================
+ Coverage   97.78%   97.82%   +0.04%     
==========================================
  Files          33       33              
  Lines       11315    11400      +85     
==========================================
+ Hits        11064    11152      +88     
+ Misses        251      248       -3     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@ti-chi-bot ti-chi-bot Bot added the lgtm label Apr 29, 2026
@ti-chi-bot
Copy link
Copy Markdown

ti-chi-bot Bot commented Apr 29, 2026

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: glorv, overvenus

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Details Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@ti-chi-bot ti-chi-bot Bot removed the needs-1-more-lgtm Indicates a PR needs 1 more LGTM. label Apr 29, 2026
@ti-chi-bot
Copy link
Copy Markdown

ti-chi-bot Bot commented Apr 29, 2026

[LGTM Timeline notifier]

Timeline:

  • 2026-04-29 08:35:25.348385727 +0000 UTC m=+2759730.553745784: ☑️ agreed by glorv.
  • 2026-04-29 14:22:49.415828597 +0000 UTC m=+2780574.621188654: ☑️ agreed by overvenus.

@ti-chi-bot ti-chi-bot Bot merged commit 55bccde into tikv:master Apr 29, 2026
9 checks passed
LykxSassinator added a commit that referenced this pull request Apr 30, 2026
… proactively after compaction. (#401) (#402)

close tikv/tikv#19544

Signed-off-by: lucasliang <nkcs_lykx@hotmail.com>
@ti-chi-bot ti-chi-bot Bot added the needs-cherry-pick-release-8.5 Should cherry pick this PR to release-8.5 branch. label May 8, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

approved dco-signoff: yes Indicates the PR's author has signed the dco. lgtm needs-cherry-pick-release-8.5 Should cherry pick this PR to release-8.5 branch. size/L Denotes a PR that changes 100-499 lines, ignoring generated files.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

raft-engine's memory usage increases slowly overtime which the workload is stable

3 participants