Skip to content

fix: enforce flat-phase contract via assert (no depth counter)#12

Merged
Barnadrot merged 1 commit into
mainfrom
fix/assert-flat-phase-contract
May 11, 2026
Merged

fix: enforce flat-phase contract via assert (no depth counter)#12
Barnadrot merged 1 commit into
mainfrom
fix/assert-flat-phase-contract

Conversation

@Barnadrot
Copy link
Copy Markdown
Owner

@Barnadrot Barnadrot commented May 11, 2026

Summary

Replace the PHASE_DEPTH counter with an unconditional assert in begin_phase() that panics when a phase is already active. The counter masked correctness bugs: a panic between paired begin_phase()/end_phase() orphaned the depth, and any subsequent begin_phase() then silently nested under the orphan instead of recycling the slab — turning a bug into a memory leak the user couldn't see.

Mirrors the pattern on commit 684a526c ("zk-alloc: assert begin_phase has flat lifetime"), which currently lives on leanEthereum/leanMultisig#215 ("zkalloc fix: safe arena routing", OPEN since 2026-05-06, not merged) and on a Barnadrot/leanMultisig experiment branch. Neither downstream main has the pattern yet. This PR brings it to standalone Barnadrot/zk-alloc independently; a separate sync to leanMultisig vendored crates/backend/zk-alloc/ remains a follow-up.

Behavior change (caller-visible)

begin_phase() now panics if called while another phase is active. The supported pairing model is exactly one begin_phase() per end_phase() (or use PhaseGuard / phase() for panic-safe pairing). Nested phases are no longer tolerated.

If you have code that relied on nested begin/end composing, it will now panic at runtime — that path was already silently wrong (the inner begin's GENERATION bump recycled the outer's slab on the next allocation); the assert just makes the failure loud.

Why this is the right shape

  1. The depth counter's correctness only held when every begin had a matching end on every path, including panic unwind. Without PhaseGuard, that invariant could not be statically enforced.
  2. When the invariant broke (orphaned depth), the failure mode was silent slab growth — the slab kept appending across "iterations" until the per-thread slab overflowed to System. The user observed memory bloat, not a crash, and the cause was three call sites removed.
  3. The assert variant fails at the bug, not three function calls later.

Test changes

  • tests/test_nested_phase.rs — rewritten as a #[should_panic] contract test pinning the assert message.
  • tests/test_panic_phase.rs — pinned to "next begin_phase trips the assert loudly" after an orphaned phase, instead of the old "depth counter preserves data" assertion.
  • tests/test_phase_guard.rs::nested_phase_guards_composenested_phase_guards_panic (#[should_panic]). Lock acquires use unwrap_or_else(|e| e.into_inner()) so the should_panic test doesn't poison the lock and cascade-fail the binary.
  • tests/correctness.rs, tests/test_crossbeam_epoch.rs — added a per-binary PHASE_LOCK around phase-touching tests so they don't race on ARENA_ACTIVE under parallel test execution. Added the missing end_phase() in phase_boundary_does_not_crash.
  • tests/test_realloc_overlap.rs — reflowed an ASCII-diagram doc comment that the new clippy 1.94 doc_overindented_list_items lint rejected (unrelated to the phase-contract change but blocked CI).

Verified locally

  • cargo fmt --check clean
  • cargo clippy --workspace --all-targets -- -D warnings clean
  • cargo test --workspace — all 21 tests pass across 8 test binaries

Test plan

  • CI green on this PR
  • Spot-check downstream: leanMultisig does not yet run this shape — PR #215 (Emile, OPEN) has it on an unmerged branch. A separate sync PR to leanMultisig vendored crates/backend/zk-alloc/ is queued as a follow-up after this lands.
  • Confirm any caller that relied on nested begin/end is migrated to PhaseGuard / phase() before merging.

🤖 Generated with Claude Code

Replace the PHASE_DEPTH counter with an unconditional assert in
begin_phase() that panics when a phase is already active. The counter
masked correctness bugs: a panic between paired calls orphaned the
depth and silently shifted subsequent phases into nested mode, where
the slab kept growing instead of recycling.

Mirrors the pattern Emile cherry-picked into leanMultisig as
'zk-alloc: assert begin_phase has flat lifetime' (684a526c). Bringing
it back to the standalone repo so downstream consumers see the same
contract.

- src/lib.rs: drop PHASE_DEPTH, swap-based assert in begin_phase,
  unconditional store in end_phase. Doc-comment now states the
  contract explicitly.
- tests/test_nested_phase.rs: rewritten as a #[should_panic] contract
  test pinning the assert message.
- tests/test_panic_phase.rs: pinned to 'next begin_phase trips the
  assert loudly' after an orphaned phase, instead of the old
  'depth-counter preserves data' assertion.
- tests/test_phase_guard.rs: nested_phase_guards_compose →
  nested_phase_guards_panic (#[should_panic]). Lock acquires ignore
  poisoning so the should_panic test does not cascade-fail the binary.
- tests/correctness.rs, tests/test_crossbeam_epoch.rs: add a
  per-binary PHASE_LOCK around phase-touching tests so they don't
  race on ARENA_ACTIVE under parallel test execution. Add the missing
  end_phase() in phase_boundary_does_not_crash.
- tests/test_realloc_overlap.rs: reflow a doc-comment ASCII diagram
  the new clippy 1.94 doc_overindented_list_items lint rejected.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@Barnadrot Barnadrot merged commit 6165ce1 into main May 11, 2026
5 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant