Refactor/fiat shamir statement streamlined by MauroToscano · Pull Request #617 · yetanotherco/lambda_vm

MauroToscano · 2026-05-22T20:30:55Z

No description provided.

…script The VM prover and verifier built their transcript with `new(&[])` — the statement (program, public output, table layout) was never absorbed, so Fiat-Shamir challenges did not depend on it. Soundness was preserved only by out-of-band reconstruction checks in the verifier. Add a canonical, versioned, length-prefixed encoder (`prover/src/statement.rs`) that digests the full statement — ELF hash, public output, table counts, and page layout — into a 32-byte transcript seed. Seed the transcript with it on the prove path, the verify path, and the bus-balance replay (`replay_transcript_phase_a`), so all three derive identical challenges for an honest proof and diverge for any tampered statement. Closes the frozen-heart class of issue as defense-in-depth and is a prerequisite for recursion. Breaking proof-format change (challenges differ). `tiny-keccak` is promoted from a dev-dependency to a dependency for the statement digest.

Apply the simplifications to the statement-binding work: - Drop tiny-keccak as a prod dep; the DefaultTranscript is already a Keccak256 absorber, so the outer keccak256 in statement_seed() was redundant. ELF digest now uses sha3::Keccak256 (already a transitive prod dep via crypto). tiny-keccak stays as a dev-dep for the in-VM keccak precompile cross-check. - Replace encode_statement -> Vec<u8> + statement_seed -> [u8; 32] with a single absorb_statement(&mut transcript, ...) that streams bytes directly into the transcript. No intermediate allocation. - Drop FORMAT_VERSION constant — the version is in DOMAIN_TAG suffix (LAMBDAVM_STARK_STATEMENT_V1). Bump the suffix on any encoding change. - Drop the redundant 8-byte length prefix on the elf bytes: we hash to fixed 32-byte digest, so the prefix doesn't prevent any collision. - replay_transcript_phase_a / compute_expected_commit_bus_balance now take &mut DefaultTranscript<E> instead of an opaque transcript_seed: &[u8]. Callers decide what's in the transcript; internal tests just hand a fresh empty one (no &[] magic sentinel). - Fold the duplicated lde_trace_main_merkle_root append branches in replay_transcript_phase_a. - verify_with_options absorbs the statement ONCE, then clones the transcript so multi_verify and the bus-balance replay both start from the same post-absorb state — no closure, no field-list drift. - statement_tests.rs reframed around transcript.state(). 3 tests (state_is_deterministic, state_depends_on_every_field, public_output_length_prefix_prevents_collision) cover the same surface as the 8 byte-level encoder tests they replace, hitting the real absorb path. Drop the two-ELF e2e test that needed RISC-V artifacts and couldn't run in author env. - 6 internal test sites (decode_tests.rs, prove_elfs_tests.rs): pass a fresh DefaultTranscript instead of &[]. Soundness commitment unchanged: same five statement fields bound, same domain separation, same length prefixes where needed, same cross-site agreement (now explicitly tested via deterministic absorb). Follow-up: Phase A is still replayed twice on the verify path (once in compute_expected_commit_bus_balance, once in multi_verify). That's pre-existing, not from this PR; removing it would change multi_verify's public API to take a closure that produces the bus balance offset given (z, alpha). Separate scope.

github-actions · 2026-05-22T20:32:34Z

Codex Code Review

No issues found in the PR diff.

I reviewed the statement absorption changes, transcript replay path, prover/verifier transcript initialization, and updated tests. The changes appear consistent: prover, verifier, and commit bus replay all start from the same statement-bound transcript state, and tampered statement fields should cause verification failure.

Verification note: I attempted cargo test -p lambda-vm-prover statement_tests --no-default-features and cargo check -p lambda-vm-prover --no-default-features, but both were blocked by the environment because rustup cannot write to /home/runner/.rustup/tmp on this filesystem. git diff --check passed.

claude · 2026-05-22T20:36:08Z

+
+    t.append_bytes(&(num_private_input_pages as u64).to_le_bytes());
+
+    // runtime_page_ranges: count-prefixed; each entry fixed width.


Medium (soundness): The hardcoded field list silently excludes any field added to TableCounts in the future. There's no compile-time check that this array covers the full struct.

A new TableCounts field that isn't listed here won't be bound into the transcript, which would let a proof generated for one table configuration pass verification against a different one — breaking the statement-binding guarantee this PR introduces.

Consider adding a test that structurally validates field completeness, e.g.:

#[test] fn table_counts_all_fields_absorbed() { // If TableCounts gains a new field, this test should fail until // absorb_statement is updated to include it. let counts = TableCounts { cpu: 1, lt: 2, memw: 3, memw_aligned: 4, load: 5, mul: 6, dvrm: 7, shift: 8, branch: 9, memw_register: 10, // Adding a new field here causes a compile error, which is the // signal to update the loop in absorb_statement. }; // exhaustive struct literal above is the actual guard let _ = counts; }

Alternatively, implement a fn counts_as_array(&self) -> [usize; 10] method on TableCounts (or a to_le_bytes helper) so the encoding is co-located with the struct definition.

claude · 2026-05-22T20:36:21Z

Review: Fiat-Shamir statement binding

The security approach is sound. Binding the full statement into the transcript before Phase A prevents malleability attacks where a tampered ELF, public output, or table layout would otherwise produce valid-looking challenges.

Logic correctness:

The refactor of the phase-A replay is semantically equivalent to the original (the merkle root was always appended in both branches).
The transcript clone for the bus-balance replay is the correct pattern — both paths start from the same post-statement state.
Tests using the low-level prover directly are consistent (both prover and replay use empty transcripts at that level), and the security property is exercised end-to-end by the higher-level prove/verify tests.

One issue found:

Medium (soundness): The hardcoded TableCounts field array in absorb_statement silently excludes any field added to the struct in the future — see inline comment.

diegokingston and others added 3 commits May 22, 2026 11:42

Merge branch 'main' into harden/fiat-shamir-statement-binding

28f2776

claude Bot reviewed May 22, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Refactor/fiat shamir statement streamlined#617

Refactor/fiat shamir statement streamlined#617
MauroToscano wants to merge 3 commits into
mainfrom
refactor/fiat-shamir-statement-streamlined

MauroToscano commented May 22, 2026

Uh oh!

github-actions Bot commented May 22, 2026

Uh oh!

claude Bot May 22, 2026

Uh oh!

claude Bot commented May 22, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants


		t.append_bytes(&(num_private_input_pages as u64).to_le_bytes());

		// runtime_page_ranges: count-prefixed; each entry fixed width.

Conversation

MauroToscano commented May 22, 2026

Uh oh!

github-actions Bot commented May 22, 2026

Codex Code Review

Uh oh!

claude Bot May 22, 2026

Choose a reason for hiding this comment

Uh oh!

claude Bot commented May 22, 2026

Review: Fiat-Shamir statement binding

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants