Refactor/fiat shamir statement streamlined#617
Conversation
…script The VM prover and verifier built their transcript with `new(&[])` — the statement (program, public output, table layout) was never absorbed, so Fiat-Shamir challenges did not depend on it. Soundness was preserved only by out-of-band reconstruction checks in the verifier. Add a canonical, versioned, length-prefixed encoder (`prover/src/statement.rs`) that digests the full statement — ELF hash, public output, table counts, and page layout — into a 32-byte transcript seed. Seed the transcript with it on the prove path, the verify path, and the bus-balance replay (`replay_transcript_phase_a`), so all three derive identical challenges for an honest proof and diverge for any tampered statement. Closes the frozen-heart class of issue as defense-in-depth and is a prerequisite for recursion. Breaking proof-format change (challenges differ). `tiny-keccak` is promoted from a dev-dependency to a dependency for the statement digest.
Apply the simplifications to the statement-binding work: - Drop tiny-keccak as a prod dep; the DefaultTranscript is already a Keccak256 absorber, so the outer keccak256 in statement_seed() was redundant. ELF digest now uses sha3::Keccak256 (already a transitive prod dep via crypto). tiny-keccak stays as a dev-dep for the in-VM keccak precompile cross-check. - Replace encode_statement -> Vec<u8> + statement_seed -> [u8; 32] with a single absorb_statement(&mut transcript, ...) that streams bytes directly into the transcript. No intermediate allocation. - Drop FORMAT_VERSION constant — the version is in DOMAIN_TAG suffix (LAMBDAVM_STARK_STATEMENT_V1). Bump the suffix on any encoding change. - Drop the redundant 8-byte length prefix on the elf bytes: we hash to fixed 32-byte digest, so the prefix doesn't prevent any collision. - replay_transcript_phase_a / compute_expected_commit_bus_balance now take &mut DefaultTranscript<E> instead of an opaque transcript_seed: &[u8]. Callers decide what's in the transcript; internal tests just hand a fresh empty one (no &[] magic sentinel). - Fold the duplicated lde_trace_main_merkle_root append branches in replay_transcript_phase_a. - verify_with_options absorbs the statement ONCE, then clones the transcript so multi_verify and the bus-balance replay both start from the same post-absorb state — no closure, no field-list drift. - statement_tests.rs reframed around transcript.state(). 3 tests (state_is_deterministic, state_depends_on_every_field, public_output_length_prefix_prevents_collision) cover the same surface as the 8 byte-level encoder tests they replace, hitting the real absorb path. Drop the two-ELF e2e test that needed RISC-V artifacts and couldn't run in author env. - 6 internal test sites (decode_tests.rs, prove_elfs_tests.rs): pass a fresh DefaultTranscript instead of &[]. Soundness commitment unchanged: same five statement fields bound, same domain separation, same length prefixes where needed, same cross-site agreement (now explicitly tested via deterministic absorb). Follow-up: Phase A is still replayed twice on the verify path (once in compute_expected_commit_bus_balance, once in multi_verify). That's pre-existing, not from this PR; removing it would change multi_verify's public API to take a closure that produces the bus balance offset given (z, alpha). Separate scope.
Codex Code ReviewNo issues found in the PR diff. I reviewed the statement absorption changes, transcript replay path, prover/verifier transcript initialization, and updated tests. The changes appear consistent: prover, verifier, and commit bus replay all start from the same statement-bound transcript state, and tampered statement fields should cause verification failure. Verification note: I attempted |
|
|
||
| t.append_bytes(&(num_private_input_pages as u64).to_le_bytes()); | ||
|
|
||
| // runtime_page_ranges: count-prefixed; each entry fixed width. |
There was a problem hiding this comment.
Medium (soundness): The hardcoded field list silently excludes any field added to TableCounts in the future. There's no compile-time check that this array covers the full struct.
A new TableCounts field that isn't listed here won't be bound into the transcript, which would let a proof generated for one table configuration pass verification against a different one — breaking the statement-binding guarantee this PR introduces.
Consider adding a test that structurally validates field completeness, e.g.:
#[test]
fn table_counts_all_fields_absorbed() {
// If TableCounts gains a new field, this test should fail until
// absorb_statement is updated to include it.
let counts = TableCounts {
cpu: 1, lt: 2, memw: 3, memw_aligned: 4, load: 5,
mul: 6, dvrm: 7, shift: 8, branch: 9, memw_register: 10,
// Adding a new field here causes a compile error, which is the
// signal to update the loop in absorb_statement.
};
// exhaustive struct literal above is the actual guard
let _ = counts;
}Alternatively, implement a fn counts_as_array(&self) -> [usize; 10] method on TableCounts (or a to_le_bytes helper) so the encoding is co-located with the struct definition.
Review: Fiat-Shamir statement bindingThe security approach is sound. Binding the full statement into the transcript before Phase A prevents malleability attacks where a tampered ELF, public output, or table layout would otherwise produce valid-looking challenges. Logic correctness:
One issue found: Medium (soundness): The hardcoded |
No description provided.