perf(stark): migrate end_exemptions_poly to evaluation form by diegokingston · Pull Request #621 · yetanotherco/lambda_vm

diegokingston · 2026-05-22T22:14:18Z

Follow-up to #604. Removes transition.rs's last dependency on Polynomial arithmetic — combined with #604 (already merged) no production code in the prover uses Polynomial operators anymore.

Summary

Replaces the coefficient-form end_exemptions_poly: Polynomial<F> in crypto/stark/src/constraints/transition.rs with two eval-form helpers:

end_exemptions_roots(trace_primitive_root, trace_length) -> Vec<FieldElement<F>> — the roots rᵢ of ∏(x − rᵢ).
end_exemptions_lde_evaluations(domain) -> Vec<FieldElement<F>> — the product evaluated over the precomputed LDE coset directly.

The OOD path in evaluate_zerofier now computes ∏(z − rᵢ) directly instead of Polynomial::evaluate. The Polynomial and evaluate_polynomial_on_lde_domain imports are dropped from transition.rs.

Performance

VM path (end_exemptions == 0, the only case in every VM table): unchanged. The else-branch's existing fast-path early-return is preserved bit-for-bit, so the function still returns the short cyclic vector instead of expanding it to the full LDE domain.
k > 0 path (example AIRs only — fibonacci variants and read-only-memory): an O(N · end_exemptions) direct product replaces an O(N log N) FFT. Strictly faster for end_exemptions ≤ 2.

Test plan

cargo build --workspace — clean.
make lint — clean (all three clippy configs).
cargo test -p stark — 125/125 passing, including the example AIRs that exercise the eval-form k > 0 path (end_exemptions = 1, 2 in fibonacci variants and read-only-memory).
cargo test -p lambda-vm-prover --lib — 273 pass / 77 fail; identical counts on the pre-cleanup(stark/constraints): drop coefficient-form leftovers, dedupe eval #604 baseline (verified by reverting just transition.rs). All 77 failures are pre-existing on the base and unrelated to constraints (UnknownSyscall(5) etc.).

Replaces the coefficient-form `end_exemptions_poly: Polynomial<F>` with two eval-form helpers, removing transition.rs's last dependency on `Polynomial` arithmetic. Stacks on #604 (which removed boundary.rs's `Polynomial` dependency). With both landed, no production code in the prover uses `Polynomial` operators. - `end_exemptions_roots` returns the roots r_i of prod(x - r_i) (<= 2 in the example AIRs, 0 in every VM table). - `end_exemptions_lde_evaluations(domain)` evaluates the product over the precomputed LDE coset directly: an O(N * end_exemptions) loop replaces an O(N log N) FFT. - `evaluate_zerofier`'s OOD path computes prod(z - r_i) directly instead of `Polynomial::evaluate`. Performance: the VM's dominant path (end_exemptions == 0) is unchanged - the existing else-branch early-return is preserved bit-for-bit, so that case still returns the short cyclic vector instead of expanding it. The k > 0 path (example AIRs only) goes from FFT to direct product, strictly faster. Correctness verified by `cargo test -p stark` (125/125, includes the fibonacci and read-only-memory AIRs with end_exemptions = 1 and 2 - exactly the eval-form k > 0 path). VM prover lib test counts are identical to the #604 baseline (273 pass, 77 pre-existing failures unrelated to constraints).

github-actions · 2026-05-22T22:16:37Z

Codex Code Review

No issues found in the PR diff.

I was not able to run the Rust tests: cargo/rustup tried to write under read-only /home/runner/.rustup / /home/runner/.cargo, and offline mode then failed on the unresolved git dependency.

claude · 2026-05-22T22:17:44Z

Code Review

Clean refactoring. The eval-form replacement is correct and the logic is preserved faithfully. A few items worth noting:

Medium — Latent correctness bug: offset != 0 with end_exemptions > 0

The FIXME on line 122 is carried over from the original, but now has sharper consequences: the roots vector is the sole source of truth for which rows are exempt, so a wrong root silently produces a wrong proof for any future constraint with both offset != 0 and end_exemptions > 0. No existing AIR exercises this combination today, but it is easy to miss.

The correct roots should be the last k elements of the offset periodic domain — currently decrement = g^(-period) ignores the offset entirely. The correct value for the last root in the offset domain is g^(offset + (trace_length/period - 1)*period) = g^(trace_length - period + offset).

Low — exemptions_period branch does not skip LDE expansion when end_exemptions == 0

The else branch has a fast-path early return for end_exemptions == 0 (preserving the cyclic vector). The exemptions_period branch calls end_exemptions_lde_evaluations unconditionally: when roots are empty the fold produces a vector of N ones and multiplies through pointlessly. Not a regression (old code also evaluated one_poly over the full domain here), but worth a matching guard if this path ever gets a VM caller.

Nit — double-negation in evaluate_zerofier

The expression -(root.clone() - z.clone()) is a workaround for the subfield/superfield subtraction direction. The comment explains it, but if the math library exposes to_extension(), z.clone() - root.to_extension() reads more directly.

Overall the change is correct, the described performance improvement holds, and tests cover the exercised paths.

diegokingston · 2026-05-23T01:47:15Z

/bench 5

github-actions · 2026-05-23T01:49:53Z

Benchmark — fib_iterative_8M (median of 5)

_{Table parallelism: auto (cores / 3)}

Metric	main	PR	Δ
Peak heap	51992 MB	51823 MB	-169 MB (-0.3%) ⚪
Prove time	24.990s	25.078s	+0.088s (+0.4%) ⚪

✅ No significant change.

✅ Low variance (time: 1.2%, heap: 0.5%)

_{Commit: ccab5dc · Baseline: cached · Runner: self-hosted bench}

After #604 (boundary.rs eval-form) and #621 (end_exemptions_poly eval-form) landed on main, no production code in the prover uses `Polynomial` operators anymore. The only remaining user was `Polynomial::interpolate` (test-only Lagrange interpolation, gated `#[cfg(test)]` and used as a naive reference by the FFT cross-check tests). With both production references gone, the operators + the naive interpolant can go too. Removed: - All 14 operator `impl`s (Add P+/-P, Neg P, Mul P*P, Sub P-FieldElement); the previous PR kept these four families because `transition.rs` and `boundary.rs` used them. They are now fully unreachable. - `Polynomial::interpolate` and `InterpolateError` from `polynomial.rs`. - Operator unit tests in `polynomial_tests.rs` (adding_, negating_, multiply_*) and their helpers (`polynomial_a`, `polynomial_b`, `polynomial_minus_a`, `polynomial_a_plus_b`). - `compose` test helper + `composition_works` test (depended on the removed `interpolate`). - Direct `interpolate_x_*_y_*` tests (tested the removed function). Rewrote: - FFT correctness tests (both `fft_tests.rs` and `fft_friendly_u64_goldilocks_tests.rs`): the `gen_fft_and_naive_interpolate` / `_coset_` helpers cross-checked `interpolate_fft` against naive Lagrange. Replaced with FFT round-trip via direct Horner evaluation — interpolate via FFT, then re-evaluate at every twiddle with `Polynomial::evaluate` (a different algorithm). Same independence property, no naive Lagrange needed. - `simple_interpolating_polynomial_by_hand_works`: swapped `*` for `mul_with_ref(&...)` (the only remaining call site that needed it). Verified: - `cargo build --workspace` clean. - `cargo test -p math` -> 178 passing. - `cargo test -p lambda-vm-prover --lib` -> 281 pass / 77 pre-existing failures unrelated to constraints (same baseline as main). - `make lint` clean across all three clippy configs.

#618) After #604 (boundary.rs eval-form) and #621 (end_exemptions_poly eval-form) landed on main, no production code in the prover uses `Polynomial` operators anymore. The only remaining user was `Polynomial::interpolate` (test-only Lagrange interpolation, gated `#[cfg(test)]` and used as a naive reference by the FFT cross-check tests). With both production references gone, the operators + the naive interpolant can go too. Removed: - All 14 operator `impl`s (Add P+/-P, Neg P, Mul P*P, Sub P-FieldElement); the previous PR kept these four families because `transition.rs` and `boundary.rs` used them. They are now fully unreachable. - `Polynomial::interpolate` and `InterpolateError` from `polynomial.rs`. - Operator unit tests in `polynomial_tests.rs` (adding_, negating_, multiply_*) and their helpers (`polynomial_a`, `polynomial_b`, `polynomial_minus_a`, `polynomial_a_plus_b`). - `compose` test helper + `composition_works` test (depended on the removed `interpolate`). - Direct `interpolate_x_*_y_*` tests (tested the removed function). Rewrote: - FFT correctness tests (both `fft_tests.rs` and `fft_friendly_u64_goldilocks_tests.rs`): the `gen_fft_and_naive_interpolate` / `_coset_` helpers cross-checked `interpolate_fft` against naive Lagrange. Replaced with FFT round-trip via direct Horner evaluation — interpolate via FFT, then re-evaluate at every twiddle with `Polynomial::evaluate` (a different algorithm). Same independence property, no naive Lagrange needed. - `simple_interpolating_polynomial_by_hand_works`: swapped `*` for `mul_with_ref(&...)` (the only remaining call site that needed it). Verified: - `cargo build --workspace` clean. - `cargo test -p math` -> 178 passing. - `cargo test -p lambda-vm-prover --lib` -> 281 pass / 77 pre-existing failures unrelated to constraints (same baseline as main). - `make lint` clean across all three clippy configs.

* refactor(math): remove dead Deserializable trait and unreachable error variants `Deserializable` had a single impl (`Proof<T>`) whose only `deserialize` call site was inside that impl itself — a recursion with no external entry point. Production deserialization uses `bincode`/serde everywhere. Delete the trait and the `Proof<T>` impl. `merkle_tree/proof.rs` also carried a dead `Serializable` impl behind `#[cfg(feature = "alloc")]` — a feature the `crypto` crate does not even define, so the block never compiled. It referenced `math::traits::Serializable`, which does not exist. Remove it. Error cleanup (none constructed anywhere in the workspace): - `DeserializationError` enum + `From<ByteConversionError>` impl — only reachable through the now-deleted `Deserializable`. - `PairingError` enum — lambda_vm has no pairings. - `ByteConversionError::{InvalidValue, PointNotInSubgroup, ValueNotCompressed}` — the latter two are elliptic-curve concepts; there is no EC here. - `CreationError::InvalidDecString`. Pure dead-code removal; `cargo build`/`cargo test -p math` unchanged (216 tests pass). * refactor(math): remove vestigial helpers module `helpers::next_power_of_two(u64)` was bit-for-bit equivalent to stdlib `u64::next_power_of_two()`. Its only caller, `resize_to_next_power_of_two`, was used solely by the `fibonacci_rap` example — production trace sizing already calls stdlib `.next_power_of_two()` directly everywhere. Move `resize_to_next_power_of_two` into `examples/fibonacci_rap.rs` as a local helper (rewritten on `usize` with the stdlib call, dropping the old `try_into().unwrap()` TODO), and delete the `helpers` module. * refactor(math): gate evaluate_fft_bit_reversed behind cfg(test) `evaluate_fft_bit_reversed` was a non-gated `pub fn` but is only called from the FFT property tests — no production caller in stark or prover. Gate it `#[cfg(test)]` so it stops appearing in the production API surface. Restore the gate (or remove it) when a real caller appears. * refactor(math): gate test-only Polynomial::{truncate, reverse} Both methods are called only from `fast_division` / `invert_polynomial_mod`, which are themselves already `#[cfg(test)]`. No production caller in stark or prover. Gate them `#[cfg(test)]` so they leave the production API surface. `long_division_with_remainder` is left public — it is used by the debug-checks path (`check_boundary_polys_divisibility`) and is a legitimate general polynomial operation other code could want. * refactor(math): flatten fft/cpu/* up to fft/* The `fft/cpu/` directory implied a `fft/gpu/` sibling that never existed — GPU code lives in the separate `crypto/math-cuda` crate. The `cpu` nesting was a vestigial holdover. Move `bit_reversing`, `bowers_fft`, `roots_of_unity`, `fft`, and `bowers_fft_tests` from `fft/cpu/` up to `fft/`; delete `fft/cpu/mod.rs` and merge its declarations into `fft/mod.rs`. Update every `fft::cpu::...` / `super::cpu::...` path across math, stark, prover, and math-cuda to `fft::...`. Pure module move — no logic change. 216 math tests pass; stark, prover, math-cuda all build. * refactor(math): collapse polynomial/ folder into one polynomial.rs Two problems compounded: 1. `polynomial/` was a folder holding a single file (`mod.rs`) — the directory level served no purpose. 2. Two files were named `polynomial` (`polynomial/mod.rs` and `fft/polynomial.rs`), both defining `impl Polynomial<...>` blocks. Merge `fft/polynomial.rs` (the FFT-specific `impl Polynomial` blocks plus the `evaluate_fft_cpu` / `interpolate_fft_cpu` free functions) into `polynomial/mod.rs`, then rename it to a top-level `crypto/math/src/polynomial.rs` and delete the folder. Result: one `polynomial.rs` holding the `Polynomial` type and all of its methods (core ring ops + FFT). `fft/` is now purely the FFT *algorithm* layer (Bowers, twiddles, bit-reversal, roots of unity). Also delete the stale `polynomial/README.md` — inherited lambdaworks docs referencing multilinear-polynomial files that don't exist in lambda_vm. Updated the one external path (`fft::polynomial::compose_fft` -> `polynomial::compose_fft`). 216 math + 124 stark tests pass. * refactor(math): consolidate scattered tests into tests/ directory Three test conventions coexisted: the `tests/` directory, a sibling `bowers_fft_tests.rs` next to source, and an inline `#[cfg(test)] mod` in the polynomial module. Unify on the `tests/` directory. - Move `fft/bowers_fft_tests.rs` -> `tests/bowers_fft_tests.rs`, register it in `tests/mod.rs`, drop the `mod bowers_fft_tests;` from `fft/mod.rs`. - Move the inline coset-LDE `mod tests` out of `polynomial.rs` into a new `coset_lde_tests` module in `tests/fft_tests.rs`. - Rename `fft/fft.rs` -> `fft/reference_fft.rs`: after the Step 6 flatten it collided with its own parent module name (`fft::fft`), which clippy rejects. The new name also says what it is — the reference radix-2 FFT used only to cross-check the production Bowers FFT. Gate the module `#[cfg(test)] pub(crate)` since its whole content is test-only. 216 math tests pass (same count — no tests lost in the move). * refactor(math): drop redundant FFT cross-check code, relocate inline tests The reference radix-2 FFT existed only to cross-check the production Bowers FFT. Bowers is already verified directly against a naive O(n^2) DFT at the same sizes, so reference_fft.rs and its four comparison tests are dead weight. gate get_powers_of_primitive_root as test-only (production twiddles go through LayerTwiddles), collapse the evaluate_fft_cpu / interpolate_fft_cpu wrappers into their sole callers, and narrow coset_lde_full_into to pub(crate). Move the inline test modules from bit_reversing.rs, goldilocks.rs and extensions_goldilocks.rs into the dedicated tests/ directory. * refactor(math): remove dead polynomial-division cluster, fix zero-offset panic Addresses PR review feedback. Reviewers flagged that `#[cfg(test)]`-gating previously-public methods is a breaking change. Rather than keep them as `pub(crate)` dead code, the cluster is removed outright — none of it has a production caller: - Delete `Polynomial::{fast_division, invert_polynomial_mod, fast_fft_multiplication, truncate, reverse}` and `evaluate_fft_bit_reversed`, plus their 8 tests. The fast polynomial-division algorithm had no production use; it existed only to be tested. - `get_powers_of_primitive_root` stays `#[cfg(test)]` — it is a genuine test helper used to validate the production FFT, not dead code. Also fixes the pre-existing panic reviewers noted in `interpolate_offset_fft`: `offset.inv().unwrap()` panicked on a zero coset offset. Add `FFTError::InvalidCosetOffset` and propagate it instead. * refactor(math): remove dead Polynomial APIs and self-referential tests Continues the dead-code audit of polynomial.rs. Every item removed here has no production caller — each is a function reachable only from its own tests, or a test-file algorithm that tests itself. Removed from polynomial.rs: - to_extension (zero callers anywhere) - long_division_with_remainder, leading_coefficient - ruffini_division_inplace - interpolate_coset_eval_with_g_n_inv (base-field variant) Removed the matching self-referential tests/helpers from polynomial_tests.rs (division_works, division_by_zero_degree_polynomial_works, test_xgcd, ruffini_inplace_* and the div_with_ref / xgcd / ruffini_division helpers). The 3 barycentric tests that exercised the base-field interpolate_coset_eval_with_g_n_inv now call the production-used interpolate_coset_eval_ext_with_g_n_inv (instantiated E=F), so they keep covering the real code path. * refactor(math): keep Deserializable in math; leave crypto/merkle_tree/proof.rs untouched The earlier dead-code sweep removed `Deserializable` + `DeserializationError` from `math` because its only out-of-crate user was the `impl Deserializable` in `crypto/crypto/src/merkle_tree/proof.rs`, so the PR also dropped those impls. PR #611 (arity-4 FRI + Merkle caps) explicitly asks this PR to leave `proof.rs` alone, so revert that pair of deletions: - Restore `proof.rs` to main's contents (diff vs main: empty). - Restore `DeserializationError` enum and `From<ByteConversionError>` impl in `math/src/errors.rs`. - Restore `Deserializable` trait + the `DeserializationError` import in `math/src/traits.rs`. The `Serializable` half of the impl block in `proof.rs` is gated on `crypto/crypto`'s `alloc` feature, which no consumer in this repo enables — that code path was already non-compiling and is unchanged here. Tests: `cargo test -p math -p crypto` → 46 + 198 green. `make lint` clean. * refactor(math): drop all 14 Polynomial operator impls + naive Lagrange (#618) After #604 (boundary.rs eval-form) and #621 (end_exemptions_poly eval-form) landed on main, no production code in the prover uses `Polynomial` operators anymore. The only remaining user was `Polynomial::interpolate` (test-only Lagrange interpolation, gated `#[cfg(test)]` and used as a naive reference by the FFT cross-check tests). With both production references gone, the operators + the naive interpolant can go too. Removed: - All 14 operator `impl`s (Add P+/-P, Neg P, Mul P*P, Sub P-FieldElement); the previous PR kept these four families because `transition.rs` and `boundary.rs` used them. They are now fully unreachable. - `Polynomial::interpolate` and `InterpolateError` from `polynomial.rs`. - Operator unit tests in `polynomial_tests.rs` (adding_, negating_, multiply_*) and their helpers (`polynomial_a`, `polynomial_b`, `polynomial_minus_a`, `polynomial_a_plus_b`). - `compose` test helper + `composition_works` test (depended on the removed `interpolate`). - Direct `interpolate_x_*_y_*` tests (tested the removed function). Rewrote: - FFT correctness tests (both `fft_tests.rs` and `fft_friendly_u64_goldilocks_tests.rs`): the `gen_fft_and_naive_interpolate` / `_coset_` helpers cross-checked `interpolate_fft` against naive Lagrange. Replaced with FFT round-trip via direct Horner evaluation — interpolate via FFT, then re-evaluate at every twiddle with `Polynomial::evaluate` (a different algorithm). Same independence property, no naive Lagrange needed. - `simple_interpolating_polynomial_by_hand_works`: swapped `*` for `mul_with_ref(&...)` (the only remaining call site that needed it). Verified: - `cargo build --workspace` clean. - `cargo test -p math` -> 178 passing. - `cargo test -p lambda-vm-prover --lib` -> 281 pass / 77 pre-existing failures unrelated to constraints (same baseline as main). - `make lint` clean across all three clippy configs.

claude Bot reviewed May 22, 2026

View reviewed changes

Comment thread crypto/stark/src/constraints/transition.rs Outdated

claude Bot reviewed May 22, 2026

View reviewed changes

Comment thread crypto/stark/src/constraints/transition.rs

offset-aware end_exemptions roots

b02950f

MauroToscano approved these changes May 26, 2026

View reviewed changes

MauroToscano enabled auto-merge May 26, 2026 17:15

Merge branch 'main' into perf/end-exemptions-eval-form

d242924

MauroToscano added this pull request to the merge queue May 26, 2026

Merged via the queue into main with commit 23121a6 May 26, 2026
12 checks passed

MauroToscano deleted the perf/end-exemptions-eval-form branch May 26, 2026 18:19

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

perf(stark): migrate end_exemptions_poly to evaluation form#621

perf(stark): migrate end_exemptions_poly to evaluation form#621
MauroToscano merged 3 commits into
mainfrom
perf/end-exemptions-eval-form

diegokingston commented May 22, 2026

Uh oh!

github-actions Bot commented May 22, 2026

Uh oh!

claude Bot commented May 22, 2026

Uh oh!

Uh oh!

Uh oh!

diegokingston commented May 23, 2026

Uh oh!

github-actions Bot commented May 23, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

diegokingston commented May 22, 2026

Summary

Performance

Test plan

Uh oh!

github-actions Bot commented May 22, 2026

Codex Code Review

Uh oh!

claude Bot commented May 22, 2026

Code Review

Uh oh!

Uh oh!

Uh oh!

diegokingston commented May 23, 2026

Uh oh!

github-actions Bot commented May 23, 2026

Benchmark — fib_iterative_8M (median of 5)

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants