perf(stark): migrate end_exemptions_poly to evaluation form#621
Conversation
Replaces the coefficient-form `end_exemptions_poly: Polynomial<F>` with two eval-form helpers, removing transition.rs's last dependency on `Polynomial` arithmetic. Stacks on #604 (which removed boundary.rs's `Polynomial` dependency). With both landed, no production code in the prover uses `Polynomial` operators. - `end_exemptions_roots` returns the roots r_i of prod(x - r_i) (<= 2 in the example AIRs, 0 in every VM table). - `end_exemptions_lde_evaluations(domain)` evaluates the product over the precomputed LDE coset directly: an O(N * end_exemptions) loop replaces an O(N log N) FFT. - `evaluate_zerofier`'s OOD path computes prod(z - r_i) directly instead of `Polynomial::evaluate`. Performance: the VM's dominant path (end_exemptions == 0) is unchanged - the existing else-branch early-return is preserved bit-for-bit, so that case still returns the short cyclic vector instead of expanding it. The k > 0 path (example AIRs only) goes from FFT to direct product, strictly faster. Correctness verified by `cargo test -p stark` (125/125, includes the fibonacci and read-only-memory AIRs with end_exemptions = 1 and 2 - exactly the eval-form k > 0 path). VM prover lib test counts are identical to the #604 baseline (273 pass, 77 pre-existing failures unrelated to constraints).
Codex Code ReviewNo issues found in the PR diff. I was not able to run the Rust tests: cargo/rustup tried to write under read-only |
Code ReviewClean refactoring. The eval-form replacement is correct and the logic is preserved faithfully. A few items worth noting: Medium — Latent correctness bug: offset != 0 with end_exemptions > 0 The FIXME on line 122 is carried over from the original, but now has sharper consequences: the roots vector is the sole source of truth for which rows are exempt, so a wrong root silently produces a wrong proof for any future constraint with both offset != 0 and end_exemptions > 0. No existing AIR exercises this combination today, but it is easy to miss. The correct roots should be the last k elements of the offset periodic domain — currently decrement = g^(-period) ignores the offset entirely. The correct value for the last root in the offset domain is g^(offset + (trace_length/period - 1)*period) = g^(trace_length - period + offset). Low — exemptions_period branch does not skip LDE expansion when end_exemptions == 0 The else branch has a fast-path early return for end_exemptions == 0 (preserving the cyclic vector). The exemptions_period branch calls end_exemptions_lde_evaluations unconditionally: when roots are empty the fold produces a vector of N ones and multiplies through pointlessly. Not a regression (old code also evaluated one_poly over the full domain here), but worth a matching guard if this path ever gets a VM caller. Nit — double-negation in evaluate_zerofier The expression -(root.clone() - z.clone()) is a workaround for the subfield/superfield subtraction direction. The comment explains it, but if the math library exposes to_extension(), z.clone() - root.to_extension() reads more directly. Overall the change is correct, the described performance improvement holds, and tests cover the exercised paths. |
|
/bench 5 |
Benchmark — fib_iterative_8M (median of 5)Table parallelism: auto (cores / 3)
Commit: ccab5dc · Baseline: cached · Runner: self-hosted bench |
After #604 (boundary.rs eval-form) and #621 (end_exemptions_poly eval-form) landed on main, no production code in the prover uses `Polynomial` operators anymore. The only remaining user was `Polynomial::interpolate` (test-only Lagrange interpolation, gated `#[cfg(test)]` and used as a naive reference by the FFT cross-check tests). With both production references gone, the operators + the naive interpolant can go too. Removed: - All 14 operator `impl`s (Add P+/-P, Neg P, Mul P*P, Sub P-FieldElement); the previous PR kept these four families because `transition.rs` and `boundary.rs` used them. They are now fully unreachable. - `Polynomial::interpolate` and `InterpolateError` from `polynomial.rs`. - Operator unit tests in `polynomial_tests.rs` (adding_, negating_, multiply_*) and their helpers (`polynomial_a`, `polynomial_b`, `polynomial_minus_a`, `polynomial_a_plus_b`). - `compose` test helper + `composition_works` test (depended on the removed `interpolate`). - Direct `interpolate_x_*_y_*` tests (tested the removed function). Rewrote: - FFT correctness tests (both `fft_tests.rs` and `fft_friendly_u64_goldilocks_tests.rs`): the `gen_fft_and_naive_interpolate` / `_coset_` helpers cross-checked `interpolate_fft` against naive Lagrange. Replaced with FFT round-trip via direct Horner evaluation — interpolate via FFT, then re-evaluate at every twiddle with `Polynomial::evaluate` (a different algorithm). Same independence property, no naive Lagrange needed. - `simple_interpolating_polynomial_by_hand_works`: swapped `*` for `mul_with_ref(&...)` (the only remaining call site that needed it). Verified: - `cargo build --workspace` clean. - `cargo test -p math` -> 178 passing. - `cargo test -p lambda-vm-prover --lib` -> 281 pass / 77 pre-existing failures unrelated to constraints (same baseline as main). - `make lint` clean across all three clippy configs.
After #604 (boundary.rs eval-form) and #621 (end_exemptions_poly eval-form) landed on main, no production code in the prover uses `Polynomial` operators anymore. The only remaining user was `Polynomial::interpolate` (test-only Lagrange interpolation, gated `#[cfg(test)]` and used as a naive reference by the FFT cross-check tests). With both production references gone, the operators + the naive interpolant can go too. Removed: - All 14 operator `impl`s (Add P+/-P, Neg P, Mul P*P, Sub P-FieldElement); the previous PR kept these four families because `transition.rs` and `boundary.rs` used them. They are now fully unreachable. - `Polynomial::interpolate` and `InterpolateError` from `polynomial.rs`. - Operator unit tests in `polynomial_tests.rs` (adding_, negating_, multiply_*) and their helpers (`polynomial_a`, `polynomial_b`, `polynomial_minus_a`, `polynomial_a_plus_b`). - `compose` test helper + `composition_works` test (depended on the removed `interpolate`). - Direct `interpolate_x_*_y_*` tests (tested the removed function). Rewrote: - FFT correctness tests (both `fft_tests.rs` and `fft_friendly_u64_goldilocks_tests.rs`): the `gen_fft_and_naive_interpolate` / `_coset_` helpers cross-checked `interpolate_fft` against naive Lagrange. Replaced with FFT round-trip via direct Horner evaluation — interpolate via FFT, then re-evaluate at every twiddle with `Polynomial::evaluate` (a different algorithm). Same independence property, no naive Lagrange needed. - `simple_interpolating_polynomial_by_hand_works`: swapped `*` for `mul_with_ref(&...)` (the only remaining call site that needed it). Verified: - `cargo build --workspace` clean. - `cargo test -p math` -> 178 passing. - `cargo test -p lambda-vm-prover --lib` -> 281 pass / 77 pre-existing failures unrelated to constraints (same baseline as main). - `make lint` clean across all three clippy configs.
#618) After #604 (boundary.rs eval-form) and #621 (end_exemptions_poly eval-form) landed on main, no production code in the prover uses `Polynomial` operators anymore. The only remaining user was `Polynomial::interpolate` (test-only Lagrange interpolation, gated `#[cfg(test)]` and used as a naive reference by the FFT cross-check tests). With both production references gone, the operators + the naive interpolant can go too. Removed: - All 14 operator `impl`s (Add P+/-P, Neg P, Mul P*P, Sub P-FieldElement); the previous PR kept these four families because `transition.rs` and `boundary.rs` used them. They are now fully unreachable. - `Polynomial::interpolate` and `InterpolateError` from `polynomial.rs`. - Operator unit tests in `polynomial_tests.rs` (adding_, negating_, multiply_*) and their helpers (`polynomial_a`, `polynomial_b`, `polynomial_minus_a`, `polynomial_a_plus_b`). - `compose` test helper + `composition_works` test (depended on the removed `interpolate`). - Direct `interpolate_x_*_y_*` tests (tested the removed function). Rewrote: - FFT correctness tests (both `fft_tests.rs` and `fft_friendly_u64_goldilocks_tests.rs`): the `gen_fft_and_naive_interpolate` / `_coset_` helpers cross-checked `interpolate_fft` against naive Lagrange. Replaced with FFT round-trip via direct Horner evaluation — interpolate via FFT, then re-evaluate at every twiddle with `Polynomial::evaluate` (a different algorithm). Same independence property, no naive Lagrange needed. - `simple_interpolating_polynomial_by_hand_works`: swapped `*` for `mul_with_ref(&...)` (the only remaining call site that needed it). Verified: - `cargo build --workspace` clean. - `cargo test -p math` -> 178 passing. - `cargo test -p lambda-vm-prover --lib` -> 281 pass / 77 pre-existing failures unrelated to constraints (same baseline as main). - `make lint` clean across all three clippy configs.
* refactor(math): remove dead Deserializable trait and unreachable error variants
`Deserializable` had a single impl (`Proof<T>`) whose only `deserialize`
call site was inside that impl itself — a recursion with no external
entry point. Production deserialization uses `bincode`/serde everywhere.
Delete the trait and the `Proof<T>` impl.
`merkle_tree/proof.rs` also carried a dead `Serializable` impl behind
`#[cfg(feature = "alloc")]` — a feature the `crypto` crate does not even
define, so the block never compiled. It referenced `math::traits::Serializable`,
which does not exist. Remove it.
Error cleanup (none constructed anywhere in the workspace):
- `DeserializationError` enum + `From<ByteConversionError>` impl — only
reachable through the now-deleted `Deserializable`.
- `PairingError` enum — lambda_vm has no pairings.
- `ByteConversionError::{InvalidValue, PointNotInSubgroup, ValueNotCompressed}`
— the latter two are elliptic-curve concepts; there is no EC here.
- `CreationError::InvalidDecString`.
Pure dead-code removal; `cargo build`/`cargo test -p math` unchanged
(216 tests pass).
* refactor(math): remove vestigial helpers module
`helpers::next_power_of_two(u64)` was bit-for-bit equivalent to stdlib
`u64::next_power_of_two()`. Its only caller, `resize_to_next_power_of_two`,
was used solely by the `fibonacci_rap` example — production trace sizing
already calls stdlib `.next_power_of_two()` directly everywhere.
Move `resize_to_next_power_of_two` into `examples/fibonacci_rap.rs` as a
local helper (rewritten on `usize` with the stdlib call, dropping the old
`try_into().unwrap()` TODO), and delete the `helpers` module.
* refactor(math): gate evaluate_fft_bit_reversed behind cfg(test)
`evaluate_fft_bit_reversed` was a non-gated `pub fn` but is only called
from the FFT property tests — no production caller in stark or prover.
Gate it `#[cfg(test)]` so it stops appearing in the production API
surface. Restore the gate (or remove it) when a real caller appears.
* refactor(math): gate test-only Polynomial::{truncate, reverse}
Both methods are called only from `fast_division` / `invert_polynomial_mod`,
which are themselves already `#[cfg(test)]`. No production caller in stark
or prover. Gate them `#[cfg(test)]` so they leave the production API surface.
`long_division_with_remainder` is left public — it is used by the
debug-checks path (`check_boundary_polys_divisibility`) and is a
legitimate general polynomial operation other code could want.
* refactor(math): flatten fft/cpu/* up to fft/*
The `fft/cpu/` directory implied a `fft/gpu/` sibling that never existed —
GPU code lives in the separate `crypto/math-cuda` crate. The `cpu` nesting
was a vestigial holdover.
Move `bit_reversing`, `bowers_fft`, `roots_of_unity`, `fft`, and
`bowers_fft_tests` from `fft/cpu/` up to `fft/`; delete `fft/cpu/mod.rs`
and merge its declarations into `fft/mod.rs`. Update every
`fft::cpu::...` / `super::cpu::...` path across math, stark, prover, and
math-cuda to `fft::...`.
Pure module move — no logic change. 216 math tests pass; stark, prover,
math-cuda all build.
* refactor(math): collapse polynomial/ folder into one polynomial.rs
Two problems compounded:
1. `polynomial/` was a folder holding a single file (`mod.rs`) — the
directory level served no purpose.
2. Two files were named `polynomial` (`polynomial/mod.rs` and
`fft/polynomial.rs`), both defining `impl Polynomial<...>` blocks.
Merge `fft/polynomial.rs` (the FFT-specific `impl Polynomial` blocks plus
the `evaluate_fft_cpu` / `interpolate_fft_cpu` free functions) into
`polynomial/mod.rs`, then rename it to a top-level `crypto/math/src/polynomial.rs`
and delete the folder.
Result: one `polynomial.rs` holding the `Polynomial` type and all of its
methods (core ring ops + FFT). `fft/` is now purely the FFT *algorithm*
layer (Bowers, twiddles, bit-reversal, roots of unity).
Also delete the stale `polynomial/README.md` — inherited lambdaworks
docs referencing multilinear-polynomial files that don't exist in
lambda_vm.
Updated the one external path (`fft::polynomial::compose_fft` ->
`polynomial::compose_fft`). 216 math + 124 stark tests pass.
* refactor(math): consolidate scattered tests into tests/ directory
Three test conventions coexisted: the `tests/` directory, a sibling
`bowers_fft_tests.rs` next to source, and an inline `#[cfg(test)] mod`
in the polynomial module. Unify on the `tests/` directory.
- Move `fft/bowers_fft_tests.rs` -> `tests/bowers_fft_tests.rs`, register
it in `tests/mod.rs`, drop the `mod bowers_fft_tests;` from `fft/mod.rs`.
- Move the inline coset-LDE `mod tests` out of `polynomial.rs` into a new
`coset_lde_tests` module in `tests/fft_tests.rs`.
- Rename `fft/fft.rs` -> `fft/reference_fft.rs`: after the Step 6 flatten
it collided with its own parent module name (`fft::fft`), which clippy
rejects. The new name also says what it is — the reference radix-2 FFT
used only to cross-check the production Bowers FFT. Gate the module
`#[cfg(test)] pub(crate)` since its whole content is test-only.
216 math tests pass (same count — no tests lost in the move).
* refactor(math): drop redundant FFT cross-check code, relocate inline tests
The reference radix-2 FFT existed only to cross-check the production Bowers
FFT. Bowers is already verified directly against a naive O(n^2) DFT at the
same sizes, so reference_fft.rs and its four comparison tests are dead
weight. gate get_powers_of_primitive_root as test-only (production twiddles
go through LayerTwiddles), collapse the evaluate_fft_cpu / interpolate_fft_cpu
wrappers into their sole callers, and narrow coset_lde_full_into to
pub(crate).
Move the inline test modules from bit_reversing.rs, goldilocks.rs and
extensions_goldilocks.rs into the dedicated tests/ directory.
* refactor(math): remove dead polynomial-division cluster, fix zero-offset panic
Addresses PR review feedback.
Reviewers flagged that `#[cfg(test)]`-gating previously-public methods is a
breaking change. Rather than keep them as `pub(crate)` dead code, the cluster
is removed outright — none of it has a production caller:
- Delete `Polynomial::{fast_division, invert_polynomial_mod,
fast_fft_multiplication, truncate, reverse}` and `evaluate_fft_bit_reversed`,
plus their 8 tests. The fast polynomial-division algorithm had no production
use; it existed only to be tested.
- `get_powers_of_primitive_root` stays `#[cfg(test)]` — it is a genuine test
helper used to validate the production FFT, not dead code.
Also fixes the pre-existing panic reviewers noted in `interpolate_offset_fft`:
`offset.inv().unwrap()` panicked on a zero coset offset. Add
`FFTError::InvalidCosetOffset` and propagate it instead.
* refactor(math): remove dead Polynomial APIs and self-referential tests
Continues the dead-code audit of polynomial.rs. Every item removed here
has no production caller — each is a function reachable only from its own
tests, or a test-file algorithm that tests itself.
Removed from polynomial.rs:
- to_extension (zero callers anywhere)
- long_division_with_remainder, leading_coefficient
- ruffini_division_inplace
- interpolate_coset_eval_with_g_n_inv (base-field variant)
Removed the matching self-referential tests/helpers from
polynomial_tests.rs (division_works, division_by_zero_degree_polynomial_works,
test_xgcd, ruffini_inplace_* and the div_with_ref / xgcd / ruffini_division
helpers).
The 3 barycentric tests that exercised the base-field
interpolate_coset_eval_with_g_n_inv now call the production-used
interpolate_coset_eval_ext_with_g_n_inv (instantiated E=F), so they keep
covering the real code path.
* refactor(math): keep Deserializable in math; leave crypto/merkle_tree/proof.rs untouched
The earlier dead-code sweep removed `Deserializable` + `DeserializationError`
from `math` because its only out-of-crate user was the `impl Deserializable`
in `crypto/crypto/src/merkle_tree/proof.rs`, so the PR also dropped those
impls. PR #611 (arity-4 FRI + Merkle caps) explicitly asks this PR to leave
`proof.rs` alone, so revert that pair of deletions:
- Restore `proof.rs` to main's contents (diff vs main: empty).
- Restore `DeserializationError` enum and `From<ByteConversionError>` impl
in `math/src/errors.rs`.
- Restore `Deserializable` trait + the `DeserializationError` import in
`math/src/traits.rs`.
The `Serializable` half of the impl block in `proof.rs` is gated on
`crypto/crypto`'s `alloc` feature, which no consumer in this repo
enables — that code path was already non-compiling and is unchanged here.
Tests: `cargo test -p math -p crypto` → 46 + 198 green. `make lint` clean.
* refactor(math): drop all 14 Polynomial operator impls + naive Lagrange (#618)
After #604 (boundary.rs eval-form) and #621 (end_exemptions_poly eval-form)
landed on main, no production code in the prover uses `Polynomial`
operators anymore. The only remaining user was `Polynomial::interpolate`
(test-only Lagrange interpolation, gated `#[cfg(test)]` and used as a
naive reference by the FFT cross-check tests). With both production
references gone, the operators + the naive interpolant can go too.
Removed:
- All 14 operator `impl`s (Add P+/-P, Neg P, Mul P*P, Sub P-FieldElement);
the previous PR kept these four families because `transition.rs` and
`boundary.rs` used them. They are now fully unreachable.
- `Polynomial::interpolate` and `InterpolateError` from `polynomial.rs`.
- Operator unit tests in `polynomial_tests.rs` (adding_, negating_,
multiply_*) and their helpers (`polynomial_a`, `polynomial_b`,
`polynomial_minus_a`, `polynomial_a_plus_b`).
- `compose` test helper + `composition_works` test (depended on the
removed `interpolate`).
- Direct `interpolate_x_*_y_*` tests (tested the removed function).
Rewrote:
- FFT correctness tests (both `fft_tests.rs` and
`fft_friendly_u64_goldilocks_tests.rs`): the
`gen_fft_and_naive_interpolate` / `_coset_` helpers cross-checked
`interpolate_fft` against naive Lagrange. Replaced with FFT
round-trip via direct Horner evaluation — interpolate via FFT, then
re-evaluate at every twiddle with `Polynomial::evaluate` (a different
algorithm). Same independence property, no naive Lagrange needed.
- `simple_interpolating_polynomial_by_hand_works`: swapped `*` for
`mul_with_ref(&...)` (the only remaining call site that needed it).
Verified:
- `cargo build --workspace` clean.
- `cargo test -p math` -> 178 passing.
- `cargo test -p lambda-vm-prover --lib` -> 281 pass / 77 pre-existing
failures unrelated to constraints (same baseline as main).
- `make lint` clean across all three clippy configs.
Follow-up to #604. Removes transition.rs's last dependency on
Polynomialarithmetic — combined with #604 (already merged) no production code in the prover usesPolynomialoperators anymore.Summary
Replaces the coefficient-form
end_exemptions_poly: Polynomial<F>incrypto/stark/src/constraints/transition.rswith two eval-form helpers:end_exemptions_roots(trace_primitive_root, trace_length) -> Vec<FieldElement<F>>— the rootsrᵢof∏(x − rᵢ).end_exemptions_lde_evaluations(domain) -> Vec<FieldElement<F>>— the product evaluated over the precomputed LDE coset directly.The OOD path in
evaluate_zerofiernow computes∏(z − rᵢ)directly instead ofPolynomial::evaluate. ThePolynomialandevaluate_polynomial_on_lde_domainimports are dropped fromtransition.rs.Performance
end_exemptions == 0, the only case in every VM table): unchanged. The else-branch's existing fast-path early-return is preserved bit-for-bit, so the function still returns the short cyclic vector instead of expanding it to the full LDE domain.O(N · end_exemptions)direct product replaces anO(N log N)FFT. Strictly faster forend_exemptions ≤ 2.Test plan
cargo build --workspace— clean.make lint— clean (all three clippy configs).cargo test -p stark— 125/125 passing, including the example AIRs that exercise the eval-formk > 0path (end_exemptions = 1, 2in fibonacci variants and read-only-memory).cargo test -p lambda-vm-prover --lib— 273 pass / 77 fail; identical counts on the pre-cleanup(stark/constraints): drop coefficient-form leftovers, dedupe eval #604 baseline (verified by reverting justtransition.rs). All 77 failures are pre-existing on the base and unrelated to constraints (UnknownSyscall(5)etc.).