You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
test(auto-eval): determinism gate also asserts on ef_cqs_strict
Address review comment #2 (Important) on PR #764 Sub-project A. The
original determinism gate only tracked lenient ef_cqs deltas between
back-to-back runs. Since Sub-project A's whole purpose is to make
ef_cqs_strict the future decision gate, the determinism gate should
cover it now — not after the cut-over PR, when hidden nondeterminism
in the strict denominator path would surface for the first time.
Changes:
- Track both lenient and strict deltas per ticker in parallel.
- Compute max_delta as max(max_lenient, max_strict) and assert against
the single DETERMINISM_THRESHOLD (both must be bit-identical).
- Log both columns separately so the CI run captures which field
(if either) drifted, making root-causing faster.
- Failure message reports both maxes and per-ticker pairs.
Lenient and strict share an ef_pass_count numerator, so under current
determinism they should co-move exactly. But pinning both now catches
any future FP-reduction or iteration-order bug that affects only the
strict path's wider denominator (total - disputed - unverified).
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
0 commit comments