Skip to content

V1#75

Closed
JakeHartnell wants to merge 23 commits into
mainfrom
v1
Closed

V1#75
JakeHartnell wants to merge 23 commits into
mainfrom
v1

Conversation

@JakeHartnell

@JakeHartnell JakeHartnell commented May 9, 2026

Copy link
Copy Markdown

What changed at the architecture level

  • Mock family deleted. Audit C-4 + M-3. Replaced by real ECDSA and BLS implementations, so mock
    has no remaining purpose.
  • ECDSA and BLS standalone families implemented for real, mirroring poa-middleware's
    POAStakeRegistry.sol semantics: owner-managed operator registration, weight tracking, signing-key
    snapshots, secp256k1 / BLS12-381 verification, quorum check via Uint512.
  • Owner / Admin role split on ECDSA + BLS service-managers: OWNER controls the operator set +
    pause; ADMIN controls service URI + quorum threshold. Smaller blast radius if a key is compromised.
    Mirror keeps its existing two-contract OWNER/ADMIN split.
  • Snapshot-everything for historical correctness. Total weight, operator weights, signing-key
    mappings, and quorum threshold are all SnapshotItem/SnapshotMap-backed and read at
    signature_data.reference_block during validation. No fallback to "latest" in any new code;
    mirror's existing fallback (audit H-2) was removed.
  • Mirror sync-handler ownership handoff added (audit C-5): stake-registry::TransferOwnership
    and service-manager::SetAdmin, plus CLI commands so the deploy script can hand control of the
    mirror contracts to the operator-sync-handler / quorum-sync-handler post-deploy. This is the point
    of the whole mirror family; previously the documented deploy left the CLI signer as both OWNER and
    ADMIN, so every BatchSetOperatorDetails and WavsSetQuorumThreshold from the sync-handlers
    reverted with Unauthorized.
  • Pause mechanism on ECDSA + BLS service-managers (owner-controlled). Standard prod guard;
    auditors expect it.
  • DST locked in at b"BLS_SIG_BLS12381G2_XMD:SHA-256_SSWU_RO_" (RFC 9380, byte-for-byte match
    with poa-middleware's HashToCurve.sol). EVM-side digest contract for ECDSA pinned to
    keccak256(envelope) → eip191_hash_message, also matching POA byte-for-byte.

Finding-by-finding

Status
C-1 ECDSA::WavsValidate stub
C-2 BLS::WavsValidate stub
C-3 Unauthenticated WavsSet*
C-4 Mock unauthenticated SetSigningKey ✅ (deleted)
C-5 Mirror sync-handler ownership gap
H-1 ECDSA/BLS service-handler no-ops
H-2 Mirror snapshot fallback
H-3 Mirror Uint256 underflow
H-4 Mirror duplicate trigger_id
H-5 trigger_id u64::MAX bricking
H-6 No owner transfer
M-1 Trigger public bus ✅ (optional allowlist; default unchanged)
M-2 Cross-chain digest format 🟡 Pinned in code; vector tests deferred
M-3 Mock api duplicates state ✅ (deleted)
M-4 Quorum not snapshotted
M-5 Untagged ExecuteMsg 🟡 Kept per upstream wavs-types protocol convention — see
Architectural Decision #8 in fixes.md

Commit layout

One logical commit per finding (or tightly-grouped fixes). Each commit message references the
finding ID it addresses, so revert-per-finding is clean.

What's NOT in this PR

Called out explicitly in audits/2026-05-09-fixes.md:

  • Vector tests for the digest contract (M-2) and the H-2 snapshot regression. The new ECDSA +
    BLS state shapes don't have multitest scaffolding in packages/tests/; building it out is the
    major remaining gap before external audit.
  • Deploy-flow E2E test for the C-5 fix.
  • Operator-set + quorum + pause edge-case suites for ECDSA + BLS.
  • Schema regen + cosmwasm-check on artifacts.
  • Memory note refresh at /workspace/memory/cw-middleware.md (still references the deleted
    Mock variant).
  • Canonical handler-deploys-downstream architecture for C-5. Functionally equivalent to the
    post-deploy TransferOwnership / SetMirrorAdmin model already implemented; would resolve a
    circular dependency between the two sync-handlers and the service-manager via Instantiate2 address
    prediction. Deferred as a future architectural cycle.
  • migrate entrypoints. Decision Create example ServiceHandler to count packets #7: explicit choice to defer until the first state-shape change
    forces a designed surface.

Verification

cargo check --workspace --exclude echo-with-id --exclude simple-aggregator clean for every
contract crate, sdk, utils, cli, on-chain-tests, off-chain-tests. echo-with-id and
simple-aggregator have pre-existing wit-bindgen requirements (task components:bindings-all)
unrelated to this work. shared-tests got a one-line #![recursion_limit = "512"] to clear a
pre-existing rustc compatibility issue blocking verification.

JakeHartnell and others added 23 commits May 9, 2026 19:08
Internal heartbeat-driven security audit (5C/6H/5M/5L/4I). Headlines: ECDSA/BLS service-managers are signature-verification stubs (C-1, C-2), unauthenticated config setters (C-3, C-4), mirror sync handlers can't authorize downstream calls under CLI deploy (C-5).
Companion to the internal audit. Commits the per-finding remediation
plan and the architectural decisions (real ECDSA + BLS, mirror kept,
mock deleted, both deploy-flow ownership + TransferOwnership).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
ECDSA/BLS digest + DST pinned from POA reference. Owner-only
registration, owner-pause, separate OWNER/ADMIN roles, no migrate
entrypoint, service-handler matches mirror's base. Open questions
trimmed to three tactical items.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Per the 2026-05-09 fix plan, the mock family was a footgun: unauthenticated
SetSigningKey, unauthenticated WavsSetServiceUri/WavsSetQuorumThreshold, and a
publishable api crate that exposed those types under a name a third party
could mistake for production. With ECDSA and BLS standalone families being
implemented in Phases 2 and 3, mock has no remaining purpose.

Changes:
- Delete packages/contracts/mock/{api,service-handler,service-manager}.
- Drop mock from workspace members and dependency aliases in Cargo.toml.
- Remove cw-wavs-mock-api deps from cli, sdk, utils, off-chain-tests,
  on-chain-tests, simple-aggregator, trigger/simple (most were stale and
  unused).
- Inline MessageWithId (HexBinary) into echo-with-id as a private module so
  echo-with-id no longer pulls from a deleted crate.
- Strip InstantiateMock variants and Mock contract-kind enum members from the
  CLI command surface; matching match arms removed from main.rs.
- Delete sdk/contract_kinds/mock.rs and remove from mod.rs.
- Delete on-chain/off-chain mock-specific test scaffolding (contract clients,
  test files). Rename mock_sanity test fns to bls_sanity / ecdsa_sanity.
- Port concurrency tests from MockTestClient to EcdsaTestClient.
- Remove Mock variant from TestService enum + new_mock constructor.
- e2e: stub out the per-flavor response polling until Phase 2/3 wires up
  real persistence on ECDSA/BLS handlers.
- README: drop the legacy "not for production" disclaimer in favor of an
  honest 2026-05-09 fix-plan-anchored status note. Update contract list.
- taskfile/config.yml: drop "mock" from ALL_CONTRACT_KINDS.

cargo check clean for every contract crate, sdk, utils, cli, on-chain-tests,
off-chain-tests. shared-tests has a pre-existing recursion-limit error
unrelated to this change (see verification stash on the branch); to be
addressed separately.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…-6, H-1)

Replaces the stub Empty/untagged-only api with a full surface that
mirrors POA's StakeRegistry semantics, plus operational role split
matching the 2026-05-09 fix plan.

Service-manager api:
- InstantiateMsg now requires owner + admin (separate roles) and
  takes optional initial quorum threshold (defaults 2/3).
- ExecuteMsg adds owner-only operator-set messages (RegisterOperator,
  DeregisterOperator, UpdateOperatorWeight, UpdateOperatorSigningKey),
  owner-only Pause/Unpause, two-step TransferOwnership/AcceptOwnership,
  and admin-only SetAdmin/AcceptAdmin. WavsSetServiceUri and
  WavsSetQuorumThreshold remain part of the wavs-types untagged set
  per upstream protocol convention; auth gating is enforced in the
  service-manager impl (next commit).
- QueryMsg adds Owner/PendingOwner/Admin/PendingAdmin/Paused, plus
  TotalWeight/OperatorWeight/OperatorSigningKey with optional
  reference_block (snapshot reads). OperatorRegistered for current
  membership.

Service-handler api:
- Adds EcdsaServiceHandlerQueryMessages with TriggerValidated /
  TriggerMessage / SignatureData parallel to mirror's surface so the
  handler can persist envelopes and let downstream callers retrieve
  them by trigger_id.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
… C-3, H-2, H-3, H-6, M-4)

Replaces the stub with a full operator-set + signature-validation
implementation, mirroring poa-middleware/contracts/src/ecdsa/POAStakeRegistry.sol
in semantics and porting mirror::stake_registry::is_valid_signature for the
secp256k1 verification logic.

State (cw-storage-plus snapshots throughout):
- OWNER + PENDING_OWNER (operator-set role).
- ADMIN + PENDING_ADMIN (quorum + URI role; separated from OWNER).
- PAUSED flag (owner-controlled freeze).
- QUORUM_NUMERATOR / QUORUM_DENOMINATOR as SnapshotItem so historical
  envelopes can be evaluated against the threshold in force at their
  reference_block (audit M-4 preventatively applied to the new family).
- TOTAL_WEIGHT as SnapshotItem.
- OPERATOR_WEIGHTS / OPERATOR_TO_SIGNING_KEY / SIGNING_KEY_TO_OPERATOR
  as SnapshotMap keyed on String.
- OPERATOR_REGISTERED current-membership flag.

Owner-only writes (audit C-3):
- RegisterOperator: rejects duplicate operator, zero signing key,
  signing-key collisions; bumps TOTAL_WEIGHT via checked_add.
- DeregisterOperator: removes maps + decrements TOTAL_WEIGHT via
  checked_sub.
- UpdateOperatorWeight: recomputes TOTAL_WEIGHT via checked_sub +
  checked_add (audit H-3 preventative).
- UpdateOperatorSigningKey: enforces uniqueness, snapshots both old
  removal and new add at current height.
- Pause / Unpause toggle the freeze.

Two-step role transfer (audit H-6):
- TransferOwnership / AcceptOwnership.
- SetAdmin / AcceptAdmin.

Admin-only WAVS-standard writes (audit C-3):
- WavsSetServiceUri.
- WavsSetQuorumThreshold (validates n > 0, d > 0, n <= d).

WavsValidate query:
- Reads total_weight + quorum threshold via may_load_at_height at
  signature_data.reference_block. NO fall-back to latest (audit H-2
  preventatively applied to new code).
- Resolves signing_key -> operator via SIGNING_KEY_TO_OPERATOR
  may_load_at_height only; missing => signer rejected.
- secp256k1 recovery against EIP-191 personal-sign of keccak256(envelope)
  -- ported byte-for-byte from mirror.
- Enforces sorted+unique signers, non-zero signers, non-zero operator
  weight at reference_block.
- Quorum check via Uint512 (signed * denominator >= total * numerator).
- Returns paused error when PAUSED is set.

Cargo deps add alloy-primitives (eip191_hash_message, keccak256) and
sha3 (Keccak256 for pubkey -> address derivation).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…id (audit H-1, H-4)

Replaces the no-op handler with envelope persistence parallel to mirror's
base service-handler, plus the duplicate-trigger_id guard (audit H-4
preventatively applied to the new ECDSA family).

Flow on WavsHandleSignedEnvelope:
1. Delegate validation to the ECDSA service-manager via WavsValidate
   query. Any failure short-circuits before persistence.
2. Decode the envelope, extract MessageWithId from the payload.
3. Reject if TRIGGER_MESSAGE already has an entry for the trigger_id.
4. Persist (trigger_id -> message bytes) and (trigger_id -> signature
   data).

api/message_with_id.rs introduces an ECDSA-family MessageWithId using
HexBinary for the message field (binary-safe; mirrors echo-with-id's
local definition rather than mirror's String version).

api/service_handler.rs (committed earlier) defines the matching
EcdsaServiceHandlerQueryMessages so downstream callers can query
TriggerValidated / TriggerMessage / SignatureData.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
- CLI ServiceManagerCommand::InstantiateEcdsa now takes --owner, --admin,
  and optional --quorum-numerator / --quorum-denominator.
- on-chain + off-chain test clients pass owner = admin = signer (same
  semantic shape they had before, just satisfying the new struct).
- shared-tests: bump recursion_limit to 512 to clear the pre-existing
  query-depth overflow in run_sanity_tests (rustc compatibility, not
  introduced by this change). Workspace cargo check now clean.

BLS CLI surface left as Empty for now; Phase 3 will mirror this CLI
change once BLS gets its real InstantiateMsg.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…ty (audit C-1, C-2, C-3, H-1, H-2, H-3, H-4, H-6, M-4)

Replaces the BLS stub with a full operator-set + aggregate-signature
verification implementation. Same shape as the ECDSA family from
Phase 2; the BLS-specific differences are documented inline.

Cryptography:
- 48-byte compressed G1 pubkeys (cosmwasm-crypto convention).
- 96-byte compressed G2 aggregate signatures.
- DST locked to b"BLS_SIG_BLS12381G2_XMD:SHA-256_SSWU_RO_" --
  byte-for-byte match with poa-middleware's HashToCurve.sol so
  off-chain signers can be re-used across both stacks.
- Verification via cosmwasm-std's BLS12-381 host calls:
  bls12_381_aggregate_g1, bls12_381_hash_to_g2, and
  bls12_381_pairing_equality with G1::generator.
- Signers in WavsSignatureData carry 20-byte BLS-key ids
  (= keccak256(g1_pubkey)[..20]), since the upstream wavs-types
  schema constrains signers to Vec<EvmAddr>. The full G1 pubkey is
  stored on-chain at registration time and looked up by id at
  validation. signature_data.signatures must have len == 1 carrying
  the aggregate G2 signature.

State + execute messages parallel to ECDSA:
- OWNER + PENDING_OWNER (operator-set role).
- ADMIN + PENDING_ADMIN (quorum + URI role; separated).
- PAUSED.
- QUORUM_NUMERATOR / QUORUM_DENOMINATOR / TOTAL_WEIGHT as
  SnapshotItem.
- OPERATOR_WEIGHTS + OPERATOR_TO_BLS_PUBKEY +
  OPERATOR_TO_BLS_KEY_ID + BLS_KEY_ID_TO_OPERATOR as SnapshotMap.
- Owner-only RegisterOperator/Deregister/UpdateWeight/
  UpdateOperatorBlsPubkey, all under PAUSED guard.
- Admin-only WavsSetServiceUri / WavsSetQuorumThreshold.
- Two-step TransferOwnership/AcceptOwnership and SetAdmin/
  AcceptAdmin.

WavsValidate enforces:
- non-empty signers, len(sigs) == 1, sig length = 96.
- sorted-ascending unique key ids (matches POA's pattern).
- snapshot reads at reference_block (no fall-back to latest --
  audit H-2 preventatively for the new family).
- non-zero operator weights at reference_block.
- pairing equality.
- Uint512 quorum check.
- paused short-circuits.

Service-handler persists envelope keyed by trigger_id (parallel to
ECDSA + mirror); rejects duplicate trigger_id (audit H-4
preventatively for the new family). Adds a BLS-family MessageWithId
in the api crate using HexBinary for binary-safe payloads.

CLI ServiceManagerCommand::InstantiateBls now takes --owner,
--admin, and optional --quorum-numerator / --quorum-denominator
matching the ECDSA shape.

off-chain test client/trigger.rs cleanup: was using
cw_wavs_bls_api::service_manager::InstantiateMsg as a stand-in for
the trigger's empty InstantiateMsg (silent copy-paste); replaced
with the correct cw_wavs_trigger_api::simple::InstantiateMsg.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…ht arithmetic (H-3)

H-2: WavsValidate previously fell back to the latest mapping when
SIGNING_KEY_TO_OPERATOR.may_load_at_height(reference_block) returned
None, and applied the same fallback to OPERATOR_WEIGHTS. That broke
reference_block semantics — an operator registered (or with weight
updated) AFTER reference_block could contribute weight to a
historical signature. The fix removes both fallbacks: missing
snapshot at reference_block now rejects the signer with an explicit
error.

H-3: set_operator_details_at computed
  new_total_weight = total_weight - current_weight + weight
on Uint256, which panics on underflow (e.g. storage corruption,
double-decrement, reorder bug). Replaced with checked_sub +
checked_add returning ContractError::Std on failure, so the contract
no longer becomes wedged on bad arithmetic.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…(audit H-4)

The base service-handler used Map::save which silently overwrites.
Two different signed envelopes sharing the same trigger_id (operator
quorum signs both -- accidentally or maliciously) would silently
clobber each other; downstream consumers reading TRIGGER_MESSAGE
would see only the second.

The mirror operator-sync and quorum-sync handlers each enforce
monotonic LAST_TRIGGER_ID separately. The bare service-handler did
not, so any deployment using the base handler directly was exposed.
Now reject duplicates with an explicit error.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…_GAP (audit H-5)

The strict-greater check `last_trigger_id < triggerId` allowed a
malicious or buggy operator quorum to submit triggerId == u64::MAX,
permanently bricking all future updates (every legitimate triggerId
< MAX gets rejected forever).

Cap the gap at 2^48 (~280 trillion -- ample headroom for real
workloads while rejecting sentinel values like u64::MAX). Applied
identically to mirror-operator-sync-handler and
mirror-quorum-sync-handler. Also bound the initial value when no
LAST_TRIGGER_ID is set yet, so a fresh contract can't be primed
with a sentinel either.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…dit M-4, L-3)

M-4: QUORUM_NUMERATOR / QUORUM_DENOMINATOR were Item<Uint256>, read at
validation time and not snapshotted. In-flight envelopes signed under
threshold A would be re-validated against threshold B if the admin
flipped it before validation. Convert both to SnapshotItem with
EveryBlock strategy; instantiate writes the default 2/3 at
env.block.height; WavsSetQuorumThreshold writes at the current
height; validate_quorum reads via may_load_at_height(reference_block)
matching how operator weights are pulled. WavsQuorumThreshold query
keeps the current-value semantics via load().

L-3: ethereum_address_raw used .try_into().unwrap() to extract the
last 20 bytes of the keccak digest. The unwrap is currently
unreachable but a defensive ? guards future refactors.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
… handoff (audit C-5, H-6)

Adds the messages the post-deploy script needs to hand control of
the mirror stake-registry and service-manager to the corresponding
sync-handler contracts, resolving audit C-5 -- the previous flow
left the CLI signer as both OWNER and ADMIN, so mirror-operator-
sync-handler::BatchSetOperatorDetails and mirror-quorum-sync-
handler::WavsSetQuorumThreshold both reverted with Unauthorized
under the documented deploy.

stake_registry api/entry:
- ExecuteMsg::TransferOwnership { new_owner } -- single-step
  transfer because the typical target is a contract address that
  can't sign an AcceptOwnership message on its own. Owner-gated.

service_manager api/entry:
- ExecuteMsg::SetAdmin { new_admin } -- same shape, admin-gated.

CLI gains:
- ServiceManagerCommand::SetMirrorAdmin --address --new-admin.
- RegistryCommand::TransferOwnership --address --new-owner.

Recommended deploy flow now:
1. Deploy stake-registry (CLI = OWNER) and chained service-manager
   (CLI = ADMIN).
2. Deploy mirror-operator-sync-handler and mirror-quorum-sync-
   handler with the existing service-manager address.
3. CLI: registry transfer-ownership --address <stake_registry>
        --new-owner <operator-sync-handler>.
4. CLI: service-manager set-mirror-admin --address <service_manager>
        --new-admin <quorum-sync-handler>.

After step 4, the documented sync-handler call paths succeed.
TransferOwnership/SetAdmin remain available for emergency rotation.

The "canonical" handler-deploys-downstream path the fix plan
sketched would resolve a circular dependency between the two
sync-handlers and the service-manager via Instantiate2 address
prediction; left as a future architectural cycle. Post-deploy
transfer is functionally equivalent and far simpler.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The simple trigger contract was a public message bus -- anyone could
Push and downstream relayers/indexers were silently inheriting that
trust. The legacy default behavior is preserved (allowed_pushers =
None means public bus), but instantiating with
allowed_pushers: Some([..]) now restricts Push to known senders.

api/simple.rs: replace `pub type InstantiateMsg = Empty` with a
struct carrying `allowed_pushers: Option<Vec<String>>` (default
None). Existing call sites that used `Empty {}` migrate to
`InstantiateMsg::default()`.

simple/state.rs: ALLOWED_PUSHERS Item<Option<Vec<Addr>>>.

simple/entry.rs: instantiate validates each address in the optional
list and stores it. execute(Push) loads the allowlist and rejects
senders not on it; legacy behavior when None preserved.

The two test scaffolding sites now use InstantiateMsg::default().

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Companion to the audit and fix plan. Maps every Critical, High,
Medium, and Low finding to the commit SHA that addresses it; calls
out the deferred test work and the explicit acknowledged-and-
deferred items (M-5 untagged enum kept per upstream protocol
convention; L-4 migrate entrypoint deferred per Decision #7).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Mechanical reformatting across files touched by the audit-fix work.
Resolves the cargo fmt --check failure on PR #75 lint job.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…s for H-2

Two unrelated issues blocking the test job on PR #75:

1. Each `tests/*.rs` file is its own crate root for cargo, so the
   recursion_limit attribute on shared-tests/lib.rs doesn't apply.
   Default rustc recursion limit (128) is exceeded by the async test
   bodies' future depth (130). Add `#![recursion_limit = "512"]` to
   contracts_ecdsa.rs / contracts_bls.rs / contracts_mirror.rs.

2. The mirror stake-registry tests in shared-tests use
   `reference_block: 12345u32` (cw-multi-test's default starting
   block). With the H-2 fix removing the snapshot fall-back to
   latest, validation now correctly rejects signers whose
   registration snapshot doesn't exist at reference_block --
   registration writes to block 12346+, not 12345. Bump the
   constant to 999_999u32 (well past any cw-multi-test block
   height) so SnapshotMap::may_load_at_height returns the actual
   registered values.

After this all 11 off-chain tests pass: bls_sanity, ecdsa_sanity,
mirror_sanity, mirror_stake_registry_sanity,
mirror_multi_signer_validation, mirror_abi_binary_compatibility,
mirror_service_manager_admin_only, mirror_negative_scenarios,
mirror_ethereum_recovery_id_test,
mirror_service_manager_validate_query_is_result.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
fix: validate mirror quorum against snapshot weights
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants