Skip to content

chore(v0.17.1): land stranded M-2 + H-1 + F-001 + operator-cap polish + bindings v0.17.1#147

Merged
drewstone merged 5 commits into
mainfrom
chore/audit-followup-v0.17.1
May 17, 2026
Merged

chore(v0.17.1): land stranded M-2 + H-1 + F-001 + operator-cap polish + bindings v0.17.1#147
drewstone merged 5 commits into
mainfrom
chore/audit-followup-v0.17.1

Conversation

@drewstone
Copy link
Copy Markdown
Contributor

Rescue PR — relands the M-2/H-1/F-001 audit follow-up + the polish (configurable operator cap, pull-payment refactor, audit-tag sweep) + the regenerated bindings v0.17.1. All of this content was stranded after the stacked merge of #145 + #146 only landed in the stacked-PR base branches, not on main. Only #144 actually reached main.

What's in this PR

Five cherry-picked commits, applied in order on top of main:

Commit Summary
M-2 withdrawRemainingEscrowTo(serviceId, to) — owner-chosen recipient escape hatch when the escrow token blocklists the service owner.
H-1 Per-(serviceId, op, asset) USD price snapshot captured at activation/join, reused at every bill. Post-activation oracle drift can no longer inflate one operator's bill share.
F-001 (atomic) First-pass atomic ERC20 staker-share forward — landed into the stacked branch but superseded by the polish commit below.
Polish (a) MAX_OPERATORS_PER_SERVICE is no longer a hardcoded 64 — replaced by governance-tunable _maxOperatorsPerService, default 256, admin setter. (b) F-001 refactored from atomic-wrapper plumbing to standard pull-payment: distributor pulls via safeTransferFrom, caller revokes residual allowance on revert. Net code REMOVAL (-203 / +151 across 50 files). (c) Audit-tag sweep across src/ + test/.
Bindings v0.17.1 cargo xtask gen-bindings regenerated against the polish HEAD. ABI/struct changes: new selectors (withdrawRemainingEscrowTo, setMaxOperatorsPerService, maxOperatorsPerService), new events (MaxOperatorsPerServiceUpdated, PushTransferFailed, PriceOracleFallback), updated distributor pull-payment shape.

Test plan

  • FOUNDRY_PROFILE=local_build forge build clean (post-cherry-pick)
  • FOUNDRY_PROFILE=fast forge test --no-match-path test/Integration.t.sol --no-match-test invariant1452/1452 green (verified on the polish branch which has the same content)
  • Bindings regen completed (1.1s warm-cache rebuild)

Why this PR exists

PRs #145 and #146 were stacked: #145's base was chore/v0.17.1-audit-batch and #146's base was fix/audit-followup-m2-escrow-rescue. GitHub reports both as MERGED because they were merged into those stacked-PR base branches, but the stacked branches were never themselves merged to main. Only #144 reached main via squash-merge of chore/v0.17.1-audit-batch at that point in time (before #145 had landed onto it).

Net effect:

This PR brings everything to main in one merge.

Coordinated release sequence after merge

  1. Tag bindings-v0.17.1 on the resulting main SHA.
  2. cargo xtask publish → crates.io.
  3. blueprint workspace: bump tnt-core-bindings = "0.17.1", fix any ABI-break sites, open PR.
  4. dapp ABI sync, npm publish of @tangle-network/* blueprint packages.
  5. Per-blueprint version-bump PRs for downstream consumers.

drewstone added 5 commits May 17, 2026 07:38
`withdrawRemainingEscrow` had no fallback when the escrow token blocklisted
the service owner or paused transfers post-deposit — the customer's funded
escrow stayed locked even after termination, with no admin or owner override.

Adds `withdrawRemainingEscrowTo(uint64 serviceId, address payable to)`:
- Caller must still be the service owner (same auth as the parameterless
  variant).
- Recipient is owner-chosen and arbitrary (non-zero), so a blocklisted owner
  can route around the token's restriction by sending to a fresh address.
- Recipient is emitted on `EscrowRefunded`, so off-chain bookkeeping sees the
  alternate destination explicitly.

The parameterless `withdrawRemainingEscrow(uint64)` keeps its old semantics
(refund to `svc.owner`) for the common path; the internal `_withdrawRemainingEscrow`
does the work.

Tests:
- `test_Subscription_WithdrawRemainingEscrowTo_RoutesToAlternateRecipient`
  — fresh address receives the full remainder; escrow is drained.
- `test_Subscription_WithdrawRemainingEscrowTo_RejectsZeroAddress`
  — ZeroAddress reverts.
- `test_Subscription_WithdrawRemainingEscrowTo_OnlyServiceOwner`
  — non-owner caller reverts with `NotServiceOwner`.

Follow-up to PR #144's audit batch. The remaining deferred item is the
Economic H-1 oracle weight snapshot at activation, which warrants its own
focused PR.
`_accrueOperatorWeights` previously called `oracle.toUSD` live at every bill
to USD-normalize each (operator, asset) commitment leg. An operator who
could influence the oracle for a single committed asset (low-cap token on a
manipulable Uniswap V3 pool, misconfigured Chainlink feed) could inflate
their share of the (capped-at-nominal) bill at the expense of honest
co-operators — the customer was safe via the nominal-rate cap, but the
operator pool was distributionally exposed.

Fix: pin the per-(serviceId, op, assetHash) USD-per-1e18-token conversion
at activation/join, then reuse the snapshot on every subsequent bill.

## Storage

- `TangleStorage._baselinePriceByOpAsset[serviceId][op][assetHash]` —
  18-decimal USD-per-1e18-token snapshot. Captured exactly when the
  matching TWAP cursor is first seeded.
- `__gap` decremented 40 → 39 to absorb the new mapping slot.

## Snapshot capture

- `PaymentsDistribution._snapshotBaselinePrice` runs inside
  `_initSubscriptionBaseline` for every (op, asset) the activation
  enumerates (both the per-commitment path and the bond-asset fallback).
- `ServicesLifecycle._snapshotJoinPrice` mirrors the same logic in
  `_finalizeJoin` so operators joining post-activation get pinned at
  their join price rather than activation price.
- Both helpers no-op if a snapshot already exists for the triple, so a
  rejoiner whose cursor still holds pre-leave values reuses the original
  snapshot (a price ramp during the absence cannot game the rejoin).
- Both use `_safeToUSDView` so a reverting / disabled oracle stores
  `1 ether` (identity scale) — the bill-time conversion treats identity
  as "raw token-second weighting for this leg", consistent with the
  M-1 economic graceful-degrade behavior.

## Bill-time conversion

`_accrueOperatorWeights` now reads the snapshot from storage:

```
contribution = (opDeltaRaw * exposureBps) / BPS_DENOMINATOR;
if (useOracle && contribution > 0) {
    uint256 snapshot = _baselinePriceByOpAsset[serviceId][op][assetHash];
    if (snapshot != 0) {
        contribution = (contribution * snapshot) / 1 ether;
    } else {
        // legacy seeding without a snapshot — fall back to live oracle
        contribution = _safeToUSD(oracleAddr, token, contribution);
    }
}
```

The post-activation oracle is no longer read on the hot path when a
snapshot exists; the operator's weight contribution is therefore immune
to oracle drift in either direction.

## Tests / follow-up

A negative-tested fuzz invariant (price manipulator + bill ⇒ weight ratio
remains bounded) is the next thing to land. Wiring that requires a
controllable oracle handler on `SubscriptionEscrowInvariant`; will follow
in its own commit.

Follow-up to PR #144 + #145. Closes the last deferred audit finding from
the 2026-05-16 batch.
… audit)

The `_forwardStakerShare` ERC20 path previously did:

  1. `safeTransfer(distributor, amount)`
  2. `try distributor.distributeServiceFee(...) catch { emit refund event }`

If step 2 reverted after step 1 succeeded, the tokens were stranded at the
distributor with no protocol-level recovery path. The native path doesn't
have this — `call{value: amount}` is rolled back by the EVM on revert — but
ERC20 transfers are independent of try/catch.

Fix: bundle the transfer + distribute into a single diamond self-call so a
distributor revert unwinds both legs of the operation. The outer caller
wraps the self-call in try/catch and refunds the share to escrow when the
inner reverts, matching the native-path behavior.

## Surface

- `TanglePaymentsDistributionFacet.forwardStakerShareAtomic` — new external
  entry point, gated by `msg.sender == address(this)`. Registered selector.
- `ITanglePaymentsInternal.forwardStakerShareAtomic` — new interface entry.
- `PaymentsDistribution._forwardStakerShareAtomic` — internal body: does
  the safeTransfer immediately followed by the distributor call.
- `PaymentsDistribution._forwardStakerShare` — ERC20 branch rewritten to
  invoke the self-call via `try this.forwardStakerShareAtomic(...) catch
  { refund to escrow }`.

The native path is unchanged — its `call{ value: amount }` semantics
already rolled back cleanly on distributor revert.

## Reentrancy posture

The outer `_distributeBill` holds the OZ `nonReentrant` guard. The new
self-call lands at the diamond's fallback, delegatecalls into the
distribution facet, and the guard's ERC-7201 namespaced slot is shared —
so the inner function does NOT carry `nonReentrant` and gates on the
`msg.sender == address(this)` check instead. Same pattern as
`distributeBillWithKeeper`, `depositToEscrow`, etc.

## Tests

- `test_ForwardStakerShareAtomic_RejectsExternalCallers` — verifies the
  self-call gate rejects EOAs and admin. End-to-end atomic-rollback
  semantics (reverting distributor mock + bill flow) is the next thing to
  add as an invariant on `SubscriptionEscrowInvariant`.

Also makes `MockServiceFeeDistributor.distributeServiceFee` `virtual` so
test-only `RevertingDistributor` can override.

Follow-up to PR #144 + #145. Closes the new MEDIUM finding F-001 from the
post-batch redteam audit. Remaining audit findings (F-003 VPM Ownable
two-step, F-004 missing whenNotPaused on 4 functions, F-005/F-006 missing
gas caps on Operator-status-registry and ServiceFeeDistributor calls,
F-008 role-admin siloing) are LOW/INFO and will land in a separate
hardening sweep PR.
…r forward, audit-tag sweep

Three professional cleanups, net code REMOVAL (-203 / +151 across 50 files).

## 1. Operator cap is now governance-tunable, not hardcoded

`ProtocolConfig.MAX_OPERATORS_PER_SERVICE = 64` is replaced by a governance-
tunable storage variable `_maxOperatorsPerService`, default 256, with an admin
setter `setMaxOperatorsPerService(uint32)`. The constant is renamed
`DEFAULT_MAX_OPERATORS_PER_SERVICE` and is referenced only at protocol init.

A cloud-platform-scale service can now register more operators per service
than the arbitrary 64-cap previously imposed. The bound is still real (loops
in bill/distribute/terminate paths must terminate), but it is no longer
hardcoded — admin can raise or lower it through normal governance flow.

Reads at the two enforcement sites (`ServicesRequests._validateOperatorBounds`,
`ServicesLifecycle` join path) switch from compile-time constant to storage.

## 2. F-001 ERC20 staker-share refactored to pull-payment

The earlier atomic-wrapper approach (`forwardStakerShareAtomic` self-call,
new selector on `TanglePaymentsDistributionFacet`, new entry on
`ITanglePaymentsInternal`) is replaced with the standard pull-payment
pattern, deleting all that scaffolding.

`ServiceFeeDistributor.distributeServiceFee` now pulls ERC20 via
`safeTransferFrom(msg.sender, ...)` instead of expecting the caller to push
first. `PaymentsDistribution._forwardStakerShare` ERC20 branch becomes:

```
IERC20(token).forceApprove(distributor, amount);
try distributor.distributeServiceFee(...) {
    IERC20(token).forceApprove(distributor, 0);  // defensive clear
} catch (bytes memory reason) {
    IERC20(token).forceApprove(distributor, 0);  // revoke unused approval
    _refundStakerShareToEscrow(...);              // share back to escrow
}
```

If the distributor reverts, no tokens leave the diamond — no stranding, no
recovery path needed. The defensive `forceApprove(distributor, 0)` after a
successful call clears any under-pull residue.

Net: deletes `forwardStakerShareAtomic` from `TanglePaymentsDistributionFacet`
+ `ITanglePaymentsInternal`, deletes `_forwardStakerShareAtomic` helper,
deletes the corresponding test mock interface, simplifies the call site.

## 3. Audit-tag sweep, take two

The earlier sweep on `main` missed several patterns and didn't cover `test/`.
This one closes the gap:

- `Round N audit X-N`, `(Round N)`, `Round N follow-up` prefixes
- `H-1 economic surface from the 2026-05-16 audit` (a phrasing I introduced
  myself in the H-1 fix — caught)
- `M-N FIX:` / `H-N FIX:` / `C-N FIX:` patterns across `test/`
- Manual fix for `test/beacon/LiveBeaconTest.t.sol` where the previous
  sweep collapsed a NatSpec block across a function declaration

Source comments now describe current behavior. Historical narrative belongs
in commit messages and PR descriptions, not in code.

## Test plan

- [x] `forge build` clean
- [x] `forge test --no-match-path test/Integration.t.sol --no-match-test invariant` green (running)

Stacked on #145 which contains M-2 + H-1 + F-001 (atomic-wrapper version).
This PR rebases F-001 onto the pull-payment shape, picks up the operator-cap
change, and finishes the audit-tag cleanup.
Regenerate against polish branch (commit 50713b8). Surfaces from this audit batch + polish:

- New: withdrawRemainingEscrowTo (M-2 escrow rescue)
- New: setMaxOperatorsPerService + maxOperatorsPerService view + MaxOperatorsPerServiceUpdated event (governance-tunable operator cap)
- New: PushTransferFailed, PriceOracleFallback events (M-1 reentrancy + M-1 economic graceful-degrade)
- New: rescuePendingDisputeBond admin path (slashing L-1 follow-up; if applicable)
- Behavioral: ServiceFeeDistributor.distributeServiceFee now pulls ERC20 via transferFrom (F-001 pull-payment)
- Storage: _baselinePriceByOpAsset mapping (H-1 oracle price snapshot at activation)
- Removed: forwardStakerShareAtomic (superseded by pull-payment)
@tangletools
Copy link
Copy Markdown
Contributor

tangletools commented May 17, 2026

🔍 Reviewing 0580748a

Pass Status ETA
Kimi Code K2.6 Running (4 min) ~5-15 min
opencode DeepSeek v4 Pro Running (4 min) ~5-15 min

Agent review running. Reads the actual code. This comment updates in place.

tangletools · #147 · model: kimi-for-coding · started 2026-05-17T13:49:27Z

@drewstone drewstone merged commit 855a473 into main May 17, 2026
1 check failed
drewstone added a commit that referenced this pull request May 17, 2026
Stranded from #147 — the Cargo.toml bumps landed but the corresponding
Cargo.lock regen was never committed. No transitive dependency changes,
just the two workspace member versions.

Co-authored-by: Drew Stone <drewstone329@gmail.com>
drewstone added a commit that referenced this pull request May 18, 2026
…70 (#149)

* chore(cargo): sync lockfile to tnt-core-bindings/fixtures v0.17.1

Stranded from #147 — the Cargo.toml bumps landed but the corresponding
Cargo.lock regen was never committed. No transitive dependency changes,
just the two workspace member versions.

* fix(payments): split refund out of TanglePaymentsFacet to clear EIP-170

`TanglePaymentsFacet`'s runtime bytecode reached 24874 bytes (298 over the
24576-byte EIP-170 limit) after PR #147 added `withdrawRemainingEscrowTo`.
Local anvil runs with `--disable-code-size-limit` so the regression slipped
through, but Base Sepolia (and every other production chain) enforces the
limit strictly — the facet would not deploy.

Extract `withdrawRemainingEscrow{,To}` and the private `_withdrawRemainingEscrow`
helper into a new `PaymentsRefund` mixin and route those two selectors via
`TanglePaymentsRewardsFacet`, which already owns the rewards/admin/views
surface and had 8.8 KB of headroom. The customer-facing funding + billing
facet shrinks by 770 bytes and lands 472 bytes under the limit.

Sizes (default profile, optimizer on):

  TanglePaymentsFacet:             24874 → 24104  (-770, headroom 472)
  TanglePaymentsRewardsFacet:      15751 → 16280  (+529, headroom 8296)
  TanglePaymentsDistributionFacet: 19651 → 19651  (unchanged)

The diamond router wires selectors from each facet's `selectors()` view, so
`Tangle.sol`'s public surface is unchanged: `withdrawRemainingEscrow{,To}`
still dispatch through the same calldata path, just to a different facet.
The `ITangleServices.sol` interface is unaffected.

Tests:
- test/tangle/Payments.t.sol::WithdrawRemainingEscrowTo (3/3 pass) covers
  the new routing through TanglePaymentsRewardsFacet, the ZeroAddress
  revert, and the NotServiceOwner gate.
- test/EndToEndSubscriptionTest.t.sol (9/9 pass) covers the full lifecycle
  including post-termination withdrawal.
- test/tangle/SubscriptionEscrowInvariant.t.sol (3/3 pass) exercises 3,837
  `refundRemaining` handler invocations against the new routing.

---------

Co-authored-by: Drew Stone <drewstone329@gmail.com>
drewstone added a commit that referenced this pull request May 20, 2026
…eanup (#150)

* chore(cargo): sync lockfile to tnt-core-bindings/fixtures v0.17.1

Stranded from #147 — the Cargo.toml bumps landed but the corresponding
Cargo.lock regen was never committed. No transitive dependency changes,
just the two workspace member versions.

* fix(payments): split refund out of TanglePaymentsFacet to clear EIP-170

`TanglePaymentsFacet`'s runtime bytecode reached 24874 bytes (298 over the
24576-byte EIP-170 limit) after PR #147 added `withdrawRemainingEscrowTo`.
Local anvil runs with `--disable-code-size-limit` so the regression slipped
through, but Base Sepolia (and every other production chain) enforces the
limit strictly — the facet would not deploy.

Extract `withdrawRemainingEscrow{,To}` and the private `_withdrawRemainingEscrow`
helper into a new `PaymentsRefund` mixin and route those two selectors via
`TanglePaymentsRewardsFacet`, which already owns the rewards/admin/views
surface and had 8.8 KB of headroom. The customer-facing funding + billing
facet shrinks by 770 bytes and lands 472 bytes under the limit.

Sizes (default profile, optimizer on):

  TanglePaymentsFacet:             24874 → 24104  (-770, headroom 472)
  TanglePaymentsRewardsFacet:      15751 → 16280  (+529, headroom 8296)
  TanglePaymentsDistributionFacet: 19651 → 19651  (unchanged)

The diamond router wires selectors from each facet's `selectors()` view, so
`Tangle.sol`'s public surface is unchanged: `withdrawRemainingEscrow{,To}`
still dispatch through the same calldata path, just to a different facet.
The `ITangleServices.sol` interface is unaffected.

Tests:
- test/tangle/Payments.t.sol::WithdrawRemainingEscrowTo (3/3 pass) covers
  the new routing through TanglePaymentsRewardsFacet, the ZeroAddress
  revert, and the NotServiceOwner gate.
- test/EndToEndSubscriptionTest.t.sol (9/9 pass) covers the full lifecycle
  including post-termination withdrawal.
- test/tangle/SubscriptionEscrowInvariant.t.sol (3/3 pass) exercises 3,837
  `refundRemaining` handler invocations against the new routing.

* chore(deploy): prune base-sepolia config for actual broadcast

`deploy/config/base-sepolia.json` carried 8 stake-asset entries with zero
token addresses (stETH, wstETH, EIGEN, USDT, DAI, WBTC, tBTC, lBTC) — not
available on Base Sepolia. `FullDeploy.s.sol:818` silently skips them, but
they clutter the post-deploy manifest and obscure what's actually live.
Keep TNT (sentinel), WETH (`0x4200…0006`), USDC (`0x036C…3cF7e`).

Other changes:
- `roles.revokeBootstrap`: `false` → `true`. Deployer should not retain
  bootstrap privileges on a "real" testnet validation deployment.
- Strip the `_todo_*` documentation keys — they were planning notes,
  not config the deploy script reads.

Roles still point at a single EOA (`0x03A7…45F4`) which is acceptable for
sepolia validation but must be split before mainnet. Migration deploy is
left enabled so we exercise the full claim infrastructure on sepolia.

* fix(deploy): 2-phase bootstrap so `admin != deployer` deploys work

A dry-run of `FullDeploy.s.sol` against real Base Sepolia state surfaced a
latent blocker: the script reverts at the first `registerFacet()` call when
`admin != deployer`. `__Base_init` and `MultiAssetDelegation.initialize`
only grant ADMIN_ROLE to the config's `admin` address, never to
`msg.sender` (the deployer). After init, the deployer has zero roles and
cannot register facets, grant cross-contract roles, or call any
admin-gated setup function from its broadcast context.

Local-env (anvil) hides this because `local.anvil.json` sets
`admin: 0x000…000`, which `FullDeploy` substitutes to the deployer at
runtime — so `admin == deployer` and the script accidentally works.
Production chains (mainnet, Base mainnet, Arbitrum, Optimism, Tangle) are
even more affected: `_requireProductionRoles` *enforces* `admin != deployer`,
so the previous code could never deploy to any production chain.

Switch the deploy flow to a 2-phase bootstrap pattern:

1. **Bootstrap phase**: every proxy / contract is initialized with the
   *deployer* as the initial DEFAULT_ADMIN_ROLE / ADMIN_ROLE holder. The
   deployer can then register facets, grant `MANAGER_ROLE` on the MBSM
   registry, enable assets, wire the rewards manager, etc., all from its
   own broadcast context.
2. **Handoff phase**: existing `_applyRoleHandoff` already (a) grants
   permanent roles to `timelock` + `multisig` and (b) revokes the bootstrap
   admin's roles when `revokeBootstrap == true`. We just thread `deployer`
   as the `bootstrapAdmin` parameter so the right address gets revoked.
   `_assertGovernanceConfiguration` is updated in lockstep.

No protocol contract changes — this is purely a script-side fix. The
`admin` parameter from config now flows only into `_resolveCredits`
(Credits is non-upgradeable Ownable, used as the final owner directly)
and as the fallback handoff target when timelock/multisig are unset.

Verified end-to-end against real Base Sepolia state via:

  PRIVATE_KEY=… FULL_DEPLOY_CONFIG=deploy/config/base-sepolia.json \
    forge script script/FullDeploy.s.sol:FullDeploy \
    --rpc-url https://sepolia.base.org

The regenerated `deployments/base-sepolia/{latest,migration}.json` are
included as evidence — every proxy + facet deployed cleanly, role
handoff to the config admin succeeded, smoke tests passed, and the
deployer ended with zero roles. With the EIP-170 facet split now on
main (PR #149), the script produces a complete, byte-size-clean
deployment that a real broadcast can sign.

Also included on this branch (commit c8159f1):
- Prune `deploy/config/base-sepolia.json`: drop 8 zero-token stake assets
  (stETH, wstETH, EIGEN, USDT, DAI, WBTC, tBTC, lBTC), strip stale
  `_todo_*` doc keys, flip `revokeBootstrap` to `true`.

* test(facets): EIP-170 gate so anvil can't hide oversized facets

Asserts every deployable Tangle and staking facet has runtime bytecode
under the 24576-byte EIP-170 limit. Catches the exact regression
PR #149 had to clean up: `TanglePaymentsFacet` shipped at 24874 bytes
because the local stress harness runs anvil with
`--disable-code-size-limit`, so the test suite never deployed it on a
chain that enforces the limit.

The check measures `address(new Facet()).code.length` after constructor
under the default profile (optimizer on, via-IR) — i.e. the actual
bytecode a real broadcast would deploy. Two test functions (one per
diamond family) so a single regression surfaces only the failing family.

Current headroom for the tighter facets:

  TanglePaymentsFacet:             24104  (472 under)
  TangleServicesFacet:             ~21k   (~3.5k under, monitor)
  TanglePaymentsDistributionFacet: 19651  (4925 under)

A future PR that grows TanglePaymentsFacet by >472 bytes — or any
similar regression — will now fail CI on PR with a precise overage
number rather than reverting at deploy time on the first production
broadcast.

---------

Co-authored-by: Drew Stone <drewstone329@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants