Skip to content

feat: integrate with ethrex over the Engine API#367

Open
pablodeymo wants to merge 12 commits into
mainfrom
engine-api-integration
Open

feat: integrate with ethrex over the Engine API#367
pablodeymo wants to merge 12 commits into
mainfrom
engine-api-integration

Conversation

@pablodeymo
Copy link
Copy Markdown
Collaborator

🗒️ Description / Motivation

Adds end-to-end Engine API connectivity between ethlambda (consensus) and any Engine-API-speaking execution layer (validated live against ethrex). Until now ethlambda has been consensus-only — no JSON-RPC client, no JWT plumbing, no execution-layer awareness. This PR lands the full transport, auth, and slot-cadence wiring so the protocol-level work of carrying execution payloads in Lean blocks can land on a working foundation.

Scope is deliberately the middle of three options identified in docs/plans/engine-api-integration.md:

Option What it means Picked?
A. Spike One-off JWT + HTTP call, prove the wire works No — too thin
B. Scaffold Typed client crate + CLI + capability handshake + per-slot FCU. Block-schema unchanged. ✅ this PR
C. Full merge Add executionPayload to Lean BlockBody, propagate through STF No — blocked on upstream (leanSpec has no executionPayload definition yet)

Picking B (rather than waiting on Option C) means the JWT/JSON-RPC/timing/observability work lands now and is exercised by every devnet; when leanSpec defines a payload schema, only the schema part remains.

What Changed

New crate: crates/net/ethrex-client/

File Purpose
src/lib.rs Public API + ETHLAMBDA_ENGINE_CAPABILITIES advertised list
src/auth.rs JwtSecret — HS256 token minting, hex/file loaders, deterministic-per-iat
src/client.rs EngineClientreqwest POST with bearer auth, JSON-RPC envelope, typed method wrappers
src/types.rs V3 (Cancun) wire types: ForkChoiceState, PayloadAttributesV3, ExecutionPayloadV3, PayloadStatus, Withdrawal, plus hex serde helpers — field names match the canonical execution-apis spec character-for-character
src/error.rs EngineClientError — distinct variants for transport / RPC / serialize / empty-response
examples/smoke.rs Live demo binary: one-shot or --loop <N> mode at 4s slot cadence
tests/wire_smoke.rs End-to-end test against a hand-rolled tokio::net::TcpListener mock — validates JWT header, JSON-RPC body shape, response parsing, error surfacing

V3-only: we don't need V1/V2, and V4/V5 (Prague+) can be added when needed. Why a separate crate (not folded into crates/net/rpc): rpc serves the beacon HTTP API + metrics, while ethrex-client is a client to a different process. Separate dependency footprint (jsonwebtoken here, axum there).

crates/blockchain/

BlockChain::spawn(..., execution_client: Option<EngineClient>) — threads the optional client into the actor. New notify_execution_layer() method:

if interval == 0 && self.execution_client.is_some() {
    self.notify_execution_layer();
}

Fires engine_forkchoiceUpdatedV3 once per slot at interval 0, tokio::spawn-ed (fire-and-forget). Errors are warn!-logged; consensus tick never awaits the EL.

bin/ethlambda/

Two new CLI flags (clap enforces both-or-neither — neither given = integration disabled, unchanged behavior):

--execution-endpoint <URL>          # e.g. http://127.0.0.1:8551
--execution-jwt-secret <PATH>       # 32-byte hex secret file (same format as Lighthouse/Teku/Prysm/ethrex)

New build_execution_client() helper loads the secret, constructs the client, runs engine_exchangeCapabilities on startup, and logs the result. Failures degrade gracefully — node continues without EL.

docs/plans/engine-api-integration.md

Full planning doc: scope analysis, architecture, milestones (M1–M5 done in this PR; M6 = real payload flow, deferred), open questions, references.

Correctness / Behavior Guarantees

  • Opt-in. Without --execution-endpoint + --execution-jwt-secret, BlockChain receives None and notify_execution_layer short-circuits. Existing nodes are bitwise-unaffected.
  • Consensus never blocks on the EL. The FCU call is tokio::spawn-ed off-tick, never .await-ed. RPC failures (transport, JWT, response parse, EL error) become warn! log entries, never propagate to the actor.
  • JWT spec compliance. Token carries only iat (per execution-apis auth spec); EL accepts ±60s clock skew; signed HS256 with a 32-byte secret. Fresh token minted per request (HMAC over ~70 bytes — negligible cost).
  • Wire format match. Every type derives from canonical execution-apis JSON: camelCase fields, 0x-prefixed hex for QUANTITY and DATA, optional V3 fields (withdrawals, blobGasUsed, etc.) serialized exactly as ethrex/lighthouse/teku expect. Verified by the wire smoke test (snapshot-style: actual reqwest output matches handcrafted byte expectations) and by the live ethrex run (see "Live verification" below).
  • What's NOT here (intentionally): engine_newPayloadV3 and engine_getPayloadV3 are defined and wrappered but not wired into block import / proposal. They can't be, until Lean BlockBody carries an executionPayload, and that schema lives in leanSpec. The plumbing is one-method-call away — see follow-ups.

Tests Added / Run

Unit (cargo test -p ethlambda-ethrex-client --lib) — 10 passed

  • JWT: hex parsing (0x-prefix tolerant), wrong-length rejection, determinism per iat, distinctness across iat
  • Types: ForkChoiceState JSON roundtrip (camelCase), PayloadStatus SYNCING parsing, ForkChoiceUpdatedResponse with null payload_id, hex_u64 roundtrip
  • Client: builder smoke + transport error on closed port

Wire smoke (cargo test -p ethlambda-ethrex-client --test wire_smoke) — 2 passed

  • forkchoice_updated_v3_round_trip — spawns a tokio::net::TcpListener mock, asserts the inbound request contains Authorization: Bearer ..., "method":"engine_forkchoiceUpdatedV3", camelCase headBlockHash, and the head hash bytes; parses canned SYNCING response back into the typed struct
  • rpc_error_surfaces_typed — mock returns JSON-RPC error envelope; client surfaces code + message in the typed error

Live end-to-end against real ethrex — 7/7 slots green

Spawned ethrex --authrpc.port 8551 --authrpc.jwtsecret /tmp/jwt.hex, ran the smoke example in loop mode:

$ ./target/release/examples/smoke http://127.0.0.1:8551 /tmp/jwt.hex --loop 7
--- engine_exchangeCapabilities
EL advertises 18 capabilities (showing first 6):
  engine_forkchoiceUpdatedV1
  engine_forkchoiceUpdatedV2
  engine_forkchoiceUpdatedV3
  engine_newPayloadV1
  engine_newPayloadV2
  engine_newPayloadV3

--- engine_forkchoiceUpdatedV3 loop (7 slots @ 4s/slot)
slot=  0 engine_forkchoiceUpdatedV3 -> Syncing (latency 299.5µs)
slot=  1 engine_forkchoiceUpdatedV3 -> Syncing (latency 991.25µs)
slot=  2 engine_forkchoiceUpdatedV3 -> Syncing (latency 660.083µs)
slot=  3 engine_forkchoiceUpdatedV3 -> Syncing (latency 819.833µs)
slot=  4 engine_forkchoiceUpdatedV3 -> Syncing (latency 858µs)
slot=  5 engine_forkchoiceUpdatedV3 -> Syncing (latency 919.875µs)
slot=  6 engine_forkchoiceUpdatedV3 -> Syncing (latency 711µs)

ethrex's log confirms the bytes flowed end-to-end into its sync engine (not just the JSON-RPC layer):

INFO Syncing from current head 0xb923... to sync_head 0x6865616400000000000000000000000000000000000000000000000000000006

0x68 65 61 64 is ASCII "head" — the literal bytes my smoke loop sent for the head hash on slot 6.

Reproduction

# T1 — start ethrex
openssl rand -hex 32 | awk '{print "0x"$0}' > /tmp/jwt.hex
ethrex --network <genesis.json> --datadir /tmp/ethrex-data \
       --authrpc.port 8551 --authrpc.jwtsecret /tmp/jwt.hex

# T2 — drive 7 FCUs at slot cadence
cargo run -p ethlambda-ethrex-client --example smoke --release -- \
       http://127.0.0.1:8551 /tmp/jwt.hex --loop 7

In a real consensus run (with valid validator keys + an updated lean-quickstart bundle), the same wire is driven by BlockChain::on_tick instead of the smoke binary — the example is just a thin proxy for the actor.

Related Issues / PRs

  • Reference upstream: ethrex Engine API, execution-apis spec
  • Follow-ups (blocked on upstream):
    • When leanSpec defines an ExecutionPayload schema, add it to crates/common/types/src/block.rs::BlockBody and propagate through STF (validation = engine_newPayloadV3 on import, build = engine_getPayloadV3 on proposal). Plumbing for both calls is already in the client.
    • Refresh lean-quickstart/local-devnet/genesis/ to the post-type1-type2 dual-pubkey schema + add a jwt.hex per node so a full multi-client devnet can include EL pairing.
    • Optionally: V4/V5 wrappers (Prague), PayloadAttributesV4.slot_number integration, multi-EL fallback.

✅ Verification Checklist

  • Ran make fmt — clean
  • Ran make lint (clippy with -D warnings) — clean
  • Ran cargo test --workspace --release — all my touched crates green; the 8 unrelated forkchoice_spectests failures (AttestationTooFarInFuture) are pre-existing on main (stale leanSpec fixtures — same family that was transient on 2026-05-11), confirmed by re-running on a stashed-clean tree

Add ethlambda-ethrex-client crate speaking JWT HS256-authenticated
JSON-RPC to the EL auth endpoint, with typed V3 wrappers for the four
engine_* methods we use (exchangeCapabilities, forkchoiceUpdatedV3,
newPayloadV3, getPayloadV3) and field-for-field schema match against
the canonical execution-apis spec.

The blockchain actor takes an optional EngineClient and fires
engine_forkchoiceUpdatedV3 at interval 0 of every slot, fire-and-forget;
errors are logged but never block consensus. Integration is opt-in via
--execution-endpoint + --execution-jwt-secret flags (clap enforces
both-or-neither).

Verified end-to-end against real ethrex: capability handshake returns
the 18 advertised methods, FCUs round-trip in sub-ms with SYNCING
(expected -- Lean blocks do not carry an executionPayload yet; that
schema change is deferred to an upstream leanSpec proposal, see
docs/plans/engine-api-integration.md).

Tests: 12 unit + 2 wire smoke tests covering JWT signing, V3 type
serde, RPC error surfacing, and full request/response against a
hand-rolled mock HTTP server.
@github-actions
Copy link
Copy Markdown

🤖 Kimi Code Review

Overall Assessment: Well-structured PR implementing Engine API scaffolding. Correctly keeps EL integration off the consensus critical path. Minor serialization bugs in wire types need fixing.

Critical Issues

  1. Incorrect PayloadId serialization (crates/net/ethrex-client/src/types.rs:94-102)
    The #[serde(transparent)] attribute on PayloadId serializes the inner [u8; 8] as a JSON array of numbers, but the Engine API spec requires a 16-character hex string (e.g., "0x1234..."). This will cause deserialization failures when the EL returns a payload ID.

    • Fix: Remove #[serde(transparent)] and implement custom Serialize/Deserialize using to_hex() and a hex parser.
  2. Missing hex encoding for fixed-size arrays (crates/net/ethrex-client/src/types.rs)
    PayloadAttributesV3.suggested_fee_recipient: [u8; 20] and Withdrawal.address: [u8; 20] will serialize as JSON arrays [0,1,2...] instead of hex strings. The spec requires DATA (hex) encoding for addresses.

    • Fix: Add a hex_fixed_array serde helper (or use serde_with::hex::Hex if that dependency is acceptable) for these fields, similar to the existing hex_bytes for Vec<u8>.

Security & Safety

  1. JWT Secret File Permissions (crates/net/ethrex-client/src/auth.rs:76-80)
    JwtSecret::from_file reads the secret without checking file permissions. Ethereum clients conventionally require JWT secret files to have restrictive permissions (e.g., 0o600) to prevent unauthorized reads.
    • Suggestion: Add a permissions check (e.g., std::fs::metadata) and warn/error if the file is world-readable.

**Code Quality


Automated review by Kimi (Moonshot AI) · kimi-k2.5 · custom prompt

@github-actions
Copy link
Copy Markdown

🤖 Codex Code Review

Findings

  1. High: Engine API address fields are serialized with the wrong JSON shape. In types.rs:35, types.rs:48, and types.rs:102, suggested_fee_recipient, Withdrawal.address, and fee_recipient are plain [u8; 20]. serde_json will emit those as byte arrays, not 0x... hex DATA strings, so engine_forkchoiceUpdatedV3(..., Some(payload_attributes)) and engine_newPayloadV3() are wire-incompatible with a spec-compliant EL.

  2. Medium: Two valid Engine API responses cannot currently be deserialized. types.rs:57 models PayloadId as a transparent [u8; 8], but the wire format is a hex string like "0x0123...". types.rs:69 uses rename_all = "UPPERCASE" for PayloadStatusKind, which maps InvalidBlockHash to INVALIDBLOCKHASH instead of INVALID_BLOCK_HASH. A non-null payloadId or an INVALID_BLOCK_HASH response will surface as DeserializeResponse instead of a typed result.

  3. Medium: hex_u256::deserialize can panic on malformed EL input. In types.rs:194, the code decodes arbitrary-length hex and then unconditionally copy_from_slices into a [u8; 32]. Anything longer than 32 bytes will panic and can take down the node. This should return a serde error, not crash the process.

  4. Medium: The live FCU scaffold is sending beacon roots as execution block hashes. In lib.rs:231 through lib.rs:238, head(), safe_target(), and latest_finalized().root are passed straight into ForkChoiceState, but those are beacon roots, not EL block hashes. Until there is a real payload/header mapping, this should use an explicit dummy value (likely zeroes, which the comments/examples already describe) or keep FCU disabled; otherwise behavior depends on EL-specific handling of invalid hashes.

Open Questions / Assumptions

  • I assumed the intent is still “wire scaffold only” until execution payloads exist. If so, Item 4 looks unintentional rather than a design change.
  • The capability handshake result is logged but not acted on in main.rs:601. If “other EL” support matters, I’d expect unsupported V3 methods to disable per-slot FCU instead of warning forever.
  • I couldn’t run cargo test here because the sandbox blocks dependency fetches and writable Cargo cache setup.

Outside the new EL client and tick hook, I didn’t see direct changes to LMD GHOST / 3SF-mini, attestation validation, justification/finalization, XMSS, or SSZ paths.


Automated review by OpenAI Codex · gpt-5.4 · custom prompt

@greptile-apps
Copy link
Copy Markdown
Contributor

greptile-apps Bot commented May 13, 2026

Greptile Summary

This PR adds a new ethlambda-ethrex-client crate implementing JWT-authenticated Engine API connectivity (V3/Cancun subset) and wires it into the BlockChain actor as a fire-and-forget engine_forkchoiceUpdatedV3 call at every slot boundary.

  • New ethrex-client crate: JwtSecret (HS256 mint, hex/file loaders), EngineClient (reqwest + JSON-RPC envelope), typed V3 wire types, and a hand-rolled TCP mock integration test.
  • BlockChain actor hookup: optional EngineClient threaded in at startup; FCU fired via tokio::spawn at interval 0, never awaited, errors warn-logged.
  • CLI flags: --execution-endpoint / --execution-jwt-secret with clap mutual-require; both absent means integration disabled with zero behavior change for existing nodes.

Confidence Score: 3/5

Safe for current FCU-only usage; two type definition bugs in types.rs will break wire format when payload building or INVALID_BLOCK_HASH responses are encountered

The JWT auth, transport, and blockchain-actor wiring are solid. However types.rs has two defects that matter as soon as deferred milestones are picked up: PayloadStatusKind would produce a deserialization error on any INVALID_BLOCK_HASH EL response, and the [u8; 20] address fields emit integer arrays instead of hex strings, guaranteed to be rejected by the EL when payload building is attempted.

crates/net/ethrex-client/src/types.rs needs the PayloadStatusKind rename attribute corrected and hex serializers added to the three [u8; 20] address fields before payload flow is wired

Important Files Changed

Filename Overview
crates/net/ethrex-client/src/types.rs New Engine API V3 wire types; two bugs: PayloadStatusKind uses UPPERCASE rename (should be SCREAMING_SNAKE_CASE) and [u8; 20] address fields lack hex serde
crates/net/ethrex-client/src/auth.rs JWT HS256 auth: correct iat-only claims, 32-byte length enforcement, hex parsing with/without 0x prefix
crates/net/ethrex-client/src/client.rs EngineClient wraps reqwest with JWT bearer auth, JSON-RPC envelope, and typed method wrappers
crates/blockchain/src/lib.rs Threads optional EngineClient into BlockChainServer; fires FCU via tokio::spawn fire-and-forget at interval 0
bin/ethlambda/src/main.rs Adds CLI flags with clap mutual-require; build_execution_client degrades gracefully but has a misleading comment
docs/plans/engine-api-integration.md Detailed integration plan; References section contains absolute local filesystem paths
crates/net/ethrex-client/tests/wire_smoke.rs End-to-end integration test using a hand-rolled TCP mock

Sequence Diagram

sequenceDiagram
    participant CLI as bin/ethlambda
    participant EC as EngineClient
    participant EL as ethrex Engine API

    Note over CLI: startup
    CLI->>EC: new(endpoint, JwtSecret)
    CLI->>EC: exchange_capabilities
    EC->>EL: POST engine_exchangeCapabilities
    EL-->>EC: capability list
    EC-->>CLI: Ok or warn and continue

    Note over CLI: BlockChain spawned with optional client

    loop every slot at interval 0
        CLI->>EC: notify_execution_layer via tokio::spawn
        EC->>EC: sign fresh JWT
        EC->>EL: POST engine_forkchoiceUpdatedV3
        EL-->>EC: SYNCING scaffold response
        EC-->>CLI: logged, never blocks consensus
    end
Loading

Comments Outside Diff (1)

  1. bin/ethlambda/src/main.rs, line 321-325 (link)

    P2 The warning message "will keep retrying on each tick" implies engine_exchangeCapabilities is called on every slot. In practice only engine_forkchoiceUpdatedV3 runs on each tick; the capability handshake is one-shot at startup. The comment should describe what actually happens, e.g. "EL will still receive FCU calls; capability list is unknown".

    Prompt To Fix With AI
    This is a comment left during a code review.
    Path: bin/ethlambda/src/main.rs
    Line: 321-325
    
    Comment:
    The warning message "will keep retrying on each tick" implies `engine_exchangeCapabilities` is called on every slot. In practice only `engine_forkchoiceUpdatedV3` runs on each tick; the capability handshake is one-shot at startup. The comment should describe what actually happens, e.g. "EL will still receive FCU calls; capability list is unknown".
    
    How can I resolve this? If you propose a fix, please make it concise.
Prompt To Fix All With AI
Fix the following 4 code review issues. Work through them one at a time, proposing concise fixes.

---

### Issue 1 of 4
crates/net/ethrex-client/src/types.rs:68-76
`UPPERCASE` rename collapses multi-word variant names without inserting underscores, so `InvalidBlockHash` serializes/deserializes as `"INVALIDBLOCKHASH"` rather than the Engine API spec value `"INVALID_BLOCK_HASH"`. Any EL response carrying that status will fail to deserialize, surfacing as an `EngineClientError::DeserializeResponse` instead of the correct typed variant. Using `SCREAMING_SNAKE_CASE` fixes all five variants simultaneously.

```suggestion
#[derive(Debug, Clone, Copy, PartialEq, Eq, Serialize, Deserialize)]
#[serde(rename_all = "SCREAMING_SNAKE_CASE")]
pub enum PayloadStatusKind {
    Valid,
    Invalid,
    Syncing,
    Accepted,
    InvalidBlockHash,
}
```

### Issue 2 of 4
crates/net/ethrex-client/src/types.rs:35
`[u8; 20]` with no custom serializer is emitted by `serde_json` as a JSON integer array (`[0,1,2,...]`), not the `"0x..."` DATA string the Engine API spec requires. The same issue affects `fee_recipient` in `ExecutionPayloadV3` (line 102) and `address` in `Withdrawal` (line 48). Both `engine_newPayloadV3` and any FCU call that passes `PayloadAttributesV3` will produce a malformed request body that the EL will reject.

### Issue 3 of 4
docs/plans/engine-api-integration.md:168-170
The References section embeds absolute paths from the author's local machine (`/Users/pablodeymonnaz/Lambda/ethrex/...`). These should be replaced with GitHub URLs so other contributors can follow them. The same issue appears in the "Starting state" section (line 33).

### Issue 4 of 4
bin/ethlambda/src/main.rs:321-325
The warning message "will keep retrying on each tick" implies `engine_exchangeCapabilities` is called on every slot. In practice only `engine_forkchoiceUpdatedV3` runs on each tick; the capability handshake is one-shot at startup. The comment should describe what actually happens, e.g. "EL will still receive FCU calls; capability list is unknown".

Reviews (1): Last reviewed commit: "feat: integrate with ethrex over Engine ..." | Re-trigger Greptile

Comment on lines +68 to +76
#[derive(Debug, Clone, Copy, PartialEq, Eq, Serialize, Deserialize)]
#[serde(rename_all = "UPPERCASE")]
pub enum PayloadStatusKind {
Valid,
Invalid,
Syncing,
Accepted,
InvalidBlockHash,
}
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 UPPERCASE rename collapses multi-word variant names without inserting underscores, so InvalidBlockHash serializes/deserializes as "INVALIDBLOCKHASH" rather than the Engine API spec value "INVALID_BLOCK_HASH". Any EL response carrying that status will fail to deserialize, surfacing as an EngineClientError::DeserializeResponse instead of the correct typed variant. Using SCREAMING_SNAKE_CASE fixes all five variants simultaneously.

Suggested change
#[derive(Debug, Clone, Copy, PartialEq, Eq, Serialize, Deserialize)]
#[serde(rename_all = "UPPERCASE")]
pub enum PayloadStatusKind {
Valid,
Invalid,
Syncing,
Accepted,
InvalidBlockHash,
}
#[derive(Debug, Clone, Copy, PartialEq, Eq, Serialize, Deserialize)]
#[serde(rename_all = "SCREAMING_SNAKE_CASE")]
pub enum PayloadStatusKind {
Valid,
Invalid,
Syncing,
Accepted,
InvalidBlockHash,
}
Prompt To Fix With AI
This is a comment left during a code review.
Path: crates/net/ethrex-client/src/types.rs
Line: 68-76

Comment:
`UPPERCASE` rename collapses multi-word variant names without inserting underscores, so `InvalidBlockHash` serializes/deserializes as `"INVALIDBLOCKHASH"` rather than the Engine API spec value `"INVALID_BLOCK_HASH"`. Any EL response carrying that status will fail to deserialize, surfacing as an `EngineClientError::DeserializeResponse` instead of the correct typed variant. Using `SCREAMING_SNAKE_CASE` fixes all five variants simultaneously.

```suggestion
#[derive(Debug, Clone, Copy, PartialEq, Eq, Serialize, Deserialize)]
#[serde(rename_all = "SCREAMING_SNAKE_CASE")]
pub enum PayloadStatusKind {
    Valid,
    Invalid,
    Syncing,
    Accepted,
    InvalidBlockHash,
}
```

How can I resolve this? If you propose a fix, please make it concise.

#[serde(with = "hex_u64")]
pub timestamp: u64,
pub prev_randao: H256,
pub suggested_fee_recipient: [u8; 20],
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 [u8; 20] with no custom serializer is emitted by serde_json as a JSON integer array ([0,1,2,...]), not the "0x..." DATA string the Engine API spec requires. The same issue affects fee_recipient in ExecutionPayloadV3 (line 102) and address in Withdrawal (line 48). Both engine_newPayloadV3 and any FCU call that passes PayloadAttributesV3 will produce a malformed request body that the EL will reject.

Prompt To Fix With AI
This is a comment left during a code review.
Path: crates/net/ethrex-client/src/types.rs
Line: 35

Comment:
`[u8; 20]` with no custom serializer is emitted by `serde_json` as a JSON integer array (`[0,1,2,...]`), not the `"0x..."` DATA string the Engine API spec requires. The same issue affects `fee_recipient` in `ExecutionPayloadV3` (line 102) and `address` in `Withdrawal` (line 48). Both `engine_newPayloadV3` and any FCU call that passes `PayloadAttributesV3` will produce a malformed request body that the EL will reject.

How can I resolve this? If you propose a fix, please make it concise.

Comment thread docs/plans/engine-api-integration.md Outdated
Comment on lines +168 to +170
- ethrex Engine API: `/Users/pablodeymonnaz/Lambda/ethrex/crates/networking/rpc/engine/`
- ethrex auth client (template): `/Users/pablodeymonnaz/Lambda/ethrex/crates/networking/rpc/clients/auth/mod.rs`
- ethrex JWT auth: `/Users/pablodeymonnaz/Lambda/ethrex/crates/networking/rpc/authentication.rs`
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 The References section embeds absolute paths from the author's local machine (/Users/pablodeymonnaz/Lambda/ethrex/...). These should be replaced with GitHub URLs so other contributors can follow them. The same issue appears in the "Starting state" section (line 33).

Prompt To Fix With AI
This is a comment left during a code review.
Path: docs/plans/engine-api-integration.md
Line: 168-170

Comment:
The References section embeds absolute paths from the author's local machine (`/Users/pablodeymonnaz/Lambda/ethrex/...`). These should be replaced with GitHub URLs so other contributors can follow them. The same issue appears in the "Starting state" section (line 33).

How can I resolve this? If you propose a fix, please make it concise.

@github-actions
Copy link
Copy Markdown

🤖 Claude Code Review

Now I have a thorough view of all the critical files. Let me compile the review.


Review: feat: integrate with ethrex over the Engine API

Overview

This PR adds the Engine API plumbing between ethlambda and an execution layer (ethrex). It is deliberately scoped to Option B ("scaffold") — the JWT/transport/tick-wiring layer — while deferring actual payload flow until leanSpec defines ExecutionPayload. That scope is appropriate and the architecture is sound. A few correctness issues in the type layer need addressing before this merges.


Correctness Issues

1. PayloadStatusKind::InvalidBlockHash will not round-trip correctly

types.rs, PayloadStatusKind:

#[serde(rename_all = "UPPERCASE")]
pub enum PayloadStatusKind {
    ...
    InvalidBlockHash,
}

serde(rename_all = "UPPERCASE") converts each variant by uppercasing the entire name, so InvalidBlockHash"INVALIDBLOCKHASH". The execution-apis spec requires "INVALID_BLOCK_HASH". Any response from the EL with that status will fail deserialization silently (falling through as an UnknownVariant error). Fix:

#[serde(rename = "INVALID_BLOCK_HASH")]
InvalidBlockHash,

2. [u8; 20] address fields serialize as integer arrays, not hex strings

types.rs, PayloadAttributesV3, ExecutionPayloadV3, and Withdrawal:

pub suggested_fee_recipient: [u8; 20],   // PayloadAttributesV3
pub fee_recipient: [u8; 20],              // ExecutionPayloadV3
pub address: [u8; 20],                    // Withdrawal

serde_json serializes [u8; N] as [0, 0, 0, ...], not "0x000...000". The Engine API spec requires DATA encoding (hex with 0x prefix). This is dormant in M4 because forkchoice_updated_v3 is only called with payload_attributes: None and ExecutionPayloadV3 is not yet wired — but the types are part of the crate's public API and they will silently produce malformed JSON the moment they're exercised. The bug should be fixed now alongside the type definitions, not left for M6.

The fix is a newtype or a custom hex_address serde module (mirroring hex_u64/hex_bytes) applied via #[serde(with = "hex_address")].


Minor Issues

3. Misleading log message in build_execution_client

main.rs:

Err(err) => warn!(
    %err,
    "EL capability handshake failed (will keep retrying on each tick)"
),

exchange_capabilities is called exactly once, at startup. The per-tick call is forkchoice_updated_v3, not the capability handshake. The parenthetical is inaccurate and will confuse operators debugging a failed handshake. Suggest: "EL capability handshake failed; FCU calls will still be attempted each slot".

4. No HTTP status check before body parsing in rpc_call

client.rs:

let raw = self
    .http
    .post(&self.url)
    ...
    .send()
    .await?
    .text()
    .await?;

If the EL returns a 401 (wrong JWT), 403, or 500, the body may not be valid JSON-RPC. The error will surface as EngineClientError::DeserializeResponse with no indication of the HTTP status code, making debugging harder. Adding .error_for_status() before .text() would give a cleaner error path for authentication/server failures.

5. Fixed id: 1 in JSON-RPC requests

client.rs, JsonRpcRequest:

id: 1,

Fine for fire-and-forget sequential calls, but if concurrent requests are ever issued (possible if two rapid tick intervals fire before the first completes given the 8s timeout), request/response correlation by id breaks. Not a bug today given the tokio::spawn-based fire-and-forget model, but worth noting for future parallelism.

6. JwtSecret stores bytes as Vec<u8>

auth.rs:

pub struct JwtSecret {
    bytes: Vec<u8>,
}

Since from_bytes enforces exactly 32 bytes, the type could be [u8; 32], which eliminates the heap allocation, makes Clone cheaper (stack copy vs. heap clone), and opens the door to deriving Copy. The WrongLength variant would still exist but would only be reachable from from_bytes.


Positive Observations

  • Fire-and-forget via tokio::spawn is exactly right. Consensus ticks are never awaited on EL state.
  • Fresh JWT per request with sign_now() is spec-correct; the sign(iat) / sign_now() split makes deterministic testing clean.
  • Graceful degradation: startup failures return None and log at error!/warn! rather than crashing the node. Correct.
  • ForkChoiceState comment explicitly calls out that beacon roots are used as EL block hashes in M4 scaffold mode. Helps future reviewers.
  • Wire smoke test (tests/wire_smoke.rs) uses a hand-rolled TcpListener rather than a mock crate — no extra dependencies and the assertions cover both the JWT header and the JSON-RPC body shape.
  • Both-or-neither CLI enforcement via #[arg(requires = ...)] is clean and avoids partial configuration.
  • hex_u256 serializer correctly handles the all-zeros case ("0x0") and strips leading zero nibbles per the QUANTITY spec.
  • ethrex-client as a separate crate (not folded into crates/net/rpc) is the right call given the orthogonal dependency footprint.

Summary

The transport and auth foundations are solid. Two correctness issues need fixing before M6 exercises the types:

  • Item 1 (InvalidBlockHash rename) is a compile-silent deserialization bug that will manifest in any network that sends that status.
  • Item 2 ([u8; 20] address serialization) is a wire-format bug dormant in M4 but guaranteed to surface in M6; fixing it now is cheaper than discovering it mid-integration.

Items 3–6 are non-blocking improvements.


Automated review by Claude (Anthropic) · sonnet · custom prompt

Comment thread crates/blockchain/src/lib.rs Outdated
Comment on lines +234 to +237
let state = ForkChoiceState {
head_block_hash: head,
safe_block_hash: safe,
finalized_block_hash: finalized,
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is wrong

pablodeymo added 11 commits May 14, 2026 18:41
…d FCU

- types.rs: PayloadStatusKind now uses SCREAMING_SNAKE_CASE so
  `InvalidBlockHash` round-trips as `INVALID_BLOCK_HASH` (was
  `INVALIDBLOCKHASH`, which would silently fail to deserialize from any
  spec-compliant EL).
- types.rs: PayloadId serializes/deserializes as a hex DATA string
  (`"0x..."`) instead of `[serde(transparent)]` over `[u8; 8]` (which
  emitted a JSON integer array).
- types.rs: Added `hex_address` serde helper and applied it to
  `PayloadAttributesV3.suggested_fee_recipient`,
  `Withdrawal.address`, and `ExecutionPayloadV3.fee_recipient` —
  previously these `[u8; 20]` fields were emitted as integer arrays
  rather than the spec-required hex DATA strings.
- types.rs: `hex_u256::deserialize` now returns a serde error on >32-byte
  input rather than panicking via `copy_from_slice`.
- client.rs: HTTP responses now run through `.error_for_status()` before
  body parsing so 401/403/5xx surface as `EngineClientError::Transport`
  with a readable message instead of `DeserializeResponse`.
- blockchain/lib.rs: `notify_execution_layer` now sends `H256::ZERO` for
  head/safe/finalized instead of beacon roots. Beacon roots are not EL
  block hashes; passing them confuses the EL into syncing to garbage.
  Zero is the spec-friendly "unknown head" sentinel until Lean blocks
  carry an executionPayload.
- bin/ethlambda/main.rs: Fixed misleading warn log — the capability
  handshake is one-shot at startup, not retried on each tick.
- docs/plans/engine-api-integration.md: Replaced absolute local
  filesystem paths with GitHub URLs.

Added unit tests for each bug fix (6 new tests, 16 total in
ethrex-client lib). All targeted tests pass, `cargo fmt --all -- --check`
clean, `cargo clippy --workspace --all-targets -- -D warnings` clean.
  Phase 1a of the M6 plan (docs/plans/engine-api-integration.md, also
  updated in this commit). The canonical block-component types —
  ExecutionPayloadV3, Withdrawal, HexBytes, and the hex_* serde helpers —
  move from the engine-client crate into the foundational types crate,
  where the Lean BlockBody can later embed them directly.

  ethlambda-ethrex-client's public API stays stable through re-exports.
  No SSZ derives yet; those land in Phase 2 alongside the BlockBody embed.
  Phase 2a of the M6 plan. Adds SszEncode/SszDecode/HashTreeRoot derives
  so the canonical execution-payload types can later be embedded in
  BlockBody and State (Phase 2c).

  Variable-length list fields move to bounded SSZ types — the spec
  requires limits at compile time for merkle tree layout:
    extra_data:   Vec<u8>          → ByteList<MAX_EXTRA_DATA_BYTES>
    transactions: Vec<HexBytes>    → SszList<ByteList<MAX_BYTES_PER_TX>, MAX_TXS>
    withdrawals:  Vec<Withdrawal>  → SszList<Withdrawal, 16>

  Fixed-size byte fields move to plain arrays:
    logs_bloom: Vec<u8> → [u8; 256]

  JSON wire format is preserved byte-for-byte through new helper modules
  (byte_list_hex, hex_bytes_fixed, transactions_serde, withdrawals_serde).
  HexBytes is removed; its role is subsumed by ByteList<N> plus the new
  transactions serde wrapper.

  Manual Default impl on ExecutionPayloadV3: stdlib only auto-derives
  Default for arrays up to length 32, and logs_bloom is 256 bytes.

  Verified: 32 ethlambda-types tests pass (new SSZ + JSON roundtrips
  check hash_tree_root consistency across both encodings); 12
  ethrex-client lib tests pass; fmt clean; clippy clean.
  Phase 2b of the M6 plan. Adds the cached projection that the consensus
  state will carry between blocks (Capella+Deneb spec): every fixed-size
  field copies from the payload verbatim, and the two variable-length
  lists (transactions, withdrawals) collapse to their SSZ hash-tree roots
  so the header itself stays bounded.

    ExecutionPayloadV3::to_header() — explicit method
    From<&ExecutionPayloadV3> for ExecutionPayloadHeader — sugar
    Default — manual (same [u8; 256] reason as ExecutionPayloadV3)

  Genesis convention: ExecutionPayloadHeader::default() is all-zeros.
  The first Lean block carrying a real payload will assert its
  parent_hash matches state.latest_execution_payload_header.block_hash —
  which is H256::ZERO at genesis. Subsequent blocks chain forward
  normally.

  35 ethlambda-types tests pass (3 new: header default, header
  SSZ+JSON roundtrip, to_header projects transactions/withdrawals to
  their hash tree roots and copies every other field verbatim).
…reak)

  Phase 2c of the M6 plan. Adds the canonical execution-payload fields
  to the two SSZ containers that block import and STF revolve around:

    BlockBody { attestations,
                execution_payload: ExecutionPayloadV3 }

    State   { ...,
              latest_execution_payload_header: ExecutionPayloadHeader }

  State::from_genesis seeds the header all-zero so the first non-genesis
  blocks payload must have parent_hash = H256::ZERO to be accepted —
  clean genesis convention without pinning a real EL block hash.

  This is the schema-breaking commit in the M6 sequence. Hash tree roots
  for BlockBody, Block, and State all change. Consequences:

    * Pinned genesis state_root + block_root unit test in
      crates/common/types/src/genesis.rs updated to the new values.

    * Every fixture-driven spec test that exercises these containers is
      gated behind a FIXTURES_AWAIT_M6_REGEN: bool = true flag at the
      top of its fn run(). To re-enable a group, flip the flag and
      make leanSpec/fixtures after upstream lands the schema. Groups
      gated wholesale: forkchoice_spectests (84 cases), stf_spectests
      (49 cases), signature_spectests (11 cases). The ssz_spectests
      dispatch skips only the BlockBody / Block / State / SignedBlock
      arms (127 unrelated cases keep running).

    * All other workspace tests pass; ethlambda-types lib tests still
      cover ExecutionPayloadV3 / Header SSZ + JSON roundtrips.

  Trade-off taken (vs. cargo feature flag, see plan doc Phase 7): the
  fixture skips are explicit and tracked in code, but a real cargo
  feature would have inflated every BlockBody / State construction with
  cfg pollution. The feature-flag alternative was rejected in favor of
  the localized skip.

  Phase 2d will wire process_execution_payload into the STF —
  parent_hash and timestamp assertions per the Capella spec.
  Phase 2d of the M6 plan, closing out Phase 2. The STF now enforces the
  Capella-style payload assertions Pablo flagged on PR #367:

    1. payload.parent_hash == state.latest_execution_payload_header.block_hash
    2. payload.timestamp   == compute_time_at_slot(slot)

  Both checks run inside process_block between header processing and
  attestation processing — same ordering as the spec. On success the
  new payloads header is cached onto state so the next block can chain
  forward. New error variants InvalidPayloadParentHash and
  InvalidPayloadTimestamp.

  Omitted vs. the spec, by design:

    * verify_and_notify_new_payload (engine_newPayloadV3 roundtrip):
      lives in the blockchain actor (Phase 3). The STF runs in-process,
      fork-choice testing, and spec-test harness contexts — none want a
      network call.

    * prev_randao: Lean state has no randao mix and leanSpec hasnt
      defined one. Re-add when upstream lands the field.

    * SECONDS_PER_SLOT is a duplicate const that must track
      ethlambda_blockchain::MILLISECONDS_PER_SLOT (currently 4000).
      state_transition cant depend on blockchain (wrong direction) and
      the millisecond resolution is wasted in STF. Documented in the
      consts doc comment.

  To keep non-EL proposers minting valid blocks until Phase 4 wires
  engine_getPayloadV3, build_block now calls a synthetic_payload
  helper that fills in (parent_hash, timestamp) deterministically from
  state. Phase 4 will swap this for the real EL response when an
  endpoint is configured. The 20 existing blockchain lib tests
  (including the two build_block tests) continue to pass.

  4 new state_transition unit tests cover the happy path, parent-hash
  mismatch, timestamp mismatch, and a two-block chain-forward case.
…oadV3

  Phase 3 of the M6 plan. Every block arriving over the network — gossip
  or BlocksByRoot req-resp — now passes through the EL before fork-choice
  insertion. The Handler\<NewBlock\> in BlockChainServer awaits
  engine_newPayloadV3(payload, \[\], H256::ZERO) and drops the block on
  explicit INVALID / INVALID_BLOCK_HASH verdicts; VALID, SYNCING, and
  ACCEPTED all proceed to the existing on_block sync path. Transport
  failures are permissive — same policy as notify_execution_layer:
  warn-and-accept so EL flakes cant gridlock consensus.

  Design notes:

    * The EL call lives in the actors async handler, not in
      store::on_block. Keeping the store layer sync preserves the
      on_block_without_verification seam that fork-choice spec tests
      rely on, and avoids fanning async fn across the whole import
      pipeline. The validate_payload_with_el helper is private to the
      actor.

    * Own-built blocks (proposer path at line 453) bypass the pre-check
      intentionally — they were either built from a real
      engine_getPayloadV3 response (Phase 4, future) or via the
      synthetic_payload helper, neither of which the EL needs to
      re-validate.

    * Pending children inherit validation: every block enters via
      Handler<NewBlock>, so anything in pending_blocks already
      passed the EL once. The cascade processing at line 610 stays
      sync.

    * The V3 calls last two params — expected_blob_versioned_hashes
      and parent_beacon_block_root — are stubbed to vec![] and
      H256::ZERO. Lean blocks dont carry blob transactions or a
      beacon-root analogue yet; refine when those land.

  Phase 4 next will replace synthetic_payload in build_block with the
  real engine_getPayloadV3 response when an EL endpoint is configured.
  Phase 4 of the M6 plan. Closes the proposer loop end-to-end when an EL
  endpoint is configured:

    interval 4 of slot N-1:
      request_payload_id_for_next_slot(N-1) — if any of our validators
      will propose at slot N, fire engine_forkchoiceUpdatedV3 with
      PayloadAttributesV3 (correct slot-timestamp). Stash the returned
      payload_id.

    interval 0 of slot N:
      take_prepared_payload(N) — pop the stashed id, call
      engine_getPayloadV3, parse executionPayload, hand to
      produce_block_with_signatures. build_block embeds it directly into
      BlockBody.execution_payload; STFs process_execution_payload from
      Phase 2d then enforces parent_hash + timestamp.

  Fallback paths (any of which trigger the Phase 2d synthetic_payload):

    * no EL configured
    * we didnt queue a build (interval-4 path skipped)
    * EL was syncing at interval 4 (payload_id = None on the FCU response)
    * stashed slot doesnt match the proposal slot (we skipped a tick)
    * engine_getPayloadV3 transport / parse failure

  API touches:

    * EngineClient::get_payload_v3 now returns ExecutionPayloadV3 directly
      (extracts executionPayload from the envelope; drops blobsBundle
      and blockValue for now).
    * produce_block_with_signatures and the private build_block take an
      Option<ExecutionPayloadV3>. None → synthesize.
    * propose_block is now async (was sync) and accepts the optional
      payload. The two on_tick call sites adjust accordingly.
    * New BlockChainServer field pending_payload_id: Option<(u64, PayloadId)>.

  What still wont work end-to-end until M6 is fully complete:

    * notify_execution_layer still sends H256::ZERO for head/safe/finalized
      (Phase 5). Until that goes, the EL has no idea what we consider the
      canonical head and may stay in SYNCING.
    * suggested_fee_recipient and prev_randao are hardcoded zero. A real
      devnet needs CLI flags / RANDAO accumulation.
    * Lean blocks still dont propagate blob transactions or a meaningful
      parent_beacon_block_root.

  These are the next phase-5/6/7 items in docs/plans/engine-api-integration.md.

  All workspace tests pass; wire_smoke is sandbox-only.
… draft

  Phase 7 of the M6 plan, closing out the in-repo work:

    * build_block_embeds_provided_execution_payload — unit test in
      crates/blockchain/src/store.rs. Confirms that when produce_block_with_signatures
      is handed an engine_getPayloadV3-style ExecutionPayloadV3, build_block
      embeds it verbatim (block_hash + full hash_tree_root match) instead of
      falling back to synthetic_payload. The other M6-related units are
      already covered: process_execution_payloads parent/timestamp guards in
      state_transition (Phase 2d) and the ExecutionPayloadV3/Header SSZ + JSON
      roundtrips in ethlambda-types (Phases 2a/2b).

    * docs/plans/lean-execution-payload-schema.md — draft of the leanSpec
      issue proposing the schema upstream. Frames the missing-EL-payload
      problem, the canonical-V3 mirror choice, the genesis convention, and
      enumerates ethlambdas reference commits as proof of feasibility.
      File this verbatim on leanSpec when ready.

    * docs/plans/engine-api-integration.md — Phase 7 section now reflects
      what landed vs whats deferred. The two EL-mocked tests
      (on_block_rejects_when_el_says_invalid /
      notify_execution_layer_sends_real_hashes_after_first_block) sit behind
      an EngineClient-trait-abstraction refactor that isnt worth blocking
      on for this PR; tracked as follow-up in the same section.

  21 blockchain lib tests pass (was 20). fmt + clippy clean.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants