feat(local-review): add fixed-claude CLI engine via --engine#309
feat(local-review): add fixed-claude CLI engine via --engine#309anagnorisis2peripeteia wants to merge 5 commits into
Conversation
Generalizes runCodexProcess with an optional command param so the same bounded runner drives a fixed `claude` CLI; adds --engine codex|claude (codex default). claude engine: fixed binary (no operator override), codexEnv() scrub + read-only tools, subscription auth via CLAUDE_CODE_OAUTH_TOKEN. Provider-neutral, zero new deps. Inherits openclaw#298's committed-range/clean-checkout/unique-dir/reject-unsupported. Addresses the openclaw#274 close: fixed boundary, shared runner, scrubbed env, identical schema, no SDK deps, no configurable binary.
|
@clawsweeper review |
|
🦞🧹 I asked ClawSweeper to review this item again. Re-review progress:
|
|
Codex review: needs maintainer review before merge. Reviewed June 17, 2026, 3:15 AM ET / 07:15 UTC. Summary Reproducibility: not applicable. this is a new local-review provider feature, not a broken existing behavior. The verification path is an authenticated Review metrics: 2 noteworthy metrics.
Root-cause cluster Members:
Proposal only: this assessment does not dispatch repair, suppress jobs, mutate sibling items, close, or merge anything. Merge readiness Overall follows the weaker of proof and patch quality, so missing proof can cap an otherwise strong patch. Rank-up moves:
Risk before merge
Maintainer options:
Next step before merge
Security Review detailsBest possible solution: Land the fixed-binary Claude engine only if maintainers accept it as a supported local-review provider while preserving Codex as the default and keeping the offline local-review credential boundaries intact. Do we have a high-confidence way to reproduce the issue? Not applicable: this is a new local-review provider feature, not a broken existing behavior. The verification path is an authenticated Is this the best way to solve the issue? Yes for the technical shape, pending product acceptance: the fixed executable and shared runner are narrower than the closed earlier Claude PR. The remaining question is whether maintainers want to support this second local provider boundary in core. AGENTS.md: found and applied where relevant. Codex review notes: model internal, reasoning high; reviewed against c88270b81889. Label changesLabel changes:
Label justifications:
Evidence reviewedSecurity concerns:
What I checked:
Likely related people:
What the crustacean ranks mean
Shiny media proof means a screenshot, video, or linked artifact directly shows the changed behavior. Runtime, network, CSP, and security claims still need visible diagnostics. How this review workflow works
|
Move the --engine codex|claude validation to the top of localReviewCommand so an invalid engine fails fast before any run state (run dir, GH_CONFIG_DIR) is created. Addresses the P3 review finding on openclaw#309.
|
@clawsweeper re-review |
|
🦞🧹 I asked ClawSweeper to review this item again. Re-review progress:
|
…full stdout, error fallback) Found by local-review --engine claude reviewing its own PR. (1) Embed the full committed-range diff in the claude prompt — the read-only reviewer can't run git to see the change (codex's sandbox can). (2) Read the full captured stdout file instead of the 64 KiB tail, so a large report's YAML front matter isn't truncated. (3) Fall back to stdout in failure reports when stderr is empty (mirrors codex).
"Read Grep Glob" was passed as a single space-joined token (one bogus rule → no tools enabled in -p mode). Comma-separate so Read/Grep/Glob parse as three rules and the reviewer can actually open repo files. Found by the claude meta self-review.
…d branches run() buffers the whole diff host-side; a >maxBuffer whole-branch diff threw before the 256KiB cap applied, aborting the process with no report. Wrap in try/catch -> failureReport, matching the lane's graceful-degradation pattern. Found by the claude meta self-review.
|
@clawsweeper re-review |
|
🦞🧹 I asked ClawSweeper to review this item again. Re-review progress:
|
|
@clawsweeper re-review |
|
Closing for now because we are not adding a Claude execution provider yet. The implementation was thoughtful and can be revisited if provider expansion becomes explicit product direction. Thank you for the work and validation. |
What
local-review --engine claude— drive a fixedclaudeCLI through the same bounded runner as codex.runCodexProcessgains an optionalcommandparam (the worker already takes one internally), so a second fixed review CLI reuses the existing process/retry/capture path. codex stays the default. Provider-neutral, zero new dependencies (package.jsonunchanged).Addresses the #274 close, point by point
claude/codex; no--claude-bin/ no operator-configurable binarycodexEnv()(scrubs GitHub/OpenAI/app creds, keeps the OS env + the engine's own auth)runCodexProcessworker, now command-agnosticfailureReport/stripMarkdownFence/ timestamp path as codexAuth
The claude engine uses the caller's existing Claude auth. For a subscription, set
CLAUDE_CODE_OAUTH_TOKEN(fromclaude setup-token) and leaveANTHROPIC_API_KEYunset.Verification
pnpm run build, lint (type-aware),oxfmt, and the local-review test suite (8 tests, incl. a new unknown---engineguard) all pass.local-review --engine claudeon this very branch (subscription auth,ANTHROPIC_API_KEYunset). It executed end-to-end through the shared runner, read the changed files in full, and produced a schema-valid report each iteration. The loop also caught — and I fixed — four real bugs in the engine itself before converging clean (see below), which doubles as evidence the lane actually reviews rather than rubber-stamps.Proof — claude engine reviewing its own PR
local-review --engine claude --base origin/mainrun repeatedly against this branch as it hardened. Each run produced a valid report (front matter +result/confidence/highest_severity); the findings drove the fix commits:b4ea2701(diff in prompt, full-stdout capture, error fallback)--allowedTools "Read Grep Glob"was one space-joined token → no tools enabled in-pmode →d0d62765(comma-separateRead,Grep,Glob)git diffbuffered host-side could exceedmaxBufferand crash before any report →36239522(try/catch →failureReport)The final clean run (
36239522,result: nothing_found,confidence: high) verifies the fixed binary, the env scrub, full-stdout capture, and engine-validation ordering by name. Run via ambient subscription credentials; no API key, no token minted.Not included