feat(local-review): add fixed-claude CLI engine via --engine by anagnorisis2peripeteia · Pull Request #309 · openclaw/clawsweeper

anagnorisis2peripeteia · 2026-06-16T21:37:29Z

What

local-review --engine claude — drive a fixed claude CLI through the same bounded runner as codex. runCodexProcess gains an optional command param (the worker already takes one internally), so a second fixed review CLI reuses the existing process/retry/capture path. codex stays the default. Provider-neutral, zero new dependencies (package.json unchanged).

pnpm local-review -- --base main                 # codex (default)
pnpm local-review -- --base main --engine claude # fixed `claude` CLI

Addresses the #274 close, point by point

fixed trusted executable boundary → fixed claude/codex; no --claude-bin / no operator-configurable binary
minimal allowlisted environment → codexEnv() (scrubs GitHub/OpenAI/app creds, keeps the OS env + the engine's own auth)
shared process capture/retry handling → the existing runCodexProcess worker, now command-agnostic
identical final schema validation → same failureReport / stripMarkdownFence / timestamp path as codex
provider-neutral committed-range review + unique temporary state → inherited from the merged local-review foundation
no SDK dependency → CLI subprocess, not the Agent SDK

Auth

The claude engine uses the caller's existing Claude auth. For a subscription, set CLAUDE_CODE_OAUTH_TOKEN (from claude setup-token) and leave ANTHROPIC_API_KEY unset.

Verification

pnpm run build, lint (type-aware), oxfmt, and the local-review test suite (8 tests, incl. a new unknown---engine guard) all pass.
codex path is fully covered by the existing tests — the runner change is backward-compatible (default still resolves codex).
The claude execution path is now run-verified via a meta self-review: I ran local-review --engine claude on this very branch (subscription auth, ANTHROPIC_API_KEY unset). It executed end-to-end through the shared runner, read the changed files in full, and produced a schema-valid report each iteration. The loop also caught — and I fixed — four real bugs in the engine itself before converging clean (see below), which doubles as evidence the lane actually reviews rather than rubber-stamps.

Proof — claude engine reviewing its own PR

local-review --engine claude --base origin/main run repeatedly against this branch as it hardened. Each run produced a valid report (front matter + result/confidence/highest_severity); the findings drove the fix commits:

run	result	severity	finding → fix
1	findings	(3)	claude lane couldn't see the diff / 64 KiB stdout tail truncated front matter / spawn-error gave no detail → `b4ea2701` (diff in prompt, full-stdout capture, error fallback)
2	findings	medium	`--allowedTools "Read Grep Glob"` was one space-joined token → no tools enabled in `-p` mode → `d0d62765` (comma-separate `Read,Grep,Glob`)
3	findings	low	whole-branch `git diff` buffered host-side could exceed `maxBuffer` and crash before any report → `36239522` (try/catch → `failureReport`)
4	nothing_found	none	clean, high confidence — reviewer read 9 files in full and traced every spawn path

The final clean run (36239522, result: nothing_found, confidence: high) verifies the fixed binary, the env scrub, full-stdout capture, and engine-validation ordering by name. Run via ambient subscription credentials; no API key, no token minted.

Not included

The Agent-SDK approach (it pulls many deps — wrong fit for this zero-dep tool).
Any configurable binary or second-provider scope beyond the engine switch.

Generalizes runCodexProcess with an optional command param so the same bounded runner drives a fixed `claude` CLI; adds --engine codex|claude (codex default). claude engine: fixed binary (no operator override), codexEnv() scrub + read-only tools, subscription auth via CLAUDE_CODE_OAUTH_TOKEN. Provider-neutral, zero new deps. Inherits openclaw#298's committed-range/clean-checkout/unique-dir/reject-unsupported. Addresses the openclaw#274 close: fixed boundary, shared runner, scrubbed env, identical schema, no SDK deps, no configurable binary.

anagnorisis2peripeteia · 2026-06-16T21:53:42Z

@clawsweeper review

clawsweeper · 2026-06-16T21:53:45Z

🦞🧹
ClawSweeper re-review requested.

I asked ClawSweeper to review this item again.
Action: item re-review queued (workflow sweep.yml, event repository_dispatch).
Result: the existing ClawSweeper review comment will be edited in place when the review finishes.

Re-review progress:

State: Complete
Detail: The targeted re-review finished, the durable review comment was updated, and the synced verdict was routed.
Run: https://github.com/openclaw/clawsweeper/actions/runs/27650688627
Updated: 2026-06-16T22:21:08.794Z

clawsweeper · 2026-06-16T22:16:04Z

Codex review: needs maintainer review before merge. Reviewed June 17, 2026, 3:15 AM ET / 07:15 UTC.

Summary
The PR adds local-review --engine codex|claude, routes --engine claude through the shared bounded subprocess runner with a fixed claude command, documents Claude auth, and adds an invalid-engine test.

Reproducibility: not applicable. this is a new local-review provider feature, not a broken existing behavior. The verification path is an authenticated local-review --engine claude run, and the PR body now reports that path running end to end on this branch.

Review metrics: 2 noteworthy metrics.

Changed surface: 4 files, +162/-6. The diff is compact but spans runtime subprocess selection, CLI behavior, docs, and tests.
New engine path: 1 local-review engine added. The provider executable path is the main maintainer decision surface, not the raw line count.

Root-cause cluster
Relationship: canonical
Canonical: #309
Summary: This PR is the current open canonical candidate for the Claude local-review engine; the older Claude attempt is closed unmerged and the merged local-review foundation is adjacent prerequisite work.

Members:

superseded: feat(local-review): add claude CLI (-p) review engine (alternative to codex) #274 - The older Claude-engine PR is closed unmerged and this PR narrows that design to a fixed executable with the shared runner.
adjacent_distinct: feat(local-review): offline committed-range branch review on Commit Sweeper #298 - The merged PR added the local-review foundation that this PR extends, but it did not add a Claude engine.

Proposal only: this assessment does not dispatch repair, suppress jobs, mutate sibling items, close, or merge anything.

Merge readiness
Overall: 🐚 platinum hermit
Proof: 🐚 platinum hermit
Patch quality: 🐚 platinum hermit
Result: ready for maintainer review.

Overall follows the weaker of proof and patch quality, so missing proof can cap an otherwise strong patch.

Rank-up moves:

[P2] Maintainers should explicitly accept or reject supporting claude as a fixed local-review provider before merge.

Risk before merge

[P1] Merging intentionally adds a second trusted local review provider executable that receives repository review context and the caller's local Claude authentication.
[P1] The new engine path depends on an installed, authenticated claude CLI, and green repository CI does not exercise that external provider runtime path.
[P1] Maintainers still need to decide whether supporting Claude as a first-class local-review engine fits the ClawSweeper automation boundary.

Maintainer options:

Accept The Fixed Claude Boundary
Maintainers can merge after explicitly accepting claude as a supported local-review provider that runs through the shared bounded subprocess path.
Keep Codex-Only For Now
If maintainers are not ready to own a second local provider, pause or close this PR while preserving the proof and design notes for a future provider decision.

Next step before merge

[P2] Manual review is appropriate because the remaining blocker is maintainer acceptance of a new provider CLI and security boundary, not a narrow automated code repair.

Security
Needs attention: The diff is intentionally security-sensitive because it adds a new trusted local provider executable that receives repository context and local Claude auth.

Review details

Best possible solution:

Land the fixed-binary Claude engine only if maintainers accept it as a supported local-review provider while preserving Codex as the default and keeping the offline local-review credential boundaries intact.

Do we have a high-confidence way to reproduce the issue?

Not applicable: this is a new local-review provider feature, not a broken existing behavior. The verification path is an authenticated local-review --engine claude run, and the PR body now reports that path running end to end on this branch.

Is this the best way to solve the issue?

Yes for the technical shape, pending product acceptance: the fixed executable and shared runner are narrower than the closed earlier Claude PR. The remaining question is whether maintainers want to support this second local provider boundary in core.

AGENTS.md: found and applied where relevant.

Codex review notes: model internal, reasoning high; reviewed against c88270b81889.

Label changes

Label changes:

add proof: sufficient: Contributor real behavior proof is sufficient. The PR body now includes after-fix real run proof for local-review --engine claude, including repeated self-review runs and a final schema-valid nothing_found result at the current head.
add rating: 🐚 platinum hermit: Overall readiness is 🐚 platinum hermit; proof is 🐚 platinum hermit and patch quality is 🐚 platinum hermit.
add status: ⏳ waiting on author: ClawSweeper has contributor-facing work open and is waiting for author action. Sufficient (live_output): The PR body now includes after-fix real run proof for local-review --engine claude, including repeated self-review runs and a final schema-valid nothing_found result at the current head.
remove rating: 🧂 unranked krab: Current PR rating is rating: 🐚 platinum hermit, so this older rating label is no longer current.
remove status: 📣 needs proof: Current PR status label is status: ⏳ waiting on author.

Label justifications:

P2: This is a normal-priority local-review feature with bounded user impact but meaningful automation and provider-boundary review needs.
merge-risk: 🚨 automation: The PR changes local-review subprocess selection and adds an external provider runtime path that CI cannot fully prove.
merge-risk: 🚨 security-boundary: The fixed claude executable would receive repository review context and local Claude authentication as a newly supported trusted provider boundary.
rating: 🐚 platinum hermit: Overall readiness is 🐚 platinum hermit; proof is 🐚 platinum hermit and patch quality is 🐚 platinum hermit.
status: ⏳ waiting on author: ClawSweeper has contributor-facing work open and is waiting for author action. Sufficient (live_output): The PR body now includes after-fix real run proof for local-review --engine claude, including repeated self-review runs and a final schema-valid nothing_found result at the current head.
proof: sufficient: Contributor real behavior proof is sufficient. The PR body now includes after-fix real run proof for local-review --engine claude, including repeated self-review runs and a final schema-valid nothing_found result at the current head.

Evidence reviewed

Security concerns:

[medium] Accept the trusted Claude CLI boundary — src/commit-sweeper.ts:420
runClaudeReview invokes a fixed claude executable in the target repository with the full review prompt and caller Claude auth; codexEnv() and read-only tool args reduce risk, but maintainers still need to accept this new provider trust boundary.
Confidence: 0.86

What I checked:

AGENTS.md policy read: Repository policy was read fully and applied because this PR changes ClawSweeper automation and subprocess behavior. (AGENTS.md:1, c88270b81889)
Current main lacks the requested engine: Current main's local-review path still calls the Codex runner directly and has no --engine or claude matches in src/commit-sweeper.ts. (src/commit-sweeper.ts:487, c88270b81889)
Latest release lacks the requested engine: The v0.3.0 release tag has the local-review command but no --engine or claude implementation, so the requested capability has not shipped. (src/commit-sweeper.ts:487, dc824915bb6c)
PR implementation: PR head adds runClaudeReview, embeds the range diff for the read-only Claude lane, invokes fixed command claude, passes Read,Grep,Glob, and uses codexEnv(). (src/commit-sweeper.ts:360, 36239522d6a1)
Shared runner change: PR head adds an optional command override to runCodexProcess while preserving Codex as the default command. (src/codex-process.ts:82, 36239522d6a1)
Docs and validation coverage: PR head documents the fixed Claude engine and adds a guard test for unknown --engine values. (docs/commit-sweeper.md:276, 36239522d6a1)

Likely related people:

steipete: Merged the local-review foundation in feat(local-review): offline committed-range branch review on Commit Sweeper #298 and authored the final local-review hardening commits before v0.3.0. (role: recent merger and adjacent owner; confidence: high; commits: d877d4e7e095, 2ac70249fb45, 2506d2124bfa; files: src/commit-sweeper.ts, src/codex-process.ts, docs/commit-sweeper.md)
anagnorisis2peripeteia: Authored the merged local-review foundation commits in feat(local-review): offline committed-range branch review on Commit Sweeper #298 and the earlier Claude-engine exploration in feat(local-review): add claude CLI (-p) review engine (alternative to codex) #274, so this is prior area work rather than only this proposal. (role: recent feature contributor; confidence: high; commits: b37fefda8017, bd33d7164428, c22dd1ea81cc; files: src/commit-sweeper.ts, test/local-review.test.ts, docs/commit-sweeper.md)

What the crustacean ranks mean

🦀 challenger crab: rare, exceptional readiness with strong proof, clean implementation, and convincing validation.
🦞 diamond lobster: very strong readiness with only minor maintainer review expected.
🐚 platinum hermit: good normal PR, likely mergeable with ordinary maintainer review.
🦐 gold shrimp: useful signal, but proof or patch confidence is still limited.
🦪 silver shellfish: thin signal; proof, validation, or implementation needs work.
🧂 unranked krab: not merge-ready because proof is missing/unusable or there are serious correctness or safety concerns.
🌊 off-meta tidepool: rating does not apply to this item.

Shiny media proof means a screenshot, video, or linked artifact directly shows the changed behavior. Runtime, network, CSP, and security claims still need visible diagnostics.

How this review workflow works

ClawSweeper keeps one durable marker-backed review comment per issue or PR.
Re-runs edit this comment so the latest verdict, findings, and automation markers stay together instead of adding duplicate bot comments.
A fresh review can be triggered by eligible @clawsweeper re-review comments, exact-item GitHub events, scheduled/background review runs, or manual workflow dispatch.
PR/issue authors and users with repository write access can comment @clawsweeper re-review or @clawsweeper re-run on an open PR or issue to request a fresh review only.
Maintainers can also comment @clawsweeper review to request a fresh review only.
Fresh-review commands do not start repair, autofix, rebase, CI repair, or automerge.
Maintainer-only repair and merge flows require explicit commands such as @clawsweeper autofix, @clawsweeper automerge, @clawsweeper fix ci, or @clawsweeper address review.
Maintainers can comment @clawsweeper explain to ask for more context, or @clawsweeper stop to stop active automation.

Move the --engine codex|claude validation to the top of localReviewCommand so an invalid engine fails fast before any run state (run dir, GH_CONFIG_DIR) is created. Addresses the P3 review finding on openclaw#309.

anagnorisis2peripeteia · 2026-06-16T22:24:53Z

@clawsweeper re-review

clawsweeper · 2026-06-16T22:24:57Z

🦞🧹
ClawSweeper re-review requested.

I asked ClawSweeper to review this item again.
Action: item re-review queued (workflow sweep.yml, event repository_dispatch).
Result: the existing ClawSweeper review comment will be edited in place when the review finishes.

Re-review progress:

State: Complete
Detail: The targeted re-review finished, the durable review comment was updated, and the synced verdict was routed.
Run: https://github.com/openclaw/clawsweeper/actions/runs/27652207890
Updated: 2026-06-16T22:48:38.559Z

…full stdout, error fallback) Found by local-review --engine claude reviewing its own PR. (1) Embed the full committed-range diff in the claude prompt — the read-only reviewer can't run git to see the change (codex's sandbox can). (2) Read the full captured stdout file instead of the 64 KiB tail, so a large report's YAML front matter isn't truncated. (3) Fall back to stdout in failure reports when stderr is empty (mirrors codex).

"Read Grep Glob" was passed as a single space-joined token (one bogus rule → no tools enabled in -p mode). Comma-separate so Read/Grep/Glob parse as three rules and the reviewer can actually open repo files. Found by the claude meta self-review.

…d branches run() buffers the whole diff host-side; a >maxBuffer whole-branch diff threw before the 256KiB cap applied, aborting the process with no report. Wrap in try/catch -> failureReport, matching the lane's graceful-degradation pattern. Found by the claude meta self-review.

anagnorisis2peripeteia · 2026-06-17T06:09:07Z

@clawsweeper re-review

clawsweeper · 2026-06-17T06:09:11Z

🦞🧹
ClawSweeper re-review requested.

I asked ClawSweeper to review this item again.
Action: item re-review queued (workflow sweep.yml, event repository_dispatch).
Result: the existing ClawSweeper review comment will be edited in place when the review finishes.

Re-review progress:

State: Complete
Detail: The targeted re-review finished, the durable review comment was updated, and the synced verdict was routed.
Run: https://github.com/openclaw/clawsweeper/actions/runs/27671147549
Updated: 2026-06-17T07:16:14.514Z

anagnorisis2peripeteia · 2026-06-17T06:48:28Z

@clawsweeper re-review

steipete · 2026-06-19T06:59:27Z

Closing for now because we are not adding a Claude execution provider yet. The implementation was thoughtful and can be revisited if provider expansion becomes explicit product direction. Thank you for the work and validation.

fix(local-review): validate --engine before creating run state

97f579d

Move the --engine codex|claude validation to the top of localReviewCommand so an invalid engine fails fast before any run state (run dir, GH_CONFIG_DIR) is created. Addresses the P3 review finding on openclaw#309.

anagnorisis2peripeteia added 3 commits June 17, 2026 06:49

anagnorisis2peripeteia marked this pull request as ready for review June 17, 2026 06:48

steipete closed this Jun 19, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(local-review): add fixed-claude CLI engine via --engine#309

feat(local-review): add fixed-claude CLI engine via --engine#309
anagnorisis2peripeteia wants to merge 5 commits into
openclaw:mainfrom
anagnorisis2peripeteia:feat/local-review-engines

anagnorisis2peripeteia commented Jun 16, 2026 •

edited

Loading

Uh oh!

anagnorisis2peripeteia commented Jun 16, 2026

Uh oh!

clawsweeper Bot commented Jun 16, 2026 •

edited

Loading

Uh oh!

clawsweeper Bot commented Jun 16, 2026 •

edited

Loading

Uh oh!

anagnorisis2peripeteia commented Jun 16, 2026

Uh oh!

clawsweeper Bot commented Jun 16, 2026 •

edited

Loading

Uh oh!

anagnorisis2peripeteia commented Jun 17, 2026

Uh oh!

clawsweeper Bot commented Jun 17, 2026 •

edited

Loading

Uh oh!

anagnorisis2peripeteia commented Jun 17, 2026

Uh oh!

steipete commented Jun 19, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

anagnorisis2peripeteia commented Jun 16, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What

Addresses the #274 close, point by point

Auth

Verification

Proof — claude engine reviewing its own PR

Not included

Uh oh!

anagnorisis2peripeteia commented Jun 16, 2026

Uh oh!

clawsweeper Bot commented Jun 16, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

clawsweeper Bot commented Jun 16, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

anagnorisis2peripeteia commented Jun 16, 2026

Uh oh!

clawsweeper Bot commented Jun 16, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

anagnorisis2peripeteia commented Jun 17, 2026

Uh oh!

clawsweeper Bot commented Jun 17, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

anagnorisis2peripeteia commented Jun 17, 2026

Uh oh!

steipete commented Jun 19, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

anagnorisis2peripeteia commented Jun 16, 2026 •

edited

Loading

clawsweeper Bot commented Jun 16, 2026 •

edited

Loading

clawsweeper Bot commented Jun 16, 2026 •

edited

Loading

clawsweeper Bot commented Jun 16, 2026 •

edited

Loading

clawsweeper Bot commented Jun 17, 2026 •

edited

Loading