Fix executor QR: split broken combined modules into decompose+verify#20
Fix executor QR: split broken combined modules into decompose+verify#20Sapd wants to merge 4 commits into
Conversation
The executor referenced non-existent quality_reviewer/impl-code-qr.py and impl-docs-qr.py modules. These were split into separate _decompose.py and _verify.py files (matching planner's pattern) but the executor was never updated. Changes: - Rewrite executor from 9 to 10 steps (matching EXECUTOR_TOTAL_STEPS and phases.py step numbering) - Split step 4 (Code QR) into: decompose (3) + verify (4) + gate (5) - Split step 7 (Doc QR) into: decompose (7) + verify (8) + gate (9) - Add --state-dir CLI arg (created in step 1, passed through all steps) - Use phases.py config for script paths instead of hardcoded mode_script - Reuse planner's QR patterns: qr_file_exists skip, parallel verify dispatch with template_dispatch, build_gate_output for gates - Fold reconciliation into step 1 (--reconciliation-check flag)
Review Summary by QodoFix executor QR: split broken combined modules into decompose+verify pattern
WalkthroughsDescription• Rewrite executor from 9 to 10 steps, splitting QR modules into decompose/verify/gate pattern • Replace broken combined QR modules with separate decompose and verify scripts • Add --state-dir CLI argument created in step 1, passed through all steps • Use get_phase_config() to resolve script paths from phases.py instead of hardcoded references • Reuse planner's QR patterns: skip logic, parallel verify dispatch, gate routing • Fold reconciliation into step 1 via --reconciliation-check flag • Derive fix mode from QR state files instead of relying solely on --qr-fail flag Diagramflowchart LR
A["Step 1: Planning<br/>+ state_dir setup"] --> B["Step 2: Implementation<br/>wave-aware dispatch"]
B --> C["Step 3: Code QR<br/>Decompose"]
C --> D["Step 4: Code QR<br/>Verify parallel"]
D --> E["Step 5: Code QR<br/>Gate route"]
E -->|pass| F["Step 6: Documentation<br/>TW dispatch"]
E -->|fail| B
F --> G["Step 7: Doc QR<br/>Decompose"]
G --> H["Step 8: Doc QR<br/>Verify parallel"]
H --> I["Step 9: Doc QR<br/>Gate route"]
I -->|pass| J["Step 10: Retrospective"]
I -->|fail| F
File Changes1. skills/scripts/skills/planner/orchestrator/executor.py
|
Code Review by Qodo
✅ 1.
|
| qr_state = load_qr_state(state_dir, phase) | ||
| if not qr_state or "items" not in qr_state: | ||
| body = f"Error: qr-{phase}.json not found or malformed in {state_dir}" | ||
| return format_step(body, title=title) |
There was a problem hiding this comment.
1. state_dir exposed in errors 📘 Rule violation ⛨ Security
The QR verify step returns an error message that includes the internal state_dir filesystem path, which can leak implementation details to the user. This violates secure error handling expectations for user-facing errors.
Agent Prompt
## Issue description
User-facing error output includes the internal filesystem path (`state_dir`), leaking implementation details.
## Issue Context
This orchestrator prints `format_step(...)` content to stdout, which is user-visible. Secure error handling requires keeping internal details (paths, stack traces) out of user-facing messages.
## Fix Focus Areas
- skills/scripts/skills/planner/orchestrator/executor.py[266-269]
ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools
| next_cmd = f"python3 -m {MODULE_PATH} --step 2 --state-dir {state_dir}" | ||
|
|
||
| # Determine next command | ||
| if qr.passed and gate.pass_step is not None: | ||
| next_cmd = f"python3 -m {MODULE_PATH} --step {gate.pass_step}" | ||
| else: | ||
| next_iteration = qr.iteration + 1 | ||
| next_cmd = f"python3 -m {MODULE_PATH} --step {gate.work_step} --qr-fail --qr-iteration {next_iteration}" | ||
| return format_step(body, next_cmd, title="Execution Planning") |
There was a problem hiding this comment.
2. Unquoted state_dir in commands 📘 Rule violation ⛨ Security
The code interpolates state_dir directly into command strings without quoting/escaping, so paths with spaces can break commands and user-controlled values can enable shell-injection if copy/pasted or executed via a shell. This violates security-first input handling expectations for external inputs.
Agent Prompt
## Issue description
`state_dir` is treated as an external input and is interpolated into command strings without quoting/escaping, risking command breakage and potential injection.
## Issue Context
Commands are produced as strings (for next-step commands and agent dispatch templates). If these strings are executed via a shell or copy/pasted, unquoted values can be interpreted as additional flags/commands.
## Fix Focus Areas
- skills/scripts/skills/planner/orchestrator/executor.py[130-132]
- skills/scripts/skills/planner/orchestrator/executor.py[204-205]
- skills/scripts/skills/planner/orchestrator/executor.py[238-238]
- skills/scripts/skills/planner/orchestrator/executor.py[306-307]
- skills/scripts/skills/planner/orchestrator/executor.py[341-345]
- skills/scripts/skills/planner/orchestrator/executor.py[396-409]
ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools
| mode_script = get_mode_script_path("developer/exec_implement.py") | ||
| invoke_cmd = f"python3 -m {mode_script} --step 1 --state-dir {state_dir} --qr-fail --qr-iteration {qr.iteration}" | ||
|
|
There was a problem hiding this comment.
3. Bad --qr-fail dispatch 🐞 Bug ✓ Correctness
In fix mode, executor dispatches developer/TW router scripts with a bare --qr-fail flag, but those scripts use mode_main, which defines --qr-fail as a string option requiring a value. This will raise argparse errors and break the fix loop at runtime.
Agent Prompt
### Issue description
Executor fix-mode dispatch commands include a bare `--qr-fail`, but the invoked subagent scripts use `mode_main` which defines `--qr-fail` as a string option requiring a value. This causes argparse failures and breaks the fix loop.
### Issue Context
- Executor fix-mode dispatch uses `developer/exec_implement.py` and `technical_writer/exec_docs.py`.
- Those scripts are entered via `mode_main` which defines `--qr-fail` as `type=str`.
- The router scripts already decide fix vs execute via QR state inspection (`has_qr_failures`), so `--qr-fail` is unnecessary.
### Fix Focus Areas
- skills/scripts/skills/planner/orchestrator/executor.py[152-154]
- skills/scripts/skills/planner/orchestrator/executor.py[396-400]
ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools
| # Derive fix mode from QR state files (like planner does) | ||
| phase_for_step = {2: "impl-code", 6: "impl-docs"} | ||
| if step in phase_for_step and state_dir and not qr_fail: | ||
| from skills.planner.shared.qr.utils import has_qr_failures | ||
| phase = phase_for_step[step] | ||
| if has_qr_failures(state_dir, phase): | ||
| qr = QRState(iteration=qr_iteration, state=LoopState.RETRY, status=status) | ||
|
|
There was a problem hiding this comment.
4. Retry state not applied 🐞 Bug ✓ Correctness
Executor derives LoopState.RETRY from QR state files only for steps 2 and 6, not for QR steps (3–5, 7–9). As a result, QR verify steps typically run as INITIAL even when failures exist, so iteration increments and progressive de-escalation logic won’t execute as intended and workflows can get stuck blocking on low-severity items.
Agent Prompt
### Issue description
Executor only derives retry state from QR files for steps 2 and 6, so QR verify steps do not enter RETRY mode even when failures exist. This prevents iteration increments and breaks progressive severity de-escalation.
### Issue Context
Planner orchestrator derives phase/iteration/retry state from QR state files for every step handler with a phase. Executor should follow the same pattern for impl-code (steps 2–5) and impl-docs (steps 6–9).
### Fix Focus Areas
- skills/scripts/skills/planner/orchestrator/executor.py[450-492]
ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools
| qr_state = load_qr_state(state_dir, phase) | ||
| if not qr_state or "items" not in qr_state: | ||
| body = f"Error: qr-{phase}.json not found or malformed in {state_dir}" | ||
| return format_step(body, title=title) |
There was a problem hiding this comment.
5. Qr error marks complete 🐞 Bug ⛯ Reliability
If qr-{phase}.json is missing/malformed, format_qr_verify calls format_step without a next command,
which causes the step renderer to output WORKFLOW COMPLETE. This can silently stop execution instead
of routing back to decompose or surfacing a recoverable error path.
Agent Prompt
### Issue description
When qr state is missing/malformed, executor returns a formatted step without a next command, and the step formatter treats this as workflow completion.
### Issue Context
This should be a recoverable error: users should be routed back to the decompose step (or given a command to regenerate qr-{phase}.json).
### Fix Focus Areas
- skills/scripts/skills/planner/orchestrator/executor.py[266-269]
- skills/scripts/skills/lib/workflow/prompts/step.py[57-58]
ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools
get_step_guidance() was missing module_path and **kwargs parameters that mode_main() passes to all step guidance functions. This caused a TypeError when the QR fix workflow was dispatched.
Sub-agents dispatched via Task tool don't always cd to the working
directory before running Python module commands, causing intermittent
"module not found" failures. Fix by inlining `cd {dir} &&` into the
command itself instead of relying on a separate "Working directory"
instruction that agents may ignore.
Fixes for 4 review bugsAddressed the 4 substantive bugs from the Qodo review. Here's the diff: 1. Remove bare
|
…on, skeleton fields - Remove bare --qr-fail/--qr-iteration flags from subagent dispatch commands; sub-agents detect fix mode from QR state files via has_qr_failures() - Use return value of increment_qr_iteration() instead of reading stale in-memory qr_state after disk update - Route back to decompose step when QR state file is missing/malformed instead of silently marking workflow complete - Add planning_context and invisible_knowledge to executor plan.json skeleton to match planner's v2 schema Co-authored-by: Arty <24428965+ArtyT@users.noreply.github.com>
|
Thanks, the planner skill is currently going through a bit of an overhaul, and is indeed broken -- let me see what I can do. |
|
I also noticed that even with my changes, sometimes claude/the planner declares victory while QR steps are still open. A simple "check if you really did all steps" is enough to trigger it continuing. Not sure how this can be fixed. |
|
Yes, I think it's Claude that wants to stop early; it's effectively one of the limitations of Claude Code not being extensible enough to "force" a workflow. I'm thinking of ways to avoid this; context also gets larger and larger with every QR iteration run. |
|
Do you have an opinion on whether it's better to use an earlier version of the workflow (e.g., one from mid-January) while you iron out the bugs in the planner skill, or if it's still worth using the current version despite the issues? |
Port the executor QR decompose/verify/gate rewrite from PR solatis#20 and go beyond the merged branch by closing the three Qodo items ArtyT skipped (unquoted state_dir, RETRY state for QR steps, plan.json write error handling). Also repair broken tests that predated this change: Workflow._params, ChoiceSet/Constant domain types, dead WORKFLOW strings in QR verify modules, and a NameError in incoherence.py. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
(Ive written this PR using claude. I noticed that when executing a plan, it does not go through Quality review. Thats silent so easy to miss).
Problem
The executor orchestrator (
skills/planner/orchestrator/executor.py) references QR modules that don't exist:Neither
impl_code_qr.pynorimpl_docs_qr.pyexist. These were split into separate decompose/verify files at some point, matching the planner's pattern:impl_code_qr_decompose.py+impl_code_qr_verify.pyimpl_docs_qr_decompose.py+impl_docs_qr_verify.pyThe planner orchestrator was updated to dispatch decompose and verify as separate steps (steps 4+5, 8+9, 12+13), but the executor was never updated. This means executor steps 4 and 7 always fail with
ModuleNotFoundError.The supporting infrastructure was already updated for the split:
constants.py:EXECUTOR_TOTAL_STEPS = 10,EXECUTOR_GATE_STEPS = {5, 9}(expects 10 steps, not 9)phases.py:impl-codemaps decompose=3, verify=4, gate=5;impl-docsmaps decompose=7, verify=8, gate=9qr/constants.py: routing entries exist for("executor", "impl-code")→ gate step 5But executor.py still had the old 9-step layout with combined QR modules.
Fix
Rewrites executor.py from 9 to 10 steps, matching the existing
phases.py/constants.pydefinitions:impl_code_qr_decomposeimpl_code_qr_verify(parallel)impl_docs_qr_decomposeimpl_docs_qr_verify(parallel)Key changes:
--state-dirCLI arg (created in step 1, passed through all steps)get_phase_config()to resolve decompose/verify script paths fromphases.pyqr_file_existsskip,template_dispatchfor parallel verify,build_gate_outputfor gates--qr-failflagTest plan
--step 1through--step 10)--state-dirqr-{phase}.jsonalready exists