fix(change): enforce Step 9 scope guard against Step 5/6 allowlist (#1123)#1133
Open
Serhan-Asad wants to merge 14 commits into
Open
fix(change): enforce Step 9 scope guard against Step 5/6 allowlist (#1123)#1133Serhan-Asad wants to merge 14 commits into
Serhan-Asad wants to merge 14 commits into
Conversation
Add tests for Step 9 scope guard before implementing. Tests fail at import: _parse_step6_devunit_prompts, _parse_step6_dependency_prompts, _parse_step6_integration_prompts, _parse_step6_frontend_prompts, _parse_step5_doc_paths, _build_step9_allowlist, and _enforce_step9_scope don't exist yet. Covers: - Parser tests: prompt-only column extraction from Step 6 tables, placeholder-row skipping, dependency/integration/frontend selectivity, Step 5 section selectivity (Update + Create + Associated, no Conflicts / No Changes Needed). - Allowlist builder tests: union of Step 5/6 paths + direct edits, exclusion of Step 6 code/example/test columns, preflight-healed prompts + companion .pdd/meta/<name>.json files, Step 5 Conflicts exclusion, architecture.json exclusion. - Scope-guard tests over a real git worktree: reverts untracked examples/, tests/, unrelated prompts; reverts tracked architecture.json modifications; preserves in-scope prompts, direct edit candidates, associated docs, healed prompt + meta; no-op on clean worktree; tolerates missing worktree. - Orchestrator integration: scope violation stops workflow with SCOPE_VIOLATION state, last_completed_step not advanced; clean Step 9 proceeds to Step 10.
Step 9 of `pdd change` is supposed to modify only prompts, docs, and
explicitly-listed Direct Edit Candidates. The LLM does not reliably
honor the scope from the prompt alone; in pdd_cloud#1600 it wrote ~30
unrelated files (CRM/hackathon prompts, new examples and tests,
extensions/* code) despite a precise Allowed write set in the issue
body.
This commit adds a deterministic post-Step-9 scope guard:
* New parsers (regex-based, in the same shape as the existing
`_parse_direct_edit_candidates`):
- `_parse_step6_devunit_prompts` — first column of MODIFY/CREATE
tables, prompts only (code/example/test columns never leak in);
placeholder rows containing `{...}` are skipped.
- `_parse_step6_dependency_prompts` — `.prompt` bullets under
Dependencies.
- `_parse_step6_integration_prompts` — 3rd column of Integration
Points; skips empty cells and `manual update needed`.
- `_parse_step6_frontend_prompts` — 2nd column of Frontend Dev
Units; skips `None`.
- `_parse_step5_doc_paths` — `#### path` headers under Files to
Update + Associated Documents, and `- path - purpose` bullets
under Files to Create. Excludes Conflicts and No Changes Needed.
* `_build_step9_allowlist` unions the Step 5/6 paths, Direct Edit
Candidates, preflight-healed prompts, and the `.pdd/meta/<basename>.json`
companion files only for healed prompts. `architecture.json` (root
and nested) is explicitly excluded — Step 10 owns architecture.
No directory prefixes — exact paths only.
* `_enforce_step9_scope` is the orchestrator hook. It builds the
allowlist from `context["step5_output"]`, `context["step6_output"]`,
and `state["preflight_healed_prompt_paths"]`, delegates to the
existing `revert_out_of_scope_changes_with_dirs` worktree helper
(issue #1080's structured porcelain parser), and returns the list
of reverted relative paths. Returns `[]` on missing worktree or
non-`.git` paths so the guard is a no-op in non-worktree contexts.
* Orchestrator wiring (inside `if step_num == 9:`, before
`_parse_changed_files`): when the guard returns a non-empty list,
we append a `SCOPE_VIOLATION:` block to `step_output`, print a
red console warning with the reverted paths, post a GitHub comment,
mark `state["step_outputs"]["9"] = FAILED: SCOPE_VIOLATION...`,
do NOT advance `last_completed_step` (matches the "no file changes"
fallback at the same site), save state, and return False with
`Stopped at step 9: Scope violation — ...`.
This is `pdd change` only; other agentic_* orchestrators are not
modified.
The Step 6 prompt template emits the Direct Edit Candidates table after
an italic description line plus a blank line:
### Direct Edit Candidates (No Prompt)
*Files that need scoped direct edits because no corresponding prompt exists*
| File | Edit Type | Markers Found |
The previous `_parse_direct_edit_candidates` regex required `|` to follow
the heading on the very next line, so it silently returned `[]` on real
outputs. Before #1123 that was a harmless no-op (Step 9's LLM also emits
DIRECT_EDITS markers separately), but with the new scope guard a parser
miss means Direct Edit Candidates are NOT in the allowlist and get
reverted — breaking the guard's own contract.
Loosen the regex to skip optional non-pipe, non-heading lines between
the section heading and the table. Add a regression test using the
real template format, and update the Step 5/6 fixtures to mirror what
the prompt templates actually emit.
…ement Address codex round-2 review: - Snapshot worktree status before Step 9 LLM; only enforce allowlist against paths Step 9 actually changed (status differs from pre-snapshot). Fixes false SCOPE_VIOLATION when Step 8.5 drift-heal mutates .pdd/meta/*_run.json and architecture.json. - Add `.rst`/`.txt` to the worktree-fallback allowed extensions so Step 5 Associated Docs of those types don't break the "no changes" path. - Guard now propagates enforcement errors; orchestrator catches and treats them as a workflow-stopping scope violation (no more silent fail-open). - Case-insensitive section heading match; strip markdown emphasis in path parsers; cleanup test import positioning. #1123 Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…rict mode Address codex round-3 review: - Snapshot now captures (porcelain_status, sha256_content_hex). Skip rule in revert_out_of_scope_changes_with_dirs requires BOTH to match — a status-only match isn't enough when Step 8.5 left a tracked file modified and Step 9 modified it again. - Detect deletions of pre-snapshot files: iterate pre paths not in post status. Tracked deletions restored via `git checkout HEAD --`; untracked deletions can't auto-restore (Step 8.5 is idempotent) and surface as unrecoverable violations. - Add `strict=True` mode to revert_out_of_scope_changes_with_dirs. The Step 9 caller uses it; other callers (e2e_fix) keep the silent-fail default to preserve their semantics. #1123
…on violation - Pre-snapshot deletion pass now also flags files that exist on disk but are absent from post-Step-9 git status (state went dirty -> clean). Step 9 reverting Step 8.5's tracked-modification or removing an untracked meta file by overwriting it to HEAD content used to be invisible; we now mark these as unrecoverable scope violations. - Scope-violation save path clears state["preflight_drift_healed"] / preflight_healed_worktree so resume re-runs Step 8.5. Without this, resume after a Step-9 revert of Step-8.5's mutations would leave the worktree permanently missing the heal. - state["preflight_healed_prompt_paths"] is intentionally preserved so Step 10's doc-discovery sweep on retry still sees the originally healed prompts; the re-heal will regenerate them identically. - Skipped the optional OSError double-log cleanup. The pattern only fires in strict mode and changing the raise/log flow would risk altering observable behavior tests already assert on; the duplicate is cosmetic per round-3 review notes. #1123
There was a problem hiding this comment.
Your free trial has ended. If you'd like to continue receiving code reviews, you can add a payment method here.
Collaborator
Author
Step 7/8: Verification & Final Report (Iteration 1)Test Results After Fixes
Build StatusPass — all PR-scope modules import cleanly ( Overall StatusAll clear. Issue Alignmentissue_aligned: true — the branch's round 1–4 commits add an orchestrator-enforced post-Step 9 allowlist (snapshot-based scope guard, Step 5/6 dev-unit + Direct Edit Candidate parsing, deletion + content-revert detection, fail-closed enforcement) exactly as #1123 requests. Issues Summary
Checkup complete. All Issues Fixed {
"success": true,
"message": "PR #1133 implements the post-Step 9 write-scope guard requested by issue #1123: snapshot-based diff against a Step 5/6-derived allowlist with deletion and content-revert detection and fail-closed enforcement. All 258 targeted scope-guard/allowlist/direct-edit/snapshot/step9/porcelain tests pass; package imports clean. No critical findings.",
"tech_stack": ["python", "click", "pydantic-v2", "litellm", "pytest"],
"issue_aligned": true,
"issues": [],
"changed_files": []
}PR Push StatusPushed fixes to PR branch. |
…1123) - Rewrite _parse_direct_edit_candidates without regex backtracking (CodeQL py/redos alert at pdd/agentic_change_orchestrator.py:741). The previous regex had overlapping alternatives that caused exponential backtracking on N blank lines after the heading. - _parse_step5_doc_paths now reconciles Associated Documents against Step 6's confirmed dev-unit prompt set: only docs whose "Discovered via" originating prompt is in Step 6 are allowlisted. Matches the reconciliation rule the Step 9 prompt already enforces; closes the defense-in-depth gap where a misbehaving Step 9 LLM could edit a stale doc and slip past the guard. - Update prompts/agentic_common_worktree_python.prompt to describe the new revert_out_of_scope_changes_with_dirs signature (+ pre_snapshot, + strict) and the new _hash_file_content helper, so `pdd sync` regeneration doesn't quietly lose the fix. - Delete stray empty new_file.txt the checkup commit introduced.
5 tasks
…t, README - Parse the documented include-graph format (`prompts/foo.prompt` → `<include>docs/bar.md</include>`) in Step 5 Discovered via lines. Previous regex anchored to $ and rejected backticks, so on real Step 5 output it never matched and the reconciliation gate became a no-op. Two-step extraction now: pull the Discovered via clause, then match the first .prompt path within it. - Update prompts/agentic_change_orchestrator_python.prompt with a Step 9 Scope Guard section: pre-snapshot capture, _enforce_step9_scope call site, SCOPE_VIOLATION stop path, preflight-clear-on-violation, architecture.json exclusion. Closes the regression risk where `pdd sync` / auto-heal could regenerate the orchestrator and silently drop the new guard. - README Workflow Resumption now lists Step 9 as the third pause point alongside Steps 4 and 7, with the recovery action (refine contract, rerun pdd change; preflight re-runs automatically). #1123
…ment body - agentic_change_orchestrator imports pdd.agentic_common_worktree; add agentic_common_worktree_python.prompt to the prompt's <pdd-dependency> block and to architecture.json's dependencies array so `pdd sync` / auto-heal regeneration sees the dependency. - SCOPE_VIOLATION GitHub comment now passes an explicit `body` argument with the reverted-paths list at the top, bypassing the post_step_comment fallback that truncates output to the first 1000 chars. README claim that the comment lists reverted paths was a false promise for long Step 9 outputs; now it holds. - Apply the same explicit-body pattern to the scope_guard_error branch so the strict-mode failure surface is consistent. Out of scope: auto-heal Cloud Build still fails on backend/generated (prompts/generated_python.prompt set but missing on disk) — this is a pre-existing CI infra issue unrelated to PR #1133 (no such prompt or arch entry exists in this repo). #1123 Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
git checkout HEAD -- <path> fails for paths not in HEAD, so staged additions (status X=A: A , AM) survived the scope guard and could land in the PR via Step 13's git add -A. Handle them like copy destinations: git reset HEAD -- <path> then unlink, same pattern already used for C-status entries. strict=True propagates failures. Regression tests use a real git repo: stage a new file (A ) and a staged+modified file (AM); assert the guard removes both and leaves no trace in post-status. #1123 Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…ns and copies git reset HEAD -- <path> could fail silently (non-zero exit, ignored return code), leaving the staged blob in the index. The code then unlinked the file and reported it as reverted. Step 13's git add -A would then reference a missing or still-staged path. Fix: capture the CompletedProcess from reset; on non-zero return, log a warning and raise under strict=True WITHOUT proceeding to unlink. Applied to both the copy-destination branch and the new staged-addition branch (status X=A). Matching change to pdd/prompts/agentic_common_worktree_python.prompt so the fix survives pdd sync regeneration. Tests: staged_addition_reset_failure_strict asserts OSError is raised and unlink is not attempted; copy_reset_failure_strict covers the copy branch. Both use a fake subprocess.run that returns rc=1 for reset. #1123 Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
The rename branch called git reset HEAD -- <new> <old> without capturing the return code, then unconditionally proceeded to git checkout HEAD -- <old> and unlink. A failed reset left the staged rename in the index while the code reported it as reverted; Step 13's git add -A could then include the leaked paths. Fix: capture the CompletedProcess; on non-zero return, log a warning and raise under strict=True WITHOUT proceeding to checkout/unlink. Consistent with the same fix applied to copy destinations and staged additions in round-9. Test: rename_reset_failure_strict uses a fake subprocess.run that returns rc=1 for reset; asserts OSError is raised. #1123 Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…t sync Three issues found in review: 1. Symlink bypass: _path_in_scope called .resolve() on symlinks, so a symlink at an out-of-scope path whose target was allowlisted was treated as in-scope. Fix: check is_symlink() first; symlinks are always out-of-scope regardless of target (the guard enforces scope on the path Step 9 created, not the content it points at). 2. Checkout-before-unlink ordering in rename revert: unlink(new) ran before checking checkout(old) return code, leaving both sides of the rename damaged on checkout failure. Fix: only unlink AFTER checkout return code is confirmed 0. 3. Source prompt (agentic_common_worktree_python.prompt) did not document the rename reset return-code invariant, checkout-before- unlink ordering, or symlink scope rule. Updated so pdd sync regeneration preserves all three behaviors. Nit: fix two mock tests that used diff-style C100/R100 porcelain records instead of the real git status --porcelain=v1 -z format (C src\0dst\0 and R old\0new\0). New tests: symlink_not_bypassed_by_target (real git repo + symlink), rename_checkout_failure_does_not_unlink (tracks unlink() calls via patch.object, asserts none occur when checkout fails). #1123 Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
pdd changeStep 9 that reverts any worktree edits outside an allowlist derived from Step 5 (docs to update / create / associated documents) and Step 6 (dev-unit prompt paths, dependencies, integration points, frontend prompts, direct-edit candidates). Code, example, and test files from Step 6 tables are intentionally not allowed — Step 9's contract is prompts + docs + scoped direct edits..pdd/meta/<basename>_<lang>.json,_run.json,architecture.json) are recognized via(porcelain_status, sha256(content))and excluded from the Step 9 delta. Step 9 modifying a Step 8.5 file on top, or reverting Step 8.5's mod back to HEAD, is detected and flagged.SCOPE_VIOLATION:block (file list + reason) posted to the issue and saved into state. Preflight gates are cleared so resume re-runs Step 8.5 before re-attempting Step 9.Closes #1123.
Why the existing helper wasn't enough
The existing
_revert_out_of_scope_changesinagentic_common.pyruns with-unoand never sees untracked files — exactly the shape of the drift in pdd_cloud#1602 (newexamples/,tests/, unrelated prompts). This PR usesrevert_out_of_scope_changes_with_dirsfromagentic_common_worktree.py(which runs with-u) and extends it with two new optional parameters:pre_snapshot: Mapping[str, Tuple[str, Optional[str]]]— skip entries that match the snapshot's(status, content_hash); existing two-arg callers (agentic_e2e_fix_orchestrator) are unchanged.strict: bool = False— Step 9 caller passesstrict=Trueso internalgit/OSErrorfailures propagate and stop the workflow rather than fail open. Other callers keep the silent-fail default.Review loop
This change went through four iterations with codex as the reviewer:
.rst/.txtin fallback detection, fail-closed orchestrator handler. Addressed codex's first three blockers (under-allowlisted preflight metadata, fallback missing doc extensions, silent fail-open).sha256, deletion detection,strictmode. Addressed codex's next three blockers (status-only delta was content-blind; deletions were invisible; helper still failed open internally).preflight_drift_healedon violation so resume re-runs Step 8.5. Addressed codex's final two blockers.Final codex verdict: merge-ready.
Test plan
LC_ALL=C pytest tests/test_agentic_change_orchestrator.py tests/test_agentic_common_worktree.py tests/test_issue_1080_porcelain_scope_guards.py -m "not integration and not e2e and not real" --timeout=60— 286 passedpylint pdd/agentic_change_orchestrator.py pdd/agentic_common_worktree.py --errors-only— cleangit diff --check— cleanexamples/gemini_triage_example.py,test_gemini_triage.py,extensions/.../gemini_triage.py, unrelatedprompts/backend/{crm,hackathon}_*— all reverted (untracked → removed; tracked →git checkout HEAD --).architecture.jsonStep 9 edits — reverted (Step 10 owns).pdd changerun (will land after merge; can't test against production without merging).🤖 Generated with Claude Code