Skip to content

feat: enforce split-contract allowed write sets in pdd sync (#1013)#1014

Open
prompt-driven-github[bot] wants to merge 44 commits into
mainfrom
change/issue-1013
Open

feat: enforce split-contract allowed write sets in pdd sync (#1013)#1014
prompt-driven-github[bot] wants to merge 44 commits into
mainfrom
change/issue-1013

Conversation

@prompt-driven-github
Copy link
Copy Markdown
Contributor

Summary

Adds scope-guard plumbing to the agentic sync prompts so pdd sync honors the issue's split contract / allowed write set, preventing stray generated artifacts from leaking into a PR (the root cause observed on PR #1010 / issue #1005).

Closes #1013

Changes Made

Prompts Modified

  • pdd/prompts/agentic_common_python.prompt — add parse_issue_contract helper, IssueContract dataclass, and DEFAULT_SYNC_COMPANION_ALLOWLIST constant; document existing _revert_out_of_scope_changes in <pdd-interface>.
  • pdd/prompts/agentic_sync_runner_python.prompt — add allowed_write_set, companion_allowlist, and scope_guard_enabled kwargs to AsyncSyncRunner; enforce scope guard after each per-module subprocess via new _enforce_scope_guard helper with a hard-fail diagnostic.
  • pdd/prompts/agentic_sync_python.prompt — add scope_guard: bool = True kwarg to run_agentic_sync and run_global_sync; parse contract from fetched issue body/comments; plumb three kwargs into both AsyncSyncRunner and DurableSyncRunner dispatches.

Documentation Updated

  • README.md — document the new scope-guard behavior and --no-scope-guard opt-out.
  • CHANGELOG.md — note the contract enforcement.
  • architecture.json — record the new parse_issue_contract entry point and scope-guard wiring.

Design Notes

  • Additive only — all new parameters have safe defaults preserving current behavior.
  • Reuses existing code — the scope guard delegates to _revert_out_of_scope_changes already used in production for update / fix / crash / e2e-fix.
  • Permissive fallback — when no contract is present in the issue, current behavior is unchanged.
  • Opt-out--no-scope-guard flag is available for cases where reviewers explicitly want broad churn.

Review Checklist

  • Prompt syntax is valid
  • PDD conventions followed
  • Documentation is up to date
  • Scope-guard fallback behavior is acceptable when no contract is found

Next Steps After Merge

  1. Regenerate code from modified prompts in dependency order:
    ./sync_order.sh
    Or manually:
    pdd sync agentic_common
    pdd sync agentic_test_generate
    pdd sync ci_validation
    pdd sync agentic_crash
    pdd sync agentic_test_orchestrator
    pdd sync agentic_architecture_orchestrator
    pdd sync agentic_verify
    pdd sync cli
    pdd sync agentic_common_worktree
    pdd sync agentic_update
    pdd sync architecture
    pdd sync agentic_fix
    pdd sync duplicate_cli_guard
    pdd sync agentic_split_orchestrator
    pdd sync executor
    pdd sync git_update
    pdd sync fix_verification_errors_loop
    pdd sync fix_error_loop
    pdd sync fix_code_loop
    pdd sync agentic_split
    pdd sync sync_order
    pdd sync auto_deps_architecture
    pdd sync agentic_change
    pdd sync agentic_e2e_fix
    pdd sync agentic_e2e_fix_orchestrator
    pdd sync agentic_sync_runner
    pdd sync agentic_test
    pdd sync auth
    pdd sync commands
    pdd sync crash_main
    pdd sync fix
    pdd sync fix_verification_main
    pdd sync one_session_sync
    pdd sync utility
    pdd sync agentic_checkup_orchestrator
    pdd sync bug_main
    pdd sync durable_sync_runner
    pdd sync sync_main
    pdd sync __init__
    pdd sync agentic_sync
    pdd sync checkup
    pdd sync checkup_review_loop
    pdd sync ci_drift_heal
    pdd sync connect
    pdd sync generate
    pdd sync maintenance
    pdd sync update_main
    pdd sync agentic_checkup
    pdd sync modify
    pdd sync analysis
    pdd sync pin_example_hack
    pdd sync sync_orchestration
    pdd sync agentic_bug
    pdd sync agentic_bug_orchestrator
    
  2. Run tests to verify functionality.
  3. Deploy if applicable.

Created by pdd change workflow

Add scope-guard plumbing to agentic sync prompts so generated artifacts
stay within the issue's allowed write set:

- agentic_common_python.prompt: add parse_issue_contract helper,
  IssueContract dataclass, and DEFAULT_SYNC_COMPANION_ALLOWLIST constant
- agentic_sync_runner_python.prompt: add allowed_write_set,
  companion_allowlist, scope_guard_enabled kwargs to AsyncSyncRunner;
  enforce scope guard after each per-module subprocess
- agentic_sync_python.prompt: add scope_guard kwarg to run_agentic_sync
  and run_global_sync; parse contract from issue body/comments and plumb
  through to AsyncSyncRunner / DurableSyncRunner

Reuses the existing _revert_out_of_scope_changes helper already in
production for update/fix/crash/e2e-fix. Additive only — safe defaults
preserve current behavior; --no-scope-guard provides opt-out.

Closes #1013

Co-Authored-By: Claude Opus 4 <noreply@anthropic.com>
Copy link
Copy Markdown

@greptile-apps greptile-apps Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Your free trial has ended. If you'd like to continue receiving code reviews, you can add a payment method here.

Serhan-Asad and others added 27 commits May 14, 2026 17:00
…red contract formats

The deprecated wrapper now delegates to parse_issue_contract, which only
recognizes HTML-comment JSON or heading+fenced-block formats. Update the
tests to cover both supported formats and confirm the loose-markdown-bullet
format from earlier iterations is no longer accepted.
… dot/slash

Replaces .lstrip('./') in two scope-guard normalizers. lstrip strips any
combination of '.' and '/' characters from the left, so '.pdd/meta/foo.json'
was being rewritten to 'pdd/meta/foo.json' and missing the '.pdd/meta/*.json'
companion glob — the auto-allow that issue #1013 documents. Now strips a
single './' prefix only, preserving paths whose first segment begins with a
dot.
…d-block parser

F1: parse_issue_contract used to return None when allowed_paths parsed to an
empty list (either declared as [] or reduced to [] after dropping invalid
entries). Per Issue #1013, a syntactically valid empty contract means
"reject every change as out-of-scope" — return IssueContract(allowed_paths=())
instead so the runner can enforce reject-all. Update the docstring to match.

F2: tighten the fenced-block regex so it requires the fence to IMMEDIATELY
follow the heading (anchored with \A, only whitespace between) and restrict
the info string to empty/text/json. Prevents the parser from picking up a
random ```python``` or ```bash``` block somewhere later in the issue body.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
fnmatch.fnmatch treats ``*`` as matching any character including ``/``, so a
companion pattern of ``.pdd/meta/*.json`` was inadvertently allowing nested
paths like ``.pdd/meta/nested/foo.json``. Issue #1013 specifies pathlib-style
glob semantics. Switch both AsyncSyncRunner._matches_companion_allowlist and
DurableSyncRunner._out_of_scope_staged_paths to PurePosixPath(rel).match(pat).
Drop the now-unused fnmatch / DEFAULT_SYNC_COMPANION_ALLOWLIST imports.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
…opt-out, single log line

F4: AsyncSyncRunner now ALWAYS unions the caller-supplied companion_allowlist
with DEFAULT_SYNC_COMPANION_ALLOWLIST (caller patterns first, defaults
appended, deterministic dedup). Previously the runner stored the tuple as-is
and only fell back to the default when empty, so a caller passing
[".github/*.yml"] would silently lose .pdd/meta/*.json coverage. The dead
``or DEFAULT_SYNC_COMPANION_ALLOWLIST`` fallback at the enforcement call site
is removed now that __init__ guarantees a non-empty allowlist.

F6: parse_issue_contract is now called UNCONDITIONALLY in run_agentic_sync,
and the parsed contract is plumbed through to the runner regardless of the
``--no-scope-guard`` flag. The runner records ``allowed_write_paths`` and
``contract_source`` for diagnostics even when scope_guard_enabled=False;
_enforce_scope_guard short-circuits on the disabled flag so behaviour is
unchanged at enforcement time. The max_workers serialisation gate is also
updated to require ``scope_guard_enabled AND contract`` — opt-out runs no
longer needlessly drop to single-threaded execution.

F7: scope-guard status logging now lives in exactly one layer
(``agentic_sync.run_agentic_sync``, closer to the user). The runner's
duplicate ``run()``-entry log block is removed.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
…revert helper

revert_out_of_scope_changes_with_dirs invoked ``git status --porcelain -u``,
which is ambiguous: in some git versions/configs the result can collapse an
untracked directory into a single ``?? subdir/`` entry instead of listing the
files within. ``os.remove`` then fails on the directory and the contained
files are left behind, defeating the scope-guard's "remove out-of-scope
untracked files" promise that Issue #1013 relies on.

Fix:
- Use the explicit ``--untracked-files=all`` form so individual files are
  always listed.
- Defensively detect directory targets (path ending in ``/`` or filesystem
  shows is_dir) and use ``shutil.rmtree`` instead of ``os.remove`` so any
  remaining nested files are still removed.

Update the corresponding prompt text in
pdd/prompts/agentic_common_worktree_python.prompt:46 to match the corrected
behavior (the requirement is strengthened, not weakened).

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
The spec (pdd/prompts/agentic_sync_runner_python.prompt:69) says the
scope-guard diagnostic is printed to stderr. Until now the diagnostic only
appeared inside the assembled module-failure error string surfaced later by
maintenance.py — operators tailing stderr in real time never saw it.

Add ``print(diagnostic, file=sys.stderr)`` immediately after the revert
operations inside ``_enforce_scope_guard``. The deferred stdout echo in
maintenance.py is a separate event and remains.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Adds TestParseIssueContract with 9 cases covering Issue #1013 prompt
requirements (item 21): HTML-comment happy path, empty allowed_paths as
reject-all contract (F1), malformed JSON returns None, body-marker wins
over comment-marker, path traversal dropped silently, fenced block accepts
only text/json/bare fence (F2), fenced block must immediately follow
heading (F2), empty fenced body returns empty contract.
TestEnforceScopeGuard in test_agentic_sync_runner.py covers:
- permissive mode returns None
- scope_guard_enabled=False returns None
- pathlib companion match (F3): .pdd/meta/*.json matches only top-level
- companion_allowlist always unions DEFAULT_SYNC_COMPANION_ALLOWLIST (F4)
- diagnostic prefix "Scope guard reverted N out-of-scope file(s) for module"

test_commands_maintenance.py covers --no-scope-guard CLI flag (F10):
- --no-scope-guard propagates scope_guard=False to run_agentic_sync
- omitted flag defaults to scope_guard=True
…tore runner-entry log, reject bare fences

Iter-3 codex review surfaced three remaining gaps against the prompt spec:

- F1 (MAJOR): _enforce_scope_guard auto-allowed any file matching the
  companion glob anywhere under repo_root. The prompt says "every path
  under module_cwd", so sibling-module .pdd/meta/*.json could leak
  through in a shared-worktree run. Scan rglob from module_cwd instead.

- F2 (MAJOR): iter-2 deduplicated logging in favor of the sync layer only,
  but the spec at agentic_sync_runner_python.prompt items 22 requires the
  runner to ALSO log at run() entry. The two logs report different events:
  the sync-layer line records *contract detection*; the runner-entry line
  records *runtime enforcement state* ("permissive mode" or "disabled via
  --no-scope-guard"). Both are required.

- F3 (MINOR): the fenced-block regex accepted bare ``` fences with the
  ``(?:text|json)?`` optional group. The spec at agentic_common_python.prompt
  item 21 says only ``text`` or ``json`` info strings are legal. Make the
  language required and update the test.
…og coverage

F1: test_companion_glob_scoped_to_module_cwd_not_sibling locks in the
module_cwd scoping; a sibling module's .pdd/meta/*.json is no longer
auto-allowed when scanning from the current module's cwd.

F2: test_run_entry_logs_permissive_mode and test_run_entry_logs_opt_out_warning
verify the runner-entry INFO/WARNING required by
agentic_sync_runner_python.prompt items 22. The entry log moved above the
empty-basenames short-circuit so state is visible even on no-op runs.
Iter-4 codex flagged that the companion-allowlist build pass used
``cwd_path.rglob("*")``, which only sees files that still exist on disk.
When sync legitimately deletes a ``.pdd/meta/<module>.json`` companion
(module renamed/removed), the deletion appears in ``git status`` as a
tracked ``D `` but the file is gone, so it was missing from the
allowed-files set. The subsequent revert helper would resurrect the
deletion and hard-fail the module on a legitimate operation.

Now also pulls paths from ``_git_changed_paths`` and adds those matching
the companion allowlist to the allowed set, while still respecting the
module_cwd scoping from iter-3 F1.
…uard

DATA-LOSS BUG (reported by external review): the scope guard removed
any untracked file that wasn't in the contract or companion allowlist,
including pre-existing user files like scratch.txt or unrelated WIP
that existed before pdd sync started.

Fix: capture self._baseline_changed_paths at runner __init__ (set of
paths that git status reported BEFORE any module ran), and during
_enforce_scope_guard add each baseline path to allowed_files so the
revert helpers never touch them.

Regression test uses a real tmp_path git repo with scratch.txt as the
pre-existing untracked file — mock-based tests would not have caught
this because the revert helpers were stubbed out in iter-1..5.
REVERT-CLAIMED-BUT-NOT-DONE BUG (reported by external review): for a
staged rename ``R  old -> new``, the helper read the whole ``old -> new``
payload as a single path. The subsequent ``git checkout HEAD --
"old -> new"`` silently failed (pathspec didn't match) AND the return
code was never checked, so the helper reported the rename as reverted
when in fact ``git status`` still showed it.

Fix: split rename payloads so source and destination are independently
membership-checked, switch from ``git checkout HEAD --`` to
``git restore --staged --worktree --source=HEAD --`` so rename
destinations not present in HEAD are correctly removed, and check the
return code — clearing the reverted list on failure so callers see
real-vs-claimed revert state. Falls back to legacy ``git checkout``
on pre-2.23 git.

Affects every caller of _revert_out_of_scope_changes (agentic_update,
agentic_fix, agentic_crash, agentic_verify, agentic_e2e_fix_orchestrator,
agentic_sync_runner) — all of them were previously silently no-op on
rename out-of-scope cases.
OUT-OF-SCOPE-DELETION-MISSED BUG (reported by external review): the
durable runner staged-paths inspection used ``git diff --cached
--name-only``, which for a staged ``git mv old new`` emits only the
destination ``new``. A contract that allowed ``new`` but not ``old``
passed validation while the rename silently deleted the out-of-scope
``old``.

Fix: switch to ``git diff --cached --name-status -M``. Rename and copy
lines now emit ``R<score>\told\tnew`` / ``C<score>\told\tnew``; both
columns past the status are treated as scope-checked paths.

Regression test uses a real durable runner against a real git repo
with a staged rename — the bug only reproduces against real git output.
…er is out of scope

PARTIAL-RENAME BUG (reported by external review): a rename is one
atomic git operation. The iter-6 B2 fix correctly split rename payloads
so both sides were membership-checked, but then independently restored
only the disallowed side — leaving the working tree in a half-renamed
state.

Concretely:
  contract = {pdd/old.py}; sync runs `git mv pdd/old.py pdd/new.py`
  iter-6: restores only pdd/new.py → ``D pdd/old.py`` left staged
  iter-7: detects rename, restores BOTH sides → clean working tree

Inverse case (allowed=new.py) is symmetric. When BOTH sides are
in-scope the rename is left in place.
…ith restore-based scope guard

B5a (empty-contract early-exit): _revert_out_of_scope_changes used to
return [] when allowed_paths was empty — the historical "scope guard
for a different module" optimization. With Issue #1013's degenerate-
empty contract (allowed_write_set=[] meaning reject-all), this early
exit silently bypassed enforcement. Now the check applies only when
allowed_paths is non-empty.

B5b (worktree-helper partial-rename bug): revert_out_of_scope_changes_with_dirs
kept only the rename destination from "R  old -> new" entries; the
out-of-scope source side was silently deleted. Treat renames atomically
(both sides scope-checked together) and revert via `git restore
--staged --worktree --source=HEAD -- <old> <new>` so the rename is
fully undone instead of half-undone.

B6 (prompt drift prevention): pdd/prompts/agentic_common_python.prompt
item 23 documented the OLD `git checkout HEAD --` revert and said
"behavior MUST remain". Future pdd sync regeneration would have drifted
the code back to the buggy version. Updated prompt 23 and the worktree-
helper prompt to describe restore-based revert + atomic rename
treatment + empty-contract reject-all semantics. The signature contract
is unchanged.

New regression tests use real tmp_path git repos to lock in:
- empty contract reverts a rename
- partial rename revert atomic in worktree helper
- mock-based rename test updated to assert both paths
…t re-scan

Codex iter-9 review (Major): _enforce_scope_guard treated empty lists from
the revert helpers as "nothing out of scope" — but those helpers fail-open
on git timeout, permission error, or restore failure (log a warning and
return []). The module could be marked successful while the contract
violation remained on disk.

Add _remaining_out_of_scope_paths() that re-scans the worktree via
`git status --porcelain --untracked-files=all` after both revert helpers
run. Filter duplicates against the already-reported offending set. Surface
remaining paths under a new "Unrecovered (revert failed, manual cleanup
required):" diagnostic section and hard-fail the module when either set
is non-empty.

On git failure, the helper returns the sentinel ["<git-status-failed>"]
so the orchestrator still hard-fails rather than silently treating an
unobservable working tree as clean. When offending is empty but remaining
is non-empty, the diagnostic header switches to "Scope guard detected
out-of-scope artifacts ... but the revert helpers reported no successful
reverts." instead of the misleading "reverted 0 out-of-scope file(s)".

Tests: 4 new regression cases in tests/test_agentic_sync_runner.py cover
fail-open, clean tree, mixed reverted+unrecovered, and the git-status
sentinel. Prompt updated to document the fail-closed contract.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
… guard bypass

Codex iter-10 review (Major): a contract that declared
``companion_allowlist: ["*"]`` (or ``**``, ``**/*``, ``?``) would let
``_matches_companion_allowlist`` treat arbitrary repo-wide writes as
auto-allowed companion artifacts, effectively neutralizing the scope
guard.

Add ``_is_valid_companion_pattern`` in ``agentic_common`` that requires
at least one path segment with a literal character (anything outside
``*?``) and rejects absolute, traversal, Windows-separator, and empty
patterns. Apply it at parse time in ``_parse_html_comment_contract``
(silent drop, matching the existing ``allowed_paths`` style) AND at
match time in both ``AsyncSyncRunner._matches_companion_allowlist``
and ``DurableSyncRunner._out_of_scope_staged_paths`` (defense-in-depth
in case a wildcard-only pattern reaches a runner via direct
construction bypassing the parser).

The default ``DEFAULT_SYNC_COMPANION_ALLOWLIST = (".pdd/meta/*.json",)``
passes the validator — regression-tested.

Tests: parse-time validator (4 cases on wildcard-only, anchored,
absolute/traversal, and default), runner-side defense (1 case per
runner). Prompt Req 21 cites the iter-10 finding and defines what
"invalid" means for companion patterns.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
…r B-2

Codex iter-11 review:

B-1 (Blocker, fixed): _parse_fenced_block_contract treated both `text` and
`json` fences as line-separated path lists, so a JSON fence body of
`["pdd/foo.py", "tests/test_foo.py"]` parsed as the single literal path
`'["pdd/foo.py", "tests/test_foo.py"]'`, and `[]` parsed as the literal
`('[]',)` instead of the empty (reject-all) contract. Capture the fence
language in the regex's `lang` group and branch `_parse_fenced_block_contract`
so the `json` body goes through `json.loads()` (array → paths, `[]` →
empty contract, non-array/malformed → None for permissive fallback) while
the `text` body keeps its existing line-by-line semantics. Both still emit
`source="fenced-block"`. Spec sentence in agentic_common_python.prompt §21
updated so prompt regeneration won't silently reintroduce the bug.

M-1 (Major, fixed): the AsyncSyncRunner `<pdd-interface>` JSON in
agentic_sync_runner_python.prompt omitted the `contract_source: Optional[str] = None`
constructor kwarg that exists in the code (agentic_sync_runner.py:878).
Append it to the prompt's signature and to the Req 1 constructor bullet
list with a one-line description of its diagnostic purpose. PDD prompts
are source of truth; a regeneration would otherwise drop the kwarg.

B-2 (deferred): the post-revert re-scan does not include gitignored
files. This is a theoretical attack surface — sync writes prompts, tests,
and code, none of which are typically gitignored — and no concrete
trigger has been observed. To be filed as a follow-up issue; no code
change here.

Tests: 5 new fenced-block parse tests (JSON array, empty JSON array,
malformed JSON, JSON object rejection, text-fence regression). One
existing test (`test_fenced_block_accepts_only_text_or_json`) updated
because it locked in the B-1 bug by feeding a bare path into a JSON
fence.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Codex iter-13 review:

M-1+M-2 (Major, fixed): pathlib.PurePosixPath.match is suffix-based when
the pattern is relative, so the default .pdd/meta/*.json companion glob
falsely matched subdir/.pdd/meta/foo.json and a/b/c/.pdd/meta/foo.json.
A contract violator could bypass the scope guard by writing
fingerprint-shaped files under any nested directory. Empirically
verified before fixing.

Add _matches_companion_pattern_anchored() in agentic_common.py:
segment-aware, anchored at the start of the path, equal segment count
required, per-segment fnmatch.fnmatchcase. Replace the inline
PurePosixPath.match call in both AsyncSyncRunner._matches_companion_allowlist
and DurableSyncRunner._out_of_scope_staged_paths so the bug stays fixed
in one place.

Tighten _is_valid_companion_pattern to also reject ** doublestar
segments. The anchored matcher does not implement recursive globbing;
allowing ** would reintroduce the same depth-bypass foot-gun via the
backdoor. Contracts that genuinely need depth-wildcard companions
should enumerate directories explicitly.

Two-part fix (not just the matcher swap): _enforce_scope_guard now
normalizes companion candidates MODULE-relative (was repo-relative).
The pattern .pdd/meta/*.json describes fingerprint metadata at the top
of each module's working directory; in a multi-module repo (module_cwd
is a subdirectory), the file lives at mod_a/.pdd/meta/x.json relative
to the repo root but at .pdd/meta/x.json relative to the module. The
old suffix-matcher obscured this by accidentally auto-allowing the
repo-relative form — that same accident was the M-1 bug surface. The
durable runner doesn't need this normalization (its staged paths are
already worktree-rooted).

M-3 (deferred): symlink-resolved out-of-scope path in
_revert_out_of_scope_changes. This helper is used by agentic_update,
agentic_fix, agentic_crash, agentic_e2e_fix_orchestrator, and the
sync scope guard — blast radius is beyond PR #1014. No concrete
trigger (LLMs do not write symlinks). To be filed as a follow-up
issue together with iter-11 B-2 (ignored-file rescan).

Tests: 3 anchored-matcher cases, 1 doublestar-validator case, 1
async-runner nested-meta-rejection case, 1 durable-runner equivalent.
Updated iter-10 test that previously accepted **/foo.json now expects
it dropped.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Serhan-Asad and others added 16 commits May 15, 2026 12:12
Codex iter-15 review (Major): iter-14's anchored matcher fix landed in
the durable runner, but with the assumption "staged paths are already
worktree-rooted." That assumption held for single-module durable sync
but BROKE multi-module sync: when module_cwd is a subdirectory of the
worktree (e.g. `worktree/pkg` for module `pkg_mod`), staged paths
surface as `pkg/.pdd/meta/foo.json` relative to the worktree git root.
The default companion pattern `.pdd/meta/*.json` describes
MODULE-RELATIVE artifacts and does not match the worktree-rooted form,
so legitimate fingerprint metadata gets rejected — blocking valid
checkpoint commits under an otherwise correct split contract.

Thread `basename` and `module_worktree` through `_out_of_scope_staged_paths`
and resolve `module_cwd_rel` via `_module_cwd_for_worktree(...).relative_to(module_worktree)`.
When `module_cwd_rel` is non-empty:
- Staged paths NOT starting with `module_cwd_rel + "/"` are sibling-module
  artifacts and never auto-allow (preserves F1 iter-3 sibling rule).
- Staged paths within the module strip the prefix before the anchored
  companion matcher runs, so `pkg/.pdd/meta/foo.json` becomes `.pdd/meta/foo.json`
  for the match decision.

When `module_cwd_rel` is `""` or `"."` (single-module worktree), the
match stays repo-relative — preserves iter-14 single-module semantics
(`.pdd/meta/*.json` matches `.pdd/meta/foo.json`, NOT `subdir/.pdd/meta/foo.json`).

The allowed_write_paths check stays repo-relative — the contract is
declared with repo-rooted paths and unchanged.

Tests: multi-module companion auto-allow, sibling-module rejection,
single-module regression (companion still matches), single-module
nested-meta regression (iter-14 fix still in effect). Five existing
call sites updated to pass new kwargs.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Resolves merge conflicts between PR #1014 (split-contract scope guard) and
the v0.0.238 release on main (PR #1008 nested-architecture global sync +
metadata-sync fixes).

Three text conflicts (the rest auto-merged):

1. CHANGELOG.md — kept both the Unreleased `#1013 sync` entry and the
   v0.0.238 release block. No content lost.

2. pdd/prompts/agentic_sync_python.prompt — `run_global_sync` requirement
   bullet 1: kept main's nested-architecture restructure (scoped
   architecture modules via `find_architecture_for_project`,
   `_resolve_module_cwd`, display keys for multi-arch basenames) AND
   added `scope_guard: bool = True` to the signature with the kwarg-parity
   text. Bullets 2-4 are main's nested-architecture wording.

3. pdd/prompts/agentic_sync_runner_python.prompt — `AsyncSyncRunner`
   constructor signature in both the `<pdd-interface>` JSON and Req 1:
   merged main's `module_targets` positional with our scope-guard kwargs
   (`allowed_write_set`, `companion_allowlist`, `scope_guard_enabled`,
   `contract_source`). Param-bullet list already auto-merged correctly.

Python code auto-merged in agentic_sync.py / agentic_sync_runner.py: the
constructor now exposes both `module_targets` (main) and the four
scope-guard kwargs (#1014). All callers use keyword arguments so the
positional/keyword split is invisible.

Tests: 701 passing across test_agentic_common, test_agentic_sync,
test_agentic_sync_runner, test_durable_sync_runner,
test_sync_determine_operation, test_duplicate_cli_guard,
test_update_main, test_metadata_sync, test_commands_maintenance.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
…missive log

Three verified findings from external review:

Blocker — the parser only supported HTML-comment and fenced-code-block
markers, but the real-world split-contract format used in issue #1005
(the original motivation for #1013) is neither. #1005's body declares
the contract under `## Split Contract` with a `**Allowed write set:**`
label and a markdown bullet list of paths. Our parser returned None
for this and pdd sync fell back to permissive mode — the exact
regression #1013 was meant to prevent. Verified end-to-end: fetching
#1005's body and feeding it into parse_issue_contract returned
allowed_paths=('pdd/update_main.py', 'pdd/prompts/update_main_python.prompt',
'tests/test_update_main.py'), source='bullet-list' after the fix.

Add a third parser branch `_parse_bullet_list_contract` in
agentic_common.py that anchors on `## Split Contract` / `## Allowed
Write Set` headings + `**Allowed write set:**` label + a `-`/`*`/`+`
bullet list. The list terminates at the next `**Label:**`, `---`
horizontal rule, next heading, blank-then-non-bullet, or EOF. Strips
surrounding backticks. Drops invalid entries via `_is_valid_contract_path`.
Returns the iter-8 B5 empty-contract reject-all when nothing valid
remains. Priority order: HTML-comment > fenced-block > bullet-list.

Major — DurableSyncRunner called super().__init__() BEFORE pinning
self.project_root = self.git_root, so AsyncSyncRunner snapshotted
_baseline_changed_paths from the caller's Path.cwd() rather than the
durable worktree's root. Those baseline paths then got auto-allowed
during enforcement, so a dirty out.py in the caller's checkout caused
out.py generated by sync in a separate worktree to pass scope guard
with diag=None.

Add a keyword-only project_root: Optional[Path] = None to
AsyncSyncRunner.__init__; when provided it overrides Path.cwd() BEFORE
the baseline snapshot. DurableSyncRunner now passes
project_root=self.git_root in its super() call; the redundant post-super
self.project_root assignment is removed.

Minor — permissive-mode runs were logging the "no contract" message
twice (once in agentic_sync.py before dispatch, once in
AsyncSyncRunner.run() entry). Drop the runner-side log; keep the
dispatch-site log which already has richer context.

Tests: 7 new bullet-list parser cases (incl. #1005 body verbatim,
section-terminator variants, backtick stripping, HTML-comment priority);
3 new project_root tests (durable pinning, async kwarg, baseline content);
2 existing iter-3 runner tests inverted to assert the duplicate log is
absent. 664 passed across the scope-guard suite. README §"Split-Contract
Scope Guard" documents the third format; prompt § 21 updated.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Codex flagged this gap three times (iter-11 B-2, iter-17 M-1, iter-19
M-1). Earlier passes deferred it as theoretical, but the project itself
gitignores .pdd/ (main commit a7ce5f0 chore: ignore untracked pdd
metadata artifacts), so the trigger is no longer hypothetical:
- User has `build/` (or `.pdd/`) in .gitignore.
- Sync writes a file there.
- The post-revert re-scan uses `git status --untracked-files=all`
  which OMITS gitignored files.
- Scope guard returns None and the contract violation lands in the PR.

Add `_git_ignored_paths()` using
`git ls-files --others --ignored --exclude-standard` (enumerates every
ignored file, not just dirs). Snapshot `_baseline_ignored_paths` at
runner init, gated on `scope_guard_enabled AND allowed_write_paths
is not None` so non-contract runs skip the cost (potentially slow on
repos with large ignored trees like node_modules/, build/).

In `_remaining_out_of_scope_paths`, run a second scan over ignored
files after the existing `git status` scan. Skip paths in the baseline
(pre-existing — not sync's fault) and paths in `allowed_files`
(companion-allowlisted, e.g. `.pdd/meta/*.json` for users who gitignore
`.pdd/`). The remainder is treated as out-of-scope and surfaces under
the existing `Unrecovered (revert failed, manual cleanup required):`
section. The `<git-status-failed>` sentinel still applies — failure of
the ignored scan also forces a hard-fail rather than silent pass.

DurableSyncRunner left untouched: it does `git add -A` before scope
checking, so ignored files surface as staged regardless.

Tests: gitignored-out-of-scope detection, baseline-preserved-on-existing,
gitignored-companion-allowlisted-still-allowed (verifies the .pdd/
workflow the project actually uses), ignored-scan-failure-sentinel.
162 passed in test_agentic_sync_runner.py.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
…WIP doesn't leak into worktrees

Codex iter-21 caught a baseline leakage that iter-18's project_root
threading did NOT cover. iter-18 pinned the AsyncSyncRunner baseline
snapshot to self.git_root, but in production self.git_root IS the
user's main checkout — where dirty WIP lives. Per-module durable sync
then runs in .pdd/worktrees/sync-issue-<N>-<basename>/ (a separate
directory). Inheriting the main-checkout baseline meant dirty out.py
in the main checkout was treated as allowed inside fresh worktrees,
silently bypassing the split contract. External reviewer's repro
confirmed: dirty out.py in main → sync writes out.py to worktree →
scope guard returned None.

Fix: after super().__init__(), clear both _baseline_changed_paths
(iter-6 B1) and _baseline_ignored_paths (iter-20 M-1) in
DurableSyncRunner. Per-module worktrees are freshly created via
git worktree add; they have no pre-existing user WIP by construction,
so the "preserve untracked WIP" carve-out has no analog in durable
mode. Anything that surfaces in a worktree at scope-guard time was
put there by this sync run.

AsyncSyncRunner unchanged — async still runs in-place in the user's
checkout, and the baseline preserves pre-existing WIP correctly there.

Tests: empty-baseline invariant (even when git_root is dirty), and
the reviewer's repro (dirty out.py in caller cwd + out.py in separate
worktree → diagnostic NOT None). iter-18 test
test_durable_baseline_paths_use_git_root_not_caller_cwd inverted to
the new invariant; iter-18 test for dirty-files-in-durable-root
replaced by the iter-22 empty-baseline invariant.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Codex iter-23 demonstrated: dirty out-of-scope outside.py + sync
overwrite → scope guard returned None, leaving "sync clobbered"
content in place of user WIP. The iter-6 B1 baseline-preservation
rule (don't delete pre-existing untracked files) was implemented as
name-based pass: any path in _baseline_changed_paths was added to
allowed_files unconditionally, so an LLM that wrote to an out-of-scope
path which happened to coincide with the user's dirty WIP bypassed
the contract check silently.

Make baseline preservation content-aware. _baseline_changed_paths and
_baseline_ignored_paths become Dict[path, Optional[str]] mapping
repo-relative path → init-time SHA-1. At scope-guard time, re-hash each
baseline path; only auto-allow when current SHA matches init-time SHA.
Clobbered (different SHA) and deleted (None) entries fall through to
the normal contract check — sync's out-of-scope writes now surface
in the diagnostic even when they happen to coincide with a baseline
filename. Unreadable-at-init paths (init SHA was None) fall back to
the legacy name-based preserve to avoid false-positives on
permission-flaky paths.

Cost: hashlib.sha1 over each baseline path, only when
scope_guard_enabled AND allowed_write_paths is not None. Baseline
sets are typically small (user's local dirty WIP, not whole tree).

DurableSyncRunner's iter-22 baseline-clearing now uses empty dicts
({}) instead of empty sets — the iteration is a no-op either way,
but the type swap matches the new invariant. Verified the existing
durable test_baseline_remains_empty_after_init continues to pass.

Verified the codex repro is closed: before this fix `diag: None /
outside: sync clobbered`; after, diagnostic surfaces outside.py in
the Unrecovered block.

Tests: clobber-detected, unchanged-still-allowed (iter-6 B1
regression), deleted-baseline-drops-from-allowed, durable-baseline
empty-dict invariant. 673 passed across the scope-guard suite.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
…rrections

Reviewer caught a real contract bypass: pdd/agentic_sync.py:1903 calls
_apply_architecture_corrections() BEFORE any runner exists, modifying
architecture.json at the orchestrator level. The per-module scope guard
never sees this write. Plus the "All modules are already synced — nothing
to do." early return at line 1973 short-circuits the dispatch entirely
without running enforcement. Reviewer reproduced: contract allowed only
pdd/foo.py, LLM dependency-correction modified architecture.json, sync
returned success with M architecture.json in git status.

Add an orchestrator-level scope gate around the deps-correction call.
Architecture.json may be modified iff:
- No contract parsed (issue_contract is None → permissive mode), OR
- The caller passed --no-scope-guard (scope_guard is False → opt-out), OR
- architecture.json is explicitly in the contract's allowed_paths.

Otherwise, emit an explicit skip warning telling the operator to add
architecture.json to the contract or rerun with --no-scope-guard, and
do not invoke _apply_architecture_corrections. This closes the bypass
WITHOUT needing to revert post-hoc — the orchestrator simply refuses
the only out-of-contract write it can perform. The "already synced"
early return now has nothing to enforce because the orchestrator never
performs the write in the first place.

Plus docs drift: CHANGELOG.md:5 listed only HTML-comment and fenced-block
contract formats; the iter-18 bullet-list format ("## Split Contract"
+ "**Allowed write set:**" + bullets) was missing. _extract_allowed_write_paths
docstring at agentic_sync.py:1543 had the same omission.

Tests: 4 codepath cases (skipped/applied/no-contract/opt-out) plus a
defensive "already-synced early-return does not leak arch changes"
test that uses a real `git init` tmp repo and asserts `git status` is
clean for architecture.json. 678 passed across the scope-guard surface.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Codex iter-27 found two BLOCKERS in the pre-runner orchestrator path
that bypass scope-guard enforcement entirely.

B-1: pdd/agentic_sync.py:1284 executes an LLM-suggested ``pdd sync``
shell command to validate the cwd for a dry-run failure. The cmd is
passed through to ``subprocess.run(shell=True)`` verbatim. If the LLM
omits ``--dry-run`` (or returns it without the flag), the validation
step actually performs a real sync and writes files BEFORE the
scope-guarded runner is constructed.

Add ``_inject_dry_run_flag()`` that injects ``--dry-run`` into every
``pdd sync`` invocation in the suggested command using a regex with
a positional lookahead so we don't match ``pdd sync-architecture`` or
``pdd synchronize``. Idempotent when the flag is already present.
Paranoia check refuses to execute if injection didn't land. LLM prompt
template updated to require ``--dry-run`` and note auto-injection.

B-2: the iter-26 ``arch_in_scope`` gate compared the literal string
``"architecture.json"`` against the contract. But ``arch_path`` can be
a nested file like ``frontend/architecture.json``, in which case the
literal string is NOT in ``allowed_paths`` yet the gate passes the
write because the contract names ``architecture.json`` (root only) —
or, in the reverse, blocks legitimate nested-arch writes when the
contract correctly names ``frontend/architecture.json``.

Add ``_arch_path_in_scope(arch_path, project_root, issue_contract,
scope_guard)`` that resolves the ACTUAL ``arch_path`` to a repo-
relative POSIX string and compares against ``issue_contract.allowed_paths``.
Returns False when arch_path resolves outside project_root.

Tests: B-1 — injection cases (no-flag, has-flag, cd-chained, must-
not-match sync-architecture) plus 7 helper-level cases. B-2 — nested
arch in/out-of-contract, arch outside project root, plus 7 helper-
level cases. 699 passed across the scope-guard surface.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Closes the entire class of pre-dispatch contract bypasses that
iter-26/-28/-29 surfaced one site at a time. The previous fixes
each gated a single orchestrator write site; codex kept finding new
ones (architecture corrections → LLM-suggested shell cmd → nested
arch path → write-capable identify-modules → free-form shell). The
structural issue: run_agentic_sync does write-capable work BEFORE
the runner exists, so AsyncSyncRunner._enforce_scope_guard cannot
see those writes.

PART 1 — Replace LLM shell execution with safe argv.

iter-28's --dry-run injection was the wrong shape: shell=True with
an LLM-provided string is the actual wound, and injection doesn't
stop `rm`, redirects, or chained writes (codex iter-29 B-2).

Rewrite the prompt template (agentic_sync_fix_dry_run_LLM.prompt)
to ask for SYNC_CWD: <path> only — the LLM identifies the directory,
nothing else. Parse and validate (no shell metachars; resolves under
project_root). Build the argv ourselves: [pdd_exe, --force, sync,
basename, --dry-run, --agentic, --no-steer]. Run with shell=False.
LLM-controlled shell execution is gone by construction. Legacy
SYNC_CMD: format produces an explicit migration error so stale
cached responses surface a clear retry hint. _inject_dry_run_flag
helper deleted — it has no surface anymore.

PART 2 — Orchestrator-level scope guard.

Add _enforce_orchestrator_scope in agentic_sync.py. Snapshot the
working tree at run_agentic_sync entry — _git_changed_paths +
_git_ignored_paths with SHA fingerprints (iter-24 shape, shared via
new _hash_baseline_paths helper in agentic_sync_runner.py). At every
early-return BEFORE runner dispatch, revert anything outside the
contract + companion allowlist + baseline. Same primitives the
per-module guard uses: _revert_out_of_scope_changes (tracked),
revert_out_of_scope_changes_with_dirs (untracked), iter-9 fail-
closed re-scan for unrecovered paths, iter-14 anchored companion
matcher, iter-24 SHA-aware baseline preservation. 9 early-return
sites wrapped via _orch_scope_check_return helper. Dispatch sites
(2164/2166) intentionally not wrapped — the runner's own guard
handles enforcement once it exists.

Gated on (scope_guard AND issue_contract is not None) so non-
contract runs and --no-scope-guard opt-out pay zero cost.

iter-26's _arch_path_in_scope gate retained as defense-in-depth —
the orchestrator guard would also catch out-of-contract arch.json
writes, but the gate prevents them in the first place, which reads
better in operator output.

Tests: 6 unit cases on _enforce_orchestrator_scope (revert/preserve/
clobber/permissive/opt-out/companion auto-allow); 5 integration
cases on run_agentic_sync with the new wrap; 4 cases for the new
safe argv path. iter-28's TestInjectDryRunFlag and
TestLlmFixDryRunInjection deleted (the helper they exercised is
gone). 703 passed across the scope-guard suite.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Codex iter-31 caught the gap iter-30 left: the unified orchestrator
guard wrapped 9 early-return sites but NOT the successful-dispatch
path. Any pre-dispatch write that survived past every early-return
check would reach the runner, where AsyncSyncRunner.__init__
snapshots it as `_baseline_changed_paths` and preserves it (auto-
allows) for the rest of the sync session.

Insert one `_enforce_orchestrator_scope` call immediately before the
`if durable:` runner-class branch. If diagnostic is non-None, post
the standard error comment to the issue (when github state is
enabled) and return False with the diagnostic. Aborts dispatch
cleanly before any runner is constructed.

Verified there is exactly one intervening block (the durable/async
branch itself) between the iter-30 entry baseline and the runner
constructions, so a single check above the branch covers both
DurableSyncRunner and AsyncSyncRunner paths.

Tests: dispatch blocked when pre-dispatch writes out-of-contract;
dispatch allowed when clean; no-op with no contract; no-op with
--no-scope-guard. 707 passed. Five pre-existing
TestDependencyCorrectionsScopeGuard tests adjusted to init a clean
tmp_path git repo so they don't trip on the new dispatch-boundary
sweep when run against the real dirty worktree.

The dispatch-boundary check coexists with iter-30's early-return
wraps — both required because early returns never reach the dispatch
boundary.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Pulls in main's `e77769262 fix(ci-heal): scope auto-heal staging` which
addresses codex iter-33 M-1 (`commit_and_push` regressed to `git add -A`
for ci-drift-heal). M-2 (review-loop fixer staging) may also be addressed
in adjacent commits.

Auto-merge clean — no conflicts. 713 passed across the scope-guard surface.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
…line files

Codex iter-33 caught a data-loss blind spot in the iter-24 hash-aware
baseline preservation: when `_hash_file` returns None at scope-guard
time, iter-24 just `continue`'d (dropped the baseline entry), assuming
either git status would surface a deleted-tracked file as `D` or the
file genuinely became out-of-scope. For UNTRACKED and IGNORED baseline
paths git has no record at all — a deletion left zero trail, the
scope guard returned None, and the user's pre-existing WIP was
silently lost.

Trigger (verifiable): user has dirty untracked `userwip.py`. Sync
deletes it (LLM moves/removes during refactor). Scope guard saw
`current_hash is None` → dropped baseline → returned None → module
passed clean. `userwip.py` is gone, no diagnostic.

Fix: collect deleted baseline paths into a `baseline_deleted` set
during both baseline iterations (changed AND ignored). At diagnostic-
build time, union with the existing `remaining_raw` re-scan set
(dedup is fine — overlap covers tracked-deletion cases where git
status would also surface the `D`). The new symmetric pass over
`_baseline_ignored_paths` is mandatory because `git ls-files
--ignored` only lists files that still exist; without this pass a
deleted ignored baseline would never appear in the ignored-rescan
loop.

Tests: untracked baseline deletion flagged; ignored baseline
deletion flagged; iter-24 unchanged-file preservation regression
still passes. 715 passed. Two pre-existing tests
(test_clean_working_tree_returns_none,
test_deleted_companion_in_git_status_is_preserved) explicitly clear
the baseline dicts to match their stated "clean working tree"
intent — previously they relied on the iter-24 silent-drop branch
that this fix removes.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Codex iter-35 found three Blockers where the iter-30/32/34 orchestrator
guard didn't fully mirror the per-module runner-side safety features.

B-1: PDD's own `run_agentic_task` writes `.pdd/agentic-logs/session_*.jsonl`
audit logs during EVERY pre-dispatch LLM call. The orchestrator scope
guard treated them as out-of-contract and hard-failed otherwise clean
contracted runs.

B-2: `AsyncSyncRunner._record_result` writes `.pdd/agentic_sync_state.json`
after each per-module scope guard. In a multi-module sync, the state
file is on disk when the NEXT module's guard runs and the next module
would hard-fail on the previous module's state file.

B-3: `_enforce_orchestrator_scope` iterated baseline dicts but on
`current_hash is None` it silently `continue`'d, dropping the deleted
entry. iter-34 closed this for the runner guard via a `baseline_deleted`
set unioned into the unrecovered diagnostic — the orchestrator guard
didn't mirror that pattern. Pre-dispatch LLM/subprocess deletion of
pre-existing user WIP would pass the orchestrator guard silently.

Fix: add `PDD_INTERNAL_PATH_ALLOWLIST` constant in agentic_common as a
SEPARATE concept from the user-facing `DEFAULT_SYNC_COMPANION_ALLOWLIST`.
Internal allowlist is fixed (not user-extensible) and represents tool
infrastructure (`.pdd/agentic-logs/*`, `.pdd/agentic_sync_state.json`,
`.pdd/bug-state/*`, `.pdd/checkup-review-loop/*`). Wire it through both
guards via the existing anchored matcher so `subdir/.pdd/agentic-logs/`
still does NOT match (anchoring preserved). Per-module guard scans
repo-rooted (not module-rooted) so a top-level audit log matches even
when module_cwd is a subdirectory.

DurableSyncRunner inherits `_enforce_scope_guard` from AsyncSyncRunner
so B-1/B-2 fix applies transitively — no separate durable change.

Mirror iter-34's `baseline_deleted` set into the orchestrator guard
for both `_orch_baseline_changed` and `_orch_baseline_ignored`. Union
into the existing "Unrecovered (revert failed, manual cleanup required)"
diagnostic section — parity with runner-side wording.

Tests: audit log auto-allowed at orchestrator level; audit log
auto-allowed at runner level (including the multi-module anchoring
asymmetry); state file auto-allowed at runner level; deleted untracked
baseline flagged; deleted ignored baseline flagged. 721 passed.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Codex iter-37 (last remaining major after iter-36 closed parity gaps):
when ``_git_changed_paths`` / ``_git_ignored_paths`` fails at runner /
orchestrator ``__init__`` (transient git timeout, missing binary,
OSError, non-zero return), the helpers returned ``set()`` —
indistinguishable from "scan succeeded, worktree clean." The empty
baseline was stored. Later enforcement-time probes typically succeed
(transient passed), and the scope guard treated the empty baseline as
"user had nothing dirty," so any pre-existing user file was falsely
flagged as out-of-scope and reverted/deleted. Same fail-open pattern
as iter-9 sentinel but at init-time rather than enforcement-time.

Change helper return type from ``set[str]`` to ``Optional[set[str]]``.
``None`` signals failure; empty set still signals "clean worktree."
Runner/orchestrator init records ``_baseline_acquisition_failed``
flag when either helper returned ``None`` (only when scope_guard is
enabled with a contract — permissive mode skips the scan and the gate).
``AsyncSyncRunner.run()`` short-circuits with a fail-closed message
before any write-capable work. ``run_agentic_sync`` mirrors the gate
before any pre-dispatch LLM/shell work AND posts the GitHub comment.
DurableSyncRunner inherits via super().__init__()/run().

Enforcement-time call sites (iter-20 ignored re-scan, iter-9 fail-
closed boundary) treat ``None`` as ``or set()`` since their separate
``<git-status-failed>`` sentinel handles those policy decisions.

Prompt drift fix in agentic_sync_runner_python.prompt §44-46
documents the new helper signature and ``_baseline_acquisition_failed``
contract.

Tests: 6 runner-level cases (changed-None, ignored-None, OSError,
permissive skip, scope-guard-disabled skip, success-empty regression)
plus 4 orchestrator-level cases (changed-None abort, ignored-None
abort, success regression, permissive skip). 731 passed.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
…le init ordering

Codex iter-39 (2 Majors, no Blockers — strong convergence signal):

M-1: _hash_file() returned None for BOTH "file gone" and "file
unreadable" (permission error). The iter-34 deletion-detection code
treated any None as deletion, so a baseline file that became
unreadable mid-sync (permission flip, locked file) got misclassified
as deleted and the diagnostic falsely claimed the file was removed.

Add sibling helper _classify_baseline_path() returning a discriminated
_BaselinePathStatus(sha, missing) NamedTuple. Route the 4 iter-34
baseline-iteration sites (per-module changed + ignored loops;
orchestrator changed + ignored loops) through it. Unreadable files
now preserve-by-name (same as iter-24's unreadable-at-init carve-out)
instead of being misclassified as deletions. Three other _hash_file
callers (_hash_baseline_paths and two collapsed-None re-scan loops)
intentionally keep _hash_file since their fall-through semantics
are already safe.

M-2: DurableSyncRunner.run() called _prepare_durable_branch() BEFORE
checking _baseline_acquisition_failed. A baseline-scan failure left
durable side effects (worktree creation, branch checkout, possibly
remote pushes) before the inherited fail-closed abort ran.

Hoist the _baseline_acquisition_failed check above
_prepare_durable_branch() in DurableSyncRunner.run() so the abort
happens BEFORE any durable setup. Iter-22's {} baseline clear
preserves the flag, so it correctly reflects the main-checkout
scan failure (where the orchestrator scope guard operates).

Tests: unreadable-not-misclassified (with chmod, skipped on
Windows); missing-still-flagged-as-deleted regression for iter-34;
durable-aborts-before-worktree-setup. 734 passed.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
…eckpoint validation

Codex iter-41 — same "async fix needs durable mirror" pattern as
iter-14→iter-16 and iter-22 baseline-clear. iter-36 added
PDD_INTERNAL_PATH_ALLOWLIST (`.pdd/agentic-logs/*`,
`.pdd/agentic_sync_state.json`, etc.) to the async per-module
scope guard so PDD's own infrastructure writes don't trip the
guard. The durable runner's _out_of_scope_staged_paths and
_unsafe_staged_paths didn't honor that allowlist — contracted
durable sync hard-failed at checkpoint on PDD's own audit logs
and state file.

Import the allowlist in durable_sync_runner. In both
_out_of_scope_staged_paths and _unsafe_staged_paths, check
internal-allowlist patterns BEFORE the contract/unsafe rules and
treat matches as in-scope. Patterns are repo-root-anchored so the
check runs against `normalized` directly (no module_cwd prefix
stripping needed). Anchored matcher enforces equal segment count
so a nested `packages/app/.pdd/agentic_sync_state.json` does NOT
match the root `.pdd/agentic_sync_state.json` pattern — existing
nested-rejection test continues to pass unchanged.

Confirmed via grep that durable runner only force-adds
`.pdd/meta/<safe>_*.json` via _force_add_module_metadata; under
normal `git add -A` the gitignored `.pdd/` tree is skipped, so
the validation-skip approach is sufficient (no separate staging
exclusion needed).

Tests: audit-log not flagged, state-file not flagged, unrelated
.pdd/random/junk.txt still flagged, unsafe-rules skip internal
allowlist. One pre-existing
test_unsafe_staged_paths_rejects_sensitive_artifacts updated to
reflect the new behavior. 738 passed.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

pdd sync must enforce split-contract allowed write sets

2 participants