diff --git a/CHANGELOG.md b/CHANGELOG.md index aeb61b1b6..52058512a 100644 --- a/CHANGELOG.md +++ b/CHANGELOG.md @@ -1,3 +1,9 @@ +## Unreleased + +### Fix + +- **#1013 sync**: enforce split-contract allowed write sets. When the linked GitHub issue declares an allowed write set (HTML comment ``, a fenced "Allowed Write Set" / "Split Contract" block, or a `## Split Contract` heading with an `**Allowed write set:**` label followed by a bullet list), `pdd sync` now reverts tracked changes and removes untracked new files that fall outside the contract after each per-module subprocess, hard-fails the module on out-of-scope artifacts, and surfaces the contract source plus offending paths in checkup/review-loop reports. Companion artifacts under `.pdd/meta/*.json` are auto-allowed; additional companions can be opted in via the contract's `companion_allowlist` field. Use `--no-scope-guard` to opt out for a single run. Issues without a contract marker remain in permissive mode (no enforcement). + ## v0.0.238 (2026-05-14) ### Feat diff --git a/README.md b/README.md index 14d3fd234..572c80eac 100644 --- a/README.md +++ b/README.md @@ -872,6 +872,7 @@ Options: - `--durable-branch TEXT`: Durable mode only. Override the durable checkpoint branch name. Default is `sync/issue-` derived from the GitHub issue. Refused if it resolves to `main`, `master`, or the repository default branch. - `--no-resume`: Durable mode only. Ignore existing `PDD-Sync-Checkpoint-V1` commit trailers on the durable branch and re-run every selected module. By default, durable sync reads checkpoint trailers (`PDD-Sync-Checkpoint-V1: issue= module=`) and skips modules already checkpointed for the same issue, which is what makes a cloud rerun safely resume completed work after a partial failure. - `--durable-max-parallel INT`: Durable mode only. Cap how many module worktrees run concurrently. Defaults to the standard runner concurrency. A total budget still forces sequential execution. +- `--no-scope-guard`: Issue-sync only. Disable the split-contract scope guard for this run. By default, when the linked GitHub issue declares an allowed write set (split contract), `pdd sync` enforces it and rejects out-of-scope generated artifacts. Pass this flag only when intentionally overriding contract enforcement (e.g. recovering from a stale contract). See "Split-Contract Scope Guard" below. **Durable Issue Sync** (`--durable`): @@ -1060,6 +1061,61 @@ Options (agentic mode): **Cross-Machine Resume**: Workflow state is stored in a hidden GitHub comment, enabling resume from any machine. Use `--no-github-state` to disable. +**Split-Contract Scope Guard** (Issue #1013): + +When the linked GitHub issue declares an allowed write set (a "split contract"), `pdd sync` enforces it: each per-module subprocess is followed by a scope check that reverts tracked changes and removes untracked new files that fall outside the contract. Companion artifacts under `.pdd/meta/*.json` are auto-allowed because they are sync's own fingerprint bookkeeping; issues may opt additional companions (e.g. examples or architecture entries) into the allowlist explicitly. + +The contract is read from the issue body or any of its comments in one of three forms (tried in priority order — the first match wins): + +1. An HTML-comment block (preferred — invisible in rendered Markdown): + ```html + + ``` +2. A fenced code block under a heading like `### Allowed Write Set` or `### Split Contract`: + ```text + pdd/update_main.py + pdd/prompts/update_main_python.prompt + tests/test_update_main.py + ``` +3. A bullet list under an inline `**Allowed write set:**` label (the + real-world shape used by sub-issues such as #1005): + + ```markdown + ## Split Contract + **Command sequence:** change → sync + **Allowed write set:** + - `pdd/update_main.py` + - `pdd/prompts/update_main_python.prompt` + - `tests/test_update_main.py` + **Acceptance criteria:** + - ... + ``` + + The heading regex is the same as form 2; the inline `**Allowed write set:**` label discriminates the bullet list so unrelated bullets earlier in the body (e.g. a `## Files` section) are NOT captured. Each bullet is one repo-relative POSIX path with optional surrounding backticks. The list terminates at the next `**Label:**` (such as `**Acceptance criteria:**`), a `---` rule, another heading, a non-blank non-bullet line, or end of body. + +When an out-of-scope change is detected, the run records a hard failure for that module with a diagnostic of the form: + +``` +Scope guard reverted N out-of-scope file(s) for module '' (contract source: ): + - path/relative/to/repo + - another/path +Allowed write set: + - path/from/contract +Companion allowlist: + - .pdd/meta/*.json +``` + +This blocks the per-module success record so dependent modules do not schedule on top of an out-of-scope sync, and checkup/review-loop reports surface the failure instead of letting unrelated artifacts land in the PR. When no contract marker is present, the scope guard falls back to permissive mode — no enforcement, no reverts — preserving existing behavior for issues that have not opted in. Use `--no-scope-guard` to disable enforcement for a single run when you intentionally need to override the contract. + ### 1a. sync-architecture Sync `architecture.json` from prompt metadata tags (``, ``, and ``). This is useful after editing prompt metadata directly, or after backfilling prompt tags, so the architecture graph and command metadata stay aligned with the prompts. diff --git a/architecture.json b/architecture.json index 27696670d..20fc652f0 100644 --- a/architecture.json +++ b/architecture.json @@ -128,6 +128,16 @@ "name": "clear_workflow_state", "signature": "(cwd: Path, issue_number: int, workflow_type: str, state_dir: Path, repo_owner: str, repo_name: str, use_github_state: bool = True) -> None", "returns": "None" + }, + { + "name": "parse_issue_contract", + "signature": "(issue_body: Optional[str], issue_comments: Optional[List[str]] = None) -> Optional[IssueContract]", + "returns": "Optional[IssueContract]" + }, + { + "name": "_revert_out_of_scope_changes", + "signature": "(cwd: Path, allowed_paths: set[Path]) -> List[Path]", + "returns": "List[Path]" } ] } @@ -7264,7 +7274,7 @@ "functions": [ { "name": "run_agentic_sync", - "signature": "(issue_url: str, *, verbose: bool, quiet: bool, budget: Optional[float], skip_verify: bool, skip_tests: bool, dry_run: bool, agentic_mode: bool, no_steer: bool, max_attempts: Optional[int], timeout_adder: float, use_github_state: bool, one_session: bool, reasoning_time: Optional[float], durable: bool, durable_branch: Optional[str], no_resume: bool, durable_max_parallel: Optional[int]) -> Tuple[bool, str, float, str]", + "signature": "(issue_url: str, *, verbose: bool, quiet: bool, budget: Optional[float], skip_verify: bool, skip_tests: bool, dry_run: bool, agentic_mode: bool, no_steer: bool, max_attempts: Optional[int], timeout_adder: float, use_github_state: bool, one_session: bool, reasoning_time: Optional[float], durable: bool, durable_branch: Optional[str], no_resume: bool, durable_max_parallel: Optional[int], scope_guard: bool = True) -> Tuple[bool, str, float, str]", "returns": "Tuple[bool, str, float, str]", "sideEffects": [ "None" @@ -7272,7 +7282,7 @@ }, { "name": "run_global_sync", - "signature": "(*, verbose: bool, quiet: bool, budget: Optional[float], skip_verify: bool, skip_tests: bool, agentic_mode: bool, no_steer: bool, max_attempts: Optional[int], dry_run: bool, target_coverage: Optional[float], one_session: bool, local: bool, timeout_adder: float) -> Tuple[bool, str, float, str]", + "signature": "(*, verbose: bool, quiet: bool, budget: Optional[float], skip_verify: bool, skip_tests: bool, agentic_mode: bool, no_steer: bool, max_attempts: Optional[int], dry_run: bool, target_coverage: Optional[float], one_session: bool, local: bool, timeout_adder: float, scope_guard: bool = True) -> Tuple[bool, str, float, str]", "returns": "Tuple[bool, str, float, str]", "sideEffects": [ "Runs AsyncSyncRunner for stale modules unless dry_run=True; timeout_adder is forwarded via sync_options so --timeout-adder stretches the per-module wall-clock cap on the global-sync path the same way it does on run_agentic_sync" @@ -7324,7 +7334,7 @@ "functions": [ { "name": "AsyncSyncRunner", - "signature": "(basenames: List[str], dep_graph: Dict[str, List[str]], sync_options: Dict[str, Any], github_info: Optional[Dict[str, Any]], quiet: bool = False, verbose: bool = False, issue_url: Optional[str] = None, module_cwds: Optional[Dict[str, Path]] = None, initial_cost: float = 0.0)", + "signature": "(basenames: List[str], dep_graph: Dict[str, List[str]], sync_options: Dict[str, Any], github_info: Optional[Dict[str, Any]], quiet: bool = False, verbose: bool = False, issue_url: Optional[str] = None, module_cwds: Optional[Dict[str, Path]] = None, initial_cost: float = 0.0, *, allowed_write_set: Optional[Iterable[str]] = None, companion_allowlist: Optional[Iterable[str]] = None, scope_guard_enabled: bool = True)", "returns": "AsyncSyncRunner", "sideEffects": [ "Initializes runner state; total_budget in sync_options forces sequential scheduling and per-child/per-retry remaining-budget caps" diff --git a/pdd/agentic_common.py b/pdd/agentic_common.py index f3bb88b9f..b0707905a 100644 --- a/pdd/agentic_common.py +++ b/pdd/agentic_common.py @@ -1,5 +1,6 @@ from __future__ import annotations +import fnmatch import functools import os import signal @@ -15,7 +16,7 @@ from datetime import datetime from pathlib import Path from typing import List, Optional, Tuple, Dict, Any, Union -from dataclasses import dataclass +from dataclasses import dataclass, field from rich.console import Console @@ -44,6 +45,36 @@ def _load_model_data(*args, **kwargs): # when LLMs quote/discuss a status without declaring it (Issue #865). _SEMANTIC_TAIL_LINES = 30 +# Issue #1013 — sync scope guard: glob patterns for companion artifacts that +# ``pdd sync`` MAY touch as legitimate metadata even when an issue split +# contract narrows the primary write set. Only fingerprint metadata under +# ``.pdd/meta/`` is auto-allowed; everything else (architecture.json, examples, +# unrelated prompts, README/CHANGELOG, etc.) must be opted-in by the contract's +# own ``companion_allowlist`` field. +DEFAULT_SYNC_COMPANION_ALLOWLIST: Tuple[str, ...] = (".pdd/meta/*.json",) + +# Issue #1013 iter-36 B-1/B-2: PDD's own infrastructure writes during a +# guarded sync run (audit logs, runner state, etc.). These are NEVER part +# of a contract — they're internal artifacts the tool produces as side +# effects of running. The scope guard (both orchestrator-level and +# per-module) MUST auto-allow them or it would hard-fail every contracted +# run. +# +# Distinct from DEFAULT_SYNC_COMPANION_ALLOWLIST: the user-facing default +# may be widened by an issue's ``companion_allowlist`` field; this set is +# fixed tool infrastructure and is NOT user-extensible. Patterns here are +# always interpreted as REPO-ROOT-anchored (matched against a path +# computed relative to the repo root), not module-relative, because the +# infrastructure writes happen at the top of the project regardless of +# which module is being synced. +PDD_INTERNAL_PATH_ALLOWLIST: Tuple[str, ...] = ( + ".pdd/agentic-logs/*", # session audit logs (run_agentic_task) + ".pdd/agentic-logs/*/*", # nested per-task subdirs if any + ".pdd/agentic_sync_state.json", # runner state file + ".pdd/bug-state/*", # bug command state + ".pdd/checkup-review-loop/*", # checkup state +) + # Semantic fallback patterns for when LLMs paraphrase instead of emitting exact tokens. # Each token maps to a list of regex patterns that capture common paraphrases. # Patterns are checked only after exact and case-insensitive matching fail, @@ -2239,7 +2270,13 @@ def _revert_out_of_scope_changes( List of paths that were reverted. """ cwd_str = str(cwd.resolve()) - if not any(str(p).startswith(cwd_str) for p in allowed_paths): + # Iter-8 B5a (empty contract reject-all): when ``allowed_paths`` is + # non-empty, skip when none of the entries fall under *cwd* — that is + # the historical "scope guard for a different module" optimization. + # When ``allowed_paths`` is EMPTY, however, the caller is asking for a + # reject-all sweep (Issue #1013 degenerate-empty contract). Don't + # short-circuit; proceed with revert. + if allowed_paths and not any(str(p).startswith(cwd_str) for p in allowed_paths): return [] try: result = subprocess.run( @@ -2260,32 +2297,586 @@ def _revert_out_of_scope_changes( for line in result.stdout.splitlines(): if len(line) < 4: continue - rel_path = line[3:].strip() - full_path = (cwd / rel_path).resolve() - if full_path not in allowed_paths: - to_restore.append(rel_path) - reverted.append(full_path) + payload = line[3:].strip() + # Iter-6 B2 (rename revert bug): ``git status --porcelain`` reports + # renames as ``R old -> new``. Treating the whole payload as one + # path caused ``git checkout HEAD --`` to be called with a literal + # ``"old -> new"`` arg, which silently failed and left the rename + # in place. Split renames so both source and destination surface. + if " -> " in payload: + old_raw, new_raw = payload.split(" -> ", 1) + entry_paths = [old_raw.strip().strip('"'), new_raw.strip().strip('"')] + is_rename = True + else: + entry_paths = [payload.strip('"')] + is_rename = False + entry_paths = [p for p in entry_paths if p] + if not entry_paths: + continue + # Iter-7 B4 (partial-rename revert): a rename is an atomic git + # operation. If EITHER side of the rename is out of scope, the + # rename as a whole has to be undone — restoring only one side + # leaves the working tree in a half-renamed state (``D old`` or + # ``A new`` depending on which side was in scope). + full_paths = [(cwd / rel).resolve() for rel in entry_paths] + out_of_scope = any(fp not in allowed_paths for fp in full_paths) + if is_rename: + if out_of_scope: + for rel, fp in zip(entry_paths, full_paths): + to_restore.append(rel) + reverted.append(fp) + else: + for rel, fp in zip(entry_paths, full_paths): + if fp not in allowed_paths: + to_restore.append(rel) + reverted.append(fp) if to_restore: + # Iter-6 B2: use ``git restore --staged --worktree --source=HEAD`` + # so staged renames are correctly undone (``git checkout HEAD --`` + # cannot remove a rename destination unknown to HEAD). Falls back + # to the legacy command for git < 2.23. + checkout_failed = False try: - subprocess.run( - ["git", "-C", str(cwd), "checkout", "HEAD", "--"] + to_restore, + restore_result = subprocess.run( + ["git", "-C", str(cwd), "restore", + "--staged", "--worktree", "--source=HEAD", "--"] + to_restore, capture_output=True, timeout=30, ) + if restore_result.returncode != 0: + stderr = (restore_result.stderr or b"").decode(errors="replace") + if "'restore' is not a git command" in stderr: + legacy = subprocess.run( + ["git", "-C", str(cwd), "checkout", "HEAD", "--"] + + to_restore, + capture_output=True, timeout=30, + ) + if legacy.returncode != 0: + checkout_failed = True + _scope_guard_logger.warning( + "Scope guard: legacy git checkout returned %d " + "for %d file(s): %s", + legacy.returncode, len(to_restore), + (legacy.stderr or b"").decode(errors="replace").strip(), + ) + else: + checkout_failed = True + _scope_guard_logger.warning( + "Scope guard: git restore returned %d for %d file(s): %s", + restore_result.returncode, len(to_restore), stderr.strip(), + ) except (subprocess.TimeoutExpired, FileNotFoundError, OSError) as exc: _scope_guard_logger.warning( - "Scope guard: git checkout failed for %d file(s): %s", + "Scope guard: git restore failed for %d file(s): %s", len(to_restore), exc, ) + checkout_failed = True + if checkout_failed: reverted.clear() - else: - if reverted: - _scope_guard_logger.info( - "Scope guard reverted %d out-of-scope file(s): %s", - len(reverted), - ", ".join(str(p.name) for p in reverted[:10]), - ) + elif reverted: + _scope_guard_logger.info( + "Scope guard reverted %d out-of-scope file(s): %s", + len(reverted), + ", ".join(str(p.name) for p in reverted[:10]), + ) return reverted + +# --------------------------------------------------------------------------- +# Issue #1013 — split-contract scope guard: issue body / comment parser +# --------------------------------------------------------------------------- + +@dataclass(frozen=True) +class IssueContract: + """ + Parsed split-contract declaration extracted from a GitHub issue body or + comment. + + Attributes: + allowed_paths: Repo-relative POSIX path strings the linked sync run is + permitted to modify as its primary write set. Resolved against the + module's repo root by the caller (this dataclass does NOT resolve + to absolute filesystem paths). + companion_allowlist: Glob patterns (e.g. ``".pdd/meta/*.json"``) for + companion artifacts the run MAY touch outside the primary write + set. The caller unions this with + :data:`DEFAULT_SYNC_COMPANION_ALLOWLIST` to produce the effective + allowlist. + source: Diagnostic label describing where the contract was detected + (currently ``"html-comment"``, ``"fenced-block"``, or + ``"bullet-list"``). + """ + + allowed_paths: Tuple[str, ...] + companion_allowlist: Tuple[str, ...] + source: str + + +# Matches the heading line introducing a fenced-block contract. Case-insensitive +# multiline match so the heading can be ``## Allowed Write Set``, +# ``# split-contract``, etc. +_FENCED_BLOCK_HEADER_RE = re.compile( + r"^\s*(?:#+\s*)?(?:allowed[\s_-]*write[\s_-]*set|split[\s_-]*contract)\b.*$", + re.IGNORECASE | re.MULTILINE, +) + +# Matches the HTML-comment JSON block, e.g.:: +# +# +_HTML_COMMENT_CONTRACT_RE = re.compile( + r"", + re.DOTALL, +) + +# Matches a fenced code block (```text``` or ```json```) that IMMEDIATELY +# follows the heading. Only whitespace/newlines may precede the fence +# (anchored via ``\A`` once we slice the text after the heading); the info +# string MUST be ``text`` or ``json`` per spec (Issue #1013, iter-3 F3 — bare +# fences are rejected). Captures the language (``text`` / ``json``) into the +# named ``lang`` group so the parser can branch on the body format +# (Issue #1013 iter-12 B-1: a ``json`` body must be parsed as a JSON array, +# not as a line-separated path list). +_FENCED_BLOCK_RE = re.compile( + r"\A\s*```(?Ptext|json)[ \t]*\n(?P.*?)```", + re.DOTALL, +) + +# Issue #1013 iter-18 B-1: third declaration format — bullet-list contract. +# Matches the bold inline label ``**Allowed write set:**`` that introduces the +# bullet list. Anchored to its own line; surrounding whitespace tolerated; +# optional whitespace inside the bold delimiters. +_BULLET_LIST_LABEL_RE = re.compile( + r"^\s*\*\*\s*allowed\s+write\s+set\s*:\s*\*\*\s*$", + re.IGNORECASE | re.MULTILINE, +) + +# Matches one ``-``/``*``/``+`` bullet line whose content is captured. +_BULLET_LINE_RE = re.compile( + r"^\s*[-*+]\s+(?P.+?)\s*$" +) + +# Matches another bold inline label (e.g. ``**Acceptance criteria:**``). +# Used as a stop terminator while scanning bullets. +_NEXT_LABEL_RE = re.compile( + r"^\s*\*\*[^*\n]+\*\*\s*$" +) + + +def _is_valid_contract_path(raw: object) -> bool: + """ + Return True iff *raw* is a non-empty repo-relative POSIX path string with + no traversal segments and no Windows separators. + + Validation runs inside the parser so a malformed entry never reaches the + runner. Per Issue #1013 spec, syntactically invalid entries are dropped + silently; the *contract* itself remains valid even if the resulting + ``allowed_paths`` ends up empty (empty contract → reject-all enforcement, + see :func:`parse_issue_contract`). + """ + if not isinstance(raw, str): + return False + candidate = raw.strip() + if not candidate: + return False + if "\\" in candidate: + return False + if candidate.startswith("/"): + return False + # Reject parent-traversal segments at any position + parts = candidate.split("/") + if any(part == ".." for part in parts): + return False + return True + + +def _is_valid_companion_pattern(raw: object) -> bool: + """ + Return True iff *raw* is a repo-relative companion glob pattern with at + least one literal-character anchor and no ``**`` doublestar segment. + + Issue #1013 iter-10 M-1: ``companion_allowlist`` accepts arbitrary glob + patterns, so a contract that declares ``*``, ``**``, or ``**/*`` would + let ``_matches_companion_allowlist`` auto-allow repo-wide changes and + bypass the split-contract write set. Reject patterns whose every + segment is wildcard-only (no character outside ``*?``), as well as + absolute, Windows-separator, traversal, and empty patterns. + + Issue #1013 iter-14 M-1/M-2: also reject patterns whose any segment is + exactly ``**``. The segment-aware matcher (``fnmatch.fnmatchcase`` per + segment) treats ``**`` as just another wildcard segment, which would + let a contract like ``**/foo.json`` auto-allow ``foo.json`` at any + depth — exactly the suffix-match foot-gun ``PurePosixPath.match`` + exhibited. Contracts that genuinely need a depth-wildcard companion + artifact should enumerate the directories explicitly. + """ + if not isinstance(raw, str): + return False + candidate = raw.strip() + if not candidate: + return False + if "\\" in candidate: + return False + if candidate.startswith("/"): + return False + parts = candidate.split("/") + if any(part == ".." for part in parts): + return False + # Iter-14: reject doublestar segments. ``**`` only has well-defined + # semantics for recursive matching, which the anchored segment-aware + # matcher does NOT implement (it requires equal segment counts). + if any(part == "**" for part in parts): + return False + # At least one segment MUST contain a literal character (anything + # outside ``*?``); otherwise the pattern is wildcard-only and would + # match arbitrary repo paths, defeating the scope guard. + for segment in parts: + if not segment: + continue + if any(ch not in "*?" for ch in segment): + return True + return False + + +def _matches_companion_pattern_anchored(rel_posix: str, pattern: str) -> bool: + """ + Issue #1013 iter-14 M-1/M-2: anchored, segment-aware glob match for + companion allowlist patterns. + + Unlike :meth:`pathlib.PurePosixPath.match` (which matches from the + right and lets ``.pdd/meta/*.json`` falsely match + ``subdir/.pdd/meta/foo.json``), this matcher requires path and + pattern to align segment-by-segment from the START of the path with + equal segment count. Each segment is matched via + :func:`fnmatch.fnmatchcase` for ``*`` / ``?`` semantics. + + Returns False on invalid patterns (already filtered by + :func:`_is_valid_companion_pattern`); callers should validate first + so an invalid pattern can never auto-allow a path. + """ + if not pattern or not rel_posix: + return False + path_parts = rel_posix.replace("\\", "/").strip("/").split("/") + pattern_parts = pattern.replace("\\", "/").strip("/").split("/") + if len(path_parts) != len(pattern_parts): + return False + return all( + fnmatch.fnmatchcase(pp, patp) + for pp, patp in zip(path_parts, pattern_parts) + ) + + +def _parse_html_comment_contract(text: str) -> Optional[IssueContract]: + """Return a contract parsed from a ```` block, else None.""" + match = _HTML_COMMENT_CONTRACT_RE.search(text) + if not match: + return None + raw_json = match.group("json").strip() + if not raw_json: + return None + try: + parsed = json.loads(raw_json) + except (ValueError, TypeError): + return None + if not isinstance(parsed, dict): + return None + raw_allowed = parsed.get("allowed_paths") + if not isinstance(raw_allowed, list): + return None + # Drop syntactically invalid entries silently. Per Issue #1013, a + # syntactically valid contract with an empty ``allowed_paths`` is a + # legal degenerate contract meaning "reject every change" — keep it. + allowed = tuple(p.strip() for p in raw_allowed if _is_valid_contract_path(p)) + raw_companion = parsed.get("companion_allowlist", []) + if not isinstance(raw_companion, list): + raw_companion = [] + # Issue #1013 iter-10 M-1: drop wildcard-only / absolute / traversal / + # Windows-separator patterns silently so a contract cannot declare + # ``*``/``**``/``**/*`` and bypass the split-contract write set. + companion = tuple( + p.strip() for p in raw_companion if _is_valid_companion_pattern(p) + ) + return IssueContract( + allowed_paths=allowed, + companion_allowlist=companion, + source="html-comment", + ) + + +def _parse_fenced_block_contract(text: str) -> Optional[IssueContract]: + """Return a contract parsed from a heading + immediately-following fenced + code block, else None. The fence MUST carry a ``text`` or ``json`` info + string (bare fences are rejected per Issue #1013 iter-3 F3) and MUST + appear immediately after the heading (only whitespace permitted between). + + Body format depends on the fence language (Issue #1013 iter-12 B-1): + + - ``text``: one repo-relative POSIX path per line; blank lines and lines + starting with ``#`` are ignored. Surrounding backticks on a path line + are stripped before validation. + - ``json``: a JSON array of repo-relative POSIX path strings, e.g. + ``["pdd/foo.py", "tests/test_foo.py"]``. Anything else (object, + number, string, malformed JSON) returns ``None`` so the caller falls + back to permissive mode. + + When the fence body parses syntactically but contains no valid paths + (every entry dropped by ``_is_valid_contract_path``), the contract is + still returned with ``allowed_paths=()`` — a degenerate but legal + reject-all contract per the iter-8 B5 semantics. + """ + header_match = _FENCED_BLOCK_HEADER_RE.search(text) + if not header_match: + return None + after_header = text[header_match.end():] + # ``\A``-anchored regex: the fence must IMMEDIATELY follow the heading + # (only whitespace/newlines between heading end and the opening fence). + block_match = _FENCED_BLOCK_RE.match(after_header) + if not block_match: + return None + lang = block_match.group("lang") + body = block_match.group("body") + paths: List[str] = [] + seen: set = set() + if lang == "json": + # Issue #1013 iter-12 B-1: a JSON fence holds a JSON array of path + # strings, NOT a line-separated list. Parsing failures and any + # non-array payload (object, number, string) signal a malformed + # contract; return ``None`` so the caller falls back to permissive + # mode (matching the HTML-comment branch's tolerance). + try: + parsed = json.loads(body) + except (ValueError, TypeError): + return None + if not isinstance(parsed, list): + return None + for entry in parsed: + if not _is_valid_contract_path(entry): + continue + candidate = entry.strip() + if candidate not in seen: + paths.append(candidate) + seen.add(candidate) + else: + # ``text`` fence: one repo-relative path per line; ignore blank + # lines and ``#`` comments. Strip surrounding backticks if a user + # wrapped the path. + for raw_line in body.splitlines(): + line = raw_line.strip() + if not line or line.startswith("#"): + continue + line = line.strip("`").strip() + if not _is_valid_contract_path(line): + continue + if line not in seen: + paths.append(line) + seen.add(line) + # Empty fenced block is a legal degenerate contract (reject all). + return IssueContract( + allowed_paths=tuple(paths), + companion_allowlist=(), + source="fenced-block", + ) + + +def _parse_bullet_list_contract(text: str) -> Optional[IssueContract]: + """Return a contract parsed from a heading + ``**Allowed write set:**`` + inline label + bullet list, else ``None``. This is the third supported + format (Issue #1013 iter-18 B-1), motivated by the real-world shape used + in #1005 where the contract is written as Markdown bullets under a bold + inline label rather than inside a fenced code block or HTML comment. + + Algorithm: + + 1. Find a heading line matching :data:`_FENCED_BLOCK_HEADER_RE` + (``## Split Contract`` / ``## Allowed Write Set`` / etc.). + 2. After the heading, scan forward for the inline label line + ``**Allowed write set:**`` (case-insensitive). The label is the + discriminator — bullets BEFORE the label (e.g. under a ``## Files`` + section earlier in the body) MUST NOT be picked up. + 3. Collect contiguous ``-`` / ``*`` / ``+`` bullets that follow the + label, skipping blank lines. Stop the list at the first of: + + - another ``**Label:**`` line (e.g. ``**Acceptance criteria:**``) + - a ``---`` horizontal rule + - another ``#``-prefixed heading + - a non-blank line that is not a bullet + - end of body. + + 4. Each captured bullet entry has its surrounding backticks stripped + (users write ``- `pdd/foo.py``` because they're showing code paths), + then is validated by :func:`_is_valid_contract_path`. Invalid entries + are dropped silently — same policy as the other branches. + 5. An empty bullet list (no valid paths captured) is still returned as a + degenerate reject-all contract per the iter-8 B5 semantics. + """ + header_match = _FENCED_BLOCK_HEADER_RE.search(text) + if not header_match: + return None + after_header = text[header_match.end():] + label_match = _BULLET_LIST_LABEL_RE.search(after_header) + if not label_match: + return None + + body = after_header[label_match.end():] + paths: List[str] = [] + seen: set = set() + for raw_line in body.splitlines(): + stripped = raw_line.strip() + if not stripped: + # Blank line: do not terminate; bullets can be separated by + # blank lines in some Markdown styles. Continue scanning. + continue + if stripped.startswith("---"): + break + if stripped.startswith("#"): + break + # Another bold inline label (e.g. ``**Acceptance criteria:**``) + # terminates the bullet list. + if _NEXT_LABEL_RE.match(raw_line): + break + bullet_match = _BULLET_LINE_RE.match(raw_line) + if not bullet_match: + # First non-blank, non-terminator, non-bullet line ends the list. + break + entry = bullet_match.group("entry").strip() + # Users wrap code-shaped paths in backticks; strip a single layer of + # surrounding backticks before validation. + entry = entry.strip("`").strip() + if not _is_valid_contract_path(entry): + continue + if entry not in seen: + paths.append(entry) + seen.add(entry) + + return IssueContract( + allowed_paths=tuple(paths), + companion_allowlist=(), + source="bullet-list", + ) + + +def _parse_contract_from_text(text: Optional[str]) -> Optional[IssueContract]: + """Try HTML-comment first, then fenced-block, then bullet-list. Returns + ``None`` when no branch matches. + + Priority order (Issue #1013 iter-18 B-1): HTML comment is authoritative + (spec-preferred form, invisible in rendered Markdown), then fenced block + (the existing iter-3 form), then bullet list (the real-world shape used + in #1005 and similar issues). The first branch that returns a + non-``None`` :class:`IssueContract` wins. + """ + if not text: + return None + try: + html_contract = _parse_html_comment_contract(text) + except Exception: # noqa: BLE001 — parser MUST NOT raise on any input + html_contract = None + if html_contract is not None: + return html_contract + try: + fenced_contract = _parse_fenced_block_contract(text) + except Exception: # noqa: BLE001 — parser MUST NOT raise on any input + fenced_contract = None + if fenced_contract is not None: + return fenced_contract + try: + return _parse_bullet_list_contract(text) + except Exception: # noqa: BLE001 — parser MUST NOT raise on any input + return None + + +def parse_issue_contract( + issue_body: Optional[str], + issue_comments: Optional[List[str]] = None, +) -> Optional[IssueContract]: + """ + Parse an issue split-contract from an issue body or its comments. + + Three declaration formats are supported (Issue #1013): + + 1. HTML-comment block (authoritative, spec-preferred):: + + + + JSON ``allowed_paths`` is required (list of repo-relative path + strings); ``companion_allowlist`` is optional (list of glob patterns). + + 2. Fenced-code-block following a heading line matching + ``allowed write set`` or ``split contract``:: + + ## Allowed Write Set + ```text + pdd/foo.py + tests/test_foo.py + ``` + + One repo-relative path per line; blank lines and ``#``-prefixed + comments are ignored. ``json`` info-string fences carry a JSON array + of repo-relative path strings instead. + + 3. Bullet-list under a ``**Allowed write set:**`` inline label + (Issue #1013 iter-18 B-1 — the real-world shape used in #1005):: + + ## Split Contract + **Command sequence:** change → sync + **Allowed write set:** + - `pdd/foo.py` + - `tests/test_foo.py` + **Acceptance criteria:** + - ... + + The heading discriminator is the same regex as branch 2; the inline + ``**Allowed write set:**`` label tells the parser where the bullets + start so unrelated bullet lists earlier in the body (e.g. a + ``## Files`` section) are NOT captured. Each bullet is one + repo-relative POSIX path with optional surrounding backticks. The + list terminates at the next ``**Label:**``, a ``---`` rule, another + heading, a non-blank non-bullet line, or end of body. + + Branches are tried in this priority order; the first match wins. + + The body is scanned first; if no contract is found there, each comment is + scanned in order. When both sources declare a contract, the body wins + (issues are edited authoritatively; comments are append-only and may carry + stale snapshots from earlier workflow steps). + + Path entries are validated as repo-relative POSIX paths: syntactically + invalid entries (absolute, containing ``..``, empty, or using Windows + separators ``\\``) are dropped silently. A syntactically valid contract + whose ``allowed_paths`` ends up empty (either declared as ``[]`` or + reduced to ``[]`` after dropping invalid entries) is still returned as + an :class:`IssueContract` with ``allowed_paths=()``; the caller treats + that as a degenerate "reject every change" contract. The parser returns + ``None`` only when there is no parseable marker at all or when the + marker payload is syntactically malformed. Resolution to absolute + filesystem paths is the caller's job once it knows the repo root. + + The parser MUST NOT raise on any input: malformed JSON, missing fields, + unexpected types, or absent markers all return ``None``. + + Args: + issue_body: Raw issue body markdown. + issue_comments: Optional list of raw issue comment markdown bodies + (oldest first or newest first is fine; the parser picks the first + comment with a valid contract). + + Returns: + Parsed :class:`IssueContract` or ``None`` when no valid contract is + present. + """ + body_contract = _parse_contract_from_text(issue_body) + if body_contract is not None: + return body_contract + for comment in issue_comments or []: + contract = _parse_contract_from_text(comment) + if contract is not None: + return contract + return None + + _CLAUDE_OAUTH_PROBE_TIMEOUT_SECONDS = 10 _ANTHROPIC_KEY_STRIP_NOTICE_LOGGED: Dict[str, bool] = {} diff --git a/pdd/agentic_common_worktree.py b/pdd/agentic_common_worktree.py index ad08b66b8..f2f17cff2 100644 --- a/pdd/agentic_common_worktree.py +++ b/pdd/agentic_common_worktree.py @@ -389,8 +389,14 @@ def revert_out_of_scope_changes_with_dirs( reverted: List[Path] = [] try: + # ``--untracked-files=all`` (a.k.a. ``-uall``) forces git to list every + # individual untracked file even when ``status.showUntrackedFiles`` is + # ``normal`` or when bare ``-u`` would be interpreted as "default mode" + # by older git releases. Without this, an untracked directory would be + # reported as a single ``?? path/`` entry and the os.remove below would + # leave the contained files behind. (Issue #1013, F5.) result = subprocess.run( - ["git", "status", "--porcelain", "-u"], + ["git", "status", "--porcelain", "--untracked-files=all"], cwd=str(cwd), capture_output=True, text=True, @@ -410,6 +416,16 @@ def revert_out_of_scope_changes_with_dirs( logger.warning("OS error running git status: %s", exc) return reverted + def _scope_check_path(filepath_str: str) -> bool: + """Return True if *filepath_str* is in-scope (allowed).""" + for prefix in allowed_dirs: + if filepath_str.startswith(prefix): + return True + abs_path = (cwd / filepath_str).resolve() + if abs_path in allowed_files: + return True + return False + for line in result.stdout.splitlines(): if len(line) < 4: continue @@ -417,28 +433,51 @@ def revert_out_of_scope_changes_with_dirs( status = line[:2] filepath_raw = line[3:] - # Handle renames: "R old_name -> new_name" + # Iter-8 B5b (worktree-helper rename bug): renames are reported as + # ``R old -> new``. Previously this helper kept only the destination, + # so a partial-rename out-of-scope situation (allowed=new, old + # disallowed) silently deleted ``old`` without reverting. Treat + # renames atomically: if EITHER side is out of scope, restore both. if " -> " in filepath_raw: - filepath_raw = filepath_raw.split(" -> ")[-1] + old_raw, new_raw = filepath_raw.split(" -> ", 1) + old_path = old_raw.strip().strip('"') + new_path = new_raw.strip().strip('"') + if _scope_check_path(old_path) and _scope_check_path(new_path): + continue + # Out-of-scope rename: undo via ``git restore --staged --worktree + # --source=HEAD`` so the destination unknown-to-HEAD is removed. + try: + restore = subprocess.run( + ["git", "restore", "--staged", "--worktree", + "--source=HEAD", "--", old_path, new_path], + cwd=str(cwd), + capture_output=True, + text=True, + timeout=30, + ) + if restore.returncode == 0: + logger.info( + "Reverted out-of-scope rename: %s -> %s", + old_path, new_path, + ) + reverted.append(Path(old_path)) + reverted.append(Path(new_path)) + else: + logger.warning( + "Failed to revert rename %s -> %s: %s", + old_path, new_path, + (restore.stderr or "").strip(), + ) + except (subprocess.TimeoutExpired, FileNotFoundError, OSError) as exc: + logger.warning("git restore failed for rename: %s", exc) + continue filepath_str = filepath_raw.strip().strip('"') # ------------------------------------------------------------------ # Scope check # ------------------------------------------------------------------ - in_scope = False - - for prefix in allowed_dirs: - if filepath_str.startswith(prefix): - in_scope = True - break - - if not in_scope: - abs_path = (cwd / filepath_str).resolve() - if abs_path in allowed_files: - in_scope = True - - if in_scope: + if _scope_check_path(filepath_str): continue # ------------------------------------------------------------------ @@ -448,9 +487,27 @@ def revert_out_of_scope_changes_with_dirs( rel_path = Path(filepath_str) if is_untracked: + # Defensive: even with ``--untracked-files=all`` above, exotic git + # configs / submodule edge cases could conceivably hand us a path + # ending in ``/`` (an untracked directory). Detect and use + # ``shutil.rmtree`` so contained files don't get left behind. + # (Issue #1013, F5.) + target = cwd / filepath_str try: - os.remove(str(cwd / filepath_str)) - logger.info("Removed untracked out-of-scope file: %s", filepath_str) + if filepath_str.endswith("/") or ( + target.exists() and target.is_dir() and not target.is_symlink() + ): + shutil.rmtree(str(target)) + logger.info( + "Removed untracked out-of-scope directory: %s", + filepath_str, + ) + else: + os.remove(str(target)) + logger.info( + "Removed untracked out-of-scope file: %s", + filepath_str, + ) reverted.append(rel_path) except OSError as exc: logger.warning("Failed to remove %s: %s", filepath_str, exc) diff --git a/pdd/agentic_sync.py b/pdd/agentic_sync.py index ae3608b0a..186968097 100644 --- a/pdd/agentic_sync.py +++ b/pdd/agentic_sync.py @@ -16,17 +16,32 @@ import subprocess import sys from pathlib import Path -from typing import Any, Dict, List, NamedTuple, Optional, Tuple +from typing import Any, Dict, Iterable, List, NamedTuple, Optional, Tuple from rich.console import Console from .agentic_change import _check_gh_cli, _escape_format_braces, _parse_issue_url, _run_gh_command -from .agentic_common import run_agentic_task +from .agentic_common import ( + DEFAULT_SYNC_COMPANION_ALLOWLIST, + PDD_INTERNAL_PATH_ALLOWLIST, + IssueContract, + _is_valid_companion_pattern, + _matches_companion_pattern_anchored, + _revert_out_of_scope_changes, + parse_issue_contract, + run_agentic_task, +) +from .agentic_common_worktree import revert_out_of_scope_changes_with_dirs from .agentic_sync_runner import ( AsyncSyncRunner, _architecture_entry_aliases, _basename_from_architecture_filename, + _classify_baseline_path, _find_pdd_executable, + _git_changed_paths, + _git_ignored_paths, + _hash_baseline_paths, + _hash_file, build_dep_graph_from_architecture_data, ) from .durable_sync_runner import DurableSyncRunner @@ -963,8 +978,16 @@ def run_global_sync( one_session: bool = False, local: bool = False, timeout_adder: float = 0.0, + scope_guard: bool = True, ) -> Tuple[bool, str, float, str]: """Run project-wide Tier 1 global sync from architecture.json.""" + # Per ``agentic_sync_python.prompt`` § Global Sync 1: the ``scope_guard`` + # kwarg is accepted for CLI signature parity with ``run_agentic_sync`` but + # has no effect in global mode. Global sync has no issue body to parse, so + # the runner is always constructed in permissive fallback mode regardless + # of ``scope_guard``. The kwarg exists so ``pdd sync --no-scope-guard`` + # does not raise ``TypeError`` when dispatched into global mode. + _ = scope_guard project_root = _find_project_root(Path.cwd()) all_modules, architecture, arch_path = _architecture_sync_modules(project_root) if not architecture: @@ -1254,6 +1277,17 @@ def _run_single_dry_run( return False, str(e) +# Iter-30 B-2: shell metacharacters disallowed in the LLM-supplied +# ``SYNC_CWD`` value. We build the argv ourselves and pass ``shell=False`` +# so injection via the cwd string is impossible by construction, but reject +# these characters anyway as defense-in-depth — a path containing them is +# almost certainly the LLM ignoring the new prompt shape and trying to send +# a shell fragment. +_SYNC_CWD_FORBIDDEN_CHARS: Tuple[str, ...] = ( + ";", "&", "|", "<", ">", "`", "$", "(", ")", "\n", "\r", +) + + def _llm_fix_dry_run_failure( basename: str, project_root: Path, @@ -1262,7 +1296,16 @@ def _llm_fix_dry_run_failure( verbose: bool = False, reasoning_time: Optional[float] = None, ) -> Tuple[bool, Optional[Path], float, str]: - """Ask the LLM to suggest the correct cwd/command when dry-run fails. + """Ask the LLM to suggest the correct cwd when dry-run fails. + + Iter-30 B-2 replaces the prior ``shell=True`` exec of an LLM-supplied + string with a hardened approach: the LLM only identifies a working + directory (``SYNC_CWD: ``) and the orchestrator + builds the ``pdd --force sync --dry-run --agentic --no-steer`` + argv itself, then executes with ``shell=False``. This closes iter-29 B-2 + (LLM shell injection at the orchestrator level) — ``shell=True`` with an + LLM-provided string allowed ``rm``/redirects/chained writes that the + iter-28 ``--dry-run`` flag injection could not block. Returns: Tuple of (success, suggested_cwd_or_None, llm_cost, error_msg). @@ -1325,69 +1368,140 @@ def _llm_fix_dry_run_failure( if not llm_success: return False, None, llm_cost, f"LLM failed to suggest fix: {llm_output}" - # Parse SYNC_CMD from response - cmd_match = re.search(r"SYNC_CMD:\s*(.+)", llm_output) - if not cmd_match: - return False, None, llm_cost, "LLM response did not contain SYNC_CMD marker" + # Iter-30: explicitly reject the legacy ``SYNC_CMD:`` shape so a stale + # cached LLM response surfaces a clear migration error rather than a + # vague "no SYNC_CWD marker" message. The new prompt only asks for the + # cwd; if the response carries the old shell-command shape the LLM is + # acting on a prior cached version of the prompt and must be re-asked + # with the iter-30 wording. + if "SYNC_CWD:" not in llm_output and "SYNC_CMD:" in llm_output: + return ( + False, + None, + llm_cost, + ( + "LLM returned legacy ``SYNC_CMD:`` format. The orchestrator now " + "builds the sync argv itself and only expects ``SYNC_CWD: `` " + "from the LLM. Re-run; this usually clears after one retry." + ), + ) - suggested_cmd = cmd_match.group(1).strip() + cwd_match = re.search(r"SYNC_CWD:\s*(.+)", llm_output) + if not cwd_match: + return False, None, llm_cost, "LLM response did not contain SYNC_CWD marker" + + raw_cwd = cwd_match.group(1).strip() + if not raw_cwd: + return False, None, llm_cost, "LLM returned an empty SYNC_CWD value" + + # Strip surrounding quotes the LLM may emit. + if (raw_cwd.startswith('"') and raw_cwd.endswith('"')) or ( + raw_cwd.startswith("'") and raw_cwd.endswith("'") + ): + raw_cwd = raw_cwd[1:-1].strip() + if not raw_cwd: + return False, None, llm_cost, "LLM returned an empty SYNC_CWD value" + + # Defense-in-depth: reject any shell metacharacter. We pass ``shell=False`` + # downstream so injection through the cwd is structurally impossible, but + # a path containing these characters is almost certainly the LLM trying + # to smuggle a shell fragment past the new prompt shape. + for ch in _SYNC_CWD_FORBIDDEN_CHARS: + if ch in raw_cwd: + return ( + False, + None, + llm_cost, + ( + f"LLM SYNC_CWD value contains forbidden character " + f"{ch!r}; refusing to execute: {raw_cwd!r}" + ), + ) - # Safety: reject commands that don't look like a pdd sync invocation - if "pdd" not in suggested_cmd or "sync" not in suggested_cmd: - return False, None, llm_cost, f"LLM suggested unexpected command: {suggested_cmd}" + # Resolve relative-to-project-root or absolute paths. Both are accepted + # so long as the resolved location lives under ``project_root``. + candidate = Path(raw_cwd) + if not candidate.is_absolute(): + candidate = project_root / candidate + try: + resolved_cwd = candidate.resolve() + except (OSError, RuntimeError) as exc: + return ( + False, + None, + llm_cost, + f"Failed to resolve SYNC_CWD path {raw_cwd!r}: {exc}", + ) - # Append a pwd marker after the command so we can extract the effective cwd. - # This avoids fragile regex parsing of cd segments from the command string. - pwd_marker = "__PDD_EFFECTIVE_CWD__" - augmented_cmd = f"{suggested_cmd} && echo {pwd_marker} && pwd" + project_root_resolved = project_root.resolve() + try: + resolved_cwd.relative_to(project_root_resolved) + except ValueError: + return ( + False, + None, + llm_cost, + ( + f"SYNC_CWD resolves outside project root " + f"({resolved_cwd} not under {project_root_resolved}); " + "refusing to execute." + ), + ) + + if not resolved_cwd.is_dir(): + return ( + False, + None, + llm_cost, + f"SYNC_CWD does not resolve to a directory: {resolved_cwd}", + ) + + # Build the argv ourselves — the LLM never sees or supplies a shell line. + pdd_exe = _find_pdd_executable() + if pdd_exe: + cmd: List[str] = [pdd_exe] + else: + cmd = [sys.executable, "-m", "pdd"] + cmd.extend( + ["--force", "sync", basename, "--dry-run", "--agentic", "--no-steer"] + ) - # Run the suggested command directly via shell from project root. - # This handles relative cd paths, chained cd's, etc. naturally. try: result = subprocess.run( - augmented_cmd, - shell=True, - cwd=str(project_root), + cmd, + cwd=str(resolved_cwd), + shell=False, capture_output=True, text=True, timeout=60, env={**os.environ, "PDD_FORCE": "1", "CI": "1"}, ) except subprocess.TimeoutExpired: - return False, None, llm_cost, f"LLM suggested command timed out: {suggested_cmd}" - except Exception as e: - return False, None, llm_cost, f"Failed to run LLM suggested command: {e}" - - if result.returncode == 0: - # Extract effective cwd from the pwd output after our marker - stdout_lines = result.stdout.strip().splitlines() - effective_cwd = project_root.resolve() - for i, line in enumerate(stdout_lines): - if line.strip() == pwd_marker and i + 1 < len(stdout_lines): - effective_cwd = Path(stdout_lines[i + 1].strip()).resolve() - break - - # Validate resolved cwd is within project root - try: - effective_cwd.relative_to(project_root.resolve()) - except ValueError: - return ( - False, - None, - llm_cost, - f"LLM command resolves outside project root: {suggested_cmd}", - ) - - return True, effective_cwd, llm_cost, "" - else: - err_output = result.stderr or result.stdout or f"Exit code {result.returncode}" return ( False, None, llm_cost, - f"LLM suggested command failed: {err_output[:500]}", + f"LLM-suggested cwd dry-run timed out at {resolved_cwd}", + ) + except Exception as e: # pragma: no cover — defensive + return ( + False, + None, + llm_cost, + f"Failed to run dry-run probe at {resolved_cwd}: {e}", ) + if result.returncode == 0: + return True, resolved_cwd, llm_cost, "" + + err_output = result.stderr or result.stdout or f"Exit code {result.returncode}" + return ( + False, + None, + llm_cost, + f"LLM-suggested cwd dry-run failed at {resolved_cwd}: {err_output[:500]}", + ) + def _run_dry_run_validation( modules: List[str], @@ -1593,6 +1707,444 @@ def _parse_llm_response(response: str) -> Tuple[List[str], bool, List[Dict[str, return modules_to_sync, deps_valid, deps_corrections +def _extract_allowed_write_paths(issue_text: str) -> List[str]: + """ + Deprecated thin wrapper around :func:`parse_issue_contract` (Issue #1013, F3). + + This helper used to do its own loose markdown scan for allowed-write + paths. It now delegates to the structured contract parser in + :mod:`pdd.agentic_common` so the public contract API (HTML-comment JSON, + fenced-block, and bullet-list formats) is the single source of truth. + The wrapper is kept for one release so any external caller that imported + the private name does not crash at import time; it returns an empty list + whenever :func:`parse_issue_contract` cannot find a valid contract. + """ + contract = parse_issue_contract(issue_text) + return list(contract.allowed_paths) if contract is not None else [] + + +def _enforce_orchestrator_scope( + project_root: Path, + issue_contract: Optional[IssueContract], + scope_guard: bool, + baseline_changed: Dict[str, Optional[str]], + baseline_ignored: Dict[str, Optional[str]], + *, + quiet: bool = False, +) -> Optional[str]: + """Iter-30: scope-guard the orchestrator's pre-dispatch write surface. + + Reverts any working-tree changes that fall outside the issue contract's + allowed write set + companion allowlist + the baseline of pre-existing + files captured at orchestrator entry. Returns ``None`` when the working + tree is clean (within scope), else returns a multi-line diagnostic + string describing what was reverted and what (if anything) could not be + reverted. + + Permissive when ``issue_contract is None`` or ``scope_guard is False`` — + returns ``None`` unconditionally, matching the spec for the per-module + scope guard in :class:`AsyncSyncRunner._enforce_scope_guard`. + + Why this exists (iter-29 → iter-30 promotion): + The per-module scope guard in :class:`AsyncSyncRunner` only runs + AFTER the runner is constructed and only for per-module sync + subprocesses. The orchestrator does write-capable work BEFORE the + runner exists — LLM module identification (write-capable + :func:`run_agentic_task`), LLM dry-run-fix subprocesses, + :func:`_apply_architecture_corrections`, etc. Each pre-dispatch + write site is unguarded by the per-module scope guard, so iter-26, + iter-28 and iter-29 each kept finding new orchestrator-level + bypasses. This unified guard runs at every orchestrator return site + between the baseline snapshot and the runner dispatch, replacing the + per-site patches. + + Args: + project_root: Repo root the orchestrator is operating against + (always the user's main checkout, in both async and durable + mode — the orchestrator does not branch to worktrees). + issue_contract: Parsed contract for the issue, or ``None`` when no + structured contract was found (permissive mode). + scope_guard: ``True`` when the runner-level scope guard is enabled + (CLI default). ``False`` when the user passed ``--no-scope-guard``. + ``False`` short-circuits to a no-op for the same reason the + per-module guard does. + baseline_changed: Snapshot of working-tree changes (``git status + --porcelain --untracked-files=all``) at orchestrator entry, + mapping repo-relative POSIX paths to init-time SHA-1. Only + byte-identical content is auto-allowed (iter-24); divergent + SHAs fall through to the contract check so a clobber surfaces. + baseline_ignored: Snapshot of gitignored paths (``git ls-files + --others --ignored --exclude-standard``) at orchestrator entry, + same SHA-aware preservation rule as ``baseline_changed`` + (iter-20 + iter-24). + quiet: When True, suppress the stderr echo of the diagnostic. + + Returns: + ``None`` when working tree is clean within the contract OR when + enforcement is disabled (permissive mode, or ``--no-scope-guard``). + Diagnostic string otherwise — callers prepend it to the return + message so the user sees what was reverted. + """ + if not scope_guard or issue_contract is None: + return None + + # Build the allowed-files set. Same shape as + # ``AsyncSyncRunner._enforce_scope_guard`` (iter-14 anchored matcher + + # iter-24 SHA-aware baseline preservation), specialised for the + # orchestrator's project_root cwd: there is no per-module ``module_cwd`` + # to anchor the rglob/companion matcher against, so the orchestrator + # uses the repo root as both ``repo_root`` and ``module_cwd`` — every + # pre-dispatch write the orchestrator can produce is repo-rooted. + repo_root = project_root.resolve() + + allowed_paths_iter = tuple(issue_contract.allowed_paths or ()) + allowed_files: set[Path] = set() + for rel in allowed_paths_iter: + if not rel: + continue + allowed_files.add((repo_root / rel).resolve()) + + # Companion allowlist union with DEFAULT — mirrors the iter-26 gate + # in :func:`_arch_path_in_scope`'s neighbour and the runner's own + # ``__init__`` union. Order preserved for log determinism. + allowlist: Tuple[str, ...] = tuple( + dict.fromkeys( + tuple(issue_contract.companion_allowlist or ()) + + tuple(DEFAULT_SYNC_COMPANION_ALLOWLIST) + ) + ) + + # rglob for currently-on-disk companion files. The orchestrator scope + # guard cannot rely on a single module_cwd, so the scan is repo-wide + # — same union as the per-module guard, just with a wider net. + # + # Iter-36 B-1/B-2: ``PDD_INTERNAL_PATH_ALLOWLIST`` is a SEPARATE pass + # (not merged into ``allowlist``) because internal patterns are + # interpreted as REPO-ROOT-anchored while user-facing companion + # patterns are anchored against the iteration root for the loop they + # appear in. The orchestrator happens to iterate repo-rooted anyway, + # but keeping the passes separate preserves parity with the per-module + # guard (which iterates module-rooted) and matches the documented + # semantics of ``PDD_INTERNAL_PATH_ALLOWLIST``. + for path in repo_root.rglob("*"): + if not path.is_file(): + continue + try: + rel_posix = path.resolve().relative_to(repo_root).as_posix() + except ValueError: + continue + if _matches_companion(rel_posix, allowlist): + allowed_files.add(path.resolve()) + elif _matches_internal(rel_posix, PDD_INTERNAL_PATH_ALLOWLIST): + allowed_files.add(path.resolve()) + + # Also pick up companion-shaped tracked deletions (sync legitimately + # removes ``.pdd/meta/foo_python.json`` when a module is renamed; the + # revert helper would otherwise resurrect it). + # + # Iter-36 B-1/B-2: same internal-allowlist parallel pass — a tracked + # deletion of ``.pdd/agentic_sync_state.json`` (e.g. between sync + # invocations) must NOT be resurrected by the revert helper. + # + # Iter-38 M-1: ``_git_changed_paths`` now returns ``None`` on scan + # failure (was empty set). Enforcement-time scan failures are already + # handled separately downstream; here we treat ``None`` as the empty + # set so iteration is a no-op rather than crashing. + for rel_posix in (_git_changed_paths(repo_root) or set()): + absolute = (repo_root / rel_posix).resolve() + if _matches_companion(rel_posix, allowlist): + allowed_files.add(absolute) + elif _matches_internal(rel_posix, PDD_INTERNAL_PATH_ALLOWLIST): + allowed_files.add(absolute) + + # Iter-24 SHA-aware preservation of pre-existing changed baseline. + # + # Iter-36 B-3: mirror :meth:`AsyncSyncRunner._enforce_scope_guard`'s + # iter-34 ``baseline_deleted`` set — when ``current_hash is None``, + # the pre-existing file is gone from disk. For TRACKED baselines git + # status surfaces this as ``D `` and the re-scan picks it up, but + # UNTRACKED baselines leave no trail; without this collection the + # orchestrator silently passes a sync-side deletion of user WIP. + # + # Iter-40 M-1 (unreadable vs missing): use + # :func:`_classify_baseline_path` to distinguish "file deleted" from + # "file exists but unreadable" (permission flip, locked file). The + # latter case must NOT be flagged as deleted — preserve by name to + # avoid the false-deletion diagnostic and prevent downstream revert + # helpers from attempting to remove a still-present path. + baseline_deleted: set[str] = set() + for rel_posix, baseline_hash in baseline_changed.items(): + status = _classify_baseline_path(repo_root, rel_posix) + if status.missing: + # File was deleted after baseline — surface it via the + # ``remaining`` set below regardless of whether it was tracked + # or untracked (we can't distinguish from the snapshot, and + # even the tracked-deletion case warrants a hard-fail). + baseline_deleted.add(rel_posix) + continue + if status.sha is None: + # Iter-40 M-1: present but unreadable. Preserve by name — + # same conservative carve-out as the unreadable-at-snapshot + # branch below — so a permission-flaky baseline is not + # misreported as deleted. + allowed_files.add((repo_root / rel_posix).resolve()) + continue + if baseline_hash is None or status.sha == baseline_hash: + # Unreadable at snapshot (preserve by name) or unchanged content + # → preserve. + allowed_files.add((repo_root / rel_posix).resolve()) + + # Iter-36 B-3: symmetric pass for ignored baseline. The orchestrator + # re-scan only sees files that ``git ls-files --ignored`` currently + # lists; deleted ignored baselines (e.g. user-side ``cache.bin`` erased + # before runner dispatch) leave no trail there. Iterate the ignored + # baseline directly to catch the deletion. Present-but-changed + # ignored baselines are already surfaced by + # :func:`_orchestrator_remaining_out_of_scope_paths`'s ignored loop. + # + # Iter-40 M-1: same unreadable-vs-missing discrimination. + for rel_posix, baseline_hash in baseline_ignored.items(): + status = _classify_baseline_path(repo_root, rel_posix) + if status.missing: + baseline_deleted.add(rel_posix) + elif status.sha is None: + # Present but unreadable — preserve by name. + allowed_files.add((repo_root / rel_posix).resolve()) + + tracked_reverted = _revert_out_of_scope_changes(repo_root, allowed_files) + untracked_reverted = revert_out_of_scope_changes_with_dirs( + repo_root, allowed_dirs=set(), allowed_files=allowed_files + ) + + seen: set[str] = set() + offending: List[str] = [] + for path in list(tracked_reverted) + list(untracked_reverted): + try: + rel = Path(path).resolve().relative_to(repo_root).as_posix() + except ValueError: + rel = str(path) + if rel in seen: + continue + if (repo_root / rel).resolve() in allowed_files: + continue + seen.add(rel) + offending.append(rel) + + # Iter-9 fail-closed re-scan: either revert helper can fail silently + # and return []. We re-scan the working tree after revert to be sure + # the contract is now satisfied; anything still on disk that is not + # allowed becomes the "unrecovered" set. + # + # Iter-36 B-3: union with ``baseline_deleted`` so an orchestrator-side + # deletion of a pre-existing untracked/ignored baseline path + # hard-fails the run. Mirrors :meth:`AsyncSyncRunner._enforce_scope_guard`'s + # iter-34 union for the per-module guard. + remaining_raw = _orchestrator_remaining_out_of_scope_paths( + repo_root, allowed_files, baseline_ignored + ) + offending_set = set(offending) + remaining = sorted( + (set(remaining_raw) | baseline_deleted) - offending_set + ) + + if not offending and not remaining: + return None + + source = issue_contract.source or "" + allowed_lines = ( + "\n".join(f" - {p}" for p in sorted(allowed_paths_iter)) + or " - " + ) + companion_lines = ( + "\n".join(f" - {p}" for p in allowlist) or " - " + ) + if offending: + offending_lines = "\n".join(f" - {p}" for p in offending) + header = ( + f"Orchestrator scope guard reverted {len(offending)} " + f"out-of-scope file(s) before runner dispatch " + f"(contract source: {source}):\n{offending_lines}" + ) + else: + header = ( + "Orchestrator scope guard detected out-of-scope artifacts " + f"before runner dispatch (contract source: {source}) but the " + "revert helpers reported no successful reverts." + ) + + parts = [header] + if remaining: + unrecovered_lines = "\n".join(f" - {p}" for p in remaining) + parts.append( + "Unrecovered (revert failed, manual cleanup required):\n" + f"{unrecovered_lines}" + ) + parts.append(f"Allowed write set:\n{allowed_lines}") + parts.append(f"Companion allowlist:\n{companion_lines}") + diagnostic = "\n".join(parts) + + if not quiet: + # F8 parity with the per-module scope guard: echo diagnostic to + # stderr immediately so the user sees what was reverted before + # the orchestrator's combined return message appears. + print(diagnostic, file=sys.stderr) + return diagnostic + + +def _matches_companion(rel_posix: str, allowlist: Iterable[str]) -> bool: + """Anchored companion-allowlist match for orchestrator scope guard. + + Iter-30: thin wrapper around the iter-14 module-relative anchored + matcher used by :class:`AsyncSyncRunner._enforce_scope_guard`. Kept + local to the orchestrator so the orchestrator and the per-module + guard share semantics without the orchestrator importing the runner's + instance method. + """ + for pattern in allowlist: + if not pattern: + continue + if not _is_valid_companion_pattern(pattern): + continue + if _matches_companion_pattern_anchored(rel_posix, pattern): + return True + return False + + +def _matches_internal(rel_posix: str, internal_allowlist: Iterable[str]) -> bool: + """Iter-36 B-1/B-2: anchored match for PDD-internal infrastructure paths. + + Distinct from :func:`_matches_companion` so internal allowlist patterns + bypass the user-facing :func:`_is_valid_companion_pattern` gate (the + internal list is curated, not parsed from user input) but still go + through the iter-14 segment-aware anchored matcher so e.g. + ``subdir/.pdd/agentic-logs/x.jsonl`` does NOT match — only top-level + ``.pdd/agentic-logs/x.jsonl`` does. + """ + for pattern in internal_allowlist: + if not pattern: + continue + if _matches_companion_pattern_anchored(rel_posix, pattern): + return True + return False + + +def _orchestrator_remaining_out_of_scope_paths( + repo_root: Path, + allowed_files: set[Path], + baseline_ignored: Dict[str, Optional[str]], +) -> List[str]: + """Iter-30 fail-closed re-scan for the orchestrator scope guard. + + Re-runs the iter-9 + iter-20 + iter-24 re-scan logic from + :meth:`AsyncSyncRunner._remaining_out_of_scope_paths` but parameterised + on the orchestrator's snapshots (the runner uses its own instance + state). Returns the sentinel ``[""]`` when either + of the underlying git probes fails so the orchestrator surfaces the + failure rather than silently passing. + """ + try: + result = subprocess.run( + ["git", "-C", str(repo_root), "status", + "--porcelain", "--untracked-files=all"], + capture_output=True, text=True, timeout=30, + ) + except (subprocess.TimeoutExpired, FileNotFoundError, OSError): + return [""] + if result.returncode != 0: + return [""] + + remaining: set[str] = set() + for line in result.stdout.splitlines(): + if len(line) < 4: + continue + payload = line[3:].strip() + if not payload: + continue + if " -> " in payload: + old_raw, new_raw = payload.split(" -> ", 1) + entry_paths = [old_raw.strip().strip('"'), + new_raw.strip().strip('"')] + else: + entry_paths = [payload.strip('"')] + for rel in entry_paths: + rel = rel.strip() + if rel.startswith("./"): + rel = rel[2:] + if not rel: + continue + absolute = (repo_root / rel).resolve() + if absolute in allowed_files: + continue + remaining.add(rel) + + try: + ignored_result = subprocess.run( + ["git", "-C", str(repo_root), "ls-files", + "--others", "--ignored", "--exclude-standard"], + capture_output=True, text=True, timeout=30, + ) + except (subprocess.TimeoutExpired, FileNotFoundError, OSError): + return [""] + if ignored_result.returncode != 0: + return [""] + + for line in ignored_result.stdout.splitlines(): + rel = line.strip().strip('"') + if rel.startswith("./"): + rel = rel[2:] + if not rel: + continue + if rel in baseline_ignored: + baseline_hash = baseline_ignored[rel] + current_hash = _hash_file(repo_root, rel) + if current_hash is not None and ( + baseline_hash is None or current_hash == baseline_hash + ): + continue + absolute = (repo_root / rel).resolve() + if absolute in allowed_files: + continue + remaining.add(rel) + + return sorted(remaining) + + +def _arch_path_in_scope( + arch_path: Path, + project_root: Path, + issue_contract: Optional[IssueContract], + scope_guard: bool, +) -> bool: + """Return True if *arch_path* is in the issue contract's allowed write set. + + Iter-28 B-2: the iter-26 gate compared the literal string + ``"architecture.json"`` against the contract. That bypassed the guard when + the project uses a nested architecture (e.g. ``frontend/architecture.json``) + — the literal check would either falsely allow a write the contract + forbids, or falsely block a write the contract permits. The check now + resolves the ACTUAL ``arch_path`` to a repo-relative POSIX path and + compares that against the contract. + + The gate is bypassed (returns True) when: + - no contract was parsed (``issue_contract`` is None → permissive mode), or + - ``--no-scope-guard`` was passed (``scope_guard`` is False). + + Returns False when ``arch_path`` resolves outside ``project_root`` — that + is treated as out-of-scope by definition (the contract is repo-relative, + so a path outside the repo cannot be allowed by it). + """ + if issue_contract is None or scope_guard is False: + return True + try: + arch_rel = ( + arch_path.resolve().relative_to(project_root.resolve()).as_posix() + ) + except ValueError: + # arch_path resolves outside project_root — by definition not in scope. + return False + return arch_rel in tuple(issue_contract.allowed_paths or ()) + + def _apply_architecture_corrections( arch_path: Path, architecture: List[Dict[str, Any]], @@ -1666,6 +2218,7 @@ def run_agentic_sync( durable_branch: Optional[str] = None, no_resume: bool = False, durable_max_parallel: Optional[int] = None, + scope_guard: bool = True, ) -> Tuple[bool, str, float, str]: """ Run agentic sync workflow: identify modules from a GitHub issue and sync in parallel. @@ -1737,6 +2290,7 @@ def run_agentic_sync( # 5. Build issue content issue_content = f"Title: {title}\n\nDescription:\n{body}\n" + comment_bodies: List[str] = [] if comments_data and isinstance(comments_data, list): issue_content += "\nComments:\n" for comment in comments_data: @@ -1744,6 +2298,55 @@ def run_agentic_sync( c_user = comment.get("user", {}).get("login", "unknown") c_body = comment.get("body", "") issue_content += f"\n--- Comment by {c_user} ---\n{c_body}\n" + if isinstance(c_body, str) and c_body: + comment_bodies.append(c_body) + + # Issue #1013 — split-contract scope guard (F3, F4, F6, F11): + # Parse the structured contract from the issue body first, then comments, + # *regardless* of whether scope-guard enforcement is enabled — the + # ``--no-scope-guard`` opt-out should still record the parsed contract + # for diagnostics. The runner short-circuits enforcement when + # ``scope_guard_enabled=False`` (see AsyncSyncRunner._enforce_scope_guard). + issue_contract: Optional[IssueContract] = parse_issue_contract( + body, comment_bodies + ) + if not quiet: + if not scope_guard: + # F7: this is the single user-facing log line for the + # ``--no-scope-guard`` opt-out. The runner no longer logs the + # same state on entry — kept here so it's closer to the user. + console.print( + "[yellow]Sync scope guard: disabled via --no-scope-guard[/yellow]" + ) + elif issue_contract is not None: + console.print( + f"[dim]Sync scope guard: contract loaded from " + f"{issue_contract.source} " + f"({len(issue_contract.allowed_paths)} allowed paths)[/dim]" + ) + else: + console.print( + "[dim]Sync scope guard: no contract on issue — " + "running in permissive mode[/dim]" + ) + + # Resolve effective allow set / companion allowlist for the runner. + # ``None`` (permissive) is preserved when no contract was parsed so the + # runner can distinguish "no contract" from "explicit empty contract". + # The runner unions the companion allowlist with + # DEFAULT_SYNC_COMPANION_ALLOWLIST in its __init__ (F4); we still pre-union + # here so the durable runner's parent __init__ does the same dedup pass. + if issue_contract is not None: + allowed_write_paths: Optional[List[str]] = list(issue_contract.allowed_paths) + effective_companion_allowlist: Tuple[str, ...] = tuple( + dict.fromkeys( + tuple(issue_contract.companion_allowlist) + + tuple(DEFAULT_SYNC_COMPANION_ALLOWLIST) + ) + ) + else: + allowed_write_paths = None + effective_companion_allowlist = tuple(DEFAULT_SYNC_COMPANION_ALLOWLIST) issue_content = _escape_format_braces(issue_content) @@ -1755,6 +2358,81 @@ def run_agentic_sync( if not quiet: console.print("[yellow]No architecture.json found, falling back to include-based dependency graph[/yellow]") + # Iter-30: orchestrator-level scope guard baseline snapshot. The + # per-module scope guard in :class:`AsyncSyncRunner` only enforces AFTER + # the runner is constructed, so any pre-dispatch LLM call or shell + # command in this orchestrator can produce out-of-contract writes that + # the per-module guard never sees. We snapshot the changed + ignored + # working tree now, run the orchestrator's pre-dispatch work, then + # invoke :func:`_enforce_orchestrator_scope` at every early return so + # any orchestrator-level violation is reverted before the function + # exits. Gated on ``scope_guard AND issue_contract is not None`` so + # non-contract runs and explicit ``--no-scope-guard`` opt-outs skip the + # baseline cost (mirrors the per-module guard's gate in + # :class:`AsyncSyncRunner.__init__`). The orchestrator's cwd is the + # user's main checkout in BOTH async and durable mode — durable mode + # only branches to worktrees inside :class:`DurableSyncRunner.run`, + # which runs after this guard is no longer relevant. + if scope_guard and issue_contract is not None: + # Iter-38 M-1 (fail-closed baseline acquisition): the helpers now + # return ``None`` on transient git failure (lock contention, missing + # binary, OSError) instead of an empty set. Without this + # discrimination an init-time scan failure here would silently + # produce an empty baseline that the orchestrator scope guard + # later treats as "no pre-existing files," so any pre-existing + # user WIP could be reverted/deleted by the pre-dispatch check. + # When EITHER scan fails we abort BEFORE any LLM call or shell + # command runs in this orchestrator. The runner has its own + # symmetric guard in :meth:`AsyncSyncRunner.run`. + _raw_orch_changed = _git_changed_paths(project_root) + _raw_orch_ignored = _git_ignored_paths(project_root) + if _raw_orch_changed is None or _raw_orch_ignored is None: + msg = ( + "Scope guard fail-closed: could not snapshot working-tree " + "baseline at orchestrator init (git scan failed). Aborting " + "before any pre-dispatch LLM/shell work to prevent " + "false-positive reverts of pre-existing user files." + ) + if use_github_state: + _post_error_comment(owner, repo, issue_number, msg) + return False, msg, 0.0, "" + _orch_baseline_changed: Dict[str, Optional[str]] = _hash_baseline_paths( + project_root, _raw_orch_changed + ) + _orch_baseline_ignored: Dict[str, Optional[str]] = _hash_baseline_paths( + project_root, _raw_orch_ignored + ) + else: + _orch_baseline_changed = {} + _orch_baseline_ignored = {} + + def _orch_scope_check_return( + msg: str, cost: float, prov: str, success: bool + ) -> Tuple[bool, str, float, str]: + """Iter-30: wrap an early return with orchestrator scope-guard enforcement. + + Closes the entire class of orchestrator-level scope bypasses that + iter-26, iter-28, and iter-29 each found a new instance of. When the + scope guard reverts anything, the return is forced to ``success=False`` + and the diagnostic is appended to the caller's message so the user + sees what was reverted before the orchestrator returned. + """ + scope_diagnostic = _enforce_orchestrator_scope( + project_root, + issue_contract, + scope_guard, + _orch_baseline_changed, + _orch_baseline_ignored, + quiet=quiet, + ) + if scope_diagnostic is None: + return success, msg, cost, prov + combined = ( + f"{msg}\n\nOrchestrator scope guard: out-of-contract artifacts " + f"detected before dispatch:\n{scope_diagnostic}" + ) + return False, combined, cost, prov + # 7. Try git diff-based module detection first (deterministic, free) branch_modules = _detect_modules_from_branch_diff(project_root) llm_cost = 0.0 @@ -1780,7 +2458,7 @@ def run_agentic_sync( "skipping LLM identification.[/green]" ) console.print(f"[green]{msg}[/green]") - return True, msg, llm_cost, provider + return _orch_scope_check_return(msg, llm_cost, provider, success=True) else: # 7b. Fall back to LLM-based module identification prompt_template = load_prompt_template("agentic_sync_identify_modules_LLM") @@ -1814,7 +2492,7 @@ def run_agentic_sync( msg = f"LLM failed to identify modules: {llm_output}" if use_github_state: _post_error_comment(owner, repo, issue_number, msg) - return False, msg, llm_cost, provider + return _orch_scope_check_return(msg, llm_cost, provider, success=False) # 9. Parse LLM response modules_to_sync, deps_valid, deps_corrections = _parse_llm_response(llm_output) @@ -1831,11 +2509,11 @@ def run_agentic_sync( msg = "All modules are already synced — nothing to do." if not quiet: console.print(f"[green]{msg}[/green]") - return True, msg, llm_cost, provider + return _orch_scope_check_return(msg, llm_cost, provider, success=True) msg = "LLM identified no modules to sync" if use_github_state: _post_error_comment(owner, repo, issue_number, msg) - return False, msg, llm_cost, provider + return _orch_scope_check_return(msg, llm_cost, provider, success=False) # LLM returns basenames from architecture.json filenames (e.g., "crm_models_Python"). # pdd sync expects basenames without the language suffix (e.g., "crm_models"). @@ -1867,7 +2545,7 @@ def run_agentic_sync( msg = "All modules are already synced — nothing to do." if not quiet: console.print(f"[green]{msg}[/green]") - return True, msg, llm_cost, provider + return _orch_scope_check_return(msg, llm_cost, provider, success=True) # 9.4 Augment architecture with entries from the PR branch (new modules created by pdd-change) architecture = _augment_architecture_from_pr_branch(architecture, project_root, issue_number) @@ -1881,12 +2559,34 @@ def run_agentic_sync( msg = f"No valid modules to sync (all basenames were invalid: {invalid_basenames})" if use_github_state: _post_error_comment(owner, repo, issue_number, msg) - return False, msg, llm_cost, provider + return _orch_scope_check_return(msg, llm_cost, provider, success=False) if not quiet: console.print(f"[green]Modules to sync: {modules_to_sync}[/green]") # 10. Apply dependency corrections if needed + # + # Iter-26: scope-guard the LLM dependency-correction step. This runs + # at the ORCHESTRATOR level — before the runner exists — so the per- + # module scope guard cannot catch it. If the issue contract does not + # include the ACTUAL ``arch_path`` (as a repo-relative POSIX path) in + # its allowed write set, skip the correction so the contract is not + # silently violated. Pre-existing behavior is preserved when no contract + # was parsed (``issue_contract`` is None → permissive mode) or when the + # arch path IS in the contract (legitimate architecture-touching PRs). + # ``--no-scope-guard`` (``scope_guard is False``) is treated as an + # explicit opt-out and also bypasses the gate. + # + # Iter-28 B-2: use the resolved ``arch_path`` rather than the literal + # string ``"architecture.json"``. A nested arch path such as + # ``frontend/architecture.json`` would otherwise either bypass the gate + # (when the contract allows ``architecture.json`` literally but the + # actual file is nested) or be incorrectly skipped (when the contract + # allows the nested path explicitly). The actual file being written is + # the source of truth, delegated to :func:`_arch_path_in_scope`. + arch_in_scope = _arch_path_in_scope( + arch_path, project_root, issue_contract, scope_guard + ) if not deps_valid and deps_corrections and architecture is not None: if dry_run: if not quiet: @@ -1894,9 +2594,18 @@ def run_agentic_sync( "[yellow]Dry run: dependency corrections were suggested; " "architecture.json was not modified.[/yellow]" ) + elif not arch_in_scope: + if not quiet: + console.print( + "[yellow]Sync scope guard: skipping LLM dependency " + "corrections — architecture.json is outside the issue " + "split-contract allowed write set. Add architecture.json " + "to the contract or rerun with --no-scope-guard to apply " + "corrections.[/yellow]" + ) elif not quiet: console.print("[yellow]LLM flagged dependency corrections, updating architecture.json...[/yellow]") - if not dry_run: + if not dry_run and arch_in_scope: architecture = _apply_architecture_corrections(arch_path, architecture, deps_corrections, quiet) # 11. Build dependency graph @@ -1948,7 +2657,7 @@ def run_agentic_sync( console.print(f"[red]{msg}[/red]") if use_github_state: _post_error_comment(owner, repo, issue_number, msg) - return False, msg, llm_cost, provider + return _orch_scope_check_return(msg, llm_cost, provider, success=False) if not quiet: for bn, cwd in module_cwds.items(): @@ -1971,14 +2680,14 @@ def run_agentic_sync( msg = "All modules are already synced — nothing to do." if not quiet: console.print(f"[green]{msg}[/green]") - return True, msg, llm_cost, provider + return _orch_scope_check_return(msg, llm_cost, provider, success=True) if dry_run: module_list = ", ".join(modules_to_sync) msg = f"Dry run complete: {len(modules_to_sync)} module(s) would sync: {module_list}" if not quiet: console.print(f"[green]{msg}[/green]") - return True, msg, llm_cost, provider + return _orch_scope_check_return(msg, llm_cost, provider, success=True) # 12. Run parallel sync sync_options = { @@ -1999,6 +2708,47 @@ def run_agentic_sync( "cwd": project_root, } if use_github_state else None + contract_source: Optional[str] = ( + issue_contract.source if issue_contract is not None else None + ) + + # Iter-32 B-1: orchestrator scope guard at the dispatch boundary. + # iter-30 wrapped every early-return site with + # :func:`_orch_scope_check_return`, but the SUCCESSFUL DISPATCH path — + # where the orchestrator constructs ``AsyncSyncRunner`` or + # ``DurableSyncRunner`` and calls ``.run()`` — was intentionally left + # unwrapped (the runner has its own per-module guard). The gap: any + # pre-dispatch write from LLM module identification, dry-run validation, + # ``_apply_architecture_corrections``, etc. that does NOT trigger an + # early return reaches the runner, where the very first thing + # :class:`AsyncSyncRunner.__init__` does is snapshot the working tree + # as ``_baseline_changed_paths``. That baseline AUTO-ALLOWS those + # writes for the entire sync session (per-module guard preserves + # baseline paths). Run the orchestrator guard one last time here so + # out-of-contract pre-dispatch writes are reverted and dispatch fails + # with a clear diagnostic BEFORE the runner snapshots them. + # + # Single check before the ``if durable: ... else: ...`` branch covers + # both async and durable construction sites — the only intervening + # logic is the runner-class selection, which has no write side + # effects. + scope_diagnostic = _enforce_orchestrator_scope( + project_root, + issue_contract, + scope_guard, + _orch_baseline_changed, + _orch_baseline_ignored, + quiet=quiet, + ) + if scope_diagnostic is not None: + combined = ( + f"Orchestrator scope guard hard-fail before dispatch: " + f"out-of-contract artifacts detected.\n{scope_diagnostic}" + ) + if use_github_state: + _post_error_comment(owner, repo, issue_number, combined) + return False, combined, llm_cost, provider + if durable: runner = DurableSyncRunner( basenames=modules_to_sync, @@ -2015,6 +2765,10 @@ def run_agentic_sync( issue_url=issue_url, module_cwds=module_cwds, initial_cost=llm_cost, + allowed_write_set=allowed_write_paths, + companion_allowlist=effective_companion_allowlist, + scope_guard_enabled=scope_guard, + contract_source=contract_source, ) else: runner = AsyncSyncRunner( @@ -2027,6 +2781,10 @@ def run_agentic_sync( issue_url=issue_url, module_cwds=module_cwds, initial_cost=llm_cost, + allowed_write_set=allowed_write_paths, + companion_allowlist=effective_companion_allowlist, + scope_guard_enabled=scope_guard, + contract_source=contract_source, ) runner_success, runner_msg, total_cost = runner.run() diff --git a/pdd/agentic_sync_runner.py b/pdd/agentic_sync_runner.py index 6ca7606c9..3ac468990 100644 --- a/pdd/agentic_sync_runner.py +++ b/pdd/agentic_sync_runner.py @@ -8,6 +8,7 @@ import csv as _csv import datetime +import hashlib import json import os import re @@ -18,13 +19,22 @@ import tempfile import threading import time +from collections import defaultdict from concurrent.futures import FIRST_COMPLETED, ThreadPoolExecutor, wait from dataclasses import dataclass, field from pathlib import Path -from typing import Any, Dict, List, NamedTuple, Optional, Tuple +from typing import Any, Dict, Iterable, List, NamedTuple, Optional, Set, Tuple from rich.console import Console +from .agentic_common import ( + DEFAULT_SYNC_COMPANION_ALLOWLIST, + PDD_INTERNAL_PATH_ALLOWLIST, + _is_valid_companion_pattern, + _matches_companion_pattern_anchored, + _revert_out_of_scope_changes, +) +from .agentic_common_worktree import revert_out_of_scope_changes_with_dirs from .construct_paths import _is_known_language console = Console() @@ -133,6 +143,244 @@ class DepGraphFromArchitectureResult(NamedTuple): warnings: List[str] +def _normalize_repo_path(path: str) -> str: + """Normalize a repository-relative path for contract comparisons. + + Strip a single leading ``./`` segment only. Do NOT use ``str.lstrip("./")`` + which strips arbitrary leading ``.`` and ``/`` characters and would mangle + legitimate paths whose first segment starts with a dot (e.g. + ``.pdd/meta/foo.json`` would become ``pdd/meta/foo.json`` and miss the + ``.pdd/meta/*.json`` companion glob — Issue #1013 F5 regression). + """ + cleaned = str(path or "").replace("\\", "/").strip() + if cleaned.startswith("./"): + cleaned = cleaned[2:] + return cleaned + + +def _git_changed_paths(project_root: Path) -> Optional[set[str]]: + """Return changed paths from git status, or ``None`` on scan failure. + + Iter-38 M-1 (fail-closed baseline acquisition): previously returned an + empty set on any subprocess failure or non-zero return, indistinguishable + from "scan succeeded but worktree was clean." That ambiguity let a + transient git failure at runner/orchestrator init time produce an empty + baseline that the scope guard later treats as "user had nothing dirty," + so any pre-existing user file is falsely flagged as out-of-scope and + reverted/deleted. + + A successful scan that finds no changes returns an empty set; only + failures (OSError, ``subprocess.SubprocessError``, non-zero return code) + return ``None``. Init-time callers MUST treat ``None`` as a fail-closed + abort signal (see :class:`AsyncSyncRunner.__init__` and + :func:`pdd.agentic_sync.run_agentic_sync`). Enforcement-time callers + that already have a separate ```` policy (see + :meth:`_remaining_out_of_scope_paths`) treat ``None`` as the empty set + via ``or set()``. + """ + try: + result = subprocess.run( + ["git", "status", "--porcelain", "--untracked-files=all"], + cwd=project_root, + capture_output=True, + text=True, + check=False, + ) + except (OSError, subprocess.SubprocessError): + return None + if result.returncode != 0: + return None + + paths: set[str] = set() + for line in result.stdout.splitlines(): + if len(line) < 4: + continue + payload = line[3:].strip() + if not payload: + continue + if " -> " in payload: + old_path, new_path = payload.split(" -> ", 1) + paths.add(_normalize_repo_path(old_path.strip('"'))) + paths.add(_normalize_repo_path(new_path.strip('"'))) + else: + paths.add(_normalize_repo_path(payload.strip('"'))) + return {p for p in paths if p} + + +def _git_ignored_paths(project_root: Path) -> Optional[set[str]]: + """Return repo-relative POSIX paths of git-ignored files (Issue #1013 iter-20). + + Uses ``git ls-files --others --ignored --exclude-standard`` to enumerate + every individual ignored file (no directory entries, no status prefix). + May be slow on repos with large ignored trees (``node_modules/``, + ``build/``, etc.) — callers MUST gate the call on + ``scope_guard_enabled AND allowed_write_paths is not None`` so non-contract + runs do not pay the cost. + + Iter-38 M-1 (fail-closed baseline acquisition): returns ``None`` on any + subprocess failure or non-zero return, NOT an empty set. The init-time + callers that snapshot the baseline (see :class:`AsyncSyncRunner.__init__` + and :func:`pdd.agentic_sync.run_agentic_sync`) MUST treat ``None`` as a + fail-closed abort signal — otherwise a transient git failure at init + silently produces an empty baseline that the scope guard later treats as + "no pre-existing files," so any pre-existing user WIP is falsely flagged + as out-of-scope and reverted/deleted. + + Enforcement-time callers (post-revert re-scan in + :meth:`_remaining_out_of_scope_paths`) handle ignored-scan failures with + the separate ```` sentinel; those sites treat ``None`` + as the empty set via ``or set()``. + """ + try: + result = subprocess.run( + ["git", "ls-files", "--others", "--ignored", "--exclude-standard"], + cwd=project_root, + capture_output=True, + text=True, + check=False, + ) + except (OSError, subprocess.SubprocessError): + return None + if result.returncode != 0: + return None + + paths: set[str] = set() + for line in result.stdout.splitlines(): + rel = _normalize_repo_path(line.strip().strip('"')) + if rel: + paths.add(rel) + return paths + + +def _hash_file(project_root: Path, rel_posix: str) -> Optional[str]: + """Return the SHA-1 of *rel_posix* under *project_root*, or None. + + Issue #1013 iter-24 (M-1) baseline-clobber fix: the scope guard preserves + pre-existing dirty/untracked paths (iter-6 B1) so the sync run does not + delete unrelated user WIP. The original implementation matched paths by + NAME only, which let a buggy LLM SILENTLY OVERWRITE an out-of-scope + baseline file with different content — the post-revert re-scan saw the + name in the baseline and skipped the contract check. + + This hash is captured per baseline path at runner init and re-computed + at scope-guard time; only an unchanged SHA (same bytes on disk) is + treated as pre-existing user WIP and auto-allowed. A divergent SHA + falls through to the contract check, surfacing the clobber. + + SHA-1 is sufficient — this is clobber detection, not adversarial + collision resistance. Returns ``None`` when the file cannot be read + (missing, permission denied, etc.); callers MUST treat ``None`` as + "no fingerprint available" and decide policy explicitly. + + Iter-40 M-1: callers that need to DISCRIMINATE between missing and + unreadable (the iter-34 deletion-detection paths in the scope guards) + must use :func:`_classify_baseline_path` instead — this helper collapses + both cases to ``None`` and is kept for the snapshot-time + re-scan + sites where the fall-through "preserve by name" / "surface as + out-of-scope" semantics are already correct. + """ + try: + path = (project_root / rel_posix).resolve() + with open(path, "rb") as handle: + data = handle.read() + except (OSError, FileNotFoundError): + return None + return hashlib.sha1(data).hexdigest() + + +class _BaselinePathStatus(NamedTuple): + """Result of re-classifying a baseline file at scope-guard time. + + Iter-40 M-1: the iter-34 deletion-detection branches in the per-module + and orchestrator scope guards previously collapsed "file missing" and + "file unreadable" to the same ``current_hash is None`` signal, then + treated both as deletions. A pre-existing baseline file that became + UNREADABLE mid-sync (permission flip, locked file) was falsely flagged + as deleted — and downstream revert helpers were asked to remove a + path that still exists on disk. + + Fields: + sha: SHA-1 hex digest when the file was successfully hashed, else + ``None``. + missing: True when the file no longer exists on disk (the iter-34 + deletion case), False otherwise. When False AND ``sha is None`` + the file exists but could not be read (permission, OSError). + """ + + sha: Optional[str] + missing: bool + + +def _classify_baseline_path( + project_root: Path, rel_posix: str +) -> _BaselinePathStatus: + """Discriminated re-hash for baseline preservation at enforcement time. + + Iter-40 M-1 fix for the deletion blind spot (iter-34) which previously + treated unreadable files as deleted. Returns: + + - ``_BaselinePathStatus(hex_sha, False)`` — file exists and was hashed + - ``_BaselinePathStatus(None, True)`` — file is gone (iter-34 deletion) + - ``_BaselinePathStatus(None, False)`` — file exists but unreadable + (permission flip / OSError) → callers SHOULD preserve by name to + avoid the false-deletion diagnostic + the downstream revert-helper + attempt to remove a still-present path. + + This helper is intentionally NOT a replacement for :func:`_hash_file`. + The snapshot-time callsite (:func:`_hash_baseline_paths`) and the + re-scan loops in :meth:`AsyncSyncRunner._remaining_out_of_scope_paths` + + :func:`_orchestrator_remaining_out_of_scope_paths` already do the + right thing on a collapsed ``None`` — they either record ``None`` as + "no fingerprint available" (snapshot) or fall through to surfacing + the path as out-of-scope (re-scan, where git already listed the file). + Only the iter-34 baseline-iteration sites in + :meth:`AsyncSyncRunner._enforce_scope_guard` and + :func:`pdd.agentic_sync._enforce_orchestrator_scope` need the + discriminated answer. + """ + try: + path = (project_root / rel_posix).resolve() + except OSError: + # Path resolution itself failed — treat as unreadable, preserve. + return _BaselinePathStatus(None, False) + # ``exists()`` is the primary "missing" probe. ``open()`` below still + # has a defensive ``FileNotFoundError`` catch in case the file is + # raced out from under us between the two syscalls. + if not path.exists(): + return _BaselinePathStatus(None, True) + try: + with open(path, "rb") as handle: + data = handle.read() + except FileNotFoundError: + # Raced removal between exists() and open() — treat as missing. + return _BaselinePathStatus(None, True) + except OSError: + # PermissionError / locked file / generic IO — file is present + # but cannot be read. Distinct from missing. + return _BaselinePathStatus(None, False) + return _BaselinePathStatus(hashlib.sha1(data).hexdigest(), False) + + +def _hash_baseline_paths( + project_root: Path, paths: Iterable[str] +) -> Dict[str, Optional[str]]: + """Map each repo-relative path under *project_root* to its SHA-1. + + Iter-30: extracted helper. Previously this was an inline dict + comprehension in :class:`AsyncSyncRunner.__init__` (mirrored twice for + the changed-paths and ignored-paths baselines). The orchestrator scope + guard now reuses it to snapshot baseline before any pre-dispatch LLM + call or shell command runs in :func:`pdd.agentic_sync.run_agentic_sync`. + + Returns: + Dict from repo-relative POSIX path to SHA-1 hex string. ``None`` is + recorded when the file cannot be read at snapshot time — callers + (the scope guard) must treat ``None`` as "no fingerprint available" + and decide preservation policy explicitly. See :func:`_hash_file`. + """ + return {rel: _hash_file(project_root, rel) for rel in paths} + + # --------------------------------------------------------------------------- # Helper functions # --------------------------------------------------------------------------- @@ -820,6 +1068,11 @@ def __init__( module_cwds: Optional[Dict[str, Any]] = None, module_targets: Optional[Dict[str, str]] = None, initial_cost: float = 0.0, + allowed_write_set: Optional[Iterable[str]] = None, + companion_allowlist: Optional[Iterable[str]] = None, + scope_guard_enabled: bool = True, + contract_source: Optional[str] = None, + project_root: Optional[Path] = None, ): self.basenames: List[str] = list(basenames) self.dep_graph: Dict[str, List[str]] = { @@ -830,13 +1083,130 @@ def __init__( self.quiet = quiet self.verbose = verbose self.issue_url = issue_url - self.project_root: Path = Path.cwd() + # Issue #1013 iter-18 M-1 (durable baseline-paths bug): allow callers + # (notably ``DurableSyncRunner``) to pin ``project_root`` to a known + # repo root BEFORE the baseline-changed-paths snapshot is taken + # below. Without this kwarg the snapshot would always read + # ``git status`` from the caller's current working directory, so a + # dirty file in the user's main checkout would be auto-allowed in + # the durable worktree's scope guard. + self.project_root: Path = ( + Path(project_root).resolve() if project_root is not None else Path.cwd() + ) self.module_cwds: Dict[str, Any] = dict(module_cwds or {}) self.module_targets: Dict[str, str] = dict(module_targets or {}) self.initial_cost = float(initial_cost or 0.0) + # Issue #1013 — split-contract scope guard (F5, F9, F14, F4, F6): + # Track contract presence separately from set truthiness. ``None`` + # means "no contract → permissive fallback"; an explicit empty + # iterable means "contract present but empty → reject everything" + # (degenerate but legal). The single accepted kwarg name is + # ``allowed_write_set``; the legacy ``allowed_write_paths`` alias is + # gone per F14. + # + # F6: the parsed contract is recorded for diagnostics *regardless* of + # whether scope-guard enforcement is enabled. ``_enforce_scope_guard`` + # short-circuits on ``scope_guard_enabled=False``, so storing the + # contract in opt-out mode is safe and matches the spec requirement + # that disabled runners still see the parsed contract. + self.scope_guard_enabled: bool = bool(scope_guard_enabled) + self.contract_source: Optional[str] = contract_source + if allowed_write_set is not None: + self.allowed_write_paths: Optional[Set[str]] = { + _normalize_repo_path(path) + for path in allowed_write_set + if isinstance(path, str) and path.strip() + } + else: + self.allowed_write_paths = None + # F4: the effective companion allowlist is *always* the caller-provided + # patterns unioned with DEFAULT_SYNC_COMPANION_ALLOWLIST. Order is + # preserved (caller patterns first, defaults appended) and duplicates + # are removed deterministically. Passing an empty iterable still + # produces at least the default; passing ``None`` is identical to + # passing an empty iterable. + provided: Tuple[str, ...] = tuple( + p for p in (companion_allowlist or ()) + if isinstance(p, str) and p + ) + self.companion_allowlist: Tuple[str, ...] = tuple( + dict.fromkeys(provided + tuple(DEFAULT_SYNC_COMPANION_ALLOWLIST)) + ) + + # Per-`git toplevel` locks for scope-guard git operations (F12). + # Modules may share a repo root (shared-worktree non-durable sync) so + # the lock key MUST resolve to the actual git toplevel, not the raw + # module_cwd path — otherwise two modules in the same repo get + # separate locks and the race we're trying to prevent reappears. + self._scope_guard_locks: Dict[str, threading.Lock] = defaultdict(threading.Lock) + self._scope_guard_locks_lock = threading.Lock() + + # Iter-6 B1 (data-loss bug): snapshot the working-tree changed/untracked + # set BEFORE any module sync runs. Pre-existing untracked files + # (e.g. user's ``scratch.txt``) are not the sync run's responsibility + # and MUST be preserved by the scope guard. + # + # Iter-24 M-1 (baseline-clobber bug): the snapshot is now a DICT + # mapping repo-relative POSIX path → init-time SHA-1 instead of a + # bare set. ``_enforce_scope_guard`` re-hashes each baseline path at + # check time; ONLY paths whose content is byte-identical to the init + # snapshot are auto-allowed. Name-based preservation let a buggy LLM + # silently OVERWRITE pre-existing dirty files outside the contract + # (the iter-23 codex repro). Empty dict — never bare set — when the + # gate (``scope_guard_enabled AND allowed_write_paths is not None``) + # is off, so the dict.items() loops in the enforcement path are + # safe no-ops. + # + # Iter-20 M-1 (gitignored fail-open): also snapshot pre-existing + # gitignored files (e.g. user-side ``build/cache.bin`` under a + # repo-wide ``.gitignore: build/``). ``git status`` does not show + # ignored files by default, so without this baseline a sync that + # writes to a gitignored path outside the allowed write set would be + # invisible to the post-revert re-scan and the module would be + # marked successful with the contract violated on disk. + # + # Iter-24 M-1: same dict-with-SHA shape as ``_baseline_changed_paths``. + # + # Iter-38 M-1 (fail-closed baseline acquisition): the helpers now + # return ``None`` on scan failure (transient git lock contention, + # missing binary, OSError) instead of an empty set. Without this + # discrimination an init-time scan failure would silently produce + # an empty baseline that the scope guard later treats as "no + # pre-existing files," so any pre-existing user WIP is falsely + # flagged as out-of-scope and reverted/deleted. When EITHER scan + # returns ``None`` we record ``_baseline_acquisition_failed = True`` + # and :meth:`run` aborts before any write-capable work. The flag + # is internal — public signatures are unchanged. + if self.scope_guard_enabled and self.allowed_write_paths is not None: + _raw_changed = _git_changed_paths(self.project_root) + _raw_ignored = _git_ignored_paths(self.project_root) + if _raw_changed is None or _raw_ignored is None: + self._baseline_acquisition_failed: bool = True + self._baseline_changed_paths: Dict[str, Optional[str]] = {} + self._baseline_ignored_paths: Dict[str, Optional[str]] = {} + else: + self._baseline_acquisition_failed = False + self._baseline_changed_paths = _hash_baseline_paths( + self.project_root, _raw_changed + ) + self._baseline_ignored_paths = _hash_baseline_paths( + self.project_root, _raw_ignored + ) + else: + self._baseline_acquisition_failed = False + self._baseline_changed_paths = {} + self._baseline_ignored_paths = {} + self.total_budget = self.sync_options.get("total_budget") self.max_workers = 1 if self.total_budget is not None else MAX_WORKERS + # When a contract narrows writes AND scope-guard enforcement is + # active, serialise across modules so the per-cwd lock isn't fighting + # parallel git status / git checkout calls. With ``--no-scope-guard`` + # the contract is recorded for diagnostics only — no enforcement runs + # — so parallelism is preserved (F6). + if self.scope_guard_enabled and self.allowed_write_paths is not None: + self.max_workers = 1 self.module_states: Dict[str, ModuleState] = { b: ModuleState() for b in self.basenames @@ -1419,6 +1789,34 @@ def _record_result( # ------------------------------------------------------------------ def run(self) -> Tuple[bool, str, float]: """Run all syncs respecting dependencies.""" + # Issue #1013 iter-18 m-1: scope-guard run-entry logging is owned by + # the sync layer (`pdd/agentic_sync.py` ``run_agentic_sync``), which + # emits a single user-facing line per invocation covering all three + # states (disabled / contract loaded / no contract). The runner used + # to log the permissive-fallback and opt-out states a second time + # here, which produced a duplicate line for every sync. Removed so + # the operator sees one authoritative status line. + + # Iter-38 M-1 (fail-closed baseline acquisition): the __init__ + # baseline-scan helpers (``_git_changed_paths`` / ``_git_ignored_paths``) + # now return ``None`` on transient git failure (lock contention, + # missing binary, OSError) rather than an empty set. An empty + # baseline indistinguishable from "scan succeeded, worktree clean" + # would later cause the scope guard to flag pre-existing user WIP + # as out-of-scope and revert/delete it. When the init recorded a + # failed acquisition, abort BEFORE any write-capable work runs. + if getattr(self, "_baseline_acquisition_failed", False): + return ( + False, + ( + "Scope guard fail-closed: could not snapshot working-tree " + "baseline at runner init (git scan failed). Aborting " + "before any write-capable work to prevent false-positive " + "reverts of pre-existing user files." + ), + self.initial_cost, + ) + if not self.basenames: return True, "No modules to sync", self.initial_cost @@ -1651,6 +2049,25 @@ def _sync_one_module(self, basename: str) -> Tuple[bool, float, str]: last_stdout = "" last_stderr = "" repair_directive: Optional[str] = None + module_cwd = Path(self.module_cwds.get(basename, self.project_root)) + + def _apply_scope_guard( + success: bool, total_cost: float, error: str + ) -> Tuple[bool, float, str]: + """ + Wrap the result of a per-module attempt with scope-guard + enforcement (Issue #1013, F6, F7, F8). Runs after every attempt — + success OR failure — before returning so out-of-scope artifacts + are reverted even when ``pdd sync`` itself failed. + """ + diagnostic = self._enforce_scope_guard(basename, module_cwd) + if diagnostic is None: + return success, total_cost, error + scope_failure = ( + "Scope guard hard-fail: out-of-scope artifacts detected\n" + + diagnostic + ) + return False, total_cost, scope_failure for attempt in range(MAX_CONFORMANCE_ATTEMPTS): success, cost, error, stdout, stderr = self._run_attempt( @@ -1664,12 +2081,12 @@ def _sync_one_module(self, basename: str) -> Tuple[bool, float, str]: last_stderr = stderr if success: - return True, total_cost, "" + return _apply_scope_guard(True, total_cost, "") conformance = _parse_conformance_failure(stdout, stderr) if conformance is None: # Not a conformance failure: do not retry - return False, total_cost, error + return _apply_scope_guard(False, total_cost, error) new_directive, new_missing = conformance if last_missing is not None and new_missing == last_missing: @@ -1688,11 +2105,513 @@ def _sync_one_module(self, basename: str) -> Tuple[bool, float, str]: ) break - # Hard-failure path: include structured conformance block + # Hard-failure path: include structured conformance block, then run + # the scope guard so a failing conformance loop still cleans up + # out-of-scope writes the LLM made on the way to the failure. hard_block = self._build_conformance_hard_failure( basename, last_error, last_stdout, last_stderr ) - return False, total_cost, hard_block + return _apply_scope_guard(False, total_cost, hard_block) + + # ------------------------------------------------------------------ + # Issue #1013 — split-contract scope guard + # ------------------------------------------------------------------ + + def _resolve_repo_root(self, module_cwd: Path) -> Path: + """ + Return the git toplevel for *module_cwd*, falling back to *module_cwd* + when git is unavailable or the directory is not in a repo. + """ + try: + result = subprocess.run( + ["git", "-C", str(module_cwd), "rev-parse", "--show-toplevel"], + capture_output=True, + text=True, + timeout=10, + ) + except (OSError, subprocess.SubprocessError): + return module_cwd + if result.returncode != 0: + return module_cwd + toplevel = result.stdout.strip() + if not toplevel: + return module_cwd + return Path(toplevel) + + def _scope_guard_lock(self, repo_root: Path) -> threading.Lock: + """Return a per-repo-root :class:`threading.Lock` (F12).""" + key = str(repo_root.resolve()) + with self._scope_guard_locks_lock: + return self._scope_guard_locks[key] + + def _matches_companion_allowlist( + self, rel_posix_path: str, allowlist: Iterable[str] + ) -> bool: + """Return True if *rel_posix_path* matches any companion glob. + + Issue #1013 iter-14 M-1: uses + :func:`_matches_companion_pattern_anchored` (segment-aware, + anchored at the START of the path) rather than + :meth:`pathlib.PurePosixPath.match`. The pathlib matcher is + suffix-based when the pattern is relative, so + ``.pdd/meta/*.json`` falsely matches ``subdir/.pdd/meta/foo.json`` + — letting a contract violator bypass the guard by writing + fingerprint-shaped files nested under any directory. + """ + for pattern in allowlist: + if not pattern: + continue + # Issue #1013 iter-10 M-1 (defense-in-depth): even if a + # wildcard-only / doublestar pattern slipped past the parser, + # refuse to treat it as auto-allowing repo-wide writes. + if not _is_valid_companion_pattern(pattern): + continue + if _matches_companion_pattern_anchored(rel_posix_path, pattern): + return True + return False + + def _remaining_out_of_scope_paths( + self, repo_root: Path, allowed_files: Set[Path] + ) -> List[str]: + """ + Iter-9 M-1 (fail-closed boundary): re-scan the worktree after the + revert helpers have run and return any repo-relative paths still NOT + in *allowed_files*. + + This guards against silent fail-open when either revert helper + cannot inspect / revert / remove an out-of-scope path (git timeout, + permission error, restore failure). Those helpers log a warning and + return ``[]``; without this re-scan ``_enforce_scope_guard`` would + treat the empty list as "nothing was out of scope" and let the + module succeed with the contract still violated on disk. + + Iter-20 M-1 (gitignored fail-open): the standard ``git status`` scan + does NOT report gitignored files. A sync that writes outside the + contract into a gitignored path (e.g. ``build/junk.txt`` under a + repo-wide ``.gitignore: build/``) would be invisible. A SECOND scan + via ``git ls-files --others --ignored --exclude-standard`` enumerates + every individual ignored file; results not in ``allowed_files`` and + not in ``self._baseline_ignored_paths`` (pre-existing ignored files + the user owned BEFORE sync ran) are added to the ``remaining`` set. + + Returns: + Sorted list of POSIX repo-relative paths still out of scope, OR + the sentinel ``[""]`` when EITHER the + ``git status`` scan OR the ``git ls-files --ignored`` scan + cannot be executed (timeout / missing git / non-zero return). + The sentinel is consistent with the warning-log + empty-list + style used elsewhere in the scope guard, but still forces + ``_enforce_scope_guard`` to hard-fail rather than treat the + unobservable working tree as clean. + """ + try: + result = subprocess.run( + ["git", "-C", str(repo_root), "status", + "--porcelain", "--untracked-files=all"], + capture_output=True, text=True, timeout=30, + ) + except (subprocess.TimeoutExpired, FileNotFoundError, OSError): + return [""] + if result.returncode != 0: + return [""] + + remaining: Set[str] = set() + for line in result.stdout.splitlines(): + if len(line) < 4: + continue + payload = line[3:].strip() + if not payload: + continue + # Renames: ``R old -> new``. Both sides count. + if " -> " in payload: + old_raw, new_raw = payload.split(" -> ", 1) + entry_paths = [old_raw.strip().strip('"'), + new_raw.strip().strip('"')] + else: + entry_paths = [payload.strip('"')] + for rel in entry_paths: + rel = _normalize_repo_path(rel) + if not rel: + continue + absolute = (repo_root / rel).resolve() + if absolute in allowed_files: + continue + remaining.add(rel) + + # Iter-20 M-1: scan for gitignored files that the standard + # ``git status`` pass above cannot see. ``git ls-files + # --others --ignored --exclude-standard`` lists every individual + # ignored file (no directory rollup, no status prefix). + try: + ignored_result = subprocess.run( + ["git", "-C", str(repo_root), "ls-files", + "--others", "--ignored", "--exclude-standard"], + capture_output=True, text=True, timeout=30, + ) + except (subprocess.TimeoutExpired, FileNotFoundError, OSError): + return [""] + if ignored_result.returncode != 0: + return [""] + + # Iter-24 M-1: ``_baseline_ignored_paths`` is now a Dict[path, SHA]. + # ``getattr`` fallback keeps the runtime robust against subclasses + # that bypass __init__ (none in-tree today, but the iter-20 fallback + # used ``set()`` for the same reason); we still expect a mapping. + baseline_ignored = getattr(self, "_baseline_ignored_paths", {}) + for line in ignored_result.stdout.splitlines(): + rel = _normalize_repo_path(line.strip().strip('"')) + if not rel: + continue + # Iter-6 B1 / iter-24 M-1: pre-existing ignored files snapshotted + # at runner init are NOT the sync run's responsibility — but ONLY + # if their content is byte-identical to the init snapshot. A + # clobbered ignored baseline path must surface as out-of-scope. + if rel in baseline_ignored: + baseline_hash = baseline_ignored[rel] + current_hash = _hash_file(repo_root, rel) + # ``current_hash is None`` means the file disappeared from + # disk between the init snapshot and now — but the ignored + # scan still listed it, which is contradictory. Fall through + # to surface it; the diagnostic is the safer direction. + if (current_hash is not None + and (baseline_hash is None + or current_hash == baseline_hash)): + # Unchanged baseline (or unreadable at init → preserve + # by name, same conservative carve-out as the changed + # baseline branch in ``_enforce_scope_guard``). + continue + absolute = (repo_root / rel).resolve() + # Companion-allowlisted files (e.g. ``.pdd/meta/*.json`` when + # the user has ``.pdd/`` in ``.gitignore``) are in + # ``allowed_files`` via the rglob pass in ``_enforce_scope_guard``. + if absolute in allowed_files: + continue + remaining.add(rel) + + return sorted(remaining) + + def _enforce_scope_guard( + self, basename: str, module_cwd: Path + ) -> Optional[str]: + """ + Issue #1013 split-contract enforcement after each per-module sync. + + Returns: + ``None`` when the module is in scope (or enforcement is disabled); + a multi-line diagnostic string when out-of-scope artifacts were + detected and reverted/removed. + + This is a no-op when ``self.scope_guard_enabled`` is False or + ``self.allowed_write_paths is None`` (no parseable contract). + """ + if not self.scope_guard_enabled: + return None + if self.allowed_write_paths is None: + return None + + repo_root = self._resolve_repo_root(Path(module_cwd)) + lock = self._scope_guard_lock(repo_root) + with lock: + # Resolve contract paths to absolute paths under the repo root. + allowed_files: Set[Path] = set() + for rel in self.allowed_write_paths: + if not rel: + continue + allowed_files.add((repo_root / rel).resolve()) + + # Auto-allow companion artifacts (e.g. ``.pdd/meta/*.json``) that + # currently exist or are about to be created under the repo + # root. We add them to the allowed-files set so the helpers in + # ``agentic_common`` / ``agentic_common_worktree`` skip them. + # ``self.companion_allowlist`` already includes DEFAULT_* + # (unioned in __init__ per F4); no fallback needed here. + # F1 (Issue #1013 iter-3): only files UNDER ``module_cwd`` count + # as companion artifacts — never auto-allow a sibling module's + # ``.pdd/meta/*.json`` just because it lives in the same repo. + # + # Iter-14 M-1: companion patterns are matched MODULE-RELATIVE, + # not repo-relative. The pattern ``.pdd/meta/*.json`` describes + # fingerprint metadata at the top of each module's working + # directory; in a multi-module repo where ``module_cwd`` is a + # subdirectory (e.g. ``mod_a/``), the file lives at + # ``mod_a/.pdd/meta/x.json`` relative to the repo root but at + # ``.pdd/meta/x.json`` relative to the module — and the latter + # is what the segment-aware anchored matcher must see. (The old + # ``PurePosixPath.match`` suffix-matching obscured this by + # accidentally auto-allowing the repo-relative form, which also + # auto-allowed any ``subdir/.pdd/meta/foo.json`` — the bug.) + allowlist = tuple(self.companion_allowlist) + cwd_path = Path(module_cwd).resolve() + for path in cwd_path.rglob("*"): + if not path.is_file(): + continue + try: + rel_posix = path.resolve().relative_to(cwd_path).as_posix() + except ValueError: + continue + if self._matches_companion_allowlist(rel_posix, allowlist): + allowed_files.add(path.resolve()) + + # Iter-36 B-1/B-2: PDD-internal infrastructure paths + # (``.pdd/agentic-logs/*``, ``.pdd/agentic_sync_state.json``, + # etc.) are written by the tool itself during a guarded run + # (audit logs from ``run_agentic_task``; runner state file + # from ``_record_result`` after each module). They are + # NEVER part of a contract. This pass is SEPARATE from the + # user-facing companion pass above because internal patterns + # are REPO-ROOT-anchored (the writes happen at the top of + # the project regardless of which module is being synced) — + # in the multi-module case ``module_cwd`` is a subdirectory + # and the audit log under ``/.pdd/agentic-logs/`` + # would NOT match a module-rooted match pass. + for path in repo_root.rglob("*"): + if not path.is_file(): + continue + try: + repo_rel_posix = ( + path.resolve().relative_to(repo_root).as_posix() + ) + except ValueError: + continue + for pattern in PDD_INTERNAL_PATH_ALLOWLIST: + if _matches_companion_pattern_anchored( + repo_rel_posix, pattern + ): + allowed_files.add(path.resolve()) + break + + # Iter-4 F1: rglob only sees files that still exist on disk. Sync + # legitimately DELETES companion artifacts (e.g. ``.pdd/meta/foo_python.json`` + # when a module is renamed/removed); those deletions appear in + # ``git status`` as tracked ``D ``. Without this pass the revert + # helper would resurrect the deleted companion and hard-fail. + # + # Iter-14 M-1: ``_git_changed_paths`` returns repo-relative paths; + # scope to ``cwd_path`` FIRST, then match the module-relative form + # against the companion pattern (same semantics as the rglob loop + # above). + # + # Iter-38 M-1: ``_git_changed_paths`` now returns ``None`` on + # scan failure (was empty set). Enforcement-time scan failures + # are already handled by the ```` sentinel in + # :meth:`_remaining_out_of_scope_paths`; here we just treat + # ``None`` as the empty set so the iteration is a no-op rather + # than crashing. + for rel_posix in (_git_changed_paths(repo_root) or set()): + absolute = (repo_root / rel_posix).resolve() + # Iter-36 B-1/B-2: tracked deletion of a PDD-internal + # artifact (e.g. ``.pdd/agentic_sync_state.json`` between + # runs) must not be resurrected by the revert helper. + # Match against the REPO-relative form before the + # module-cwd scoping below. + matched_internal = False + for pattern in PDD_INTERNAL_PATH_ALLOWLIST: + if _matches_companion_pattern_anchored(rel_posix, pattern): + allowed_files.add(absolute) + matched_internal = True + break + if matched_internal: + continue + try: + module_rel_posix = absolute.relative_to(cwd_path).as_posix() + except ValueError: + # Outside the module's cwd — scoped out by F1 iter-3. + continue + if not self._matches_companion_allowlist(module_rel_posix, allowlist): + continue + allowed_files.add(absolute) + + # Iter-6 B1 (data-loss bug): pre-existing untracked files + # captured at runner __init__ are NEVER out-of-scope. Without + # this pass, a user's ``scratch.txt`` or unrelated WIP under + # the repo root would be removed by the revert helper. + # + # Iter-24 M-1 (baseline-clobber bug): preservation is now + # CONTENT-AWARE. Re-hash each baseline path against the + # init-time SHA. Only byte-identical content is auto-allowed; + # divergent SHAs fall through to the contract check so a + # sync-side clobber is surfaced. Note: the init hash uses + # ``self.project_root`` and the enforcement hash uses + # ``repo_root`` — in the non-durable async case those resolve + # to the same path; the durable runner clears the baseline + # entirely (iter-22) so this loop is a no-op there. + # + # Iter-34 M-3 (baseline-deletion blind spot, codex iter-33): + # ``current_hash is None`` means the baseline file is GONE + # from disk. For TRACKED baseline paths, ``git status`` will + # surface this as ``D `` and ``_remaining_out_of_scope_paths`` + # picks it up; but for UNTRACKED baseline paths (the user's + # local WIP captured at init) git has no record — the + # deletion is invisible to every subsequent scan and the + # module would succeed with the WIP silently lost. Collect + # the deletions here and union them into the diagnostic's + # ``remaining`` set below. + # + # Iter-40 M-1 (unreadable vs missing): the previous code + # collapsed "file missing" and "file unreadable" (permission + # flip, locked file) into the same ``current_hash is None`` + # signal, then treated both as deletions. A pre-existing + # baseline file that became UNREADABLE mid-sync would be + # falsely flagged as deleted, the diagnostic would lie about + # the file being removed, and downstream revert helpers + # would attempt to remove a still-present path. The + # :func:`_classify_baseline_path` helper distinguishes the + # two; unreadable falls through to the legacy iter-6 B1 + # preserve-by-name carve-out (same as the unreadable-at-init + # branch), while genuinely-missing flows the iter-34 path. + baseline_deleted: Set[str] = set() + for rel_posix, baseline_hash in self._baseline_changed_paths.items(): + status = _classify_baseline_path(repo_root, rel_posix) + if status.missing: + # Iter-34 M-3: baseline file is GONE. Surface it as + # unrecovered regardless of whether it was tracked + # or untracked at init — we can't distinguish the + # two from the baseline snapshot, and even the + # tracked-deletion case warrants a hard-fail (the + # user didn't expect their pre-existing file to be + # removed by sync). + baseline_deleted.add(rel_posix) + continue + if status.sha is None: + # Iter-40 M-1: file exists but unreadable now + # (permission flip, locked file, transient OSError). + # Preserve by name — same conservative carve-out as + # the unreadable-at-init branch below — so a + # permission-flaky baseline path is not misreported + # as deleted. + allowed_files.add((repo_root / rel_posix).resolve()) + continue + if baseline_hash is None: + # Couldn't hash at init (the file was unreadable + # then). Be conservative and preserve by name, the + # legacy iter-6 B1 behaviour — avoids false-positives + # on permission-flaky paths that pre-date the run. + allowed_files.add((repo_root / rel_posix).resolve()) + continue + if status.sha == baseline_hash: + # Unchanged user WIP — preserve. + allowed_files.add((repo_root / rel_posix).resolve()) + # else: sync (or some other writer) clobbered the file. + # Do NOT add to allowed_files — let the contract check + # flag it as out-of-scope. + + # Iter-34 M-3: symmetric pass for ignored baseline paths. + # ``_remaining_out_of_scope_paths`` only sees files that + # ``git ls-files --ignored`` currently lists, so a deleted + # gitignored baseline file (e.g. user-side ``cache.bin`` + # erased by sync) leaves no trail in either scan. Iterate + # the ignored baseline directly to catch the deletion. + # + # Iter-40 M-1: same unreadable-vs-missing discrimination — + # an unreadable ignored baseline must NOT be flagged as + # deleted. Preserve by name so the diagnostic does not + # falsely claim the file was removed. + for rel_posix, baseline_hash in self._baseline_ignored_paths.items(): + status = _classify_baseline_path(repo_root, rel_posix) + if status.missing: + baseline_deleted.add(rel_posix) + elif status.sha is None: + # File exists but unreadable — preserve by name. + allowed_files.add((repo_root / rel_posix).resolve()) + # Present-but-changed ignored baselines are already + # surfaced by ``_remaining_out_of_scope_paths``'s + # ignored loop (the hash comparison there falls through + # to ``remaining.add`` on divergence). Don't duplicate + # that work here. + + tracked_reverted = _revert_out_of_scope_changes(repo_root, allowed_files) + untracked_reverted = revert_out_of_scope_changes_with_dirs( + repo_root, allowed_dirs=set(), allowed_files=allowed_files + ) + + # Combine while preserving order and uniqueness for the diagnostic. + seen: Set[str] = set() + offending: List[str] = [] + for path in list(tracked_reverted) + list(untracked_reverted): + try: + rel = Path(path).resolve().relative_to(repo_root).as_posix() + except ValueError: + rel = str(path) + if rel in seen: + continue + # Filter out anything that ended up in the allowed set — + # e.g. companion artifacts that the helpers do not revert + # but that we still surface as no-ops. + if (repo_root / rel).resolve() in allowed_files: + continue + seen.add(rel) + offending.append(rel) + + # Iter-9 M-1 (fail-closed boundary): re-scan the worktree after + # the revert helpers have run. Either helper can fail silently + # (git timeout, permission error, restore failure) and return + # ``[]``. Without this re-scan we would conclude "nothing was + # out of scope" and let the module succeed with the contract + # still violated on disk. + remaining_raw = self._remaining_out_of_scope_paths( + repo_root, allowed_files + ) + # Filter out paths already surfaced as ``offending`` so the + # re-scan does not double-list. In practice when helpers succeed + # the path is gone from ``git status``; when helpers fail with + # ``reverted.clear()`` ``offending`` is empty. Defensive filter. + # + # Iter-34 M-3: union with ``baseline_deleted`` so a sync-side + # deletion of a pre-existing untracked/ignored baseline path + # hard-fails the module. For tracked baselines ``git status`` + # already surfaces the deletion as ``D ``, so the union just + # dedups via set semantics. + offending_set = set(offending) + remaining = sorted( + (set(remaining_raw) | baseline_deleted) - offending_set + ) + + if not offending and not remaining: + return None + + source = self.contract_source or "" + allowed_lines = "\n".join( + f" - {p}" for p in sorted(self.allowed_write_paths) + ) or " - " + companion_lines = "\n".join( + f" - {p}" for p in allowlist + ) or " - " + + # Header line shape depends on whether anything was actually + # reverted. When ``offending`` is empty but ``remaining`` is + # non-empty, emitting "reverted 0 out-of-scope file(s)" plus an + # empty bullet list reads incorrectly; use a distinct header. + if offending: + offending_lines = "\n".join(f" - {p}" for p in offending) + header = ( + f"Scope guard reverted {len(offending)} out-of-scope " + f"file(s) for module '{basename}' " + f"(contract source: {source}):\n" + f"{offending_lines}" + ) + else: + header = ( + f"Scope guard detected out-of-scope artifacts for " + f"module '{basename}' (contract source: {source}) " + f"but the revert helpers reported no successful reverts." + ) + + parts = [header] + if remaining: + unrecovered_lines = "\n".join(f" - {p}" for p in remaining) + parts.append( + "Unrecovered (revert failed, manual cleanup required):\n" + f"{unrecovered_lines}" + ) + parts.append(f"Allowed write set:\n{allowed_lines}") + parts.append(f"Companion allowlist:\n{companion_lines}") + diagnostic = "\n".join(parts) + # F8 (Issue #1013): print the diagnostic to stderr immediately + # after reverting. ``maintenance.py`` separately echoes the + # assembled module-failure error at the end of the run — two + # distinct events, so keep both. + print(diagnostic, file=sys.stderr) + return diagnostic def _build_conformance_hard_failure( self, diff --git a/pdd/commands/maintenance.py b/pdd/commands/maintenance.py index 6b35ae717..714a1acdd 100644 --- a/pdd/commands/maintenance.py +++ b/pdd/commands/maintenance.py @@ -121,6 +121,17 @@ default=None, help="Maximum parallel module worktrees in durable mode. Default: current runner concurrency.", ) +@click.option( + "--no-scope-guard", + "no_scope_guard", + is_flag=True, + default=False, + help="Issue-sync only. Disable the split-contract scope guard for this run. " + "By default, when the linked GitHub issue declares an allowed write set " + "(split contract), `pdd sync` enforces it and rejects out-of-scope generated " + "artifacts. Pass this flag only when intentionally overriding contract " + "enforcement (e.g. recovering from a stale contract).", +) @click.pass_context @track_cost def sync( @@ -143,6 +154,7 @@ def sync( durable_branch: Optional[str], no_resume: bool, durable_max_parallel: Optional[int], + no_scope_guard: bool, ) -> Optional[Tuple[str, float, str]]: """ Synchronize prompts with code and tests. @@ -179,6 +191,7 @@ def sync( max_attempts=max_attempts, one_session=effective_one_session, timeout_adder=timeout_adder, + scope_guard=not no_scope_guard, ) # Detect GitHub issue URL -> dispatch to agentic sync @@ -208,6 +221,7 @@ def sync( durable_branch=durable_branch, no_resume=no_resume, durable_max_parallel=durable_max_parallel, + scope_guard=not no_scope_guard, ) if durable or durable_branch or no_resume or durable_max_parallel is not None: @@ -255,6 +269,7 @@ def _run_agentic_sync_dispatch( durable_branch: Optional[str] = None, no_resume: bool = False, durable_max_parallel: Optional[int] = None, + scope_guard: bool = True, ) -> Optional[Tuple[str, float, str]]: """Dispatch to agentic sync runner for GitHub issue URLs.""" ctx.ensure_object(dict) @@ -281,6 +296,7 @@ def _run_agentic_sync_dispatch( durable_branch=durable_branch, no_resume=no_resume, durable_max_parallel=durable_max_parallel, + scope_guard=scope_guard, ) if not quiet: @@ -314,6 +330,7 @@ def _run_global_sync_dispatch( max_attempts: Optional[int], one_session: bool = False, timeout_adder: float = 0.0, + scope_guard: bool = True, ) -> Optional[Tuple[str, float, str]]: """Dispatch to global sync runner for no-argument `pdd sync`.""" ctx.ensure_object(dict) @@ -337,6 +354,7 @@ def _run_global_sync_dispatch( one_session=one_session, local=ctx.obj.get("local", False), timeout_adder=timeout_adder, + scope_guard=scope_guard, ) if not quiet: diff --git a/pdd/durable_sync_runner.py b/pdd/durable_sync_runner.py index 1ec1c5f69..efbdb662e 100644 --- a/pdd/durable_sync_runner.py +++ b/pdd/durable_sync_runner.py @@ -20,6 +20,11 @@ from pathlib import Path from typing import Dict, List, Optional, Set, Tuple +from .agentic_common import ( + PDD_INTERNAL_PATH_ALLOWLIST, + _is_valid_companion_pattern, + _matches_companion_pattern_anchored, +) from .agentic_sync_runner import AsyncSyncRunner, MAX_WORKERS CHECKPOINT_TRAILER = "PDD-Sync-Checkpoint-V1" @@ -47,6 +52,10 @@ def __init__( issue_url: Optional[str] = None, module_cwds: Optional[Dict[str, Path]] = None, initial_cost: float = 0.0, + allowed_write_set: Optional[List[str]] = None, + companion_allowlist: Optional[List[str]] = None, + scope_guard_enabled: bool = True, + contract_source: Optional[str] = None, ) -> None: self.issue_number = issue_number self.git_root = project_root.resolve() @@ -62,6 +71,14 @@ def __init__( self._run_id = uuid.uuid4().hex[:8] self._prepared = False + # Issue #1013 iter-18 M-1 (durable baseline-paths bug): forward the + # resolved ``git_root`` into ``AsyncSyncRunner.__init__`` via the new + # ``project_root`` kwarg so the baseline-changed-paths snapshot is + # taken against the durable worktree's repo root rather than against + # the caller's current working directory. The previous post-super() + # assignment to ``self.project_root`` ran *after* the snapshot, which + # captured dirty files from the caller's main checkout and let them + # bypass the scope guard inside the durable worktree. super().__init__( basenames=basenames, dep_graph=dep_graph, @@ -72,8 +89,37 @@ def __init__( issue_url=issue_url, module_cwds={}, initial_cost=initial_cost, + allowed_write_set=allowed_write_set, + companion_allowlist=companion_allowlist, + scope_guard_enabled=scope_guard_enabled, + contract_source=contract_source, + project_root=self.git_root, ) - self.project_root = self.git_root + + # Issue #1013 iter-22 M-1 (durable baseline-leakage bug): per-module + # durable worktrees are freshly created via ``git worktree add`` and + # have no pre-existing user WIP by construction. Whatever surfaces in + # a worktree at scope-guard time was put there BY this sync run, so + # the iter-6 B1 "preserve user's pre-existing untracked files" + # carve-out — which exists for the in-place async case where sync + # runs inside the user's main checkout — has no analog in durable + # mode. iter-18 pinned the baseline snapshot to ``self.git_root`` + # (the main checkout), but in production ``git_root`` IS the user's + # main checkout where dirty WIP lives; the actual per-module sync + # then runs inside ``.pdd/worktrees/sync-issue--/``, a + # DIFFERENT directory. Inheriting the main checkout's baseline LEAKS + # the caller's dirty paths into the worktree's allow set (see + # ``_enforce_scope_guard``: each baseline ``rel_posix`` is resolved + # against the scope-guard-time ``repo_root``, which is the per-module + # worktree root), bypassing the split contract. Clear the baseline + # so each fresh module worktree starts clean. + # + # Iter-24 M-1: baseline snapshots are now ``Dict[str, Optional[str]]`` + # (path → SHA-1) for content-aware preservation; clear to empty dicts + # so iteration in ``_enforce_scope_guard`` and + # ``_remaining_out_of_scope_paths`` remains a no-op. + self._baseline_changed_paths = {} + self._baseline_ignored_paths = {} if self.total_budget is not None: self.max_workers = 1 elif durable_max_parallel is not None: @@ -91,6 +137,36 @@ def _delete_state(self) -> None: """Durable mode leaves local runner state untouched.""" def run(self) -> Tuple[bool, str, float]: + # Iter-40 M-2 (durable init ordering): iter-38 added a + # fail-closed abort to :meth:`AsyncSyncRunner.run` when the + # init-time baseline scan returned ``None``, but the durable + # subclass calls :meth:`_prepare_durable_branch` BEFORE + # delegating to ``super().run()`` — so a baseline-acquisition + # failure on the main checkout would leave durable side effects + # (worktree creation, branch checkout, remote pushes) in place + # before the fail-closed check ran. Hoist the check above + # ``_prepare_durable_branch`` so no durable infrastructure is + # touched when the baseline scan failed. + # + # The flag reflects the MAIN CHECKOUT's git scan (see iter-22: + # the durable runner intentionally inherits the main-checkout + # baseline via ``super().__init__(project_root=self.git_root)`` + # and only clears the *paths* afterward — the flag is + # preserved). That is precisely the scan we want to abort on: + # the main checkout is where the orchestrator scope guard + # operates. + if getattr(self, "_baseline_acquisition_failed", False): + return ( + False, + ( + "Scope guard fail-closed: could not snapshot working-tree " + "baseline at runner init (git scan failed). Aborting " + "before any write-capable work to prevent false-positive " + "reverts of pre-existing user files." + ), + self.initial_cost, + ) + ok, message = self._prepare_durable_branch() if not ok: return False, message, self.initial_cost @@ -353,14 +429,39 @@ def _stage_module_changes( self._force_add_module_metadata(basename, module_worktree) + # Iter-6 B3 (rename detection bug): ``--name-only`` for a staged + # rename ``git mv old new`` emits ONLY ``new``. A contract that + # allows ``new`` but not ``old`` would pass validation while the + # rename silently deletes the out-of-scope ``old``. Use + # ``--name-status -M`` so rename lines surface as + # ``R\told\tnew`` and BOTH paths count for scope checking. names = self._git( - ["diff", "--cached", "--name-only", "--diff-filter=ACMRTD"], + ["diff", "--cached", "--name-status", "-M", "--diff-filter=ACMRTD"], cwd=module_worktree, ) if names.returncode != 0: return False, f"Failed to inspect staged changes: {_combined_output(names)}", False - changed_paths = [line.strip() for line in names.stdout.splitlines() if line.strip()] + changed_paths: List[str] = [] + for raw in names.stdout.splitlines(): + line = raw.rstrip("\n") + if not line.strip(): + continue + parts = line.split("\t") + # Whether the entry is a rename/copy (R/C with similarity score) + # or a single-path status (A/M/D/T), every column past the + # status code is a path that should be scope-checked. + changed_paths.extend(p.strip() for p in parts[1:] if p.strip()) + out_of_scope = self._out_of_scope_staged_paths( + changed_paths, basename, module_worktree + ) + if out_of_scope: + return ( + False, + "Durable sync refuses to checkpoint path(s) outside the issue " + "split-contract allowed write set: " + ", ".join(out_of_scope), + False, + ) unsafe = self._unsafe_staged_paths(basename, changed_paths) if unsafe: return ( @@ -373,6 +474,120 @@ def _stage_module_changes( empty = not changed_paths return True, "", empty + def _out_of_scope_staged_paths( + self, + paths: List[str], + basename: str, + module_worktree: Path, + ) -> List[str]: + """ + Return staged paths that violate the issue split-contract. + + Issue #1013 (F5, F13): when no contract is parsed, + ``self.allowed_write_paths is None`` and durable sync runs in + permissive mode — never reject. When a contract is present, accept + both contract paths AND companion-allowlist matches (e.g. + ``.pdd/meta/*.json``) so fingerprint metadata can still be + checkpointed alongside the primary write set. + + Issue #1013 iter-16 M-1: in a multi-module repo where + ``module_cwd`` is a SUBDIRECTORY of the worktree (e.g. + ``worktree/pkg``), staged paths surface relative to the worktree + git root (e.g. ``pkg/.pdd/meta/foo.json``). Companion-allowlist + patterns describe MODULE-RELATIVE artifacts (e.g. + ``.pdd/meta/*.json``), so the staged path must be stripped of the + module cwd prefix before the anchored matcher runs. Mirrors the + async runner's iter-14 M-1 part-2 fix. Paths that sit OUTSIDE the + module's cwd (sibling-module artifacts) never auto-allow — see + F1 iter-3. + """ + # Permissive mode: scope_guard disabled or no contract parsed. + if not self.scope_guard_enabled or self.allowed_write_paths is None: + return [] + allowlist = tuple(self.companion_allowlist) + # Resolve the module's cwd relative to its worktree once. For a + # single-module sync where ``module_cwd == module_worktree``, this + # is ``"."`` (or ``""`` if the helper ever returned the worktree + # path itself) and we treat it as "no prefix to strip". + module_cwd = self._module_cwd_for_worktree(basename, module_worktree) + try: + module_cwd_rel = module_cwd.resolve().relative_to( + module_worktree.resolve() + ).as_posix() + except ValueError: + module_cwd_rel = "" + if module_cwd_rel in ("", "."): + module_cwd_rel = "" + + offending: Set[str] = set() + for raw in paths: + normalized = raw.replace(os.sep, "/").strip() + if normalized.startswith("./"): + normalized = normalized[2:] + # Allowed-write-set match is REPO-RELATIVE by contract (the + # split contract is declared with repo-rooted paths) and stays + # unchanged. + if normalized in self.allowed_write_paths: + continue + # Issue #1013 iter-42 M-1 (durable PDD_INTERNAL parity): PDD's + # own infrastructure writes (audit logs, runner state, etc.) + # are NEVER part of a contract — they're internal artifacts + # the tool produces as side effects of running. Mirror the + # async per-module guard (iter-36 B-1/B-2 at + # ``agentic_sync_runner.py`` line ~2376/~2408) so the durable + # checkpoint-staging validation honors the same allowlist; + # otherwise contracted durable runs hard-fail on PDD's own + # audit logs / state file. The internal allowlist patterns + # are REPO-ROOT-anchored (the writes happen at the top of + # the project regardless of module_cwd), so this check runs + # BEFORE the module_cwd prefix stripping below and matches + # against ``normalized`` (the raw repo-relative form). + internal_matched = False + for pattern in PDD_INTERNAL_PATH_ALLOWLIST: + if _matches_companion_pattern_anchored(normalized, pattern): + internal_matched = True + break + if internal_matched: + continue + # F3 (Issue #1013): companion glob matching uses anchored, + # segment-aware semantics so ``.pdd/meta/*.json`` does NOT + # match nested paths like ``.pdd/meta/nested/foo.json`` or + # ``subdir/.pdd/meta/foo.json``. Iter-14 M-2: replaced + # ``PurePosixPath.match`` (suffix-based, falsely matched + # ``subdir/.pdd/meta/foo.json``) with the centralized + # anchored matcher in ``agentic_common``. + # + # Iter-16 M-1: for multi-module sync, strip the module_cwd + # prefix before matching so the module-relative pattern works + # against the repo-relative staged path. Paths outside the + # module's cwd are sibling artifacts and never auto-allow. + if module_cwd_rel: + prefix = module_cwd_rel + "/" + if not normalized.startswith(prefix): + offending.add(normalized) + continue + candidate_rel = normalized[len(prefix):] + else: + candidate_rel = normalized + + matched = False + for pattern in allowlist: + if not pattern: + continue + # Issue #1013 iter-10 M-1 (defense-in-depth): a + # wildcard-only / doublestar / absolute / traversal + # pattern that slipped past the parser must NOT + # auto-allow repo-wide writes. + if not _is_valid_companion_pattern(pattern): + continue + if _matches_companion_pattern_anchored(candidate_rel, pattern): + matched = True + break + if matched: + continue + offending.add(normalized) + return sorted(offending) + def _force_add_module_metadata(self, basename: str, module_worktree: Path) -> None: safe = basename.replace("/", "_") meta_dirs = [ @@ -399,6 +614,26 @@ def _unsafe_staged_paths(self, basename: str, paths: List[str]) -> List[str]: for path in paths: normalized = path.replace(os.sep, "/") lower = normalized.lower() + # Issue #1013 iter-42 M-1 (durable PDD_INTERNAL parity): PDD's + # own infrastructure writes (e.g. ``.pdd/agentic-logs/*``, + # ``.pdd/agentic_sync_state.json``) match neither a contract's + # allowed_write_set nor the user-facing companion allowlist; + # they're tool internals. The unsafe-path rules below would + # otherwise classify ``.pdd/agentic-logs/foo.jsonl`` as + # unsafe via the ``_pdd_path_index`` branch (because it sits + # under ``.pdd/`` but is NOT a recognized meta artifact). The + # async per-module guard already exempts these patterns; + # mirror it here so a contracted durable run does not + # hard-fail at checkpoint on its own audit logs / state file. + # Patterns are REPO-ROOT-anchored — match the raw normalized + # path without stripping any module prefix. + internal_matched = False + for pattern in PDD_INTERNAL_PATH_ALLOWLIST: + if _matches_companion_pattern_anchored(normalized, pattern): + internal_matched = True + break + if internal_matched: + continue pdd_index = _pdd_path_index(normalized) if pdd_index is not None: matching_meta_prefix = next( diff --git a/pdd/prompts/agentic_common_python.prompt b/pdd/prompts/agentic_common_python.prompt index 263db3635..adfbd3439 100644 --- a/pdd/prompts/agentic_common_python.prompt +++ b/pdd/prompts/agentic_common_python.prompt @@ -26,7 +26,9 @@ {"name": "github_clear_state", "signature": "(repo_owner: str, repo_name: str, issue_number: int, workflow_type: str, cwd: Path) -> bool", "returns": "bool"}, {"name": "load_workflow_state", "signature": "(cwd: Path, issue_number: int, workflow_type: str, state_dir: Path, repo_owner: str, repo_name: str, use_github_state: bool = True) -> Tuple[Optional[Dict], Optional[int]]", "returns": "Tuple[Optional[Dict], Optional[int]]"}, {"name": "save_workflow_state", "signature": "(cwd: Path, issue_number: int, workflow_type: str, state: Dict, state_dir: Path, repo_owner: str, repo_name: str, use_github_state: bool = True, github_comment_id: Optional[int] = None) -> Optional[int]", "returns": "Optional[int]"}, - {"name": "clear_workflow_state", "signature": "(cwd: Path, issue_number: int, workflow_type: str, state_dir: Path, repo_owner: str, repo_name: str, use_github_state: bool = True) -> None", "returns": "None"} + {"name": "clear_workflow_state", "signature": "(cwd: Path, issue_number: int, workflow_type: str, state_dir: Path, repo_owner: str, repo_name: str, use_github_state: bool = True) -> None", "returns": "None"}, + {"name": "parse_issue_contract", "signature": "(issue_body: Optional[str], issue_comments: Optional[List[str]] = None) -> Optional[IssueContract]", "returns": "Optional[IssueContract]"}, + {"name": "_revert_out_of_scope_changes", "signature": "(cwd: Path, allowed_paths: set[Path]) -> List[Path]", "returns": "List[Path]"} ] } } @@ -105,6 +107,15 @@ Shared infrastructure for agentic CLI invocations (Claude Code, Gemini, Codex, O 18. **Post Final Comment**: `post_final_comment(repo_owner, repo_name, issue_number, reason, total_cost, steps_completed, total_steps, cwd) -> bool`: Post a generated workflow summary comment to the GitHub issue when the workflow stops early. The function builds the comment body from the stop reason, cumulative cost, and completed/total step counts; callers do not pass a preformatted body. 19. **OpenCode Model Resolution**: Resolve the OpenCode model in this order: (1) `OPENCODE_MODEL` env var, kept verbatim including nested slashes like `openrouter/openai/gpt-5.3-codex`; (2) derive a candidate from `llm_model.csv` using PDD's existing model-strength semantics, then translate LiteLLM-oriented IDs via `_translate_to_opencode_model()`. The CSV fallback MUST be auth-aware: build the configured OpenCode provider set from parsed provider credentials in `~/.local/share/opencode/auth.json`, parsed usable OpenCode config provider/model entries (`~/.config/opencode/opencode.json`, nearest project `opencode.json`, `OPENCODE_CONFIG`, `OPENCODE_CONFIG_CONTENT`), and every provider credential env var represented in `llm_model.csv`; filter candidate rows to providers that are configured before selecting a model. OpenCode config sources contribute a configured provider only when they declare a provider/model path with resolvable auth or explicit local/no-key provider semantics; bare config existence is diagnostic-only. OpenCode agentic runs use `OPENCODE_MODEL` or the auth-aware CSV fallback, not generic direct-prompt model defaults. Required translations include `github_copilot/X -> github-copilot/X`, `gemini/X -> google/X`, bare Anthropic rows like `claude-sonnet-... -> anthropic/claude-sonnet-...`, and bare OpenAI rows like `gpt-5 -> openai/gpt-5`; IDs already in OpenCode `provider/model` form pass through unchanged. If no configured provider can serve the selected model, fail fast with an actionable error telling the user to set `OPENCODE_MODEL=provider/model`, configure the matching provider, or run `opencode models` after authentication. Do not rely on OpenCode default model resolution. 20. **OpenCode Optional Knobs**: Honor `OPENCODE_AGENT` by passing `--agent ` and `OPENCODE_VARIANT` by passing `--variant ` when set. Omit both flags when unset. `PDD_OPENCODE_MODE` is out of scope for this module version; use `opencode run` only. +21. **Issue Contract Parsing (Issue #1013 — sync scope guard)**: Provide `IssueContract` (frozen dataclass with `allowed_paths: Tuple[str, ...]`, `companion_allowlist: Tuple[str, ...]`, `source: str`) and `parse_issue_contract(issue_body, issue_comments=None) -> Optional[IssueContract]`. The parser scans the issue body first, then each comment (newest last is fine), looking for one of THREE declaration formats: (a) an HTML-comment block of the form `` whose JSON declares `allowed_paths` (required, list of repo-relative path strings) and optionally `companion_allowlist` (list of `pathlib`-style glob patterns); (b) a fenced code block introduced by a heading-like line matching `(?im)^\s*(?:#+\s*)?(?:allowed[\s_-]*write[\s_-]*set|split[\s_-]*contract)\b.*$` immediately followed by a fenced block with info string ```text``` (one repo-relative path per line; blank lines and `#`-prefixed comments ignored) or ```json``` (body is a JSON array of repo-relative path strings, e.g. `["pdd/foo.py", "tests/test_foo.py"]`; non-array payloads such as objects/numbers/strings and malformed JSON yield `None` so the caller falls back to permissive mode — Issue #1013 iter-12 B-1); or (c) **Bullet-list under an inline label (Issue #1013 iter-18 B-1)**: the same heading regex as (b), followed somewhere later by an inline bold label line `(?im)^\s*\*\*\s*allowed\s+write\s+set\s*:\s*\*\*\s*$`, followed by `-`/`*`/`+` bullets each carrying one repo-relative POSIX path (optional surrounding backticks are stripped). The label is the discriminator: bullets that appear BEFORE the label (e.g. under an earlier `## Files` section) MUST NOT be captured. The bullet list terminates at the first of (i) another `**Label:**` line such as `**Acceptance criteria:**`, (ii) a `---` horizontal rule, (iii) a `#`-prefixed heading, (iv) a non-blank, non-bullet line, or (v) end of body; blank lines do NOT terminate. Each bullet's surrounding backticks are stripped before validation. Branches MUST be tried in priority order (a) → (b) → (c); the first non-`None` match wins. Path strings are repo-relative POSIX paths; do NOT resolve to absolute filesystem paths here — that is the caller's job once it knows the repo root. The parser MUST be tolerant: malformed JSON, missing fields, or no matching marker returns `None` (the caller treats `None` as "no contract → scope guard runs in permissive fallback mode, no enforcement"). Set `source` to `"html-comment"`, `"fenced-block"`, or `"bullet-list"` matching the branch that produced the contract, for diagnostics. The parser MUST NOT raise on any input; wrap the JSON load in try/except and return `None` on failure. When both a body marker and a comment marker are present, prefer the body marker (issues are edited authoritatively in the body; comments are append-only and may contain stale snapshots from earlier workflow steps). Per Issue #1013 iter-10 M-1 (tightened in iter-14 M-1/M-2 paired with iter-10 M-1), the parser MUST drop syntactically invalid `companion_allowlist` entries silently — same policy as `allowed_paths`. An entry is invalid if it is empty after `.strip()`, absolute (starts with `/`), uses a Windows separator (`\`), contains a `..` traversal segment, is wildcard-only with no literal-character anchor (every segment consists exclusively of `*` and `?`, e.g. `*`, `**`, `**/*`, `?`), OR contains any segment that is exactly `**` (the doublestar — `fnmatch` does not implement recursive glob semantics, and the anchored segment-aware matcher requires equal segment counts, so a `**`-bearing pattern would be ambiguous and is rejected at parse time). Patterns with at least one segment containing a non-wildcard character and no `**` segment (`.pdd/meta/*.json`, `architecture.json`, `tests/test_*.py`) remain valid. +22. **Default Sync Companion Allowlist (Issue #1013)**: Expose a module-level constant `DEFAULT_SYNC_COMPANION_ALLOWLIST: Tuple[str, ...]` listing glob patterns for files that `pdd sync` MAY touch as legitimate companion artifacts even when an issue contract restricts the primary write set. The default value MUST be `(".pdd/meta/*.json",)` — only fingerprint metadata under `.pdd/meta/` is auto-allowed. Architecture, examples, and unrelated prompt files are NOT in the default companion allowlist; the issue contract must opt them in explicitly via its own `companion_allowlist` field. This constant exists so `agentic_sync_runner` and `agentic_sync` import a single shared default rather than redefining it inline. Also expose `_matches_companion_pattern_anchored(rel_posix: str, pattern: str) -> bool` (Issue #1013 iter-14 M-1/M-2) — anchored, segment-aware glob matching used by both runner-side companion-allowlist checks. Unlike `pathlib.PurePosixPath.match` (which matches from the right and falsely treats `subdir/.pdd/meta/foo.json` as matching `.pdd/meta/*.json`), this helper requires the path and pattern to align segment-by-segment from the START of the path with equal segment count, and matches each segment via `fnmatch.fnmatchcase` for `*`/`?` semantics. Returns False on invalid patterns; callers should validate with `_is_valid_companion_pattern` first so an invalid pattern can never auto-allow a path. +23. **Scope Guard Helper (Issue #1013)**: `_revert_out_of_scope_changes(cwd, allowed_paths) -> List[Path]` is the shared revert helper used by `agentic_update`, `agentic_fix`, `agentic_crash`, `agentic_verify`, `agentic_e2e_fix_orchestrator`, and the sync scope guard. Signature MUST remain `(cwd: Path, allowed_paths: set[Path]) -> List[Path]` and return the list of resolved paths that were reverted. Behavior contract: + - Skip silently when `cwd` is not a git repo, when `git status` is unavailable, or when `allowed_paths` is NON-EMPTY and none of its entries fall under `cwd` (the "scope guard meant for a different module" optimization). An EMPTY `allowed_paths` is a legal reject-all contract (Issue #1013 degenerate-empty case) — proceed with revert. + - Detect tracked changes via `git status --porcelain -uno`. + - Parse each status line. For rename entries (`R old -> new`), surface BOTH source and destination as separate paths; treat the rename atomically when deciding to revert — if EITHER side is out of scope, revert BOTH sides (otherwise the working tree is left in a half-renamed state). + - Restore out-of-scope tracked files via `git restore --staged --worktree --source=HEAD -- ` (not `git checkout HEAD --`, which cannot remove rename destinations unknown to HEAD). Fall back to `git checkout HEAD --` only when the local `git` is too old to support `git restore` (pre-2.23). + - Check the restore subprocess return code; on non-zero, log a WARNING and clear the reverted list so callers see real-vs-claimed revert state. + - Log INFO with the count and first ~10 reverted file names on success. % Function Signatures `get_agent_provider_preference() -> List[str]` @@ -142,6 +153,7 @@ Shared infrastructure for agentic CLI invocations (Claude Code, Gemini, Codex, O - `MIN_ATTEMPT_TIMEOUT_SECONDS: 60.0` - `MAX_ERROR_SNIPPET_LENGTH: 2000` - `MAX_ERROR_RESPONSE_NEWLINES: 3` (Issue #1232: newline-count gate for leading-`Error:` false-positive heuristic) +- `DEFAULT_SYNC_COMPANION_ALLOWLIST: Tuple[str, ...] = (".pdd/meta/*.json",)` (Issue #1013: glob patterns for sync companion artifacts that bypass the issue contract's primary write set) % Token Pricing `Pricing(input_per_million, output_per_million, cached_input_multiplier)` diff --git a/pdd/prompts/agentic_common_worktree_python.prompt b/pdd/prompts/agentic_common_worktree_python.prompt index b7afe1e44..239c808ef 100644 --- a/pdd/prompts/agentic_common_worktree_python.prompt +++ b/pdd/prompts/agentic_common_worktree_python.prompt @@ -43,7 +43,7 @@ All functions are public (no leading underscore) so orchestrators can import the 7. **`setup_worktree(cwd, issue_number, quiet, *, resume_existing=False, branch_prefix="fix", worktree_prefix="fix") -> Tuple[Optional[Path], Optional[str]]`**: Create an isolated git worktree at `.pdd/worktrees/{worktree_prefix}-issue-{issue_number}/` on branch `{branch_prefix}/issue-{issue_number}`. Clean up existing worktree/directory and branch before creating. If `resume_existing` is True and branch exists, reuse it (attach with `--force`). Otherwise delete the old branch first. When reusing an undeletable branch, reset to main ref after attaching. Print worktree path unless `quiet`. The `branch_prefix` and `worktree_prefix` kwargs let callers customize naming (e.g. `change` prefix for change workflows, `fix` for bug workflows). Return `(worktree_path, None)` on success, `(None, error_msg)` on failure. 8. **`get_modified_and_untracked(cwd: Path) -> List[str]`**: Return modified tracked files (`git diff --name-only HEAD`) plus untracked files (`git ls-files --others --exclude-standard`). 9. **`check_target_file_unchanged(cwd: Path, target_file: str, baseline_sha: Optional[str] = None) -> Tuple[bool, Optional[str]]`**: Detect concurrent edits. Run `git fetch origin` then `git rev-parse origin/main:{target_file}`. If `baseline_sha` is provided, compare current SHA against it — return `(True, current_sha)` if unchanged, `(False, current_sha)` if changed. If `baseline_sha` is None, just return `(True, current_sha)` to establish the baseline. Return `(True, None)` on git failures (fail-open to avoid blocking workflows). -10. **`revert_out_of_scope_changes_with_dirs(cwd: Path, allowed_dirs: set[str], allowed_files: set[Path]) -> List[Path]`**: Scope guard that detects both tracked changes AND new untracked files via `git status --porcelain -u`. For each changed/new file, check if its path starts with any prefix in `allowed_dirs` OR its resolved absolute path is in `allowed_files`. Revert tracked out-of-scope changes via `git checkout HEAD --`. Remove untracked out-of-scope files via `os.remove`. Return list of reverted/removed paths. Log actions via module logger. Handle timeout and OS errors gracefully. +10. **`revert_out_of_scope_changes_with_dirs(cwd: Path, allowed_dirs: set[str], allowed_files: set[Path]) -> List[Path]`**: Scope guard that detects both tracked changes AND new untracked files via `git status --porcelain --untracked-files=all` (a.k.a. `-uall`). The explicit `--untracked-files=all` is required so untracked content nested under a brand-new directory is expanded into individual `?? path/to/file` entries — bare `-u` is ambiguous across git versions/configs and may collapse the directory into a single `?? subdir/` entry that `os.remove` cannot delete. For each changed/new file, check if its path starts with any prefix in `allowed_dirs` OR its resolved absolute path is in `allowed_files`. Rename entries (`R old -> new`) are treated ATOMICALLY: surface both source and destination, scope-check each side, and if EITHER side is out of scope, revert BOTH via `git restore --staged --worktree --source=HEAD -- ` (so the rename destination unknown to HEAD is removed and the source restored — `git checkout HEAD --` alone cannot undo this). For non-rename tracked out-of-scope changes, revert via `git restore --staged --worktree --source=HEAD --` (or `git checkout HEAD --` on git < 2.23). Remove untracked out-of-scope files via `os.remove`; if (defensively) git ever reports an untracked directory (path ending in `/` or whose target resolves to a directory), use `shutil.rmtree` instead so contained files don't get left behind. Return list of reverted/removed paths. Log actions via module logger. Handle timeout and OS errors gracefully. 11. **`extract_block_marker(output: str, name: str) -> str`**: Parse a multi-line block delimited by `BEGIN_{name}` and `END_{name}` markers from agent output. Return the content between markers (stripped), or empty string if markers not found. Case-insensitive marker matching. % Dependencies diff --git a/pdd/prompts/agentic_sync_fix_dry_run_LLM.prompt b/pdd/prompts/agentic_sync_fix_dry_run_LLM.prompt index 39ac1dbfb..5a597aa61 100644 --- a/pdd/prompts/agentic_sync_fix_dry_run_LLM.prompt +++ b/pdd/prompts/agentic_sync_fix_dry_run_LLM.prompt @@ -13,17 +13,23 @@ You are debugging a failed `pdd sync` dry-run for module "{basename}". The dry-run failed when running `pdd sync {basename} --dry-run` from `{attempted_cwd}`. This usually means the prompt file was not found, or the working directory is wrong. -Determine the correct working directory and output the full command to run. +Determine the correct working directory from which `pdd sync {basename}` should be invoked. Look at where .pddrc files are located and which subdirectory contains the relevant prompt files. ## Output Format -Output the full shell command on a single line, prefixed with `SYNC_CMD:`. -The command MUST use `cd && pdd --force sync {basename} --dry-run --agentic --no-steer`. +Output the working directory ONLY as a single line, prefixed with `SYNC_CWD:`. +The value MUST be a path relative to the project root (use `.` for the project root itself) +or an absolute path that resolves under the project root. Do NOT emit a shell command. +Do NOT include any shell metacharacters (`;`, `&`, `|`, `<`, `>`, `` ` ``, `$`, `(`, `)`, newline). -SYNC_CMD: cd && pdd --force sync {basename} --dry-run --agentic --no-steer +The orchestrator will build and execute the `pdd sync {basename} --dry-run` argv itself +from this working directory. You only need to identify the directory. + +SYNC_CWD: Examples: -- SYNC_CMD: cd examples/hello && pdd --force sync {basename} --dry-run --agentic --no-steer -- SYNC_CMD: cd . && pdd --force sync {basename} --dry-run --agentic --no-steer +- SYNC_CWD: examples/hello +- SYNC_CWD: . +- SYNC_CWD: frontend -Only output ONE SYNC_CMD line. Do not output SYNC_CWD. +Only output ONE SYNC_CWD line. Do not output SYNC_CMD. diff --git a/pdd/prompts/agentic_sync_python.prompt b/pdd/prompts/agentic_sync_python.prompt index 0586064e9..1724d4c21 100644 --- a/pdd/prompts/agentic_sync_python.prompt +++ b/pdd/prompts/agentic_sync_python.prompt @@ -1,3 +1,19 @@ +Entry point for pdd sync orchestration over GitHub issues with split-contract scope-guard enforcement. + + +{ + "type": "module", + "module": { + "functions": [ + {"name": "run_agentic_sync", "signature": "(issue_url: str, *, verbose: bool = False, quiet: bool = False, budget: Optional[float] = None, skip_verify: bool = False, skip_tests: bool = False, dry_run: bool = False, agentic_mode: bool = True, no_steer: bool = True, max_attempts: Optional[int] = None, timeout_adder: float = 0.0, use_github_state: bool = True, one_session: bool = False, reasoning_time: Optional[float] = None, durable: bool = False, durable_branch: Optional[str] = None, no_resume: bool = False, durable_max_parallel: Optional[int] = None, scope_guard: bool = True) -> Tuple[bool, str, float, str]", "returns": "Tuple[bool, str, float, str]"}, + {"name": "run_global_sync", "signature": "(*, verbose: bool = False, quiet: bool = False, budget: Optional[float] = None, skip_verify: bool = False, skip_tests: bool = False, agentic_mode: bool = True, no_steer: bool = True, max_attempts: Optional[int] = None, dry_run: bool = False, target_coverage: Optional[float] = None, one_session: bool = False, local: bool = False, timeout_adder: float = 0.0, scope_guard: bool = True) -> Tuple[bool, str, float, str]", "returns": "Tuple[bool, str, float, str]"}, + {"name": "_is_github_issue_url", "signature": "(s: str) -> bool", "returns": "bool"}, + {"name": "_parse_llm_response", "signature": "(response: str) -> Tuple[List[str], bool, List[Dict]]", "returns": "Tuple[List[str], bool, List[Dict]]"} + ] + } +} + + context/python_preamble.prompt Global and GitHub issue-driven module identification plus parallel sync orchestration. @@ -19,6 +35,7 @@ auto_deps_main_python.prompt agentic_sync_runner_python.prompt durable_sync_runner_python.prompt +agentic_common_python.prompt % Goal Write the `pdd/agentic_sync.py` module. @@ -29,7 +46,7 @@ Entry point for sync orchestration. Supports two public workflows: 2. `run_global_sync(...)`: Tier 1 no-argument global sync. Triggered by the CLI when `pdd sync` is invoked with no BASENAME. Do not use PRD files, PRD fingerprinting, architecture schema migration, or agentic PRD analysis in v1. % Requirements -1. Function: `run_agentic_sync(issue_url: str, *, verbose: bool = False, quiet: bool = False, budget: Optional[float] = None, skip_verify: bool = False, skip_tests: bool = False, dry_run: bool = False, agentic_mode: bool = True, no_steer: bool = True, max_attempts: Optional[int] = None, timeout_adder: float = 0.0, use_github_state: bool = True, one_session: bool = False, reasoning_time: Optional[float] = None, durable: bool = False, durable_branch: Optional[str] = None, no_resume: bool = False, durable_max_parallel: Optional[int] = None) -> Tuple[bool, str, float, str]` +1. Function: `run_agentic_sync(issue_url: str, *, verbose: bool = False, quiet: bool = False, budget: Optional[float] = None, skip_verify: bool = False, skip_tests: bool = False, dry_run: bool = False, agentic_mode: bool = True, no_steer: bool = True, max_attempts: Optional[int] = None, timeout_adder: float = 0.0, use_github_state: bool = True, one_session: bool = False, reasoning_time: Optional[float] = None, durable: bool = False, durable_branch: Optional[str] = None, no_resume: bool = False, durable_max_parallel: Optional[int] = None, scope_guard: bool = True) -> Tuple[bool, str, float, str]` 2. Return 4-tuple: (success, message, total_cost, model_used) 3. Parse GitHub issue URL to extract: owner, repo, issue_number (reuse `_parse_issue_url` from `agentic_change.py`) 4. Fetch issue content and comments via `gh api` (reuse `_run_gh_command` from `agentic_change.py`) @@ -52,9 +69,10 @@ Entry point for sync orchestration. Supports two public workflows: 18. If `durable=True`, dispatch to `DurableSyncRunner` instead of `AsyncSyncRunner`. Pass the issue URL, durable branch override, `no_resume` flag, and durable max-parallel setting through unchanged. Durable mode must still use the same module identification, dependency graph, dry-run validation, fingerprint filtering, sync options, and initial cost accounting as standard issue sync. 19. Aggregate costs from LLM identification + dry-run LLM fallback + runner execution 20. Include `one_session`, `local`, and `target_coverage` in `sync_options` dict passed to the selected runner +21. **Split-Contract Scope Guard (Issue #1013)**: After fetching the issue body and comments and BEFORE dispatching to the runner, call `parse_issue_contract(issue_body, issue_comments)` from `pdd.agentic_common`. When the returned `IssueContract` is non-None, plumb its `allowed_paths` (as `allowed_write_set`) and the union of its `companion_allowlist` with `DEFAULT_SYNC_COMPANION_ALLOWLIST` (as `companion_allowlist`) into BOTH the `AsyncSyncRunner` and `DurableSyncRunner` constructors. Always pass `scope_guard_enabled=scope_guard` so the CLI opt-out flows through. When the contract is None (no marker in the issue), pass `allowed_write_set=None` so the runner falls back to permissive mode without enforcement. Emit one INFO line (suppressed under `quiet`) reporting which contract source was detected — for example `Sync scope guard: contract loaded from (N allowed paths)` — so operators can see at a glance that an issue contract is active. When `scope_guard=False`, emit one WARNING line `Sync scope guard: disabled via --no-scope-guard` instead. % Global Sync (Tier 1) -1. Function: `run_global_sync(*, verbose: bool = False, quiet: bool = False, budget: Optional[float] = None, skip_verify: bool = False, skip_tests: bool = False, agentic_mode: bool = True, no_steer: bool = True, max_attempts: Optional[int] = None, dry_run: bool = False, target_coverage: Optional[float] = None, one_session: bool = False, local: bool = False, timeout_adder: float = 0.0) -> Tuple[bool, str, float, str]`. Forward `timeout_adder` to the runner via `sync_options['timeout_adder']`. +1. Function: `run_global_sync(*, verbose: bool = False, quiet: bool = False, budget: Optional[float] = None, skip_verify: bool = False, skip_tests: bool = False, agentic_mode: bool = True, no_steer: bool = True, max_attempts: Optional[int] = None, dry_run: bool = False, target_coverage: Optional[float] = None, one_session: bool = False, local: bool = False, timeout_adder: float = 0.0, scope_guard: bool = True) -> Tuple[bool, str, float, str]`. Forward `timeout_adder` to the runner via `sync_options['timeout_adder']`. The `scope_guard` kwarg is accepted for CLI signature parity with `run_agentic_sync`; in global mode there is no issue body to parse, so the runner is always constructed with `allowed_write_set=None` (permissive fallback) regardless of `scope_guard`. The kwarg exists so `pdd sync --no-scope-guard` does not raise a `TypeError` in global mode. 2. Load scoped architecture modules by discovering architecture.json files with `find_architecture_for_project(project_root)`, preserving each entry's architecture path and cwd. For each architecture entry, derive its cwd by calling `_resolve_module_cwd(basename, arch_path.parent)` so that any nested `.pddrc` whose context claims that basename wins over the arch file's own directory; when no nested `.pddrc` claims the basename, `_resolve_module_cwd` falls back to `arch_path.parent` so nested-architecture modules retain arch-dir isolation. Fail clearly if no architecture data exists. 3. Extract syncable basenames from architecture `filename` values. Preserve subdirectory prefixes, e.g. `commands/maintenance_python.prompt` becomes `commands/maintenance`; skip non-syncable filenames such as `_LLM.prompt`. If the same basename appears in more than one architecture scope, use a display key of `:` based on `arch_path.parent` for planning and dependency graph keys while preserving the plain basename as the executable sync target. Do not key duplicates by resolved cwd, because different architecture files may legitimately resolve to the same nested `.pddrc` cwd. 4. For each scoped architecture module, resolve context/prompts_dir from that module's resolved cwd (the cwd returned by `_resolve_module_cwd`, which may be a nested `.pddrc` directory rather than `arch_path.parent`), then call `sync_determine_operation(..., log_mode=True, read_only=True, ...)` from that cwd for each detected language. Suppress info-level logs from `pdd.sync_determine_operation` around the read-only calls so global dry-run output stays readable and metadata files are never changed during analysis. diff --git a/pdd/prompts/agentic_sync_runner_python.prompt b/pdd/prompts/agentic_sync_runner_python.prompt index aa06c8132..c5f64e9eb 100644 --- a/pdd/prompts/agentic_sync_runner_python.prompt +++ b/pdd/prompts/agentic_sync_runner_python.prompt @@ -1,7 +1,24 @@ +Parallel pdd sync engine enforcing dep ordering, per-module budgets, and split-contract scope guard. + + +{ + "type": "module", + "module": { + "functions": [ + {"name": "AsyncSyncRunner", "signature": "(basenames: List[str], dep_graph: Dict[str, List[str]], sync_options: Dict[str, Any], github_info: Optional[Dict[str, Any]], quiet: bool = False, *, verbose: bool = False, issue_url: Optional[str] = None, module_cwds: Optional[Dict[str, Path]] = None, module_targets: Optional[Dict[str, str]] = None, initial_cost: float = 0.0, allowed_write_set: Optional[Iterable[str]] = None, companion_allowlist: Optional[Iterable[str]] = None, scope_guard_enabled: bool = True, contract_source: Optional[str] = None, project_root: Optional[Path] = None)", "returns": "AsyncSyncRunner"}, + {"name": "AsyncSyncRunner.run", "signature": "() -> Tuple[bool, str, float]", "returns": "Tuple[bool, str, float]"}, + {"name": "build_dep_graph_from_architecture", "signature": "(arch_path: Path, target_basenames: List[str]) -> DepGraphFromArchitectureResult", "returns": "DepGraphFromArchitectureResult"}, + {"name": "build_dep_graph_from_architecture_data", "signature": "(architecture: Any, target_basenames: List[str], *, source_name: str = 'architecture.json') -> DepGraphFromArchitectureResult", "returns": "DepGraphFromArchitectureResult"} + ] + } +} + + context/python_preamble.prompt architecture_sync_python.prompt agentic_langtest_python.prompt agentic_test_orchestrator_python.prompt +agentic_common_python.prompt % Goal Write the `pdd/agentic_sync_runner.py` module. @@ -10,7 +27,7 @@ Write the `pdd/agentic_sync_runner.py` module. Parallel sync engine that runs `pdd sync` for multiple modules concurrently using a ThreadPoolExecutor, respecting dependency ordering. Posts live progress updates to a GitHub issue comment. Supports state persistence for resumability across runs, phase tracking, and graceful interrupt handling. % Requirements -1. Class: `AsyncSyncRunner(basenames, dep_graph, sync_options, github_info, quiet, verbose, issue_url, module_cwds, module_targets, initial_cost=0.0)` +1. Class: `AsyncSyncRunner(basenames, dep_graph, sync_options, github_info, quiet, *, verbose, issue_url, module_cwds, module_targets, initial_cost=0.0, allowed_write_set=None, companion_allowlist=None, scope_guard_enabled=True, contract_source=None, project_root=None)` - `basenames: List[str]` — modules to sync - `dep_graph: Dict[str, List[str]]` — basename -> [dependency basenames] - `sync_options: Dict` — budget, total_budget, target_coverage, skip_verify, skip_tests, agentic, no_steer, max_attempts, one_session, local, timeout_adder @@ -19,8 +36,16 @@ Parallel sync engine that runs `pdd sync` for multiple modules concurrently usin - `module_cwds: Optional[Dict[str, Path]]` — per-module working directories (defaults to project_root) - `module_targets: Optional[Dict[str, str]]` — optional display-key -> actual sync basename map. This allows callers such as global sync to schedule scoped keys like `examples/prompts_linter:report` while executing `pdd sync report` in that module's cwd. - `initial_cost: float` — pre-runner cost (LLM module identification and dry-run fallback) to include in total cost display and return value (default 0.0) + - `allowed_write_set: Optional[Iterable[str]]` — repo-relative path strings from the issue split contract that this sync run is permitted to modify. `None` means "no contract was parseable from the issue → run in permissive mode (no enforcement)". An explicit empty iterable means "contract present but empty → reject every change as out-of-scope" (a degenerate but legal contract). Resolved against each module's `cwd`/repo root inside the runner. + - `companion_allowlist: Optional[Iterable[str]]` — additional glob patterns (e.g. `".pdd/meta/*.json"`) describing companion artifacts that MAY be modified outside the primary `allowed_write_set`. Defaults to `DEFAULT_SYNC_COMPANION_ALLOWLIST` from `agentic_common` (currently `(".pdd/meta/*.json",)`) when `None`. Issue contracts MAY widen the companion allowlist by passing a superset. + - `scope_guard_enabled: bool` — master switch (default `True`). When `False`, the runner records the parsed contract for diagnostics but performs no enforcement, no revert, and no hard-fail. Maps to the CLI `--no-scope-guard` opt-out. + - `contract_source: Optional[str]` — diagnostic label carrying the parse source of the issue contract (`"html-comment"`, `"fenced-block"`, or `"bullet-list"`, matching `IssueContract.source`) so scope-guard diagnostics and downstream review-loop reporters can surface where the contract was detected. `None` when no contract was parsed (permissive fallback). + - `project_root: Optional[Path]` — when non-`None`, overrides the default `Path.cwd()` used to seed `self.project_root` and to take the baseline-changed-paths snapshot (Issue #1013 iter-18 M-1). Subclasses such as `DurableSyncRunner` pin this to the durable worktree's git root so the baseline reflects the worktree where syncs will actually run, not the caller's current working directory. Resolved with `Path(project_root).resolve()` when provided. + - `_baseline_changed_paths: Dict[str, Optional[str]]` (Issue #1013 iter-6 B1 + iter-24 M-1 + iter-38 M-1) — snapshot of pre-existing dirty/untracked working-tree paths captured from `_git_changed_paths(project_root)` at runner init, mapping each repo-relative POSIX path to its init-time SHA-1 (`_hash_file(project_root, rel)`) or `None` when the file was unreadable at init. Iter-6 B1 originated the iter-6 B1 "preserve user's pre-existing untracked files" carve-out so the scope guard does not delete unrelated user WIP. **Iter-24 M-1 (baseline-clobber bug)** upgraded preservation from name-based to content-aware: the old `Set[str]` snapshot let a buggy LLM silently OVERWRITE an out-of-scope baseline file (the iter-23 codex repro: `outside: sync clobbered`). The dict + SHA invariant: a baseline path is auto-allowed (added to `allowed_files`) by `_enforce_scope_guard` ONLY IF its current SHA-1 matches the init-time SHA-1; a divergent SHA falls through to the contract check and surfaces the clobber. Gated on `scope_guard_enabled AND allowed_write_paths is not None`; when the gate is off the snapshot is an empty dict so the `.items()` iteration in the enforcement path is a no-op. The `_hash_file` helper uses SHA-1 because this is clobber detection, not adversarial collision resistance. **Iter-38 M-1 (fail-closed baseline acquisition)** upgraded the helper signature from `set[str]` to `Optional[set[str]]`: a transient init-time git failure (lock contention, missing binary, `OSError`) now returns `None` (distinguishable from "scan succeeded, clean worktree" which returns the empty set). When EITHER `_git_changed_paths` or `_git_ignored_paths` returns `None` at init, the runner records `_baseline_acquisition_failed = True` and `run()` aborts before any write-capable work runs — otherwise the empty baseline would later cause the scope guard to flag pre-existing user WIP as out-of-scope and revert/delete it. + - `_baseline_ignored_paths: Dict[str, Optional[str]]` (Issue #1013 iter-20 M-1 + iter-24 M-1 + iter-38 M-1) — sibling snapshot to `_baseline_changed_paths`, populated from `git ls-files --others --ignored --exclude-standard` at init via the helper `_git_ignored_paths(project_root)`. Records repo-relative POSIX paths of pre-existing gitignored files (e.g. user-side `build/cache.bin` under a repo-wide `.gitignore: build/`) so the post-revert re-scan does not flag them as the sync run's out-of-scope writes. Gated identically to `_baseline_changed_paths` (`scope_guard_enabled AND allowed_write_paths is not None`) so non-contract runs do not pay the `git ls-files` cost on repos with large ignored trees (`node_modules/`, `build/`, etc.). **Iter-24 M-1** same dict-with-SHA shape as `_baseline_changed_paths`: pre-existing ignored files are skipped from the gitignored re-scan ONLY when their content is byte-identical to the init snapshot; a clobbered ignored baseline path surfaces via the re-scan as out-of-scope. **Iter-38 M-1** same `Optional[set[str]]` helper signature: a `None` return from `_git_ignored_paths` at init also triggers the fail-closed `_baseline_acquisition_failed` flag. + - `_baseline_acquisition_failed: bool` (Issue #1013 iter-38 M-1) — internal flag set during `__init__` when either init-time baseline scan helper returns `None` (transient git failure). When `True`, `run()` MUST return `(False, fail-closed-message, initial_cost)` before any write-capable work (subprocess dispatch, executor submission) runs. Default `False`; not part of the public API. The flag is also evaluated in the durable runner's inherited `super().run()` call so a fail-closed init aborts durable mode at the same boundary. - Tracks per-module state: pending -> running -> success | failed -2. Method: `run() -> Tuple[bool, str, float]` — returns (all_success, summary_message, total_cost) where total_cost includes initial_cost + per-module costs +2. Method: `run() -> Tuple[bool, str, float]` — returns (all_success, summary_message, total_cost) where total_cost includes initial_cost + per-module costs. **Iter-38 M-1 (fail-closed baseline acquisition):** before any other work, check `self._baseline_acquisition_failed` (set during `__init__` when either init-time baseline scan helper returned `None`) and short-circuit with `(False, "Scope guard fail-closed: could not snapshot working-tree baseline at runner init …", self.initial_cost)` so a transient init-time git failure cannot let the scope guard later treat an empty baseline as "no pre-existing files" and revert/delete user WIP. 3. Use `concurrent.futures.ThreadPoolExecutor` with `MAX_WORKERS = 4`; when `sync_options["total_budget"]` is set, run sequentially and pass only the remaining total budget to each child process so the total budget is not multiplied per module. 4. Dependency-aware scheduling: a module starts only when all its dependencies (within target set) have status "success" 5. Dependencies outside `basenames` are omitted from `dep_graph` (partial sync). Callers should rely on `build_dep_graph_from_architecture` warnings for visibility when an architecture edge points outside the target set; the runner still assumes those deps are out of scope for this run. @@ -44,6 +69,26 @@ Parallel sync engine that runs `pdd sync` for multiple modules concurrently usin 19. Subprocess env: include `PYTHONUNBUFFERED=1` for real-time output 20. Forward "Successfully submitted example" messages from child stdout to parent console 21. Heartbeat logging: during long-running syncs, print progress updates every 60s. Prefer parsed `PDD_PHASE` state — `f" — phase: {current_phase} ({len(completed_phases)} done)"` — so operators see real progress through the generate/test/fix phases instead of a stale `Preprocessing complete` line. Fall back to the last non-box-drawing stdout line only when no phase has been reported yet. +22. **Split-Contract Scope Guard (Issue #1013)**: After each per-module `pdd sync` subprocess completes (success or failure), and **before** the runner declares that module successful or persists state, the runner MUST invoke `_enforce_scope_guard(basename, module_cwd)` when `self.scope_guard_enabled` is True AND `self.allowed_write_set is not None`. The helper: + - Builds the effective allow set for the module: every path in `self.allowed_write_set` resolved against the module's repo root (the git toplevel of `module_cwd`, falling back to `module_cwd` itself), plus every path under `module_cwd` that matches any glob in the effective companion allowlist (`self.companion_allowlist` ∪ `DEFAULT_SYNC_COMPANION_ALLOWLIST`). Companion-allowlist matching MUST use `_matches_companion_pattern_anchored` from `pdd.agentic_common` (Issue #1013 iter-14 M-1) — anchored, segment-aware glob matching — NOT `pathlib.PurePosixPath.match`, whose suffix-based semantics would falsely auto-allow nested paths like `subdir/.pdd/meta/foo.json` against the default `.pdd/meta/*.json` pattern. Candidate paths are normalized as POSIX relative to `module_cwd`, NOT relative to the repo root, so a multi-module repo with per-module cwds correctly auto-allows `/.pdd/meta/*.json` under the default top-level companion pattern. For the `git status`-driven branch (companion-shaped paths that are TRACKED-DELETED and therefore invisible to `rglob`, iter-4 F1), scope each repo-relative path to `module_cwd` first, then strip the cwd prefix before invoking the matcher. + - Calls `_revert_out_of_scope_changes(repo_root, allowed_paths)` from `pdd.agentic_common` to revert tracked out-of-scope modifications, AND calls `revert_out_of_scope_changes_with_dirs(repo_root, allowed_dirs=set(), allowed_files=allowed_paths)` from `pdd.agentic_common_worktree` to additionally remove untracked out-of-scope new files. The combination matches the existing scope-guard pattern used by `agentic_update`/`agentic_fix`/`agentic_crash`/`agentic_e2e_fix_orchestrator`. + - **Post-revert re-scan (Issue #1013 iter-9, M-1 fail-closed boundary)**: after both helpers return, the runner MUST call `_remaining_out_of_scope_paths(repo_root, allowed_files)` to detect anything the helpers could not revert/remove (git timeout, permission error, restore failure — those helpers log a warning and return `[]`, which the orchestrator otherwise mistakes for "clean"). Paths returned by the re-scan and not already in the helper-returned offending list go into an `Unrecovered (revert failed, manual cleanup required):` section in the diagnostic. A non-empty Unrecovered set MUST cause `_enforce_scope_guard` to return a diagnostic string (hard-fail the module) even when the revert helpers themselves returned empty lists. When `offending` is empty but `Unrecovered` is non-empty, emit the alternate header `Scope guard detected out-of-scope artifacts for module '' (contract source: ) but the revert helpers reported no successful reverts.` instead of the misleading `Scope guard reverted 0 out-of-scope file(s)...`. The `Unrecovered` section is OMITTED entirely when empty (no empty headers). + - Diagnostic format (printed to stderr; structured for downstream parsers — checkup, review-loop reports): + ``` + Scope guard reverted N out-of-scope file(s) for module '' (contract source: ): + - path/relative/to/repo + - another/path + Unrecovered (revert failed, manual cleanup required): + - path/the/guard/could/not/revert + Allowed write set: + - path/from/contract + Companion allowlist: + - .pdd/meta/*.json + ``` + - **Hard-fail policy (Issue #1013 acceptance criteria 3 and 4)**: if any out-of-scope path was detected, the module MUST be recorded as failed with `error="Scope guard hard-fail: out-of-scope artifacts detected"` followed by the diagnostic body. This blocks the per-module success record, blocks dependent modules from scheduling, and ensures checkup/review-loop reports surface the failure rather than burying it under an apparently-successful sync. Hard-fail applies even when the underlying `pdd sync` subprocess succeeded — the contract violation is the failure mode the scope guard exists to catch. + - **Permissive fallback**: when `self.allowed_write_set is None` (no parseable contract on the issue), `_enforce_scope_guard` returns immediately without enforcement. The user-facing INFO log for this state is owned by the sync layer (`run_agentic_sync` in `pdd/agentic_sync.py`) — the runner MUST NOT emit a duplicate line on `run()` entry (Issue #1013 iter-18 m-1: previously logged twice per invocation; one source of truth now). + - **Opt-out**: when `self.scope_guard_enabled is False`, skip enforcement entirely. Even an explicit `allowed_write_set` is recorded only for diagnostics in this mode. The user-facing WARNING for this state is likewise owned by the sync layer — the runner MUST NOT log it again on `run()` entry (Issue #1013 iter-18 m-1). + - The scope-guard step MUST run with a `threading.Lock` held around git operations on a per-`module_cwd` basis to avoid `git status` / `git checkout` races when modules share a repo root (the common shared-worktree case for non-durable issue sync). % Dataclass: `ModuleState` - `status: str` — "pending", "running", "success", "failed" @@ -103,6 +148,8 @@ Parallel sync engine that runs `pdd sync` for multiple modules concurrently usin - `_find_pdd_executable() -> Optional[str]`: find pdd binary (same pattern as `server/jobs.py`) - `_parse_cost_from_csv(csv_path: str) -> float`: sum cost column from PDD_OUTPUT_COST_PATH CSV - `_format_duration(start, end) -> str`: format seconds as "Xs" or "Xm Ys" +- `_enforce_scope_guard(self, basename: str, module_cwd: Path) -> Optional[str]`: Issue #1013 scope guard. Returns `None` when the module is in scope; returns a multi-line diagnostic string (see Req 22) when out-of-scope artifacts were detected. Callers (the per-future completion handler) treat a non-None return as a module failure and replace any prior success record with it. No-ops when `self.scope_guard_enabled is False` or `self.allowed_write_set is None`. Reuses `pdd.agentic_common._revert_out_of_scope_changes` and `pdd.agentic_common_worktree.revert_out_of_scope_changes_with_dirs` rather than reimplementing git scanning. **Fail-closed safety (Issue #1013 iter-9, M-1)**: after invoking both revert helpers, the runner MUST re-scan the worktree via `_remaining_out_of_scope_paths(repo_root, allowed_files)` and hard-fail when ANY out-of-scope artifacts remain, even if the revert helpers returned empty lists. The helpers fail-open on git timeout / permission error / restore failure (they log a warning and return `[]`); without the re-scan the orchestrator would treat that as "nothing was out of scope" and let the module succeed with the contract still violated on disk. Unrecovered paths surface in the diagnostic under a distinct `Unrecovered (revert failed, manual cleanup required):` section. +- `_remaining_out_of_scope_paths(self, repo_root: Path, allowed_files: Set[Path]) -> List[str]`: Issue #1013 iter-9 (M-1) re-scan helper. Runs `git status --porcelain --untracked-files=all` in *repo_root* with a 30s timeout, parses each line (handling the `R old -> new` rename format and the `_normalize_repo_path` cleanup), resolves each path against *allowed_files*, and returns a sorted list of POSIX repo-relative paths still NOT in the allow set. **Iter-20 M-1 (gitignored fail-open):** the standard `git status` scan does NOT report gitignored files, so a sync that writes outside the contract into a gitignored path (e.g. `build/junk.txt` under a repo-wide `.gitignore: build/`) would otherwise be invisible to the re-scan. A SECOND scan via `git ls-files --others --ignored --exclude-standard` (also with a 30s timeout) enumerates every individual ignored file; entries are skipped when present in `self._baseline_ignored_paths` (pre-existing user-owned ignored files snapshotted at runner init) or in `allowed_files` (companion artifacts like `.pdd/meta/*.json` when the user has `.pdd/` in `.gitignore`); the rest are merged into the returned set. On EITHER git scan failing (timeout, missing binary, non-zero return) the function returns the single-element sentinel `[""]` so `_enforce_scope_guard` hard-fails rather than silently treating an unobservable worktree as clean. Consistent with the warning-log + empty-list pattern used by `_revert_out_of_scope_changes` and `revert_out_of_scope_changes_with_dirs` — but the sentinel value, not an empty list, is what forces the hard-fail. - `_parse_conformance_failure(stdout: str, stderr: str) -> Optional[Tuple[str, Tuple[str, ...]]]`: scan combined stdout+stderr for the line prefix `Architecture conformance error for ` and, when matched, return `(repair_directive, missing_symbols)` where `missing_symbols` is a sorted tuple of the symbols listed after any of the following inline shapes (route each into its own directive bucket — they MUST NOT be merged): - (a) `declared symbols missing from generated code:` — default `ArchitectureConformanceError` shape (architecture.json symbol-existence check). - (b) `Python code uses camelCase names (...)` parenthesised list — camelCase guard. diff --git a/pdd/prompts/durable_sync_runner_python.prompt b/pdd/prompts/durable_sync_runner_python.prompt index 452f8e98b..492cdf489 100644 --- a/pdd/prompts/durable_sync_runner_python.prompt +++ b/pdd/prompts/durable_sync_runner_python.prompt @@ -11,10 +11,10 @@ Durable execution engine for `pdd sync --durable`. It must pr 1. Class: `DurableSyncRunner`, compatible with the runner contract used by `run_agentic_sync()`: `run() -> Tuple[bool, str, float]`. 2. Prepare a durable branch named `sync/issue-` by default, unless a safe `durable_branch` override is supplied. 3. Reject unsafe branches: `main`, `master`, the repository default branch, missing/broken `origin`, non-git directories, and durable branches checked out in another worktree. -4. Use `.pdd/worktrees/durable-issue-` as the main durable worktree and `.pdd/worktrees/sync-issue--` for per-module worktrees. +4. Use `.pdd/worktrees/durable-issue-` as the main durable worktree and `.pdd/worktrees/sync-issue--` for per-module worktrees. Issue #1013 iter-22 M-1: after invoking `AsyncSyncRunner.__init__`, explicitly clear the inherited `_baseline_changed_paths` and `_baseline_ignored_paths` sets to the empty set. Per-module durable worktrees are freshly created via `git worktree add` and contain no pre-existing user WIP by construction, so the iter-6 B1 "preserve pre-existing untracked files" carve-out (which protects the user's main checkout in the in-place async case) has no analog in durable mode. Inheriting a non-empty baseline from `git_root` leaks the caller's dirty paths into each worktree's scope-guard allow set (`_enforce_scope_guard` resolves baseline `rel_posix` entries against the per-module worktree root) and silently bypasses the split contract. 5. Resume by scanning pushed checkpoint commits on the durable branch for trailers formatted as `PDD-Sync-Checkpoint-V1: issue= module=`. Ignore trailers for other issues. 6. Do not rely on `.pdd/agentic_sync_state.json` for durable resume. Corrupt or missing local state must not prevent resuming from remote checkpoint trailers. -7. For each successful module, create a checkpoint commit containing only safe, relevant project files and allowed `.pdd/meta/_*.json` metadata. Push the checkpoint before printing `PDD_CHECKPOINT:`. +7. For each successful module, create a checkpoint commit containing only safe, relevant project files and allowed `.pdd/meta/_*.json` metadata. If the parent issue supplied an allowed write set, reject any staged path outside that exact repo-relative set before creating the checkpoint. Companion-allowlist matching for staged paths MUST use `_matches_companion_pattern_anchored` from `pdd.agentic_common` (Issue #1013 iter-14 M-2) — anchored, segment-aware glob matching — NOT `pathlib.PurePosixPath.match`, whose suffix-based semantics would let a nested path like `subdir/.pdd/meta/foo.json` falsely match the default `.pdd/meta/*.json` companion pattern and bypass the split contract. Issue #1013 iter-16 M-1: companion patterns are matched MODULE-RELATIVE, so when `module_cwd != module_worktree` (multi-module sync where the module lives in a subdirectory like `worktree/pkg`), strip the module_cwd prefix from each staged path before invoking the anchored matcher — otherwise legitimate metadata such as `pkg/.pdd/meta/foo.json` would be rejected. Staged paths that fall outside the module's cwd (sibling-module artifacts) MUST NOT auto-allow under any companion pattern (F1 iter-3 sibling rule). Push the checkpoint before printing `PDD_CHECKPOINT:`. 8. If a module succeeds with no file diff, create an empty checkpoint commit so resume can still skip it later. 9. Never checkpoint unsafe files: `.env`, `.env.local`, `cost.csv`, `crash.log`, `fix_errors.log`, `.pem`, `.key`, token/secret paths, `.pdd/worktrees`, `.pdd/agentic_sync_state.json`, or unrelated `.pdd` files. 10. On patch conflict or failed module output, exit non-zero for the durable run, abort any in-progress `git am`, preserve prior checkpoints, and do not create later checkpoints. @@ -39,4 +39,4 @@ Durable execution engine for `pdd sync --durable`. It must pr - Preserve already-pushed checkpoints as the source of truth. % Deliverables -- Code: `pdd/durable_sync_runner.py` \ No newline at end of file +- Code: `pdd/durable_sync_runner.py` diff --git a/tests/test_agentic_common.py b/tests/test_agentic_common.py index 42b64f368..da818e576 100644 --- a/tests/test_agentic_common.py +++ b/tests/test_agentic_common.py @@ -4689,6 +4689,170 @@ def _init_test_git_repo(path): class TestRevertOutOfScopeChanges: """Tests for _revert_out_of_scope_changes scope guard utility.""" + def test_reverts_out_of_scope_staged_rename(self, tmp_path): + """Iter-6 B2 (rename revert bug): ``git status --porcelain`` reports + renames as ``R old -> new``. The helper used to treat the whole + payload as one path, so the subsequent ``git checkout HEAD --`` + was passed a literal ``"old -> new"`` and silently failed. + + After the fix both source and destination are restored and + ``git status`` is clean. + """ + from pdd.agentic_common import _revert_out_of_scope_changes + + proj = tmp_path / "repo" + proj.mkdir() + (proj / "old.py").write_text("contents\n") + (proj / "in_scope.py").write_text("in_scope\n") + _init_test_git_repo(proj) + + _subprocess.run(["git", "-C", str(proj), "mv", "old.py", "new.py"], + check=True, capture_output=True) + + allowed = {(proj / "in_scope.py").resolve()} + reverted = _revert_out_of_scope_changes(proj, allowed) + + status = _subprocess.run( + ["git", "-C", str(proj), "status", "--porcelain"], + capture_output=True, text=True, check=True, + ).stdout + assert status.strip() == "", ( + f"git status should be clean after rename revert; got: {status!r}" + ) + assert (proj / "old.py").exists() + assert not (proj / "new.py").exists() + assert len(reverted) >= 1 + + def test_partial_rename_restores_both_sides_when_source_allowed(self, tmp_path): + """Iter-7 B4 (partial-rename bug): a rename is atomic. If the + contract allows the SOURCE (``old``) but NOT the destination + (``new``), restoring only ``new`` leaves ``D old`` staged. Fix: + when either side of a rename is out of scope, revert BOTH so the + rename is fully undone. + """ + from pdd.agentic_common import _revert_out_of_scope_changes + + proj = tmp_path / "repo" + proj.mkdir() + (proj / "pdd").mkdir() + (proj / "pdd" / "old.py").write_text("contents\n") + _init_test_git_repo(proj) + + _subprocess.run(["git", "-C", str(proj), "mv", + "pdd/old.py", "pdd/new.py"], + check=True, capture_output=True) + + # Contract allows source but not destination. + allowed = {(proj / "pdd" / "old.py").resolve()} + _revert_out_of_scope_changes(proj, allowed) + + status = _subprocess.run( + ["git", "-C", str(proj), "status", "--porcelain"], + capture_output=True, text=True, check=True, + ).stdout + assert status.strip() == "", ( + f"git status should be clean — rename must be fully undone; " + f"got: {status!r}" + ) + assert (proj / "pdd" / "old.py").exists() + assert not (proj / "pdd" / "new.py").exists() + + def test_partial_rename_restores_both_sides_when_destination_allowed( + self, tmp_path + ): + """Iter-7 B4 (partial-rename bug): inverse of the above. If the + contract allows the DESTINATION but NOT the source, restoring only + ``old`` leaves ``A new`` staged. The whole rename must be reverted. + """ + from pdd.agentic_common import _revert_out_of_scope_changes + + proj = tmp_path / "repo" + proj.mkdir() + (proj / "pdd").mkdir() + (proj / "pdd" / "old.py").write_text("contents\n") + _init_test_git_repo(proj) + + _subprocess.run(["git", "-C", str(proj), "mv", + "pdd/old.py", "pdd/new.py"], + check=True, capture_output=True) + + # Contract allows destination but not source. + allowed = {(proj / "pdd" / "new.py").resolve()} + _revert_out_of_scope_changes(proj, allowed) + + status = _subprocess.run( + ["git", "-C", str(proj), "status", "--porcelain"], + capture_output=True, text=True, check=True, + ).stdout + assert status.strip() == "", ( + f"git status should be clean — rename must be fully undone; " + f"got: {status!r}" + ) + assert (proj / "pdd" / "old.py").exists() + assert not (proj / "pdd" / "new.py").exists() + + def test_rename_left_in_place_when_both_sides_allowed(self, tmp_path): + """Iter-7 B4 negative case: when BOTH sides of the rename are in + scope, the rename must NOT be reverted — only out-of-scope changes + get touched. + """ + from pdd.agentic_common import _revert_out_of_scope_changes + + proj = tmp_path / "repo" + proj.mkdir() + (proj / "pdd").mkdir() + (proj / "pdd" / "old.py").write_text("contents\n") + _init_test_git_repo(proj) + + _subprocess.run(["git", "-C", str(proj), "mv", + "pdd/old.py", "pdd/new.py"], + check=True, capture_output=True) + + allowed = { + (proj / "pdd" / "old.py").resolve(), + (proj / "pdd" / "new.py").resolve(), + } + _revert_out_of_scope_changes(proj, allowed) + + # Rename should remain staged. + status = _subprocess.run( + ["git", "-C", str(proj), "status", "--porcelain"], + capture_output=True, text=True, check=True, + ).stdout + assert "R" in status and "old.py" in status and "new.py" in status, ( + f"In-scope rename must remain staged; got: {status!r}" + ) + + def test_empty_contract_reverts_rename_fully(self, tmp_path): + """Iter-8 B5a (empty-contract early-exit) + B5b: a reject-all + empty contract (``allowed_paths=set()``) used to short-circuit + the helper. After the fix, the helper proceeds with revert; the + rename is fully undone. + """ + from pdd.agentic_common import _revert_out_of_scope_changes + + proj = tmp_path / "repo" + proj.mkdir() + (proj / "pdd").mkdir() + (proj / "pdd" / "old.py").write_text("contents\n") + _init_test_git_repo(proj) + + _subprocess.run(["git", "-C", str(proj), "mv", + "pdd/old.py", "pdd/new.py"], + check=True, capture_output=True) + + # Empty contract: nothing is allowed → revert everything. + _revert_out_of_scope_changes(proj, set()) + + status = _subprocess.run( + ["git", "-C", str(proj), "status", "--porcelain"], + capture_output=True, text=True, check=True, + ).stdout + assert status.strip() == "", ( + f"Empty contract must revert all changes including renames; " + f"got: {status!r}" + ) + def test_reverts_deleted_files(self, tmp_path): """Deleted files outside allowed set must be restored.""" from pdd.agentic_common import _revert_out_of_scope_changes @@ -6986,3 +7150,491 @@ def test_anthropic_is_error_json_envelope_skips_retries( ) # 3. No backoff sleep — permanent errors must NOT delay the fallback sleep_mock.assert_not_called() + + +# --------------------------------------------------------------------------- +# Issue #1013 — IssueContract / parse_issue_contract regression coverage (F9) +# --------------------------------------------------------------------------- + +class TestParseIssueContract: + """Regression coverage for ``pdd.agentic_common.parse_issue_contract``. + + Exercises the prompt-level requirements at + ``pdd/prompts/agentic_common_python.prompt:21`` (item 21 — issue contract + parsing) and the F1+F2 hardening done in Issue #1013 review iteration 2. + """ + + def test_html_comment_happy_path_returns_contract(self): + from pdd.agentic_common import parse_issue_contract, IssueContract + + body = ( + "" + ) + c = parse_issue_contract(body) + assert isinstance(c, IssueContract) + assert c.allowed_paths == ("pdd/foo.py", "tests/test_foo.py") + assert c.companion_allowlist == (".pdd/meta/*.json",) + assert c.source == "html-comment" + + def test_empty_allowed_paths_returns_reject_all_contract(self): + """F1: an explicit empty contract is a valid 'reject every change' + contract, NOT permissive fallback.""" + from pdd.agentic_common import parse_issue_contract, IssueContract + + body = '' + c = parse_issue_contract(body) + assert isinstance(c, IssueContract) + assert c.allowed_paths == () + assert c.source == "html-comment" + + def test_malformed_json_returns_none(self): + from pdd.agentic_common import parse_issue_contract + + body = "" + assert parse_issue_contract(body) is None + + def test_body_marker_wins_over_comment_marker(self): + from pdd.agentic_common import parse_issue_contract + + body = '' + comment = ( + '' + ) + c = parse_issue_contract(body, [comment]) + assert c is not None + assert c.allowed_paths == ("from_body.py",) + + def test_path_traversal_entries_are_dropped_but_contract_kept(self): + """F1: syntactically invalid entries are dropped silently; the + contract itself remains valid even if filtering leaves it empty.""" + from pdd.agentic_common import parse_issue_contract, IssueContract + + body = ( + "" + ) + c = parse_issue_contract(body) + assert isinstance(c, IssueContract) + assert c.allowed_paths == ("pdd/ok.py",) + + def test_fenced_block_only_text_or_json_languages_are_accepted(self): + """F2: the parser must reject arbitrary fence info strings such as + ``python`` or ``bash`` — only empty / ``text`` / ``json`` are + accepted.""" + from pdd.agentic_common import parse_issue_contract + + for lang in ("python", "bash", "yaml", "shell"): + body = f"## Allowed Write Set\n```{lang}\npdd/foo.py\n```\n" + assert parse_issue_contract(body) is None, ( + f"fence language {lang!r} must be rejected" + ) + + def test_fenced_block_must_immediately_follow_heading(self): + """F2: a fence that appears later in the body (after intervening + prose) must NOT be picked up — only whitespace is allowed between + the heading and the fence.""" + from pdd.agentic_common import parse_issue_contract + + body = ( + "## Allowed Write Set\n\n" + "Some discussion paragraph here.\n\n" + "```text\npdd/foo.py\n```\n" + ) + assert parse_issue_contract(body) is None + + def test_fenced_block_accepts_only_text_or_json(self): + """Iter-3 F3: the spec at agentic_common_python.prompt:110 requires + ``text`` or ``json`` info strings; bare ``` (no language) is NOT + accepted as a split-contract fence. + + Iter-12 B-1: each fence language has its own body format — ``text`` + is line-separated paths, ``json`` is a JSON array of path strings.""" + from pdd.agentic_common import parse_issue_contract + + cases = ( + ("```text", "pdd/foo.py"), + ("```json", '["pdd/foo.py"]'), + ) + for fence, payload in cases: + body = f"## Allowed Write Set\n{fence}\n{payload}\n```\n" + c = parse_issue_contract(body) + assert c is not None and c.allowed_paths == ("pdd/foo.py",), fence + assert c.source == "fenced-block" + + def test_fenced_block_rejects_bare_fence(self): + """Iter-3 F3: bare ``` fence (no language) must be rejected.""" + from pdd.agentic_common import parse_issue_contract + + body = "## Allowed Write Set\n```\npdd/foo.py\n```\n" + assert parse_issue_contract(body) is None + + def test_fenced_block_empty_body_returns_empty_contract(self): + """F1: an empty fenced block is a degenerate but legal contract.""" + from pdd.agentic_common import parse_issue_contract, IssueContract + + body = "## Allowed Write Set\n```text\n```\n" + c = parse_issue_contract(body) + assert isinstance(c, IssueContract) + assert c.allowed_paths == () + + def test_fenced_json_array_of_paths_parses_correctly(self): + """Iter-12 B-1: a ``json`` fence whose body is a JSON array of path + strings must parse to those paths (NOT to a single literal path + equal to the raw JSON text).""" + from pdd.agentic_common import parse_issue_contract, IssueContract + + body = ( + "## Allowed Write Set\n" + "```json\n" + '["pdd/foo.py", "tests/test_foo.py"]\n' + "```\n" + ) + c = parse_issue_contract(body) + assert isinstance(c, IssueContract) + assert c.allowed_paths == ("pdd/foo.py", "tests/test_foo.py") + assert c.source == "fenced-block" + + def test_fenced_json_empty_array_returns_empty_contract(self): + """Iter-12 B-1: ``[]`` in a ``json`` fence is a syntactically valid + degenerate reject-all contract — the parser MUST return an + ``IssueContract`` with ``allowed_paths=()``, NOT a single-element + tuple containing the literal string ``'[]'``.""" + from pdd.agentic_common import parse_issue_contract, IssueContract + + body = "## Allowed Write Set\n```json\n[]\n```\n" + c = parse_issue_contract(body) + assert isinstance(c, IssueContract) + assert c.allowed_paths == () + assert c.source == "fenced-block" + + def test_fenced_json_malformed_returns_none(self): + """Iter-12 B-1: malformed JSON in a ``json`` fence MUST cause the + parser to return ``None`` (permissive fallback), not raise.""" + from pdd.agentic_common import parse_issue_contract + + body = "## Allowed Write Set\n```json\n{not valid json\n```\n" + assert parse_issue_contract(body) is None + + def test_fenced_json_object_returns_none(self): + """Iter-12 B-1: a JSON *object* in a ``json`` fence is the + HTML-comment format leaking into a fence — the fenced-block + ``json`` format is documented as an array of paths only, so the + parser MUST return ``None`` for objects.""" + from pdd.agentic_common import parse_issue_contract + + body = ( + "## Allowed Write Set\n" + "```json\n" + '{"allowed_paths": ["pdd/foo.py"]}\n' + "```\n" + ) + assert parse_issue_contract(body) is None + + def test_fenced_text_still_parses_line_by_line(self): + """Iter-12 B-1 regression: the ``text`` fence branch MUST keep its + original line-by-line semantics after the parser branched on + language.""" + from pdd.agentic_common import parse_issue_contract, IssueContract + + body = ( + "## Allowed Write Set\n" + "```text\n" + "pdd/foo.py\n" + "tests/test_foo.py\n" + "```\n" + ) + c = parse_issue_contract(body) + assert isinstance(c, IssueContract) + assert c.allowed_paths == ("pdd/foo.py", "tests/test_foo.py") + assert c.source == "fenced-block" + + def test_companion_allowlist_rejects_wildcard_only_patterns(self): + """Iter-10 M-1: wildcard-only patterns (``*``, ``**``, ``**/*``, ``?``) + would let a contract auto-allow repo-wide changes; the parser MUST + drop them silently.""" + from pdd.agentic_common import parse_issue_contract, IssueContract + + body = ( + "" + ) + c = parse_issue_contract(body) + assert isinstance(c, IssueContract) + assert c.companion_allowlist == () + assert c.allowed_paths == ("pdd/foo.py",) + + def test_companion_allowlist_keeps_anchored_patterns(self): + """Iter-10 M-1: patterns with at least one literal-character segment + anchor remain valid. Iter-14 M-1/M-2: ``**``-bearing patterns are + ALSO dropped now (segment-aware matcher requires equal segment + counts, so a doublestar segment would be ambiguous).""" + from pdd.agentic_common import parse_issue_contract, IssueContract + + body = ( + "" + ) + c = parse_issue_contract(body) + assert isinstance(c, IssueContract) + # ``**/foo.json`` is dropped by the iter-14 doublestar rule; + # the three remaining anchored patterns are kept. + assert c.companion_allowlist == ( + ".pdd/meta/*.json", + "architecture.json", + "tests/test_*.py", + ) + + def test_companion_allowlist_rejects_traversal_and_absolute(self): + """Iter-10 M-1: absolute paths, parent-traversal, and Windows + separators in companion patterns must be dropped silently.""" + from pdd.agentic_common import parse_issue_contract, IssueContract + + body = ( + "" + ) + c = parse_issue_contract(body) + assert isinstance(c, IssueContract) + assert c.companion_allowlist == () + + def test_default_companion_allowlist_passes_validation(self): + """Iter-10 M-1: the shipped default allowlist MUST itself pass the + validator — otherwise the runner's defense-in-depth filter would + strip every entry and the scope guard would have no companions.""" + from pdd.agentic_common import ( + DEFAULT_SYNC_COMPANION_ALLOWLIST, + _is_valid_companion_pattern, + ) + + assert DEFAULT_SYNC_COMPANION_ALLOWLIST + for pattern in DEFAULT_SYNC_COMPANION_ALLOWLIST: + assert _is_valid_companion_pattern(pattern), pattern + + def test_anchored_matcher_rejects_nested_default_pattern(self): + """Iter-14 M-1/M-2: the anchored, segment-aware matcher MUST treat + ``.pdd/meta/*.json`` as a TOP-LEVEL pattern — paths nested under + any other directory (``subdir/.pdd/meta/foo.json``) MUST NOT auto- + allow, because ``PurePosixPath.match`` is suffix-based and would + let a contract violator bypass the guard by writing fingerprint- + shaped files under any prefix. + """ + from pdd.agentic_common import _matches_companion_pattern_anchored + + # Intended: top-level match auto-allows. + assert _matches_companion_pattern_anchored( + ".pdd/meta/foo.json", ".pdd/meta/*.json" + ) is True + # Bug repro: nested-prefix path must NOT match. + assert _matches_companion_pattern_anchored( + "subdir/.pdd/meta/foo.json", ".pdd/meta/*.json" + ) is False + # Deeper-prefix path must NOT match. + assert _matches_companion_pattern_anchored( + "a/b/c/.pdd/meta/foo.json", ".pdd/meta/*.json" + ) is False + # Path nested UNDER the meta dir (different segment count) must + # NOT match — preserves the iter-3 F3 strict-pathlib semantics. + assert _matches_companion_pattern_anchored( + ".pdd/meta/sub/foo.json", ".pdd/meta/*.json" + ) is False + + def test_anchored_matcher_handles_segment_wildcards(self): + """Iter-14 M-1/M-2: ``*`` matches a single segment only. The + matcher MUST NOT collapse multiple segments into one wildcard. + """ + from pdd.agentic_common import _matches_companion_pattern_anchored + + assert _matches_companion_pattern_anchored( + "tests/foo.txt", "tests/*.txt" + ) is True + # Segment count mismatch — ``*`` does not span ``sub/foo.txt``. + assert _matches_companion_pattern_anchored( + "tests/sub/foo.txt", "tests/*.txt" + ) is False + + def test_is_valid_companion_pattern_rejects_doublestar(self): + """Iter-14 M-1/M-2: ``**`` segments are rejected at parse time so + the segment-aware matcher never sees them. The validator MUST + reject both pure ``**`` and any pattern with a ``**`` segment. + """ + from pdd.agentic_common import _is_valid_companion_pattern + + # Pure wildcard-only patterns (already iter-10 territory). + assert _is_valid_companion_pattern("**") is False + # Iter-14: ``**`` SEGMENTS rejected even when paired with literals. + assert _is_valid_companion_pattern("**/foo.json") is False + assert _is_valid_companion_pattern("foo/**") is False + assert _is_valid_companion_pattern("foo/**/bar.json") is False + # Regression: the shipped default and other anchored patterns + # without ``**`` segments remain valid. + assert _is_valid_companion_pattern(".pdd/meta/*.json") is True + assert _is_valid_companion_pattern("architecture.json") is True + assert _is_valid_companion_pattern("tests/test_*.py") is True + + # ------------------------------------------------------------------ + # Issue #1013 iter-18 B-1: bullet-list contract format + # ------------------------------------------------------------------ + + def test_bullet_list_contract_from_issue_1005(self): + """Iter-18 B-1: the real-world #1005 issue body (verbatim) MUST + parse to the three paths under ``**Allowed write set:**`` and + NOT the six unrelated paths under the earlier ``## Files`` + section. + """ + from pdd.agentic_common import parse_issue_contract, IssueContract + + body = ( + "## Problem\n" + "Single-file `pdd update ` clears stale `_run.json` reports but does not reliably save a fingerprint on success. This leaves `.pdd/meta/` looking partially synced.\n" + "\n" + "## Files\n" + "- `pdd/update_main.py`\n" + "- `pdd/prompts/update_main_python.prompt`\n" + "- `pdd/prompts/agentic_update_python.prompt`\n" + "- `.pdd/meta/update_main_python.json`\n" + "- `.pdd/meta/update_main_python_run.json`\n" + "- `tests/test_update_main.py` (regression test)\n" + "\n" + "## Desired Behavior\n" + "...\n" + "\n" + "---\n" + "## Split Contract\n" + "**Command sequence:** change → sync\n" + "**Allowed write set:**\n" + "- `pdd/update_main.py`\n" + "- `pdd/prompts/update_main_python.prompt`\n" + "- `tests/test_update_main.py`\n" + "**Acceptance criteria:**\n" + "- Successful `pdd update ` writes a current fingerprint to `.pdd/meta/.json`.\n" + "- Finalization failure produces an explicit user-visible warning (no silent stale metadata).\n" + "- Regression test in `tests/test_update_main.py` covers fingerprint save on success and the warning path on finalization failure.\n" + "**Independently mergeable:** True\n" + "**Scope rule:** Do not expand beyond this contract or implement sibling sub-issue work. If the contract is insufficient, report the gap instead.\n" + ) + c = parse_issue_contract(body) + assert isinstance(c, IssueContract) + assert c.allowed_paths == ( + "pdd/update_main.py", + "pdd/prompts/update_main_python.prompt", + "tests/test_update_main.py", + ) + assert c.source == "bullet-list" + + def test_bullet_list_stops_at_next_label(self): + """Iter-18 B-1: the ``**Acceptance criteria:**`` label terminates + the bullet list — bullets under it MUST NOT join the write set. + """ + from pdd.agentic_common import parse_issue_contract, IssueContract + + body = ( + "## Split Contract\n" + "**Allowed write set:**\n" + "- pdd/foo.py\n" + "- tests/test_foo.py\n" + "**Acceptance criteria:**\n" + "- a thing\n" + "- another thing\n" + ) + c = parse_issue_contract(body) + assert isinstance(c, IssueContract) + assert c.allowed_paths == ("pdd/foo.py", "tests/test_foo.py") + assert c.source == "bullet-list" + + def test_bullet_list_stops_at_horizontal_rule(self): + """Iter-18 B-1: ``---`` terminates the bullet list.""" + from pdd.agentic_common import parse_issue_contract, IssueContract + + body = ( + "## Split Contract\n" + "**Allowed write set:**\n" + "- pdd/foo.py\n" + "- tests/test_foo.py\n" + "---\n" + "Other section\n" + "- not_in_contract.py\n" + ) + c = parse_issue_contract(body) + assert isinstance(c, IssueContract) + assert c.allowed_paths == ("pdd/foo.py", "tests/test_foo.py") + + def test_bullet_list_strips_backticks_on_paths(self): + """Iter-18 B-1: backtick-wrapped paths in bullets are accepted; the + backticks are stripped before validation.""" + from pdd.agentic_common import parse_issue_contract, IssueContract + + body = ( + "## Split Contract\n" + "**Allowed write set:**\n" + "- `pdd/foo.py`\n" + "- `tests/test_foo.py`\n" + ) + c = parse_issue_contract(body) + assert isinstance(c, IssueContract) + assert c.allowed_paths == ("pdd/foo.py", "tests/test_foo.py") + assert c.source == "bullet-list" + + def test_bullet_list_under_allowed_write_set_heading(self): + """Iter-18 B-1: ``## Allowed Write Set`` is an accepted heading + (matches the same regex as ``## Split Contract``).""" + from pdd.agentic_common import parse_issue_contract, IssueContract + + body = ( + "## Allowed Write Set\n" + "**Allowed write set:**\n" + "- pdd/foo.py\n" + "- tests/test_foo.py\n" + ) + c = parse_issue_contract(body) + assert isinstance(c, IssueContract) + assert c.allowed_paths == ("pdd/foo.py", "tests/test_foo.py") + assert c.source == "bullet-list" + + def test_bullet_list_with_no_valid_paths(self): + """Iter-18 B-1: bullets that are all invalid (parent-traversal, + absolute, etc.) reduce to a degenerate reject-all contract per + the iter-8 B5 semantics — the contract is still returned with + ``allowed_paths=()``, NOT ``None``.""" + from pdd.agentic_common import parse_issue_contract, IssueContract + + body = ( + "## Split Contract\n" + "**Allowed write set:**\n" + "- ../escape\n" + "- /absolute/path\n" + "- pdd\\windows_sep.py\n" + ) + c = parse_issue_contract(body) + assert isinstance(c, IssueContract) + assert c.allowed_paths == () + assert c.source == "bullet-list" + + def test_html_comment_wins_over_bullet_list(self): + """Iter-18 B-1: when BOTH formats appear, the HTML-comment branch + wins (spec-preferred priority order is preserved).""" + from pdd.agentic_common import parse_issue_contract, IssueContract + + body = ( + '\n' + "\n" + "## Split Contract\n" + "**Allowed write set:**\n" + "- from_bullets.py\n" + ) + c = parse_issue_contract(body) + assert isinstance(c, IssueContract) + assert c.allowed_paths == ("from_html.py",) + assert c.source == "html-comment" diff --git a/tests/test_agentic_common_worktree.py b/tests/test_agentic_common_worktree.py index 6a3868605..ff660b8a9 100644 --- a/tests/test_agentic_common_worktree.py +++ b/tests/test_agentic_common_worktree.py @@ -519,16 +519,60 @@ def test_keeps_in_scope_by_allowed_files(self, tmp_path): assert result == [] def test_handles_renames(self): + """Iter-8 B5b: renames are reverted atomically — both old and new + sides appear in the result list, and the helper invokes + ``git restore --staged --worktree --source=HEAD`` (not the old + ``git checkout HEAD --``) so the rename destination is properly + removed from the working tree.""" porcelain = "R old.py -> new.py\n" with patch(f"{MODULE}.subprocess.run") as mock_run: mock_run.side_effect = [ _cp(stdout=porcelain), - _cp(), # checkout + _cp(), # restore ] result = revert_out_of_scope_changes_with_dirs( Path("/repo"), allowed_dirs=set(), allowed_files=set() ) - assert Path("new.py") in result + assert Path("old.py") in result and Path("new.py") in result + # Verify the second subprocess call used ``git restore`` with + # BOTH paths (atomic rename revert). + restore_call = mock_run.call_args_list[1] + args = restore_call.args[0] + assert "restore" in args + assert "old.py" in args and "new.py" in args + + def test_partial_rename_atomic_revert(self, tmp_path): + """Iter-8 B5b (worktree helper): when one side of a rename is + allowed and the other is not, the rename must be reverted as a + unit. Uses a real ``tmp_path`` git repo because mocks would not + catch the actual half-staged state. + """ + import subprocess as _sp + env = {**os.environ, "GIT_AUTHOR_NAME": "T", "GIT_AUTHOR_EMAIL": "t@t", + "GIT_COMMITTER_NAME": "T", "GIT_COMMITTER_EMAIL": "t@t"} + _sp.run(["git", "init", "-b", "main", str(tmp_path)], check=True, + capture_output=True, env=env) + (tmp_path / "old.py").write_text("c\n") + _sp.run(["git", "-C", str(tmp_path), "add", "-A"], check=True, + capture_output=True, env=env) + _sp.run(["git", "-C", str(tmp_path), "commit", "-m", "init"], + check=True, capture_output=True, env=env) + _sp.run(["git", "-C", str(tmp_path), "mv", "old.py", "new.py"], + check=True, capture_output=True, env=env) + + # Allow only one side of the rename. + allowed = {(tmp_path / "old.py").resolve()} + revert_out_of_scope_changes_with_dirs( + tmp_path, allowed_dirs=set(), allowed_files=allowed + ) + + status = _sp.run( + ["git", "-C", str(tmp_path), "status", "--porcelain"], + capture_output=True, text=True, check=True, + ).stdout + assert status.strip() == "", ( + f"Partial rename must be fully undone; got: {status!r}" + ) def test_handles_git_status_failure(self): with patch(f"{MODULE}.subprocess.run", return_value=_cp(returncode=1)): diff --git a/tests/test_agentic_sync.py b/tests/test_agentic_sync.py index cc0665ed7..77d73d576 100644 --- a/tests/test_agentic_sync.py +++ b/tests/test_agentic_sync.py @@ -18,12 +18,14 @@ from pdd.agentic_sync import ( _apply_architecture_corrections, _analyze_global_sync_modules, + _arch_path_in_scope, _architecture_module_basenames, _architecture_sync_modules, _augment_architecture_from_pr_branch, _build_scoped_global_dep_graph, _branch_diff_is_runtime_llm_only, _detect_modules_from_branch_diff, + _enforce_orchestrator_scope, _filter_already_synced, _find_project_root, _is_catchall_match, @@ -31,6 +33,7 @@ _is_runtime_llm_template, _llm_fix_dry_run_failure, _load_architecture_json, + _extract_allowed_write_paths, _parse_llm_response, _print_global_sync_plan, _resolve_module_cwd, @@ -41,8 +44,10 @@ run_agentic_sync, run_global_sync, ) +from pdd.agentic_common import IssueContract from pdd.agentic_sync_runner import ( DepGraphFromArchitectureResult, + _hash_baseline_paths, build_dep_graph_from_architecture, ) @@ -192,6 +197,64 @@ def test_deps_valid_case_insensitive(self): assert valid2 is False +class TestExtractAllowedWritePaths: + """ + Issue #1013 (F1, F3, F16): the deprecated ``_extract_allowed_write_paths`` + wrapper now delegates to :func:`pdd.agentic_common.parse_issue_contract`, + which only recognizes two structured contract formats: HTML-comment + blocks and heading+fenced-block. The legacy loose-markdown parsing tested + here previously is intentionally NOT supported by the new contract API — + deeper coverage lives in ``tests/test_agentic_common.py``. + """ + + def test_extracts_split_contract_allowed_paths_from_fenced_block(self): + issue = """ +## Allowed Write Set +```text +pdd/update_main.py +pdd/prompts/update_main_python.prompt +tests/test_update_main.py +``` + +But sync wrote other files. +""" + assert _extract_allowed_write_paths(issue) == [ + "pdd/update_main.py", + "pdd/prompts/update_main_python.prompt", + "tests/test_update_main.py", + ] + + def test_extracts_split_contract_allowed_paths_from_html_comment(self): + issue = """ +Some discussion. + + + +More discussion. +""" + assert _extract_allowed_write_paths(issue) == [ + "pdd/update_main.py", + "tests/test_update_main.py", + ] + + def test_returns_empty_without_contract_marker(self): + assert _extract_allowed_write_paths("Touch `pdd/foo.py` if needed.") == [] + + def test_ignores_loose_markdown_bullets_without_structured_block(self): + # The legacy markdown-bullet format is no longer supported; the new + # contract API requires either an HTML-comment or a fenced block. + issue = """ +## Split Contract +Allowed write set: + + * `pdd/update_main.py` + * `tests/test_update_main.py` +""" + assert _extract_allowed_write_paths(issue) == [] + + # --------------------------------------------------------------------------- # _apply_architecture_corrections # --------------------------------------------------------------------------- @@ -1651,157 +1714,1946 @@ def test_durable_mode_uses_durable_runner( # --------------------------------------------------------------------------- -# _resolve_module_cwd +# Iter-26: orchestrator-level scope guard for the LLM dependency-correction +# step. The per-module scope guard runs INSIDE the runner; the dependency- +# correction step writes architecture.json BEFORE any runner exists. If the +# split-contract allowed write set does not include ``architecture.json``, +# the orchestrator must skip the correction so the contract is not silently +# violated. These tests cover the gate decision plus the already-synced +# early-return path which dispatches no runner at all. # --------------------------------------------------------------------------- -class TestResolveModuleCwd: - def _write_pddrc(self, path: Path, contexts: Dict[str, Any]) -> None: - """Helper to write a .pddrc file.""" - import yaml - config = {"contexts": contexts} - path.write_text(yaml.dump(config)) - def test_module_found_in_root_pddrc(self, tmp_path): - """Module matched by root .pddrc returns project root.""" - self._write_pddrc(tmp_path / ".pddrc", { - "myctx": { - "defaults": {"prompts_dir": "prompts/mymod"}, - "paths": ["src/mymod/**"], - }, - }) - result = _resolve_module_cwd("mymod/widget", tmp_path) - assert result == tmp_path +# A bullet-list contract that the parser in pdd.agentic_common recognizes. +# The ``**Allowed write set:**`` inline label is the discriminator; ``## Split +# Contract`` is just the surrounding heading. NOTE: architecture.json is NOT +# in this allow set, so the orchestrator must skip the deps-correction step. +_CONTRACT_BODY_ARCH_OUT_OF_SCOPE = ( + "Fix foo.\n" + "\n" + "## Split Contract\n" + "\n" + "**Allowed write set:**\n" + "\n" + "- `pdd/foo.py`\n" +) - def test_module_found_in_subdirectory_pddrc(self, tmp_path): - """Module found in subdirectory .pddrc returns that subdirectory.""" - # No root .pddrc — so subdirectory scanning is used - # Subdirectory has a matching context - sub = tmp_path / "examples" / "hello" - sub.mkdir(parents=True) - self._write_pddrc(sub / ".pddrc", { - "hello_ctx": { - "defaults": {"prompts_dir": "prompts/greeting"}, - "paths": ["src/**"], - }, - }) - result = _resolve_module_cwd("greeting/hi", tmp_path) - assert result == sub +# Same shape but architecture.json IS in the allow set — the correction should +# be applied. +_CONTRACT_BODY_ARCH_IN_SCOPE = ( + "Fix foo.\n" + "\n" + "## Split Contract\n" + "\n" + "**Allowed write set:**\n" + "\n" + "- `pdd/foo.py`\n" + "- `architecture.json`\n" +) - def test_module_not_found_falls_back_to_root(self, tmp_path): - """Module not in any .pddrc falls back to project root.""" - self._write_pddrc(tmp_path / ".pddrc", { - "other": { - "defaults": {"prompts_dir": "prompts/other"}, - "paths": ["src/other/**"], - }, - }) - result = _resolve_module_cwd("nonexistent_mod", tmp_path) - assert result == tmp_path +# Iter-28 B-2: contract allows a NESTED architecture path. Used by the +# nested-arch B-2 tests to assert the gate compares the real ``arch_path`` +# (resolved repo-relative) rather than the literal string ``architecture.json``. +_CONTRACT_BODY_NESTED_ARCH_IN_SCOPE = ( + "Fix foo.\n" + "\n" + "## Split Contract\n" + "\n" + "**Allowed write set:**\n" + "\n" + "- `pdd/foo.py`\n" + "- `frontend/architecture.json`\n" +) - def test_no_pddrc_falls_back_to_root(self, tmp_path): - """No .pddrc files at all returns project root.""" - result = _resolve_module_cwd("anything", tmp_path) - assert result == tmp_path - def test_deepest_match_wins(self, tmp_path): - """When multiple subdirs match, the deepest one wins.""" - # Depth 1 match - sub1 = tmp_path / "level1" - sub1.mkdir() - self._write_pddrc(sub1 / ".pddrc", { - "ctx1": { - "defaults": {"prompts_dir": "prompts/shared"}, - "paths": ["src/**"], - }, - }) - # Depth 2 match (deeper) - sub2 = sub1 / "level2" - sub2.mkdir() - self._write_pddrc(sub2 / ".pddrc", { - "ctx2": { - "defaults": {"prompts_dir": "prompts/shared"}, - "paths": ["src/**"], - }, - }) - result = _resolve_module_cwd("shared/mod", tmp_path) - assert result == sub2 +class TestDependencyCorrectionsScopeGuard: + """Verify the orchestrator-level scope gate on + ``_apply_architecture_corrections``. The gate runs BEFORE any runner is + dispatched, so per-module scope enforcement cannot catch this write.""" - def test_catchall_subdirectory_skipped(self, tmp_path): - """Subdirectory with catch-all '**' pattern should NOT match unrelated modules.""" - # Subdirectory with catch-all pattern - sub = tmp_path / "test_debug2" - sub.mkdir() - self._write_pddrc(sub / ".pddrc", { - "test_ctx": { - "paths": ["**"], - }, - }) - # Module that doesn't belong to test_debug2 - result = _resolve_module_cwd("bug_main", tmp_path) - # Should fall back to project root, not test_debug2 - assert result == tmp_path + @patch("pdd.agentic_sync._find_project_root") + @patch("pdd.agentic_sync._apply_architecture_corrections") + @patch("pdd.agentic_sync.AsyncSyncRunner") + @patch("pdd.agentic_sync._filter_already_synced", return_value=["foo"]) + @patch("pdd.agentic_sync._detect_modules_from_branch_diff", return_value=[]) + @patch("pdd.agentic_sync._run_dry_run_validation") + @patch( + "pdd.agentic_sync.build_dep_graph_from_architecture_data", + return_value=DepGraphFromArchitectureResult({"foo": []}, []), + ) + @patch("pdd.agentic_sync.load_prompt_template", return_value="template {issue_content} {architecture_json}") + @patch("pdd.agentic_sync.run_agentic_task") + @patch("pdd.agentic_sync._load_architecture_json") + @patch("pdd.agentic_sync._run_gh_command") + @patch("pdd.agentic_sync._check_gh_cli", return_value=True) + def test_dependency_corrections_skipped_when_arch_outside_contract( + self, + mock_gh_cli, + mock_gh_cmd, + mock_load_arch, + mock_agentic_task, + mock_load_prompt, + mock_build_graph, + mock_dry_run, + mock_branch_diff, + mock_filter_synced, + mock_runner_cls, + mock_apply_corrections, + mock_find_root, + tmp_path, + capsys, + ): + """Contract excludes architecture.json → corrections must NOT run.""" + # Iter-32 B-1: pin project root to tmp_path so the dispatch-boundary + # orchestrator scope guard sweeps a clean tmp tree (the real repo + # has dirty worktrees that would trip the guard). + _init_git_repo(tmp_path) + mock_find_root.return_value = tmp_path + issue_data = { + "title": "Fix foo", + "body": _CONTRACT_BODY_ARCH_OUT_OF_SCOPE, + "comments_url": "", + } + mock_gh_cmd.return_value = (True, json.dumps(issue_data)) + mock_load_arch.return_value = ( + [{"filename": "foo_python.prompt", "dependencies": []}], + tmp_path / "architecture.json", + ) + mock_agentic_task.return_value = ( + True, + ( + 'MODULES_TO_SYNC: ["foo"]\n' + "DEPS_VALID: false\n" + 'DEPS_CORRECTIONS: [{"filename": "foo_python.prompt", "dependencies": []}]' + ), + 0.05, + "anthropic", + ) + mock_dry_run.return_value = (True, {"foo": tmp_path}, [], 0.0) - def test_catchall_star_subdirectory_skipped(self, tmp_path): - """Subdirectory with catch-all '*' pattern should NOT match unrelated modules.""" - sub = tmp_path / "some_subdir" - sub.mkdir() - self._write_pddrc(sub / ".pddrc", { - "catch_all": { - "paths": ["*"], - }, - }) - result = _resolve_module_cwd("any_module", tmp_path) - assert result == tmp_path + mock_runner = MagicMock() + mock_runner.run.return_value = (True, "All 1 modules synced successfully", 0.10) + mock_runner_cls.return_value = mock_runner - def test_specific_subdirectory_match_still_works(self, tmp_path): - """Subdirectory with specific path pattern should still match correctly.""" - sub = tmp_path / "frontend" - sub.mkdir() - self._write_pddrc(sub / ".pddrc", { - "components": { - "paths": ["components/**"], - }, - }) - result = _resolve_module_cwd("components/button", tmp_path) - assert result == sub + success, msg, cost, model = run_agentic_sync( + "https://github.com/owner/repo/issues/1", + quiet=False, # capture the skip-warning text + ) - def test_nested_pddrc_match_requires_matching_prompt_file(self, tmp_path): - """Broad nested contexts must not hijack similarly named root modules. + assert success is True + mock_apply_corrections.assert_not_called() + captured = capsys.readouterr() + # Warning text from the orchestrator skip branch. Rich console may + # line-wrap the message at any width, so we collapse whitespace + # before substring-matching for the discriminating phrase. + flat = " ".join(captured.out.split()) + assert "Sync scope guard: skipping LLM dependency corrections" in flat + assert "architecture.json is outside" in flat - The prompts-linter example has patterns like ``*llm*`` and a local - ``llm_python.prompt``. That must not claim the root ``llm_model`` - module, whose prompt exists only at the project root. - """ - (tmp_path / "prompts").mkdir() - (tmp_path / "prompts" / "llm_model_python.prompt").write_text("% root prompt") - self._write_pddrc(tmp_path / ".pddrc", { - "default": { - "defaults": {"prompts_dir": "prompts"}, - }, - }) + @patch("pdd.agentic_sync._find_project_root") + @patch("pdd.agentic_sync._apply_architecture_corrections") + @patch("pdd.agentic_sync.AsyncSyncRunner") + @patch("pdd.agentic_sync._filter_already_synced", return_value=["foo"]) + @patch("pdd.agentic_sync._detect_modules_from_branch_diff", return_value=[]) + @patch("pdd.agentic_sync._run_dry_run_validation") + @patch( + "pdd.agentic_sync.build_dep_graph_from_architecture_data", + return_value=DepGraphFromArchitectureResult({"foo": []}, []), + ) + @patch("pdd.agentic_sync.load_prompt_template", return_value="template {issue_content} {architecture_json}") + @patch("pdd.agentic_sync.run_agentic_task") + @patch("pdd.agentic_sync._load_architecture_json") + @patch("pdd.agentic_sync._run_gh_command") + @patch("pdd.agentic_sync._check_gh_cli", return_value=True) + def test_dependency_corrections_applied_when_arch_in_contract( + self, + mock_gh_cli, + mock_gh_cmd, + mock_load_arch, + mock_agentic_task, + mock_load_prompt, + mock_build_graph, + mock_dry_run, + mock_branch_diff, + mock_filter_synced, + mock_runner_cls, + mock_apply_corrections, + mock_find_root, + tmp_path, + ): + """Contract includes architecture.json → corrections must run.""" + # Iter-32 B-1: init git in tmp_path so the dispatch-boundary + # orchestrator scope guard's working-tree probes succeed. + _init_git_repo(tmp_path) + mock_find_root.return_value = tmp_path + arch_data = [{"filename": "foo_python.prompt", "dependencies": []}] + mock_apply_corrections.return_value = arch_data + + issue_data = { + "title": "Fix foo", + "body": _CONTRACT_BODY_ARCH_IN_SCOPE, + "comments_url": "", + } + mock_gh_cmd.return_value = (True, json.dumps(issue_data)) + mock_load_arch.return_value = ( + arch_data, + tmp_path / "architecture.json", + ) + mock_agentic_task.return_value = ( + True, + ( + 'MODULES_TO_SYNC: ["foo"]\n' + "DEPS_VALID: false\n" + 'DEPS_CORRECTIONS: [{"filename": "foo_python.prompt", "dependencies": []}]' + ), + 0.05, + "anthropic", + ) + mock_dry_run.return_value = (True, {"foo": tmp_path}, [], 0.0) - nested = tmp_path / "examples" / "prompts_linter" - (nested / "prompts").mkdir(parents=True) - (nested / "prompts" / "llm_python.prompt").write_text("% nested prompt") - self._write_pddrc(nested / ".pddrc", { - "utils": { - "paths": ["*llm*"], - "defaults": {"prompts_dir": "prompts"}, - }, - }) + mock_runner = MagicMock() + mock_runner.run.return_value = (True, "All 1 modules synced successfully", 0.10) + mock_runner_cls.return_value = mock_runner - assert _resolve_module_cwd("llm_model", tmp_path) == tmp_path - assert _resolve_module_cwd("llm", tmp_path) == nested + success, _msg, _cost, _model = run_agentic_sync( + "https://github.com/owner/repo/issues/1", quiet=True + ) - def test_root_prompt_wins_over_nested_broad_glob(self, tmp_path): - """A root exact prompt should not be claimed by nested basename globs.""" - (tmp_path / "prompts").mkdir() - (tmp_path / "prompts" / "cli_python.prompt").write_text("% root prompt") - self._write_pddrc(tmp_path / ".pddrc", { - "default": { - "defaults": {"prompts_dir": "prompts"}, - }, - }) + assert success is True + mock_apply_corrections.assert_called_once() + + @patch("pdd.agentic_sync._apply_architecture_corrections") + @patch("pdd.agentic_sync.AsyncSyncRunner") + @patch("pdd.agentic_sync._filter_already_synced", return_value=["foo"]) + @patch("pdd.agentic_sync._detect_modules_from_branch_diff", return_value=[]) + @patch("pdd.agentic_sync._run_dry_run_validation") + @patch( + "pdd.agentic_sync.build_dep_graph_from_architecture_data", + return_value=DepGraphFromArchitectureResult({"foo": []}, []), + ) + @patch("pdd.agentic_sync.load_prompt_template", return_value="template {issue_content} {architecture_json}") + @patch("pdd.agentic_sync.run_agentic_task") + @patch("pdd.agentic_sync._load_architecture_json") + @patch("pdd.agentic_sync._run_gh_command") + @patch("pdd.agentic_sync._check_gh_cli", return_value=True) + def test_dependency_corrections_applied_when_no_contract( + self, + mock_gh_cli, + mock_gh_cmd, + mock_load_arch, + mock_agentic_task, + mock_load_prompt, + mock_build_graph, + mock_dry_run, + mock_branch_diff, + mock_filter_synced, + mock_runner_cls, + mock_apply_corrections, + tmp_path, + ): + """No contract markers → permissive mode preserves pre-iter-26 behavior.""" + arch_data = [{"filename": "foo_python.prompt", "dependencies": []}] + mock_apply_corrections.return_value = arch_data + + # No HTML comment, no fenced block, no ``**Allowed write set:**`` label. + issue_data = { + "title": "Fix foo", + "body": "Just fix foo, no contract here.", + "comments_url": "", + } + mock_gh_cmd.return_value = (True, json.dumps(issue_data)) + mock_load_arch.return_value = ( + arch_data, + tmp_path / "architecture.json", + ) + mock_agentic_task.return_value = ( + True, + ( + 'MODULES_TO_SYNC: ["foo"]\n' + "DEPS_VALID: false\n" + 'DEPS_CORRECTIONS: [{"filename": "foo_python.prompt", "dependencies": []}]' + ), + 0.05, + "anthropic", + ) + mock_dry_run.return_value = (True, {"foo": tmp_path}, [], 0.0) + + mock_runner = MagicMock() + mock_runner.run.return_value = (True, "All 1 modules synced successfully", 0.10) + mock_runner_cls.return_value = mock_runner + + success, _msg, _cost, _model = run_agentic_sync( + "https://github.com/owner/repo/issues/1", quiet=True + ) + + assert success is True + mock_apply_corrections.assert_called_once() + + @patch("pdd.agentic_sync._apply_architecture_corrections") + @patch("pdd.agentic_sync.AsyncSyncRunner") + @patch("pdd.agentic_sync._filter_already_synced", return_value=["foo"]) + @patch("pdd.agentic_sync._detect_modules_from_branch_diff", return_value=[]) + @patch("pdd.agentic_sync._run_dry_run_validation") + @patch( + "pdd.agentic_sync.build_dep_graph_from_architecture_data", + return_value=DepGraphFromArchitectureResult({"foo": []}, []), + ) + @patch("pdd.agentic_sync.load_prompt_template", return_value="template {issue_content} {architecture_json}") + @patch("pdd.agentic_sync.run_agentic_task") + @patch("pdd.agentic_sync._load_architecture_json") + @patch("pdd.agentic_sync._run_gh_command") + @patch("pdd.agentic_sync._check_gh_cli", return_value=True) + def test_dependency_corrections_applied_when_scope_guard_disabled( + self, + mock_gh_cli, + mock_gh_cmd, + mock_load_arch, + mock_agentic_task, + mock_load_prompt, + mock_build_graph, + mock_dry_run, + mock_branch_diff, + mock_filter_synced, + mock_runner_cls, + mock_apply_corrections, + tmp_path, + ): + """``--no-scope-guard`` bypasses the gate even when arch is out of scope.""" + arch_data = [{"filename": "foo_python.prompt", "dependencies": []}] + mock_apply_corrections.return_value = arch_data + + issue_data = { + "title": "Fix foo", + "body": _CONTRACT_BODY_ARCH_OUT_OF_SCOPE, + "comments_url": "", + } + mock_gh_cmd.return_value = (True, json.dumps(issue_data)) + mock_load_arch.return_value = ( + arch_data, + tmp_path / "architecture.json", + ) + mock_agentic_task.return_value = ( + True, + ( + 'MODULES_TO_SYNC: ["foo"]\n' + "DEPS_VALID: false\n" + 'DEPS_CORRECTIONS: [{"filename": "foo_python.prompt", "dependencies": []}]' + ), + 0.05, + "anthropic", + ) + mock_dry_run.return_value = (True, {"foo": tmp_path}, [], 0.0) + + mock_runner = MagicMock() + mock_runner.run.return_value = (True, "All 1 modules synced successfully", 0.10) + mock_runner_cls.return_value = mock_runner + + success, _msg, _cost, _model = run_agentic_sync( + "https://github.com/owner/repo/issues/1", + quiet=True, + scope_guard=False, + ) + + assert success is True + mock_apply_corrections.assert_called_once() + + @patch("pdd.agentic_sync._apply_architecture_corrections") + @patch("pdd.agentic_sync.AsyncSyncRunner") + @patch("pdd.agentic_sync.DurableSyncRunner") + @patch("pdd.agentic_sync._filter_already_synced", return_value=[]) + @patch("pdd.agentic_sync._detect_modules_from_branch_diff", return_value=[]) + @patch("pdd.agentic_sync._run_dry_run_validation") + @patch( + "pdd.agentic_sync.build_dep_graph_from_architecture_data", + return_value=DepGraphFromArchitectureResult({"foo": []}, []), + ) + @patch("pdd.agentic_sync.load_prompt_template", return_value="template {issue_content} {architecture_json}") + @patch("pdd.agentic_sync.run_agentic_task") + @patch("pdd.agentic_sync._load_architecture_json") + @patch("pdd.agentic_sync._run_gh_command") + @patch("pdd.agentic_sync._check_gh_cli", return_value=True) + def test_already_synced_early_return_does_not_leak_arch_changes( + self, + mock_gh_cli, + mock_gh_cmd, + mock_load_arch, + mock_agentic_task, + mock_load_prompt, + mock_build_graph, + mock_dry_run, + mock_branch_diff, + mock_filter_synced, + mock_durable_runner_cls, + mock_async_runner_cls, + mock_apply_corrections, + tmp_path, + ): + """Defensive: even if every module is already synced and the runner is + never dispatched, the orchestrator must NOT write architecture.json + out-of-scope. Verifies both the mock-level assertion AND that no + ``M architecture.json`` shows up in a real git repo's ``git status`` + after the orchestrator returns its early "already synced" success. + """ + # Build a tiny git repo with a committed architecture.json so any + # subsequent write would show as a tracked modification. + repo = tmp_path / "repo" + repo.mkdir() + subprocess.run(["git", "init", "--quiet"], cwd=repo, check=True) + subprocess.run( + ["git", "config", "user.email", "test@example.com"], + cwd=repo, + check=True, + ) + subprocess.run( + ["git", "config", "user.name", "Test"], + cwd=repo, + check=True, + ) + arch_file = repo / "architecture.json" + arch_data = [{"filename": "foo_python.prompt", "dependencies": []}] + arch_file.write_text(json.dumps(arch_data, indent=2)) + subprocess.run( + ["git", "add", "architecture.json"], cwd=repo, check=True + ) + subprocess.run( + ["git", "commit", "--quiet", "-m", "init arch"], + cwd=repo, + check=True, + ) + + issue_data = { + "title": "Fix foo", + "body": _CONTRACT_BODY_ARCH_OUT_OF_SCOPE, + "comments_url": "", + } + mock_gh_cmd.return_value = (True, json.dumps(issue_data)) + mock_load_arch.return_value = (arch_data, arch_file) + mock_agentic_task.return_value = ( + True, + ( + 'MODULES_TO_SYNC: ["foo"]\n' + "DEPS_VALID: false\n" + 'DEPS_CORRECTIONS: [{"filename": "foo_python.prompt", "dependencies": []}]' + ), + 0.05, + "anthropic", + ) + mock_dry_run.return_value = (True, {"foo": repo}, [], 0.0) + + old_cwd = Path.cwd() + try: + os.chdir(repo) + success, msg, _cost, _model = run_agentic_sync( + "https://github.com/owner/repo/issues/1", quiet=True + ) + finally: + os.chdir(old_cwd) + + # Orchestrator returns the "already synced" early-success path. + assert success is True + assert "already synced" in msg.lower() + + # The gate must have refused the only out-of-contract write the + # orchestrator can perform. + mock_apply_corrections.assert_not_called() + # No runner is dispatched on the already-synced path. + mock_async_runner_cls.assert_not_called() + mock_durable_runner_cls.assert_not_called() + + # Defense-in-depth: a real git status check confirms the on-disk + # architecture.json is untouched. + status = subprocess.run( + ["git", "status", "--porcelain"], + cwd=repo, + check=True, + capture_output=True, + text=True, + ) + assert "architecture.json" not in status.stdout + + # ------------------------------------------------------------------ + # Iter-28 B-2: nested arch_path bypass + # ------------------------------------------------------------------ + + @patch("pdd.agentic_sync._find_project_root") + @patch("pdd.agentic_sync._apply_architecture_corrections") + @patch("pdd.agentic_sync.AsyncSyncRunner") + @patch("pdd.agentic_sync._filter_already_synced", return_value=["foo"]) + @patch("pdd.agentic_sync._detect_modules_from_branch_diff", return_value=[]) + @patch("pdd.agentic_sync._run_dry_run_validation") + @patch( + "pdd.agentic_sync.build_dep_graph_from_architecture_data", + return_value=DepGraphFromArchitectureResult({"foo": []}, []), + ) + @patch("pdd.agentic_sync.load_prompt_template", return_value="template {issue_content} {architecture_json}") + @patch("pdd.agentic_sync.run_agentic_task") + @patch("pdd.agentic_sync._load_architecture_json") + @patch("pdd.agentic_sync._run_gh_command") + @patch("pdd.agentic_sync._check_gh_cli", return_value=True) + def test_dependency_corrections_skipped_for_nested_arch_outside_contract( + self, + mock_gh_cli, + mock_gh_cmd, + mock_load_arch, + mock_agentic_task, + mock_load_prompt, + mock_build_graph, + mock_dry_run, + mock_branch_diff, + mock_filter_synced, + mock_runner_cls, + mock_apply_corrections, + mock_find_root, + tmp_path, + ): + """Contract allows the literal string ``architecture.json`` but the + REAL arch path is ``frontend/architecture.json``. Iter-28 B-2: the + gate must compare the resolved arch path, not the bare string, so + the nested arch write is rejected.""" + # Iter-32 B-1: init git + pin root so the dispatch-boundary scope + # guard sweeps a clean tmp tree. + _init_git_repo(tmp_path) + mock_find_root.return_value = tmp_path + arch_data = [{"filename": "foo_python.prompt", "dependencies": []}] + # Contract allows root architecture.json only — NOT the nested path. + issue_data = { + "title": "Fix foo", + "body": _CONTRACT_BODY_ARCH_IN_SCOPE, + "comments_url": "", + } + mock_gh_cmd.return_value = (True, json.dumps(issue_data)) + # arch_path resolves nested: frontend/architecture.json under + # tmp_path. The literal-string gate would have matched the contract's + # ``architecture.json`` entry and let the write through; the + # resolved-path gate must NOT. + nested_arch = tmp_path / "frontend" / "architecture.json" + (tmp_path / "frontend").mkdir() + mock_load_arch.return_value = (arch_data, nested_arch) + mock_agentic_task.return_value = ( + True, + ( + 'MODULES_TO_SYNC: ["foo"]\n' + "DEPS_VALID: false\n" + 'DEPS_CORRECTIONS: [{"filename": "foo_python.prompt", "dependencies": []}]' + ), + 0.05, + "anthropic", + ) + mock_dry_run.return_value = (True, {"foo": tmp_path}, [], 0.0) + + mock_runner = MagicMock() + mock_runner.run.return_value = (True, "All 1 modules synced successfully", 0.10) + mock_runner_cls.return_value = mock_runner + + success, _msg, _cost, _model = run_agentic_sync( + "https://github.com/owner/repo/issues/1", quiet=True + ) + + assert success is True + mock_apply_corrections.assert_not_called() + + @patch("pdd.agentic_sync._find_project_root") + @patch("pdd.agentic_sync._apply_architecture_corrections") + @patch("pdd.agentic_sync.AsyncSyncRunner") + @patch("pdd.agentic_sync._filter_already_synced", return_value=["foo"]) + @patch("pdd.agentic_sync._detect_modules_from_branch_diff", return_value=[]) + @patch("pdd.agentic_sync._run_dry_run_validation") + @patch( + "pdd.agentic_sync.build_dep_graph_from_architecture_data", + return_value=DepGraphFromArchitectureResult({"foo": []}, []), + ) + @patch("pdd.agentic_sync.load_prompt_template", return_value="template {issue_content} {architecture_json}") + @patch("pdd.agentic_sync.run_agentic_task") + @patch("pdd.agentic_sync._load_architecture_json") + @patch("pdd.agentic_sync._run_gh_command") + @patch("pdd.agentic_sync._check_gh_cli", return_value=True) + def test_dependency_corrections_applied_for_nested_arch_in_contract( + self, + mock_gh_cli, + mock_gh_cmd, + mock_load_arch, + mock_agentic_task, + mock_load_prompt, + mock_build_graph, + mock_dry_run, + mock_branch_diff, + mock_filter_synced, + mock_runner_cls, + mock_apply_corrections, + mock_find_root, + tmp_path, + ): + """Contract explicitly allows ``frontend/architecture.json`` and the + arch path matches → gate must permit the write.""" + # Iter-32 B-1: init git so the dispatch-boundary scope guard's + # working-tree probes succeed. + _init_git_repo(tmp_path) + mock_find_root.return_value = tmp_path + arch_data = [{"filename": "foo_python.prompt", "dependencies": []}] + mock_apply_corrections.return_value = arch_data + + issue_data = { + "title": "Fix foo", + "body": _CONTRACT_BODY_NESTED_ARCH_IN_SCOPE, + "comments_url": "", + } + mock_gh_cmd.return_value = (True, json.dumps(issue_data)) + nested_arch = tmp_path / "frontend" / "architecture.json" + (tmp_path / "frontend").mkdir() + mock_load_arch.return_value = (arch_data, nested_arch) + mock_agentic_task.return_value = ( + True, + ( + 'MODULES_TO_SYNC: ["foo"]\n' + "DEPS_VALID: false\n" + 'DEPS_CORRECTIONS: [{"filename": "foo_python.prompt", "dependencies": []}]' + ), + 0.05, + "anthropic", + ) + mock_dry_run.return_value = (True, {"foo": tmp_path}, [], 0.0) + + mock_runner = MagicMock() + mock_runner.run.return_value = (True, "All 1 modules synced successfully", 0.10) + mock_runner_cls.return_value = mock_runner + + success, _msg, _cost, _model = run_agentic_sync( + "https://github.com/owner/repo/issues/1", quiet=True + ) + + assert success is True + mock_apply_corrections.assert_called_once() + + @patch("pdd.agentic_sync._find_project_root") + @patch("pdd.agentic_sync._apply_architecture_corrections") + @patch("pdd.agentic_sync.AsyncSyncRunner") + @patch("pdd.agentic_sync._filter_already_synced", return_value=["foo"]) + @patch("pdd.agentic_sync._detect_modules_from_branch_diff", return_value=[]) + @patch("pdd.agentic_sync._run_dry_run_validation") + @patch( + "pdd.agentic_sync.build_dep_graph_from_architecture_data", + return_value=DepGraphFromArchitectureResult({"foo": []}, []), + ) + @patch("pdd.agentic_sync.load_prompt_template", return_value="template {issue_content} {architecture_json}") + @patch("pdd.agentic_sync.run_agentic_task") + @patch("pdd.agentic_sync._load_architecture_json") + @patch("pdd.agentic_sync._run_gh_command") + @patch("pdd.agentic_sync._check_gh_cli", return_value=True) + def test_dependency_corrections_skipped_for_arch_outside_project_root( + self, + mock_gh_cli, + mock_gh_cmd, + mock_load_arch, + mock_agentic_task, + mock_load_prompt, + mock_build_graph, + mock_dry_run, + mock_branch_diff, + mock_filter_synced, + mock_runner_cls, + mock_apply_corrections, + mock_find_root, + tmp_path, + ): + """``arch_path`` resolves outside ``project_root`` → never in scope. + + Defense-in-depth: even if some upstream bug threads an arch path + outside the repo root into the orchestrator, ``_arch_path_in_scope`` + catches the ``ValueError`` from ``relative_to`` and returns False so + the write is refused. + """ + # Iter-32 B-1: init git + pin root so the dispatch-boundary scope + # guard sweeps a clean tmp tree. + _init_git_repo(tmp_path) + mock_find_root.return_value = tmp_path + arch_data = [{"filename": "foo_python.prompt", "dependencies": []}] + issue_data = { + "title": "Fix foo", + "body": _CONTRACT_BODY_ARCH_IN_SCOPE, + "comments_url": "", + } + mock_gh_cmd.return_value = (True, json.dumps(issue_data)) + # Force an arch path that resolves OUTSIDE project_root. + outside_arch = (tmp_path.parent / "outside_root" / "architecture.json").resolve() + outside_arch.parent.mkdir(parents=True, exist_ok=True) + mock_load_arch.return_value = (arch_data, outside_arch) + mock_agentic_task.return_value = ( + True, + ( + 'MODULES_TO_SYNC: ["foo"]\n' + "DEPS_VALID: false\n" + 'DEPS_CORRECTIONS: [{"filename": "foo_python.prompt", "dependencies": []}]' + ), + 0.05, + "anthropic", + ) + mock_dry_run.return_value = (True, {"foo": tmp_path}, [], 0.0) + + mock_runner = MagicMock() + mock_runner.run.return_value = (True, "All 1 modules synced successfully", 0.10) + mock_runner_cls.return_value = mock_runner + + success, _msg, _cost, _model = run_agentic_sync( + "https://github.com/owner/repo/issues/1", quiet=True + ) + + assert success is True + mock_apply_corrections.assert_not_called() + + +# --------------------------------------------------------------------------- +# _arch_path_in_scope (iter-28 B-2 helper, unit-level) +# --------------------------------------------------------------------------- + + +class TestArchPathInScope: + """Unit-level coverage of the resolved-path scope check used by the + iter-26 orchestrator gate, post-iter-28 B-2.""" + + @staticmethod + def _contract(*allowed: str) -> IssueContract: + return IssueContract( + allowed_paths=tuple(allowed), + companion_allowlist=(), + source="test", + ) + + def test_no_contract_permissive(self, tmp_path): + """No contract → always in scope (pre-iter-26 behavior preserved).""" + assert _arch_path_in_scope( + tmp_path / "architecture.json", + tmp_path, + issue_contract=None, + scope_guard=True, + ) + + def test_scope_guard_disabled_bypasses_check(self, tmp_path): + """``--no-scope-guard`` → always in scope, contract ignored.""" + contract = self._contract("pdd/foo.py") # arch NOT in contract + assert _arch_path_in_scope( + tmp_path / "architecture.json", + tmp_path, + issue_contract=contract, + scope_guard=False, + ) + + def test_literal_arch_in_contract_match(self, tmp_path): + """Root arch + contract allows ``architecture.json`` → in scope.""" + contract = self._contract("pdd/foo.py", "architecture.json") + assert _arch_path_in_scope( + tmp_path / "architecture.json", + tmp_path, + issue_contract=contract, + scope_guard=True, + ) + + def test_nested_arch_with_literal_contract_rejected(self, tmp_path): + """Nested arch + contract only allows literal ``architecture.json`` + → out of scope (the iter-28 B-2 fix).""" + contract = self._contract("pdd/foo.py", "architecture.json") + assert not _arch_path_in_scope( + tmp_path / "frontend" / "architecture.json", + tmp_path, + issue_contract=contract, + scope_guard=True, + ) + + def test_nested_arch_with_nested_contract_match(self, tmp_path): + """Nested arch + contract names the same nested path → in scope.""" + contract = self._contract("pdd/foo.py", "frontend/architecture.json") + assert _arch_path_in_scope( + tmp_path / "frontend" / "architecture.json", + tmp_path, + issue_contract=contract, + scope_guard=True, + ) + + def test_arch_outside_project_root_rejected(self, tmp_path): + """``arch_path`` outside the repo → out of scope (ValueError → False).""" + contract = self._contract("architecture.json") + outside = (tmp_path.parent / "outside_root" / "architecture.json").resolve() + outside.parent.mkdir(parents=True, exist_ok=True) + assert not _arch_path_in_scope( + outside, + tmp_path, + issue_contract=contract, + scope_guard=True, + ) + + def test_empty_contract_allowed_paths_rejected(self, tmp_path): + """Empty ``allowed_paths`` tuple → no path is in scope.""" + contract = self._contract() # ``allowed_paths=()`` + assert not _arch_path_in_scope( + tmp_path / "architecture.json", + tmp_path, + issue_contract=contract, + scope_guard=True, + ) + + +# --------------------------------------------------------------------------- +# _enforce_orchestrator_scope (iter-30 unified orchestrator scope guard) +# --------------------------------------------------------------------------- + + +def _init_git_repo(tmp_path: Path) -> None: + """Create a minimal git repo at *tmp_path* for orchestrator scope tests.""" + subprocess.run( + ["git", "init", "--quiet", "--initial-branch=main", str(tmp_path)], + check=True, + ) + subprocess.run( + ["git", "-C", str(tmp_path), "config", "user.email", "test@example.com"], + check=True, + ) + subprocess.run( + ["git", "-C", str(tmp_path), "config", "user.name", "Test"], + check=True, + ) + # Seed with a committed file so HEAD exists. + seed = tmp_path / ".pddrc" + seed.write_text("# pddrc\n", encoding="utf-8") + subprocess.run( + ["git", "-C", str(tmp_path), "add", "-A"], + check=True, + ) + subprocess.run( + ["git", "-C", str(tmp_path), "commit", "--quiet", "-m", "init"], + check=True, + ) + + +def _hash_baseline_single(project_root: Path, rel: str) -> str: + """Tiny helper: SHA-1 of the file at *project_root / rel* (for baseline maps).""" + import hashlib + return hashlib.sha1((project_root / rel).read_bytes()).hexdigest() + + +class TestEnforceOrchestratorScope: + """Iter-30: unit-level coverage of the orchestrator scope guard helper. + + These tests exercise :func:`_enforce_orchestrator_scope` directly. Higher- + level integration tests that drive :func:`run_agentic_sync` end-to-end are + in :class:`TestOrchestratorScopeGuardIntegration`. + """ + + @staticmethod + def _contract(*allowed: str) -> IssueContract: + return IssueContract( + allowed_paths=tuple(allowed), + companion_allowlist=(), + source="test", + ) + + def test_no_op_when_no_contract(self, tmp_path): + """``issue_contract is None`` → permissive, returns None unconditionally.""" + _init_git_repo(tmp_path) + out_of_scope = tmp_path / "wild.py" + out_of_scope.write_text("unsanctioned\n", encoding="utf-8") + + result = _enforce_orchestrator_scope( + tmp_path, + issue_contract=None, + scope_guard=True, + baseline_changed={}, + baseline_ignored={}, + quiet=True, + ) + assert result is None + # Permissive mode preserves the file on disk. + assert out_of_scope.exists() + + def test_no_op_when_scope_guard_disabled(self, tmp_path): + """``scope_guard=False`` → no-op even with a contract.""" + _init_git_repo(tmp_path) + out_of_scope = tmp_path / "wild.py" + out_of_scope.write_text("unsanctioned\n", encoding="utf-8") + + contract = self._contract("pdd/foo.py") + result = _enforce_orchestrator_scope( + tmp_path, + issue_contract=contract, + scope_guard=False, + baseline_changed={}, + baseline_ignored={}, + quiet=True, + ) + assert result is None + assert out_of_scope.exists() + + def test_reverts_untracked_out_of_contract_writes(self, tmp_path): + """Untracked out-of-contract file → reverted, diagnostic returned.""" + _init_git_repo(tmp_path) + contract = self._contract("pdd/foo.py") + out_of_scope = tmp_path / "outside.py" + out_of_scope.write_text("oops\n", encoding="utf-8") + + result = _enforce_orchestrator_scope( + tmp_path, + issue_contract=contract, + scope_guard=True, + baseline_changed={}, + baseline_ignored={}, + quiet=True, + ) + assert result is not None + assert "outside.py" in result + assert "Orchestrator scope guard" in result + assert not out_of_scope.exists() + + def test_preserves_pre_existing_baseline_path(self, tmp_path): + """Pre-existing untracked file in baseline (unchanged SHA) → preserved.""" + _init_git_repo(tmp_path) + contract = self._contract("pdd/foo.py") + user_wip = tmp_path / "userwip.py" + user_wip.write_text("user code\n", encoding="utf-8") + + # Snapshot the baseline (matches what run_agentic_sync does). + baseline = {"userwip.py": _hash_baseline_single(tmp_path, "userwip.py")} + + result = _enforce_orchestrator_scope( + tmp_path, + issue_contract=contract, + scope_guard=True, + baseline_changed=baseline, + baseline_ignored={}, + quiet=True, + ) + # Unchanged baseline → no revert needed. + assert result is None + assert user_wip.exists() + + def test_detects_baseline_clobber(self, tmp_path): + """Baseline path overwritten with different content → flagged & reverted.""" + _init_git_repo(tmp_path) + contract = self._contract("pdd/foo.py") + user_wip = tmp_path / "userwip.py" + user_wip.write_text("original\n", encoding="utf-8") + baseline = {"userwip.py": _hash_baseline_single(tmp_path, "userwip.py")} + + # Now clobber the baseline. + user_wip.write_text("CLOBBERED by LLM\n", encoding="utf-8") + + result = _enforce_orchestrator_scope( + tmp_path, + issue_contract=contract, + scope_guard=True, + baseline_changed=baseline, + baseline_ignored={}, + quiet=True, + ) + assert result is not None + assert "userwip.py" in result + # The file is gone after the revert helper sweeps it (untracked + + # not allowed → removed). + assert not user_wip.exists() + + def test_companion_allowlist_default_auto_allows_pdd_meta(self, tmp_path): + """``.pdd/meta/*.json`` is auto-allowed by DEFAULT_SYNC_COMPANION_ALLOWLIST.""" + _init_git_repo(tmp_path) + contract = self._contract("pdd/foo.py") + meta_dir = tmp_path / ".pdd" / "meta" + meta_dir.mkdir(parents=True) + meta_file = meta_dir / "foo_python.json" + meta_file.write_text('{"fingerprint": "x"}', encoding="utf-8") + + result = _enforce_orchestrator_scope( + tmp_path, + issue_contract=contract, + scope_guard=True, + baseline_changed={}, + baseline_ignored={}, + quiet=True, + ) + # Companion-allowlisted → no revert, no diagnostic. + assert result is None + assert meta_file.exists() + + def test_pdd_audit_logs_do_not_trip_orchestrator_guard(self, tmp_path): + """Iter-36 B-1: PDD's own audit logs at ``.pdd/agentic-logs/`` written + by :func:`run_agentic_task` during the orchestrator's pre-dispatch + LLM calls MUST NOT hard-fail a contracted sync run. The audit log is + tool infrastructure (NEVER part of a contract) and the internal + allowlist auto-allows it without the contract needing to opt in. + + Baseline snapshot is empty (the log appears AFTER snapshot, mid-run); + the guard MUST still return None purely on the internal allowlist + match. + """ + _init_git_repo(tmp_path) + contract = self._contract("pdd/foo.py") + + # Audit log appears AFTER baseline snapshot — this is the realistic + # scenario: ``run_agentic_task`` writes a session record during the + # LLM call that itself happens between snapshot and guard. + log_dir = tmp_path / ".pdd" / "agentic-logs" + log_dir.mkdir(parents=True) + log_file = log_dir / "session_20251215_120000.jsonl" + log_file.write_text('{"label": "step1"}\n', encoding="utf-8") + + result = _enforce_orchestrator_scope( + tmp_path, + issue_contract=contract, + scope_guard=True, + baseline_changed={}, + baseline_ignored={}, + quiet=True, + ) + assert result is None, ( + f"iter-36 B-1: PDD audit log under .pdd/agentic-logs/ must be " + f"auto-allowed by the internal allowlist; got diagnostic: " + f"{result!r}" + ) + # The log must still exist — internal-allowlisted, not reverted. + assert log_file.exists() + + def test_orchestrator_guard_flags_deleted_untracked_baseline(self, tmp_path): + """Iter-36 B-3: untracked baseline files that disappear between + snapshot and guard MUST be surfaced as ``remaining`` (hard-fail) by + the orchestrator guard. Prior to iter-36 the orchestrator silently + ``continue``d on ``current_hash is None`` and lost user WIP without + a trace. Mirrors the per-module guard's iter-34 fix. + """ + _init_git_repo(tmp_path) + contract = self._contract("pdd/foo.py") + user_wip = tmp_path / "userwip.py" + user_wip.write_text("user code\n", encoding="utf-8") + + # Snapshot the baseline (untracked WIP captured at orchestrator entry). + baseline = {"userwip.py": _hash_baseline_single(tmp_path, "userwip.py")} + + # Orchestrator deletes the WIP before runner dispatch — simulate by + # deleting the file after snapshot. + user_wip.unlink() + + result = _enforce_orchestrator_scope( + tmp_path, + issue_contract=contract, + scope_guard=True, + baseline_changed=baseline, + baseline_ignored={}, + quiet=True, + ) + assert result is not None, ( + "iter-36 B-3: deletion of untracked baseline WIP must hard-fail " + "the orchestrator — silent data loss otherwise" + ) + assert "userwip.py" in result, ( + f"iter-36 B-3: deleted baseline path must appear in diagnostic, " + f"got: {result!r}" + ) + + def test_orchestrator_guard_flags_deleted_ignored_baseline(self, tmp_path): + """Iter-36 B-3 (symmetric): pre-existing gitignored baseline files + that disappear between snapshot and guard MUST also surface in the + orchestrator's ``remaining`` set. ``git ls-files --ignored`` only + lists files that currently exist, so a deleted ignored baseline + leaves no trail in the ignored-rescan loop. + """ + _init_git_repo(tmp_path) + # gitignore must be committed before the cache file is created so + # the cache is treated as a tracked-ignore at baseline time. + gi = tmp_path / ".gitignore" + gi.write_text("cache.bin\n", encoding="utf-8") + subprocess.run( + ["git", "-C", str(tmp_path), "add", ".gitignore"], check=True, + ) + subprocess.run( + ["git", "-C", str(tmp_path), "commit", "--quiet", "-m", "gi"], + check=True, + ) + + contract = self._contract("pdd/foo.py") + cache = tmp_path / "cache.bin" + cache.write_text("user cache\n", encoding="utf-8") + + # Baseline snapshot of the ignored file. + baseline_ignored = { + "cache.bin": _hash_baseline_single(tmp_path, "cache.bin") + } + + # Orchestrator deletes the cache before runner dispatch. + cache.unlink() + + result = _enforce_orchestrator_scope( + tmp_path, + issue_contract=contract, + scope_guard=True, + baseline_changed={}, + baseline_ignored=baseline_ignored, + quiet=True, + ) + assert result is not None, ( + "iter-36 B-3: deletion of pre-existing ignored baseline must " + "hard-fail the orchestrator — git ls-files --ignored cannot see it" + ) + assert "cache.bin" in result, ( + f"iter-36 B-3: deleted ignored baseline must appear in diagnostic, " + f"got: {result!r}" + ) + + +# --------------------------------------------------------------------------- +# Orchestrator scope guard integration (iter-30) +# --------------------------------------------------------------------------- + + +class TestOrchestratorScopeGuardIntegration: + """Iter-30: integration tests that drive :func:`run_agentic_sync` and + verify the orchestrator scope guard reverts/preserves correctly at the + early-return boundary.""" + + _ISSUE_BODY_WITH_BULLET_CONTRACT = ( + "Title: feat: foo\n\n" + "## Allowed Write Set\n\n" + "**Allowed write set:**\n" + "- pdd/foo.py\n" + ) + _ISSUE_BODY_NO_CONTRACT = "Title: feat: foo\n\nNo structured contract here." + + def _issue_payload(self, body: str) -> str: + return json.dumps({"title": "Test", "body": body, "comments_url": ""}) + + @patch("pdd.agentic_sync.AsyncSyncRunner") + @patch("pdd.agentic_sync.load_prompt_template", return_value="t {issue_content} {architecture_json}") + @patch("pdd.agentic_sync.run_agentic_task") + @patch("pdd.agentic_sync._load_architecture_json") + @patch("pdd.agentic_sync._run_gh_command") + @patch("pdd.agentic_sync._check_gh_cli", return_value=True) + def test_reverts_out_of_contract_llm_writes( + self, + _mock_gh_cli, + mock_gh_cmd, + mock_load_arch, + mock_agentic_task, + _mock_load_prompt, + mock_runner_cls, + tmp_path, + monkeypatch, + ): + """LLM writes an out-of-contract file during identify-modules → + orchestrator scope guard reverts before the orchestrator returns. + + The mock for ``run_agentic_task`` writes ``outside.py`` to disk and + returns an empty module list, so the orchestrator hits the + "LLM identified no modules to sync" early return (line ~1925). The + scope guard wrap on that return must observe and revert the write. + """ + _init_git_repo(tmp_path) + monkeypatch.setattr("pdd.agentic_sync._find_project_root", lambda *_: tmp_path) + monkeypatch.setattr( + "pdd.agentic_sync._detect_modules_from_branch_diff", lambda *_: [] + ) + mock_gh_cmd.return_value = ( + True, self._issue_payload(self._ISSUE_BODY_WITH_BULLET_CONTRACT) + ) + mock_load_arch.return_value = (None, tmp_path / "architecture.json") + + def llm_side_effect(*_args, **_kwargs): + # Simulate LLM writing an out-of-contract file mid-call. + (tmp_path / "outside.py").write_text("LLM wrote me\n", encoding="utf-8") + # Return a parse-failing response so we land on the + # "no modules to sync" early return inside the scope guard. + return True, "MODULES_TO_SYNC: []\nDEPS_VALID: true", 0.01, "anthropic" + + mock_agentic_task.side_effect = llm_side_effect + + success, msg, _cost, _model = run_agentic_sync( + "https://github.com/owner/repo/issues/1", quiet=True + ) + + assert success is False + assert "Orchestrator scope guard" in msg + assert "outside.py" in msg + assert not (tmp_path / "outside.py").exists(), ( + "scope guard must revert the out-of-contract write" + ) + mock_runner_cls.assert_not_called() + + @patch("pdd.agentic_sync.AsyncSyncRunner") + @patch("pdd.agentic_sync.load_prompt_template", return_value="t {issue_content} {architecture_json}") + @patch("pdd.agentic_sync.run_agentic_task") + @patch("pdd.agentic_sync._load_architecture_json") + @patch("pdd.agentic_sync._run_gh_command") + @patch("pdd.agentic_sync._check_gh_cli", return_value=True) + def test_preserves_pre_existing_baseline( + self, + _mock_gh_cli, + mock_gh_cmd, + mock_load_arch, + mock_agentic_task, + _mock_load_prompt, + mock_runner_cls, + tmp_path, + monkeypatch, + ): + """User WIP that pre-exists at orchestrator entry → preserved.""" + _init_git_repo(tmp_path) + # Pre-existing dirty user WIP. + (tmp_path / "userwip.py").write_text("user work in progress\n", encoding="utf-8") + + monkeypatch.setattr("pdd.agentic_sync._find_project_root", lambda *_: tmp_path) + monkeypatch.setattr( + "pdd.agentic_sync._detect_modules_from_branch_diff", lambda *_: [] + ) + mock_gh_cmd.return_value = ( + True, self._issue_payload(self._ISSUE_BODY_WITH_BULLET_CONTRACT) + ) + mock_load_arch.return_value = (None, tmp_path / "architecture.json") + + # LLM does NOT touch userwip.py; just returns no modules. + mock_agentic_task.return_value = ( + True, "MODULES_TO_SYNC: []\nDEPS_VALID: true", 0.01, "anthropic" + ) + + success, msg, _cost, _model = run_agentic_sync( + "https://github.com/owner/repo/issues/1", quiet=True + ) + + # Pre-existing untracked file MUST be preserved by the baseline rule. + assert (tmp_path / "userwip.py").exists() + assert (tmp_path / "userwip.py").read_text(encoding="utf-8") == ( + "user work in progress\n" + ) + # Scope guard MUST NOT mention userwip.py in the diagnostic. + assert "userwip.py" not in msg + mock_runner_cls.assert_not_called() + + @patch("pdd.agentic_sync.AsyncSyncRunner") + @patch("pdd.agentic_sync.load_prompt_template", return_value="t {issue_content} {architecture_json}") + @patch("pdd.agentic_sync.run_agentic_task") + @patch("pdd.agentic_sync._load_architecture_json") + @patch("pdd.agentic_sync._run_gh_command") + @patch("pdd.agentic_sync._check_gh_cli", return_value=True) + def test_detects_baseline_clobber( + self, + _mock_gh_cli, + mock_gh_cmd, + mock_load_arch, + mock_agentic_task, + _mock_load_prompt, + mock_runner_cls, + tmp_path, + monkeypatch, + ): + """LLM overwrites a baseline path with new content → flagged.""" + _init_git_repo(tmp_path) + (tmp_path / "userwip.py").write_text("original\n", encoding="utf-8") + + monkeypatch.setattr("pdd.agentic_sync._find_project_root", lambda *_: tmp_path) + monkeypatch.setattr( + "pdd.agentic_sync._detect_modules_from_branch_diff", lambda *_: [] + ) + mock_gh_cmd.return_value = ( + True, self._issue_payload(self._ISSUE_BODY_WITH_BULLET_CONTRACT) + ) + mock_load_arch.return_value = (None, tmp_path / "architecture.json") + + def llm_side_effect(*_args, **_kwargs): + # Clobber pre-existing baseline path. + (tmp_path / "userwip.py").write_text( + "CLOBBERED by malicious LLM\n", encoding="utf-8" + ) + return True, "MODULES_TO_SYNC: []\nDEPS_VALID: true", 0.01, "anthropic" + + mock_agentic_task.side_effect = llm_side_effect + + success, msg, _cost, _model = run_agentic_sync( + "https://github.com/owner/repo/issues/1", quiet=True + ) + + assert success is False + assert "Orchestrator scope guard" in msg + assert "userwip.py" in msg + mock_runner_cls.assert_not_called() + + @patch("pdd.agentic_sync.AsyncSyncRunner") + @patch("pdd.agentic_sync.load_prompt_template", return_value="t {issue_content} {architecture_json}") + @patch("pdd.agentic_sync.run_agentic_task") + @patch("pdd.agentic_sync._load_architecture_json") + @patch("pdd.agentic_sync._run_gh_command") + @patch("pdd.agentic_sync._check_gh_cli", return_value=True) + def test_no_op_when_no_contract( + self, + _mock_gh_cli, + mock_gh_cmd, + mock_load_arch, + mock_agentic_task, + _mock_load_prompt, + mock_runner_cls, + tmp_path, + monkeypatch, + ): + """Permissive mode: no contract on issue → no revert, existing + behavior preserved.""" + _init_git_repo(tmp_path) + monkeypatch.setattr("pdd.agentic_sync._find_project_root", lambda *_: tmp_path) + monkeypatch.setattr( + "pdd.agentic_sync._detect_modules_from_branch_diff", lambda *_: [] + ) + mock_gh_cmd.return_value = ( + True, self._issue_payload(self._ISSUE_BODY_NO_CONTRACT) + ) + mock_load_arch.return_value = (None, tmp_path / "architecture.json") + + def llm_side_effect(*_args, **_kwargs): + (tmp_path / "outside.py").write_text("LLM wrote me\n", encoding="utf-8") + return True, "MODULES_TO_SYNC: []\nDEPS_VALID: true", 0.01, "anthropic" + + mock_agentic_task.side_effect = llm_side_effect + + success, msg, _cost, _model = run_agentic_sync( + "https://github.com/owner/repo/issues/1", quiet=True + ) + + # No contract → permissive mode → no revert, no diagnostic. + assert "Orchestrator scope guard" not in msg + assert (tmp_path / "outside.py").exists() + + @patch("pdd.agentic_sync.AsyncSyncRunner") + @patch("pdd.agentic_sync.load_prompt_template", return_value="t {issue_content} {architecture_json}") + @patch("pdd.agentic_sync.run_agentic_task") + @patch("pdd.agentic_sync._load_architecture_json") + @patch("pdd.agentic_sync._run_gh_command") + @patch("pdd.agentic_sync._check_gh_cli", return_value=True) + def test_no_op_with_no_scope_guard_flag( + self, + _mock_gh_cli, + mock_gh_cmd, + mock_load_arch, + mock_agentic_task, + _mock_load_prompt, + mock_runner_cls, + tmp_path, + monkeypatch, + ): + """``scope_guard=False`` → explicit opt-out, no revert, existing + behavior preserved (matches iter-26 / iter-28 gate semantics).""" + _init_git_repo(tmp_path) + monkeypatch.setattr("pdd.agentic_sync._find_project_root", lambda *_: tmp_path) + monkeypatch.setattr( + "pdd.agentic_sync._detect_modules_from_branch_diff", lambda *_: [] + ) + mock_gh_cmd.return_value = ( + True, self._issue_payload(self._ISSUE_BODY_WITH_BULLET_CONTRACT) + ) + mock_load_arch.return_value = (None, tmp_path / "architecture.json") + + def llm_side_effect(*_args, **_kwargs): + (tmp_path / "outside.py").write_text("LLM wrote me\n", encoding="utf-8") + return True, "MODULES_TO_SYNC: []\nDEPS_VALID: true", 0.01, "anthropic" + + mock_agentic_task.side_effect = llm_side_effect + + success, msg, _cost, _model = run_agentic_sync( + "https://github.com/owner/repo/issues/1", + quiet=True, + scope_guard=False, + ) + + # Explicit opt-out → no revert, no diagnostic. + assert "Orchestrator scope guard" not in msg + assert (tmp_path / "outside.py").exists() + + # --------------------------------------------------------------------- + # Iter-32 B-1: dispatch-boundary scope guard + # --------------------------------------------------------------------- + # iter-30 wrapped every EARLY-RETURN site with + # ``_orch_scope_check_return``. The natural completion (iter-32) is to + # also gate the SUCCESSFUL DISPATCH path so pre-dispatch out-of-contract + # writes are not snapshotted as ``_baseline_changed_paths`` by the + # runner and silently preserved for the entire sync session. + + @patch("pdd.agentic_sync._filter_already_synced") + @patch("pdd.agentic_sync._run_dry_run_validation") + @patch("pdd.agentic_sync.AsyncSyncRunner") + @patch("pdd.agentic_sync.load_prompt_template", return_value="t {issue_content} {architecture_json}") + @patch("pdd.agentic_sync.run_agentic_task") + @patch("pdd.agentic_sync._load_architecture_json") + @patch("pdd.agentic_sync._run_gh_command") + @patch("pdd.agentic_sync._check_gh_cli", return_value=True) + def test_orchestrator_scope_guard_blocks_dispatch_when_predispatch_writes_out_of_contract( + self, + _mock_gh_cli, + mock_gh_cmd, + mock_load_arch, + mock_agentic_task, + _mock_load_prompt, + mock_runner_cls, + mock_dry_run, + mock_filter_synced, + tmp_path, + monkeypatch, + ): + """Pre-dispatch out-of-contract write that survives past all + early-return sites → orchestrator scope guard blocks dispatch and + reverts the write before the runner is constructed.""" + _init_git_repo(tmp_path) + monkeypatch.setattr("pdd.agentic_sync._find_project_root", lambda *_: tmp_path) + monkeypatch.setattr( + "pdd.agentic_sync._detect_modules_from_branch_diff", lambda *_: [] + ) + mock_gh_cmd.return_value = ( + True, self._issue_payload(self._ISSUE_BODY_WITH_BULLET_CONTRACT) + ) + mock_load_arch.return_value = (None, tmp_path / "architecture.json") + + def llm_side_effect(*_args, **_kwargs): + # Simulate the LLM identify-modules call writing an + # out-of-contract file mid-call AND returning a valid module + # list so the flow proceeds toward dispatch (skipping the + # iter-30 early-return wrap). + (tmp_path / "outside.py").write_text("LLM wrote me\n", encoding="utf-8") + return True, 'MODULES_TO_SYNC: ["foo"]\nDEPS_VALID: true', 0.01, "anthropic" + + mock_agentic_task.side_effect = llm_side_effect + # Skip dry-run early-return: report success with a cwd for "foo". + mock_dry_run.return_value = (True, {"foo": tmp_path}, [], 0.0) + # Skip "all already synced" early-return: keep "foo" in the list. + mock_filter_synced.return_value = ["foo"] + + success, msg, _cost, _model = run_agentic_sync( + "https://github.com/owner/repo/issues/1", quiet=True + ) + + assert success is False + assert "before dispatch" in msg + assert "outside.py" in msg + assert not (tmp_path / "outside.py").exists(), ( + "dispatch-boundary scope guard must revert the out-of-contract write" + ) + # Runner was NEVER constructed because dispatch was aborted. + mock_runner_cls.assert_not_called() + + @patch("pdd.agentic_sync._filter_already_synced") + @patch("pdd.agentic_sync._run_dry_run_validation") + @patch("pdd.agentic_sync.AsyncSyncRunner") + @patch("pdd.agentic_sync.load_prompt_template", return_value="t {issue_content} {architecture_json}") + @patch("pdd.agentic_sync.run_agentic_task") + @patch("pdd.agentic_sync._load_architecture_json") + @patch("pdd.agentic_sync._run_gh_command") + @patch("pdd.agentic_sync._check_gh_cli", return_value=True) + def test_orchestrator_scope_guard_allows_dispatch_when_clean( + self, + _mock_gh_cli, + mock_gh_cmd, + mock_load_arch, + mock_agentic_task, + _mock_load_prompt, + mock_runner_cls, + mock_dry_run, + mock_filter_synced, + tmp_path, + monkeypatch, + ): + """Clean working tree at dispatch boundary → runner constructed + and dispatched normally.""" + _init_git_repo(tmp_path) + monkeypatch.setattr("pdd.agentic_sync._find_project_root", lambda *_: tmp_path) + monkeypatch.setattr( + "pdd.agentic_sync._detect_modules_from_branch_diff", lambda *_: [] + ) + mock_gh_cmd.return_value = ( + True, self._issue_payload(self._ISSUE_BODY_WITH_BULLET_CONTRACT) + ) + mock_load_arch.return_value = (None, tmp_path / "architecture.json") + + # LLM does NOT write anything out-of-contract; returns a valid + # module list. + mock_agentic_task.return_value = ( + True, 'MODULES_TO_SYNC: ["foo"]\nDEPS_VALID: true', 0.01, "anthropic" + ) + mock_dry_run.return_value = (True, {"foo": tmp_path}, [], 0.0) + mock_filter_synced.return_value = ["foo"] + # Provide a runnable runner mock so the dispatch can complete. + mock_runner_cls.return_value.run.return_value = (True, "ok", 0.0) + + success, msg, _cost, _model = run_agentic_sync( + "https://github.com/owner/repo/issues/1", quiet=True + ) + + assert success is True + assert msg == "ok" + assert "before dispatch" not in msg + # Runner WAS constructed and .run() WAS called. + mock_runner_cls.assert_called_once() + mock_runner_cls.return_value.run.assert_called_once() + + @patch("pdd.agentic_sync._filter_already_synced") + @patch("pdd.agentic_sync._run_dry_run_validation") + @patch("pdd.agentic_sync.AsyncSyncRunner") + @patch("pdd.agentic_sync.load_prompt_template", return_value="t {issue_content} {architecture_json}") + @patch("pdd.agentic_sync.run_agentic_task") + @patch("pdd.agentic_sync._load_architecture_json") + @patch("pdd.agentic_sync._run_gh_command") + @patch("pdd.agentic_sync._check_gh_cli", return_value=True) + def test_orchestrator_scope_guard_dispatch_check_is_no_op_with_no_contract( + self, + _mock_gh_cli, + mock_gh_cmd, + mock_load_arch, + mock_agentic_task, + _mock_load_prompt, + mock_runner_cls, + mock_dry_run, + mock_filter_synced, + tmp_path, + monkeypatch, + ): + """Permissive mode (no contract markers) → dispatch-boundary check + is a no-op even when the LLM wrote out-of-contract files.""" + _init_git_repo(tmp_path) + monkeypatch.setattr("pdd.agentic_sync._find_project_root", lambda *_: tmp_path) + monkeypatch.setattr( + "pdd.agentic_sync._detect_modules_from_branch_diff", lambda *_: [] + ) + mock_gh_cmd.return_value = ( + True, self._issue_payload(self._ISSUE_BODY_NO_CONTRACT) + ) + mock_load_arch.return_value = (None, tmp_path / "architecture.json") + + def llm_side_effect(*_args, **_kwargs): + (tmp_path / "outside.py").write_text("LLM wrote me\n", encoding="utf-8") + return True, 'MODULES_TO_SYNC: ["foo"]\nDEPS_VALID: true', 0.01, "anthropic" + + mock_agentic_task.side_effect = llm_side_effect + mock_dry_run.return_value = (True, {"foo": tmp_path}, [], 0.0) + mock_filter_synced.return_value = ["foo"] + mock_runner_cls.return_value.run.return_value = (True, "ok", 0.0) + + success, msg, _cost, _model = run_agentic_sync( + "https://github.com/owner/repo/issues/1", quiet=True + ) + + # Permissive mode → dispatch proceeds, no revert, file preserved. + assert success is True + assert "before dispatch" not in msg + assert (tmp_path / "outside.py").exists() + mock_runner_cls.assert_called_once() + mock_runner_cls.return_value.run.assert_called_once() + + @patch("pdd.agentic_sync._filter_already_synced") + @patch("pdd.agentic_sync._run_dry_run_validation") + @patch("pdd.agentic_sync.AsyncSyncRunner") + @patch("pdd.agentic_sync.load_prompt_template", return_value="t {issue_content} {architecture_json}") + @patch("pdd.agentic_sync.run_agentic_task") + @patch("pdd.agentic_sync._load_architecture_json") + @patch("pdd.agentic_sync._run_gh_command") + @patch("pdd.agentic_sync._check_gh_cli", return_value=True) + def test_orchestrator_scope_guard_dispatch_check_is_no_op_with_scope_guard_disabled( + self, + _mock_gh_cli, + mock_gh_cmd, + mock_load_arch, + mock_agentic_task, + _mock_load_prompt, + mock_runner_cls, + mock_dry_run, + mock_filter_synced, + tmp_path, + monkeypatch, + ): + """``scope_guard=False`` (opt-out) with a contract present → + dispatch-boundary check is a no-op.""" + _init_git_repo(tmp_path) + monkeypatch.setattr("pdd.agentic_sync._find_project_root", lambda *_: tmp_path) + monkeypatch.setattr( + "pdd.agentic_sync._detect_modules_from_branch_diff", lambda *_: [] + ) + mock_gh_cmd.return_value = ( + True, self._issue_payload(self._ISSUE_BODY_WITH_BULLET_CONTRACT) + ) + mock_load_arch.return_value = (None, tmp_path / "architecture.json") + + def llm_side_effect(*_args, **_kwargs): + (tmp_path / "outside.py").write_text("LLM wrote me\n", encoding="utf-8") + return True, 'MODULES_TO_SYNC: ["foo"]\nDEPS_VALID: true', 0.01, "anthropic" + + mock_agentic_task.side_effect = llm_side_effect + mock_dry_run.return_value = (True, {"foo": tmp_path}, [], 0.0) + mock_filter_synced.return_value = ["foo"] + mock_runner_cls.return_value.run.return_value = (True, "ok", 0.0) + + success, msg, _cost, _model = run_agentic_sync( + "https://github.com/owner/repo/issues/1", + quiet=True, + scope_guard=False, + ) + + # Opt-out → dispatch proceeds, no revert, file preserved. + assert success is True + assert "before dispatch" not in msg + assert (tmp_path / "outside.py").exists() + mock_runner_cls.assert_called_once() + mock_runner_cls.return_value.run.assert_called_once() + + # --------------------------------------------------------------------- + # Iter-38 M-1: fail-closed baseline acquisition at orchestrator init. + # When ``_git_changed_paths`` / ``_git_ignored_paths`` return ``None`` + # (transient git lock contention, missing binary, OSError) the + # orchestrator MUST abort BEFORE any pre-dispatch LLM call or shell + # command. An empty baseline produced by a silent scan failure would + # later let the pre-dispatch scope guard revert pre-existing user WIP. + + @patch("pdd.agentic_sync.AsyncSyncRunner") + @patch("pdd.agentic_sync.run_agentic_task") + @patch("pdd.agentic_sync._load_architecture_json") + @patch("pdd.agentic_sync._run_gh_command") + @patch("pdd.agentic_sync._check_gh_cli", return_value=True) + def test_orchestrator_aborts_when_baseline_changed_scan_fails( + self, + _mock_gh_cli, + mock_gh_cmd, + mock_load_arch, + mock_agentic_task, + mock_runner_cls, + tmp_path, + monkeypatch, + ): + """Init-time ``_git_changed_paths`` returns ``None`` → orchestrator + fail-closes before any LLM call or runner construction.""" + _init_git_repo(tmp_path) + monkeypatch.setattr("pdd.agentic_sync._find_project_root", lambda *_: tmp_path) + monkeypatch.setattr( + "pdd.agentic_sync._detect_modules_from_branch_diff", lambda *_: [] + ) + mock_gh_cmd.return_value = ( + True, self._issue_payload(self._ISSUE_BODY_WITH_BULLET_CONTRACT) + ) + mock_load_arch.return_value = (None, tmp_path / "architecture.json") + # Patch the helpers on the agentic_sync module (where they're + # imported by name) — the orchestrator looks them up here. + monkeypatch.setattr("pdd.agentic_sync._git_changed_paths", lambda _root: None) + monkeypatch.setattr("pdd.agentic_sync._git_ignored_paths", lambda _root: set()) + + success, msg, _cost, _model = run_agentic_sync( + "https://github.com/owner/repo/issues/1", + quiet=True, + use_github_state=False, + ) + + assert success is False + assert "fail-closed" in msg + assert "baseline" in msg + # Downstream LLM / runner construction MUST NOT have run. + mock_agentic_task.assert_not_called() + mock_runner_cls.assert_not_called() + + @patch("pdd.agentic_sync.AsyncSyncRunner") + @patch("pdd.agentic_sync.run_agentic_task") + @patch("pdd.agentic_sync._load_architecture_json") + @patch("pdd.agentic_sync._run_gh_command") + @patch("pdd.agentic_sync._check_gh_cli", return_value=True) + def test_orchestrator_aborts_when_baseline_ignored_scan_fails( + self, + _mock_gh_cli, + mock_gh_cmd, + mock_load_arch, + mock_agentic_task, + mock_runner_cls, + tmp_path, + monkeypatch, + ): + """Init-time ``_git_ignored_paths`` returns ``None`` → orchestrator + fail-closes before any LLM call or runner construction.""" + _init_git_repo(tmp_path) + monkeypatch.setattr("pdd.agentic_sync._find_project_root", lambda *_: tmp_path) + monkeypatch.setattr( + "pdd.agentic_sync._detect_modules_from_branch_diff", lambda *_: [] + ) + mock_gh_cmd.return_value = ( + True, self._issue_payload(self._ISSUE_BODY_WITH_BULLET_CONTRACT) + ) + mock_load_arch.return_value = (None, tmp_path / "architecture.json") + monkeypatch.setattr("pdd.agentic_sync._git_changed_paths", lambda _root: set()) + monkeypatch.setattr("pdd.agentic_sync._git_ignored_paths", lambda _root: None) + + success, msg, _cost, _model = run_agentic_sync( + "https://github.com/owner/repo/issues/1", + quiet=True, + use_github_state=False, + ) + + assert success is False + assert "fail-closed" in msg + mock_agentic_task.assert_not_called() + mock_runner_cls.assert_not_called() + + @patch("pdd.agentic_sync._filter_already_synced") + @patch("pdd.agentic_sync._run_dry_run_validation") + @patch("pdd.agentic_sync.AsyncSyncRunner") + @patch("pdd.agentic_sync.load_prompt_template", return_value="t {issue_content} {architecture_json}") + @patch("pdd.agentic_sync.run_agentic_task") + @patch("pdd.agentic_sync._load_architecture_json") + @patch("pdd.agentic_sync._run_gh_command") + @patch("pdd.agentic_sync._check_gh_cli", return_value=True) + def test_orchestrator_proceeds_when_baseline_scans_succeed( + self, + _mock_gh_cli, + mock_gh_cmd, + mock_load_arch, + mock_agentic_task, + _mock_load_prompt, + mock_runner_cls, + mock_dry_run, + mock_filter_synced, + tmp_path, + monkeypatch, + ): + """Regression: both baseline scans succeed (empty set is a valid + success result) → orchestrator proceeds normally.""" + _init_git_repo(tmp_path) + monkeypatch.setattr("pdd.agentic_sync._find_project_root", lambda *_: tmp_path) + monkeypatch.setattr( + "pdd.agentic_sync._detect_modules_from_branch_diff", lambda *_: [] + ) + mock_gh_cmd.return_value = ( + True, self._issue_payload(self._ISSUE_BODY_WITH_BULLET_CONTRACT) + ) + mock_load_arch.return_value = (None, tmp_path / "architecture.json") + # Successful scans returning empty sets (clean worktree). + monkeypatch.setattr("pdd.agentic_sync._git_changed_paths", lambda _root: set()) + monkeypatch.setattr("pdd.agentic_sync._git_ignored_paths", lambda _root: set()) + + mock_agentic_task.return_value = ( + True, 'MODULES_TO_SYNC: ["foo"]\nDEPS_VALID: true', 0.01, "anthropic" + ) + mock_dry_run.return_value = (True, {"foo": tmp_path}, [], 0.0) + mock_filter_synced.return_value = ["foo"] + mock_runner_cls.return_value.run.return_value = (True, "ok", 0.0) + + success, msg, _cost, _model = run_agentic_sync( + "https://github.com/owner/repo/issues/1", quiet=True + ) + + assert success is True + assert "fail-closed" not in msg + mock_runner_cls.assert_called_once() + + @patch("pdd.agentic_sync._filter_already_synced") + @patch("pdd.agentic_sync._run_dry_run_validation") + @patch("pdd.agentic_sync.AsyncSyncRunner") + @patch("pdd.agentic_sync.load_prompt_template", return_value="t {issue_content} {architecture_json}") + @patch("pdd.agentic_sync.run_agentic_task") + @patch("pdd.agentic_sync._load_architecture_json") + @patch("pdd.agentic_sync._run_gh_command") + @patch("pdd.agentic_sync._check_gh_cli", return_value=True) + def test_orchestrator_does_not_fail_closed_in_permissive_mode( + self, + _mock_gh_cli, + mock_gh_cmd, + mock_load_arch, + mock_agentic_task, + _mock_load_prompt, + mock_runner_cls, + mock_dry_run, + mock_filter_synced, + tmp_path, + monkeypatch, + ): + """Permissive mode (no contract on issue) → baseline scan is never + invoked, so a hypothetical ``None`` return from the helpers does + NOT trigger the fail-closed abort. Run proceeds normally.""" + _init_git_repo(tmp_path) + monkeypatch.setattr("pdd.agentic_sync._find_project_root", lambda *_: tmp_path) + monkeypatch.setattr( + "pdd.agentic_sync._detect_modules_from_branch_diff", lambda *_: [] + ) + mock_gh_cmd.return_value = ( + True, self._issue_payload(self._ISSUE_BODY_NO_CONTRACT) + ) + mock_load_arch.return_value = (None, tmp_path / "architecture.json") + + called = {"changed": 0, "ignored": 0} + + def fake_changed(_root): + called["changed"] += 1 + return None # Would fail-close if the gate let us reach here. + + def fake_ignored(_root): + called["ignored"] += 1 + return None + + monkeypatch.setattr("pdd.agentic_sync._git_changed_paths", fake_changed) + monkeypatch.setattr("pdd.agentic_sync._git_ignored_paths", fake_ignored) + + mock_agentic_task.return_value = ( + True, 'MODULES_TO_SYNC: ["foo"]\nDEPS_VALID: true', 0.01, "anthropic" + ) + mock_dry_run.return_value = (True, {"foo": tmp_path}, [], 0.0) + mock_filter_synced.return_value = ["foo"] + mock_runner_cls.return_value.run.return_value = (True, "ok", 0.0) + + success, msg, _cost, _model = run_agentic_sync( + "https://github.com/owner/repo/issues/1", quiet=True + ) + + # Init-time helpers are gated on (scope_guard AND issue_contract is + # not None). In permissive mode they MUST NOT be called for the + # baseline acquisition, so the fail-closed abort cannot trigger. + assert called["changed"] == 0, ( + "permissive mode must not invoke the init-time changed scan" + ) + assert called["ignored"] == 0, ( + "permissive mode must not invoke the init-time ignored scan" + ) + assert success is True + assert "fail-closed" not in msg + mock_runner_cls.assert_called_once() + + +# --------------------------------------------------------------------------- +# _resolve_module_cwd +# --------------------------------------------------------------------------- + +class TestResolveModuleCwd: + def _write_pddrc(self, path: Path, contexts: Dict[str, Any]) -> None: + """Helper to write a .pddrc file.""" + import yaml + config = {"contexts": contexts} + path.write_text(yaml.dump(config)) + + def test_module_found_in_root_pddrc(self, tmp_path): + """Module matched by root .pddrc returns project root.""" + self._write_pddrc(tmp_path / ".pddrc", { + "myctx": { + "defaults": {"prompts_dir": "prompts/mymod"}, + "paths": ["src/mymod/**"], + }, + }) + result = _resolve_module_cwd("mymod/widget", tmp_path) + assert result == tmp_path + + def test_module_found_in_subdirectory_pddrc(self, tmp_path): + """Module found in subdirectory .pddrc returns that subdirectory.""" + # No root .pddrc — so subdirectory scanning is used + # Subdirectory has a matching context + sub = tmp_path / "examples" / "hello" + sub.mkdir(parents=True) + self._write_pddrc(sub / ".pddrc", { + "hello_ctx": { + "defaults": {"prompts_dir": "prompts/greeting"}, + "paths": ["src/**"], + }, + }) + result = _resolve_module_cwd("greeting/hi", tmp_path) + assert result == sub + + def test_module_not_found_falls_back_to_root(self, tmp_path): + """Module not in any .pddrc falls back to project root.""" + self._write_pddrc(tmp_path / ".pddrc", { + "other": { + "defaults": {"prompts_dir": "prompts/other"}, + "paths": ["src/other/**"], + }, + }) + result = _resolve_module_cwd("nonexistent_mod", tmp_path) + assert result == tmp_path + + def test_no_pddrc_falls_back_to_root(self, tmp_path): + """No .pddrc files at all returns project root.""" + result = _resolve_module_cwd("anything", tmp_path) + assert result == tmp_path + + def test_deepest_match_wins(self, tmp_path): + """When multiple subdirs match, the deepest one wins.""" + # Depth 1 match + sub1 = tmp_path / "level1" + sub1.mkdir() + self._write_pddrc(sub1 / ".pddrc", { + "ctx1": { + "defaults": {"prompts_dir": "prompts/shared"}, + "paths": ["src/**"], + }, + }) + # Depth 2 match (deeper) + sub2 = sub1 / "level2" + sub2.mkdir() + self._write_pddrc(sub2 / ".pddrc", { + "ctx2": { + "defaults": {"prompts_dir": "prompts/shared"}, + "paths": ["src/**"], + }, + }) + result = _resolve_module_cwd("shared/mod", tmp_path) + assert result == sub2 + + def test_catchall_subdirectory_skipped(self, tmp_path): + """Subdirectory with catch-all '**' pattern should NOT match unrelated modules.""" + # Subdirectory with catch-all pattern + sub = tmp_path / "test_debug2" + sub.mkdir() + self._write_pddrc(sub / ".pddrc", { + "test_ctx": { + "paths": ["**"], + }, + }) + # Module that doesn't belong to test_debug2 + result = _resolve_module_cwd("bug_main", tmp_path) + # Should fall back to project root, not test_debug2 + assert result == tmp_path + + def test_catchall_star_subdirectory_skipped(self, tmp_path): + """Subdirectory with catch-all '*' pattern should NOT match unrelated modules.""" + sub = tmp_path / "some_subdir" + sub.mkdir() + self._write_pddrc(sub / ".pddrc", { + "catch_all": { + "paths": ["*"], + }, + }) + result = _resolve_module_cwd("any_module", tmp_path) + assert result == tmp_path + + def test_specific_subdirectory_match_still_works(self, tmp_path): + """Subdirectory with specific path pattern should still match correctly.""" + sub = tmp_path / "frontend" + sub.mkdir() + self._write_pddrc(sub / ".pddrc", { + "components": { + "paths": ["components/**"], + }, + }) + result = _resolve_module_cwd("components/button", tmp_path) + assert result == sub + + def test_nested_pddrc_match_requires_matching_prompt_file(self, tmp_path): + """Broad nested contexts must not hijack similarly named root modules. + + The prompts-linter example has patterns like ``*llm*`` and a local + ``llm_python.prompt``. That must not claim the root ``llm_model`` + module, whose prompt exists only at the project root. + """ + (tmp_path / "prompts").mkdir() + (tmp_path / "prompts" / "llm_model_python.prompt").write_text("% root prompt") + self._write_pddrc(tmp_path / ".pddrc", { + "default": { + "defaults": {"prompts_dir": "prompts"}, + }, + }) + + nested = tmp_path / "examples" / "prompts_linter" + (nested / "prompts").mkdir(parents=True) + (nested / "prompts" / "llm_python.prompt").write_text("% nested prompt") + self._write_pddrc(nested / ".pddrc", { + "utils": { + "paths": ["*llm*"], + "defaults": {"prompts_dir": "prompts"}, + }, + }) + + assert _resolve_module_cwd("llm_model", tmp_path) == tmp_path + assert _resolve_module_cwd("llm", tmp_path) == nested + + def test_root_prompt_wins_over_nested_broad_glob(self, tmp_path): + """A root exact prompt should not be claimed by nested basename globs.""" + (tmp_path / "prompts").mkdir() + (tmp_path / "prompts" / "cli_python.prompt").write_text("% root prompt") + self._write_pddrc(tmp_path / ".pddrc", { + "default": { + "defaults": {"prompts_dir": "prompts"}, + }, + }) nested = tmp_path / "examples" / "prompts_linter" (nested / "prompts").mkdir(parents=True) @@ -2217,6 +4069,172 @@ def test_dry_run_success_rejects_changed_no_self_include_prompt_contract( assert "includes no existing module source context" in errors[0] +# --------------------------------------------------------------------------- +# _llm_fix_dry_run_failure safe-argv (iter-30 B-2 replacement) +# --------------------------------------------------------------------------- + + +class TestLlmFixDryRunSafeArgv: + """Iter-30: the orchestrator no longer executes an LLM-supplied shell + string. The LLM only returns ``SYNC_CWD: ``; the orchestrator + builds the argv itself and runs with ``shell=False``. Closes iter-29 B-2 + (shell injection at the orchestrator level).""" + + @staticmethod + def _llm_response(cwd_value: str) -> str: + return f"SYNC_CWD: {cwd_value}\n" + + @patch("pdd.agentic_sync.subprocess.run") + @patch("pdd.agentic_sync.run_agentic_task") + @patch("pdd.agentic_sync.load_prompt_template") + def test_llm_fix_dry_run_uses_safe_argv_not_shell( + self, + mock_load_prompt, + mock_agentic_task, + mock_subprocess, + tmp_path, + ): + """LLM returns ``SYNC_CWD: subdir`` → argv is a list, shell=False.""" + subdir = tmp_path / "subdir" + subdir.mkdir() + mock_load_prompt.return_value = ( + "{basename} {dry_run_error} {project_tree} {pddrc_locations} {attempted_cwd}" + ) + mock_agentic_task.return_value = ( + True, + self._llm_response("subdir"), + 0.01, + "anthropic", + ) + mock_subprocess.return_value = MagicMock( + returncode=0, + stdout="", + stderr="", + ) + + ok, cwd, cost, err = _llm_fix_dry_run_failure( + basename="foo", + project_root=tmp_path, + dry_run_error="prompt not found", + quiet=True, + ) + + assert ok is True + assert cwd == subdir.resolve() + assert err == "" + + # argv must be a list (not a string), shell must be False. + call_args, call_kwargs = mock_subprocess.call_args + argv = call_args[0] + assert isinstance(argv, list), "argv must be a list — shell=False shape" + assert call_kwargs.get("shell", False) is False + assert "--dry-run" in argv + assert "sync" in argv + assert "foo" in argv + # cwd is the resolved SYNC_CWD path, not the project root. + assert call_kwargs.get("cwd") == str(subdir.resolve()) + + @patch("pdd.agentic_sync.subprocess.run") + @patch("pdd.agentic_sync.run_agentic_task") + @patch("pdd.agentic_sync.load_prompt_template") + def test_llm_fix_dry_run_rejects_path_outside_project_root( + self, + mock_load_prompt, + mock_agentic_task, + mock_subprocess, + tmp_path, + ): + """LLM returns ``SYNC_CWD: /etc`` (outside project root) → reject.""" + mock_load_prompt.return_value = ( + "{basename} {dry_run_error} {project_tree} {pddrc_locations} {attempted_cwd}" + ) + mock_agentic_task.return_value = ( + True, + self._llm_response("/etc"), + 0.01, + "anthropic", + ) + + ok, cwd, cost, err = _llm_fix_dry_run_failure( + basename="foo", + project_root=tmp_path, + dry_run_error="prompt not found", + quiet=True, + ) + + assert ok is False + assert cwd is None + assert "outside project root" in err + mock_subprocess.assert_not_called() + + @patch("pdd.agentic_sync.subprocess.run") + @patch("pdd.agentic_sync.run_agentic_task") + @patch("pdd.agentic_sync.load_prompt_template") + def test_llm_fix_dry_run_rejects_legacy_sync_cmd_format( + self, + mock_load_prompt, + mock_agentic_task, + mock_subprocess, + tmp_path, + ): + """Stale-cache ``SYNC_CMD: pdd sync foo`` → reject with migration error.""" + mock_load_prompt.return_value = ( + "{basename} {dry_run_error} {project_tree} {pddrc_locations} {attempted_cwd}" + ) + mock_agentic_task.return_value = ( + True, + "SYNC_CMD: pdd --force sync foo --dry-run --agentic --no-steer\n", + 0.01, + "anthropic", + ) + + ok, cwd, cost, err = _llm_fix_dry_run_failure( + basename="foo", + project_root=tmp_path, + dry_run_error="prompt not found", + quiet=True, + ) + + assert ok is False + assert cwd is None + assert "SYNC_CWD" in err + assert "legacy" in err.lower() + mock_subprocess.assert_not_called() + + @patch("pdd.agentic_sync.subprocess.run") + @patch("pdd.agentic_sync.run_agentic_task") + @patch("pdd.agentic_sync.load_prompt_template") + def test_llm_fix_dry_run_rejects_shell_metachars_in_cwd( + self, + mock_load_prompt, + mock_agentic_task, + mock_subprocess, + tmp_path, + ): + """SYNC_CWD containing shell metacharacters is rejected defensively.""" + mock_load_prompt.return_value = ( + "{basename} {dry_run_error} {project_tree} {pddrc_locations} {attempted_cwd}" + ) + mock_agentic_task.return_value = ( + True, + self._llm_response("subdir; rm -rf /"), + 0.01, + "anthropic", + ) + + ok, cwd, cost, err = _llm_fix_dry_run_failure( + basename="foo", + project_root=tmp_path, + dry_run_error="prompt not found", + quiet=True, + ) + + assert ok is False + assert cwd is None + assert "forbidden character" in err + mock_subprocess.assert_not_called() + + # --------------------------------------------------------------------------- # _filter_already_synced # --------------------------------------------------------------------------- diff --git a/tests/test_agentic_sync_runner.py b/tests/test_agentic_sync_runner.py index 143ec54d5..411955384 100644 --- a/tests/test_agentic_sync_runner.py +++ b/tests/test_agentic_sync_runner.py @@ -2550,6 +2550,1780 @@ def test_module_targets_map_display_key_to_sync_basename( assert popen_kwargs["cwd"] == str(custom_cwd) +class TestAllowedWriteSet: + """ + Issue #1013 (F14): the legacy ``allowed_write_paths`` kwarg was removed. + Only ``allowed_write_set`` is accepted by ``AsyncSyncRunner``. Deeper + behavioural coverage for the new ``_enforce_scope_guard`` helper lives + in ``TestEnforceScopeGuard`` below. + """ + + def test_allowed_write_set_forces_sequential_execution(self): + runner = AsyncSyncRunner( + basenames=["a", "b"], + dep_graph={"a": [], "b": []}, + sync_options={}, + github_info=None, + quiet=True, + allowed_write_set=["pdd/a.py"], + ) + + assert runner.max_workers == 1 + + def test_permissive_mode_when_no_contract(self): + runner = AsyncSyncRunner( + basenames=["a"], + dep_graph={"a": []}, + sync_options={}, + github_info=None, + quiet=True, + allowed_write_set=None, + ) + + assert runner.allowed_write_paths is None + assert runner.scope_guard_enabled is True + # No contract → no forced sequential execution + assert runner.max_workers == 4 + + def test_explicit_empty_contract_rejects_all_changes(self): + runner = AsyncSyncRunner( + basenames=["a"], + dep_graph={"a": []}, + sync_options={}, + github_info=None, + quiet=True, + allowed_write_set=[], + ) + + # Empty-but-present contract is still "contract present"; max_workers + # is forced to 1 and the allow set is the empty set (NOT None). + assert runner.allowed_write_paths == set() + assert runner.max_workers == 1 + + def test_async_runner_project_root_kwarg_overrides_cwd( + self, tmp_path, monkeypatch + ): + """Iter-18 M-1: the new keyword-only ``project_root`` kwarg MUST + override the default ``Path.cwd()`` and MUST be applied BEFORE the + baseline-changed-paths snapshot is taken — otherwise subclasses + (e.g. ``DurableSyncRunner``) cannot pin the baseline to a known + repo root. + """ + import subprocess + + # Initialise a real git repo at ``durable_root`` so the baseline + # snapshot's ``git status`` invocation actually runs. + durable_root = tmp_path / "durable_root" + durable_root.mkdir() + subprocess.run( + ["git", "init", "-b", "main", str(durable_root)], + check=True, + capture_output=True, + ) + subprocess.run( + ["git", "-C", str(durable_root), "config", "user.email", "t@t.invalid"], + check=True, + capture_output=True, + ) + subprocess.run( + ["git", "-C", str(durable_root), "config", "user.name", "T"], + check=True, + capture_output=True, + ) + (durable_root / "README.md").write_text("initial") + subprocess.run( + ["git", "-C", str(durable_root), "add", "README.md"], + check=True, + capture_output=True, + ) + subprocess.run( + ["git", "-C", str(durable_root), "commit", "-m", "init"], + check=True, + capture_output=True, + ) + + # Dirty file inside durable_root (should appear in baseline). + (durable_root / "dirty.py").write_text("user wip") + + # The CALLER's cwd is a different directory entirely. A dirty file + # there MUST NOT leak into the runner's baseline. + caller_cwd = tmp_path / "caller_cwd" + caller_cwd.mkdir() + (caller_cwd / "out.py").write_text("dirty file in caller cwd") + monkeypatch.chdir(caller_cwd) + + runner = AsyncSyncRunner( + basenames=["a"], + dep_graph={"a": []}, + sync_options={}, + github_info=None, + quiet=True, + allowed_write_set=["pdd/a.py"], + project_root=durable_root, + ) + + assert runner.project_root == durable_root.resolve() + assert "dirty.py" in runner._baseline_changed_paths + assert "out.py" not in runner._baseline_changed_paths + + +class TestBaselineFailClosed: + """Issue #1013 iter-38 M-1: when the init-time baseline scan fails + (transient git lock contention, missing binary, OSError), the runner + MUST record ``_baseline_acquisition_failed=True`` and abort + :meth:`run` before any write-capable work runs. An empty baseline + indistinguishable from "scan succeeded, worktree clean" would cause + the scope guard to later flag pre-existing user WIP as out-of-scope + and revert/delete it. + """ + + def test_async_runner_aborts_when_baseline_changed_scan_fails( + self, monkeypatch + ): + from pdd import agentic_sync_runner as mod + + monkeypatch.setattr(mod, "_git_changed_paths", lambda _root: None) + monkeypatch.setattr(mod, "_git_ignored_paths", lambda _root: set()) + + runner = AsyncSyncRunner( + basenames=["a"], + dep_graph={"a": []}, + sync_options={}, + github_info=None, + quiet=True, + allowed_write_set=["pdd/a.py"], + ) + + assert runner._baseline_acquisition_failed is True + assert runner._baseline_changed_paths == {} + assert runner._baseline_ignored_paths == {} + + success, msg, cost = runner.run() + assert success is False + assert "fail-closed" in msg + assert "baseline" in msg + assert cost == 0.0 + + def test_async_runner_aborts_when_baseline_ignored_scan_fails( + self, monkeypatch + ): + from pdd import agentic_sync_runner as mod + + monkeypatch.setattr(mod, "_git_changed_paths", lambda _root: set()) + monkeypatch.setattr(mod, "_git_ignored_paths", lambda _root: None) + + runner = AsyncSyncRunner( + basenames=["a"], + dep_graph={"a": []}, + sync_options={}, + github_info=None, + quiet=True, + allowed_write_set=["pdd/a.py"], + ) + + assert runner._baseline_acquisition_failed is True + success, msg, _cost = runner.run() + assert success is False + assert "fail-closed" in msg + + def test_async_runner_aborts_when_baseline_scan_raises_oserror( + self, monkeypatch + ): + """Verify the actual subprocess exception path: ``_git_changed_paths`` + catches ``OSError`` and returns ``None``, which must propagate to + the fail-closed flag.""" + from pdd import agentic_sync_runner as mod + + def boom(*_args, **_kwargs): + raise OSError("git binary missing") + + monkeypatch.setattr(mod.subprocess, "run", boom) + + runner = AsyncSyncRunner( + basenames=["a"], + dep_graph={"a": []}, + sync_options={}, + github_info=None, + quiet=True, + allowed_write_set=["pdd/a.py"], + ) + + assert runner._baseline_acquisition_failed is True + success, msg, _cost = runner.run() + assert success is False + assert "fail-closed" in msg + + def test_async_runner_no_flag_when_baseline_scan_fails_in_permissive_mode( + self, monkeypatch + ): + """When ``allowed_write_set`` is ``None`` (permissive), the helpers + are never called and no failure can be recorded — the run proceeds.""" + from pdd import agentic_sync_runner as mod + + called = {"changed": 0, "ignored": 0} + + def fake_changed(_root): + called["changed"] += 1 + return None + + def fake_ignored(_root): + called["ignored"] += 1 + return None + + monkeypatch.setattr(mod, "_git_changed_paths", fake_changed) + monkeypatch.setattr(mod, "_git_ignored_paths", fake_ignored) + + runner = AsyncSyncRunner( + basenames=["a"], + dep_graph={"a": []}, + sync_options={}, + github_info=None, + quiet=True, + allowed_write_set=None, + ) + + # Gate is off → helpers MUST NOT be called and the flag MUST be False. + assert called["changed"] == 0 + assert called["ignored"] == 0 + assert runner._baseline_acquisition_failed is False + + def test_async_runner_no_flag_when_scope_guard_disabled(self, monkeypatch): + """``scope_guard_enabled=False`` skips the baseline scan entirely — + even an OSError from ``subprocess.run`` cannot trigger fail-closed.""" + from pdd import agentic_sync_runner as mod + + def boom(*_args, **_kwargs): + raise OSError("would explode if reached") + + monkeypatch.setattr(mod.subprocess, "run", boom) + + runner = AsyncSyncRunner( + basenames=["a"], + dep_graph={"a": []}, + sync_options={}, + github_info=None, + quiet=True, + allowed_write_set=["pdd/a.py"], + scope_guard_enabled=False, + ) + + assert runner._baseline_acquisition_failed is False + + def test_async_runner_no_flag_when_scan_succeeds(self, monkeypatch): + """Regression: a successful scan returning an EMPTY set (clean + worktree) must NOT trigger fail-closed — only ``None`` does.""" + from pdd import agentic_sync_runner as mod + + monkeypatch.setattr(mod, "_git_changed_paths", lambda _root: set()) + monkeypatch.setattr(mod, "_git_ignored_paths", lambda _root: set()) + + runner = AsyncSyncRunner( + basenames=["a"], + dep_graph={"a": []}, + sync_options={}, + github_info=None, + quiet=True, + allowed_write_set=["pdd/a.py"], + ) + + assert runner._baseline_acquisition_failed is False + assert runner._baseline_changed_paths == {} + assert runner._baseline_ignored_paths == {} + + +class TestEnforceScopeGuard: + """Issue #1013 (F9): direct behavioural coverage for ``_enforce_scope_guard`` + and ``_matches_companion_allowlist``. The constructor-state checks above + establish baseline; these tests exercise the methods themselves. + """ + + def _make_runner(self, **kwargs): + defaults = { + "basenames": ["mod"], + "dep_graph": {"mod": []}, + "sync_options": {}, + "github_info": None, + "quiet": True, + } + defaults.update(kwargs) + return AsyncSyncRunner(**defaults) + + def test_returns_none_when_permissive_mode(self, tmp_path): + runner = self._make_runner(allowed_write_set=None) + assert runner._enforce_scope_guard("mod", tmp_path) is None + + def test_returns_none_when_scope_guard_disabled(self, tmp_path): + runner = self._make_runner( + allowed_write_set=["pdd/foo.py"], + scope_guard_enabled=False, + ) + assert runner._enforce_scope_guard("mod", tmp_path) is None + + def test_companion_allowlist_strict_pathlib_match(self): + """F3: ``.pdd/meta/*.json`` must NOT match nested directories.""" + runner = self._make_runner( + allowed_write_set=["pdd/foo.py"], + companion_allowlist=[".pdd/meta/*.json"], + ) + # Top-level companion match → allowed + assert runner._matches_companion_allowlist( + ".pdd/meta/foo_python.json", runner.companion_allowlist + ) is True + # Nested under companion dir → rejected (pathlib semantics) + assert runner._matches_companion_allowlist( + ".pdd/meta/nested/foo_python.json", runner.companion_allowlist + ) is False + # Unrelated path → rejected + assert runner._matches_companion_allowlist( + "pdd/unrelated.py", runner.companion_allowlist + ) is False + + def test_companion_allowlist_unions_default(self): + """F4: caller-supplied allowlist is always unioned with the default.""" + from pdd.agentic_common import DEFAULT_SYNC_COMPANION_ALLOWLIST + + runner = self._make_runner( + allowed_write_set=["pdd/foo.py"], + companion_allowlist=["docs/*.md"], + ) + # Both caller pattern AND default appear in effective allowlist + assert "docs/*.md" in runner.companion_allowlist + for default in DEFAULT_SYNC_COMPANION_ALLOWLIST: + assert default in runner.companion_allowlist + + def test_diagnostic_format_has_scope_guard_reverted_prefix( + self, tmp_path, monkeypatch + ): + """Verify the spec-required diagnostic prefix is emitted.""" + # Stub the revert helpers so the test does not require a real git repo: + # _enforce_scope_guard composes the diagnostic from their return values. + from pdd import agentic_sync_runner as mod + + offending = tmp_path / "out_of_scope.txt" + offending.write_text("oops") + monkeypatch.setattr( + mod, + "_revert_out_of_scope_changes", + lambda _root, _allowed: [offending], + ) + monkeypatch.setattr( + mod, + "revert_out_of_scope_changes_with_dirs", + lambda _root, allowed_dirs, allowed_files: [], + ) + + runner = self._make_runner( + allowed_write_set=["pdd/foo.py"], + companion_allowlist=[".pdd/meta/*.json"], + ) + runner.contract_source = "html-comment" + + diagnostic = runner._enforce_scope_guard("mod", tmp_path) + assert diagnostic is not None + assert diagnostic.startswith( + "Scope guard reverted 1 out-of-scope file(s) for module 'mod' " + "(contract source: html-comment):" + ) + assert "Allowed write set:" in diagnostic + assert "Companion allowlist:" in diagnostic + + def test_companion_glob_scoped_to_module_cwd_not_sibling( + self, tmp_path, monkeypatch + ): + """Iter-3 F1: a sibling module's companion artifact (under a different + ``module_cwd``) must NOT be auto-allowed. The rglob must scope to + the current module's cwd only. + """ + from pdd import agentic_sync_runner as mod + + # Build a fake repo with two module dirs; place ``.pdd/meta/foo.json`` + # under EACH so the companion glob would match both if scanned at + # repo level. + repo = tmp_path + module_a = repo / "mod_a" + module_b = repo / "mod_b" + for m in (module_a, module_b): + (m / ".pdd" / "meta").mkdir(parents=True) + (m / ".pdd" / "meta" / "x.json").write_text("{}") + + captured_allowed = {} + + def fake_revert(repo_root, allowed_files): + captured_allowed["files"] = set(allowed_files) + return [] + + monkeypatch.setattr(mod, "_revert_out_of_scope_changes", fake_revert) + monkeypatch.setattr( + mod, + "revert_out_of_scope_changes_with_dirs", + lambda _root, allowed_dirs, allowed_files: [], + ) + + runner = self._make_runner( + allowed_write_set=["pdd/foo.py"], + companion_allowlist=[".pdd/meta/*.json"], + ) + # Skip git toplevel resolution; pretend repo root == tmp_path. + monkeypatch.setattr( + runner, "_resolve_repo_root", lambda _cwd: repo.resolve() + ) + + runner._enforce_scope_guard("mod_a", module_a) + + files = captured_allowed["files"] + assert (module_a / ".pdd" / "meta" / "x.json").resolve() in files + # Sibling module's companion artifact must NOT be auto-allowed. + assert (module_b / ".pdd" / "meta" / "x.json").resolve() not in files + + def test_run_entry_does_not_log_permissive_mode_again(self, capsys): + """Iter-18 m-1: ``run_agentic_sync`` already emits one user-facing + line per invocation covering all three states (disabled / contract + loaded / no contract). The runner used to emit a second duplicate + line on ``run()`` entry — removed in iter-18 so the operator sees + a single authoritative status line. + """ + runner = self._make_runner( + allowed_write_set=None, + quiet=False, + ) + runner.basenames = [] + runner.run() + out = capsys.readouterr().out + # The runner-side duplicate is gone. + assert "permissive mode" not in out + + def test_run_entry_does_not_log_opt_out_warning_again(self, capsys): + """Iter-18 m-1: caller-side log owns the opt-out warning; the + runner-side duplicate was removed. + """ + runner = self._make_runner( + allowed_write_set=["pdd/foo.py"], + scope_guard_enabled=False, + quiet=False, + ) + runner.basenames = [] + runner.run() + out = capsys.readouterr().out + assert "--no-scope-guard" not in out + + def test_pre_existing_untracked_files_are_preserved(self, tmp_path): + """Iter-6 B1 (data-loss bug): a user's pre-existing untracked file + (``scratch.txt``) must NOT be removed by the scope guard. Uses a + real ``git init`` repo because the bug only reproduces when the + revert helpers actually touch the filesystem. + """ + subprocess.run(["git", "init", "-b", "main", str(tmp_path)], check=True, + capture_output=True) + subprocess.run(["git", "-C", str(tmp_path), "config", "user.email", + "t@t.invalid"], check=True, capture_output=True) + subprocess.run(["git", "-C", str(tmp_path), "config", "user.name", + "T"], check=True, capture_output=True) + (tmp_path / "README.md").write_text("initial") + subprocess.run(["git", "-C", str(tmp_path), "add", "README.md"], + check=True, capture_output=True) + subprocess.run(["git", "-C", str(tmp_path), "commit", "-m", "init"], + check=True, capture_output=True) + + scratch = tmp_path / "scratch.txt" + scratch.write_text("user work-in-progress — do not delete") + assert scratch.exists() + + from unittest.mock import patch + with patch("pdd.agentic_sync_runner.Path.cwd", return_value=tmp_path): + runner = self._make_runner( + allowed_write_set=["pdd/foo.py"], + companion_allowlist=[".pdd/meta/*.json"], + ) + runner.project_root = tmp_path + + assert "scratch.txt" in runner._baseline_changed_paths + + diagnostic = runner._enforce_scope_guard("mod", tmp_path) + + assert scratch.exists(), ( + "scope guard incorrectly deleted user's pre-existing scratch.txt" + ) + assert diagnostic is None or "scratch.txt" not in diagnostic + + # --------------------------------------------------------------------- + # Iter-24 M-1: hash-aware baseline preservation + # --------------------------------------------------------------------- + + def test_baseline_preservation_clobber_is_detected( + self, tmp_path, monkeypatch + ): + """Iter-24 M-1 (baseline-clobber bug, codex iter-23 repro): a + pre-existing dirty file outside the contract MUST be flagged when + sync overwrites it with different content. Name-based preservation + (iter-6 B1) silently auto-allowed any same-named write.""" + from pdd import agentic_sync_runner as mod + + subprocess.run( + ["git", "init", "-b", "main", str(tmp_path)], + check=True, capture_output=True, + ) + subprocess.run( + ["git", "-C", str(tmp_path), "config", "user.email", "t@t.invalid"], + check=True, capture_output=True, + ) + subprocess.run( + ["git", "-C", str(tmp_path), "config", "user.name", "T"], + check=True, capture_output=True, + ) + (tmp_path / "README.md").write_text("initial") + subprocess.run( + ["git", "-C", str(tmp_path), "add", "README.md"], + check=True, capture_output=True, + ) + subprocess.run( + ["git", "-C", str(tmp_path), "commit", "-m", "init"], + check=True, capture_output=True, + ) + + # Pre-existing dirty file outside the contract — codex repro. + outside = tmp_path / "outside.py" + outside.write_text("user wip") + + monkeypatch.chdir(tmp_path) + runner = self._make_runner( + allowed_write_set=["pdd/foo.py"], + companion_allowlist=[".pdd/meta/*.json"], + ) + runner.project_root = tmp_path.resolve() + + assert "outside.py" in runner._baseline_changed_paths + baseline_hash = runner._baseline_changed_paths["outside.py"] + assert baseline_hash is not None, ( + "iter-24: baseline SHA must be captured for readable files" + ) + + # Simulate sync (buggy LLM) clobbering the file with different + # content. + outside.write_text("sync clobbered") + + monkeypatch.setattr( + mod, "_revert_out_of_scope_changes", lambda _root, _allowed: [] + ) + monkeypatch.setattr( + mod, "revert_out_of_scope_changes_with_dirs", + lambda _root, allowed_dirs, allowed_files: [], + ) + monkeypatch.setattr( + runner, "_resolve_repo_root", lambda _cwd: tmp_path.resolve() + ) + + diagnostic = runner._enforce_scope_guard("mod", tmp_path) + + assert diagnostic is not None, ( + "iter-24: clobbered baseline file must hard-fail the module" + ) + assert "outside.py" in diagnostic, ( + "iter-24: clobbered path must appear in the diagnostic" + ) + + def test_baseline_preservation_unchanged_file_still_allowed( + self, tmp_path, monkeypatch + ): + """Iter-24 M-1 (iter-6 B1 regression): an unchanged pre-existing + dirty file MUST still be auto-allowed (preserved). The iter-24 fix + adds content-awareness but does not break the iter-6 B1 carve-out + for unchanged user WIP.""" + from pdd import agentic_sync_runner as mod + + subprocess.run( + ["git", "init", "-b", "main", str(tmp_path)], + check=True, capture_output=True, + ) + subprocess.run( + ["git", "-C", str(tmp_path), "config", "user.email", "t@t.invalid"], + check=True, capture_output=True, + ) + subprocess.run( + ["git", "-C", str(tmp_path), "config", "user.name", "T"], + check=True, capture_output=True, + ) + (tmp_path / "README.md").write_text("initial") + subprocess.run( + ["git", "-C", str(tmp_path), "add", "README.md"], + check=True, capture_output=True, + ) + subprocess.run( + ["git", "-C", str(tmp_path), "commit", "-m", "init"], + check=True, capture_output=True, + ) + + outside = tmp_path / "outside.py" + outside.write_text("user wip") + + monkeypatch.chdir(tmp_path) + runner = self._make_runner( + allowed_write_set=["pdd/foo.py"], + companion_allowlist=[".pdd/meta/*.json"], + ) + runner.project_root = tmp_path.resolve() + + # Sync did NOT touch outside.py — content unchanged. + assert outside.read_text() == "user wip" + + monkeypatch.setattr( + mod, "_revert_out_of_scope_changes", lambda _root, _allowed: [] + ) + monkeypatch.setattr( + mod, "revert_out_of_scope_changes_with_dirs", + lambda _root, allowed_dirs, allowed_files: [], + ) + monkeypatch.setattr( + runner, "_resolve_repo_root", lambda _cwd: tmp_path.resolve() + ) + + diagnostic = runner._enforce_scope_guard("mod", tmp_path) + + assert diagnostic is None, ( + f"iter-24: unchanged baseline file must be preserved, got: " + f"{diagnostic!r}" + ) + + def test_baseline_preservation_deleted_file_drops_from_allowed( + self, tmp_path, monkeypatch + ): + """Iter-24 M-1: when a baseline path is deleted between init and + scope-guard time, the iter-24 logic skips it (``current_hash is + None``). The deleted path MUST NOT appear in ``allowed_files`` — + use a TRACKED-AND-MODIFIED baseline path so ``git status`` shows + the deletion (advisor #3 fix). Capture the revert helpers' allowed + set to assert directly.""" + from pdd import agentic_sync_runner as mod + + subprocess.run( + ["git", "init", "-b", "main", str(tmp_path)], + check=True, capture_output=True, + ) + subprocess.run( + ["git", "-C", str(tmp_path), "config", "user.email", "t@t.invalid"], + check=True, capture_output=True, + ) + subprocess.run( + ["git", "-C", str(tmp_path), "config", "user.name", "T"], + check=True, capture_output=True, + ) + # Commit ``outside.py`` so it's tracked; then dirty it. ``git + # status`` will list the subsequent deletion as ``D ``. + (tmp_path / "outside.py").write_text("tracked content") + (tmp_path / "README.md").write_text("initial") + subprocess.run( + ["git", "-C", str(tmp_path), "add", "outside.py", "README.md"], + check=True, capture_output=True, + ) + subprocess.run( + ["git", "-C", str(tmp_path), "commit", "-m", "init"], + check=True, capture_output=True, + ) + outside = tmp_path / "outside.py" + outside.write_text("dirty content") # now tracked + modified + + monkeypatch.chdir(tmp_path) + runner = self._make_runner( + allowed_write_set=["pdd/foo.py"], + companion_allowlist=[".pdd/meta/*.json"], + ) + runner.project_root = tmp_path.resolve() + + assert "outside.py" in runner._baseline_changed_paths + + # Simulate the file being deleted between baseline snapshot and + # scope-guard run. + outside.unlink() + + captured_allowed: Dict[str, set] = {} + + def fake_revert(_root, allowed_files): + captured_allowed["files"] = set(allowed_files) + return [] + + monkeypatch.setattr(mod, "_revert_out_of_scope_changes", fake_revert) + monkeypatch.setattr( + mod, "revert_out_of_scope_changes_with_dirs", + lambda _root, allowed_dirs, allowed_files: [], + ) + monkeypatch.setattr( + runner, "_resolve_repo_root", lambda _cwd: tmp_path.resolve() + ) + + runner._enforce_scope_guard("mod", tmp_path) + + # The KEY assertion: the deleted baseline path was NOT auto-allowed. + deleted_abs = (tmp_path / "outside.py").resolve() + assert deleted_abs not in captured_allowed["files"], ( + "iter-24: deleted baseline path must not be in allowed_files; " + f"got: {captured_allowed['files']}" + ) + + # --------------------------------------------------------------------- + # Iter-34 M-3: baseline-deletion blind spot + # --------------------------------------------------------------------- + + def test_baseline_deletion_of_untracked_file_is_flagged( + self, tmp_path, monkeypatch + ): + """Iter-34 M-3 (baseline-deletion blind spot, codex iter-33): a + user's pre-existing UNTRACKED dirty file that gets deleted during + sync MUST hard-fail the module. Untracked baselines have no git + record, so ``git status`` leaves no trail after deletion — the + iter-24 logic dropped the baseline entry on ``current_hash is + None`` and silently lost the WIP.""" + from pdd import agentic_sync_runner as mod + + subprocess.run( + ["git", "init", "-b", "main", str(tmp_path)], + check=True, capture_output=True, + ) + subprocess.run( + ["git", "-C", str(tmp_path), "config", "user.email", "t@t.invalid"], + check=True, capture_output=True, + ) + subprocess.run( + ["git", "-C", str(tmp_path), "config", "user.name", "T"], + check=True, capture_output=True, + ) + (tmp_path / "README.md").write_text("initial") + subprocess.run( + ["git", "-C", str(tmp_path), "add", "README.md"], + check=True, capture_output=True, + ) + subprocess.run( + ["git", "-C", str(tmp_path), "commit", "-m", "init"], + check=True, capture_output=True, + ) + + # Pre-existing UNTRACKED WIP outside the contract. + userwip = tmp_path / "userwip.py" + userwip.write_text("wip") + + monkeypatch.chdir(tmp_path) + runner = self._make_runner( + allowed_write_set=["pdd/foo.py"], + companion_allowlist=[".pdd/meta/*.json"], + ) + runner.project_root = tmp_path.resolve() + + assert "userwip.py" in runner._baseline_changed_paths, ( + "iter-34: untracked WIP must be captured in baseline" + ) + assert runner._baseline_changed_paths["userwip.py"] is not None, ( + "iter-34: baseline SHA must be captured for readable WIP" + ) + + # Simulate sync deleting the file (e.g. refactor removed it). + userwip.unlink() + + monkeypatch.setattr( + mod, "_revert_out_of_scope_changes", lambda _root, _allowed: [] + ) + monkeypatch.setattr( + mod, "revert_out_of_scope_changes_with_dirs", + lambda _root, allowed_dirs, allowed_files: [], + ) + monkeypatch.setattr( + runner, "_resolve_repo_root", lambda _cwd: tmp_path.resolve() + ) + + diagnostic = runner._enforce_scope_guard("mod", tmp_path) + + assert diagnostic is not None, ( + "iter-34: deletion of untracked baseline WIP must hard-fail " + "the module — silent data loss otherwise" + ) + assert "userwip.py" in diagnostic, ( + f"iter-34: deleted baseline path must appear in diagnostic, " + f"got: {diagnostic!r}" + ) + + def test_baseline_deletion_of_ignored_file_is_flagged( + self, tmp_path, monkeypatch + ): + """Iter-34 M-3 (symmetric): a pre-existing gitignored baseline + file that gets deleted during sync MUST hard-fail the module. + ``git ls-files --ignored`` only lists files that currently exist, + so a deletion is invisible to the ignored-rescan loop — a + dedicated baseline iteration is required.""" + from pdd import agentic_sync_runner as mod + + subprocess.run( + ["git", "init", "-b", "main", str(tmp_path)], + check=True, capture_output=True, + ) + subprocess.run( + ["git", "-C", str(tmp_path), "config", "user.email", "t@t.invalid"], + check=True, capture_output=True, + ) + subprocess.run( + ["git", "-C", str(tmp_path), "config", "user.name", "T"], + check=True, capture_output=True, + ) + (tmp_path / ".gitignore").write_text("cache.bin\n") + (tmp_path / "README.md").write_text("initial") + subprocess.run( + ["git", "-C", str(tmp_path), "add", ".gitignore", "README.md"], + check=True, capture_output=True, + ) + subprocess.run( + ["git", "-C", str(tmp_path), "commit", "-m", "init"], + check=True, capture_output=True, + ) + + # Pre-existing gitignored file BEFORE the runner is constructed. + cache = tmp_path / "cache.bin" + cache.write_text("user cache") + + monkeypatch.chdir(tmp_path) + runner = self._make_runner( + allowed_write_set=["pdd/foo.py"], + companion_allowlist=[".pdd/meta/*.json"], + ) + runner.project_root = tmp_path.resolve() + + assert "cache.bin" in runner._baseline_ignored_paths, ( + "iter-34: pre-existing ignored file must be captured in baseline" + ) + assert runner._baseline_ignored_paths["cache.bin"] is not None, ( + "iter-34: baseline SHA must be captured for readable ignored file" + ) + + # Simulate sync deleting the gitignored cache. + cache.unlink() + + monkeypatch.setattr( + mod, "_revert_out_of_scope_changes", lambda _root, _allowed: [] + ) + monkeypatch.setattr( + mod, "revert_out_of_scope_changes_with_dirs", + lambda _root, allowed_dirs, allowed_files: [], + ) + monkeypatch.setattr( + runner, "_resolve_repo_root", lambda _cwd: tmp_path.resolve() + ) + + diagnostic = runner._enforce_scope_guard("mod", tmp_path) + + assert diagnostic is not None, ( + "iter-34: deletion of ignored baseline file must hard-fail " + "the module — git ls-files --ignored leaves no trail" + ) + assert "cache.bin" in diagnostic, ( + f"iter-34: deleted ignored baseline must appear in diagnostic, " + f"got: {diagnostic!r}" + ) + + # --------------------------------------------------------------------- + # Iter-40 M-1: unreadable vs missing baseline discrimination + # --------------------------------------------------------------------- + + def test_baseline_unreadable_file_is_not_misclassified_as_deleted( + self, tmp_path, monkeypatch + ): + """Iter-40 M-1: an unreadable (but still on disk) baseline file + MUST NOT be flagged as deleted. The iter-34 deletion-detection + branch previously collapsed "file gone" and "file unreadable" + (permission flip, locked file) to the same ``current_hash is None`` + signal and treated both as deletions. That falsely claimed the + file was removed AND asked downstream revert helpers to remove + a still-present path. The :func:`_classify_baseline_path` helper + distinguishes the two; the unreadable case falls through to the + legacy preserve-by-name carve-out.""" + from pdd import agentic_sync_runner as mod + + subprocess.run( + ["git", "init", "-b", "main", str(tmp_path)], + check=True, capture_output=True, + ) + subprocess.run( + ["git", "-C", str(tmp_path), "config", "user.email", "t@t.invalid"], + check=True, capture_output=True, + ) + subprocess.run( + ["git", "-C", str(tmp_path), "config", "user.name", "T"], + check=True, capture_output=True, + ) + (tmp_path / "README.md").write_text("initial") + subprocess.run( + ["git", "-C", str(tmp_path), "add", "README.md"], + check=True, capture_output=True, + ) + subprocess.run( + ["git", "-C", str(tmp_path), "commit", "-m", "init"], + check=True, capture_output=True, + ) + + # Pre-existing UNTRACKED WIP — the same shape as iter-34's test. + userwip = tmp_path / "userwip.py" + userwip.write_text("wip") + + monkeypatch.chdir(tmp_path) + runner = self._make_runner( + allowed_write_set=["pdd/foo.py"], + companion_allowlist=[".pdd/meta/*.json"], + ) + runner.project_root = tmp_path.resolve() + + assert "userwip.py" in runner._baseline_changed_paths, ( + "iter-40: untracked WIP must be captured in baseline" + ) + assert runner._baseline_changed_paths["userwip.py"] is not None, ( + "iter-40: baseline SHA must be captured for readable WIP" + ) + + # Iter-40: simulate permission flip / locked file by injecting a + # ``PermissionError`` for the baseline path while leaving the file + # itself on disk. Patching the module's ``open`` attribute is + # more reliable than ``os.chmod(path, 0)`` — macOS root, + # filesystem ACLs, and Spotlight can all defeat the latter. The + # module's function bodies use the unqualified name ``open``, + # which Python's LEGB lookup resolves via the module namespace + # before falling back to the builtin, so a ``setattr(mod, "open", + # ...)`` does intercept the call. + import builtins + wip_resolved = userwip.resolve() + builtin_open = builtins.open + + def _open_with_block(path, *args, **kwargs): + try: + resolved = Path(path).resolve() + except (OSError, TypeError): + return builtin_open(path, *args, **kwargs) + if resolved == wip_resolved: + raise PermissionError("simulated permission flip") + return builtin_open(path, *args, **kwargs) + + monkeypatch.setattr(mod, "open", _open_with_block, raising=False) + + captured_allowed: Dict[str, set] = {} + + def fake_revert(_root, allowed_files): + captured_allowed["files"] = set(allowed_files) + return [] + + monkeypatch.setattr(mod, "_revert_out_of_scope_changes", fake_revert) + monkeypatch.setattr( + mod, "revert_out_of_scope_changes_with_dirs", + lambda _root, allowed_dirs, allowed_files: [], + ) + monkeypatch.setattr( + runner, "_resolve_repo_root", lambda _cwd: tmp_path.resolve() + ) + + diagnostic = runner._enforce_scope_guard("mod", tmp_path) + + # The file is still on disk, so it cannot have been "deleted": + # the diagnostic must NOT flag it as out-of-scope. + assert userwip.exists(), ( + "iter-40 precondition: the unreadable file must still exist" + ) + assert diagnostic is None or "userwip.py" not in diagnostic, ( + "iter-40: unreadable baseline file must be preserved by name " + "(not flagged as deleted). " + f"diagnostic={diagnostic!r}" + ) + # The path must end up in the allowed set so downstream revert + # helpers do not try to remove it. + assert wip_resolved in captured_allowed["files"], ( + "iter-40: unreadable baseline path must be auto-allowed " + "(preserve by name) so revert helpers do not remove it; " + f"got allowed_files={captured_allowed['files']}" + ) + + def test_baseline_missing_file_still_flagged_as_deleted( + self, tmp_path, monkeypatch + ): + """Iter-40 M-1 regression for iter-34: actual deletion of a + pre-existing untracked baseline file MUST still hard-fail the + module. The iter-40 discrimination helper preserves the iter-34 + behaviour for the genuinely-missing case.""" + from pdd import agentic_sync_runner as mod + + subprocess.run( + ["git", "init", "-b", "main", str(tmp_path)], + check=True, capture_output=True, + ) + subprocess.run( + ["git", "-C", str(tmp_path), "config", "user.email", "t@t.invalid"], + check=True, capture_output=True, + ) + subprocess.run( + ["git", "-C", str(tmp_path), "config", "user.name", "T"], + check=True, capture_output=True, + ) + (tmp_path / "README.md").write_text("initial") + subprocess.run( + ["git", "-C", str(tmp_path), "add", "README.md"], + check=True, capture_output=True, + ) + subprocess.run( + ["git", "-C", str(tmp_path), "commit", "-m", "init"], + check=True, capture_output=True, + ) + + userwip = tmp_path / "userwip.py" + userwip.write_text("wip") + + monkeypatch.chdir(tmp_path) + runner = self._make_runner( + allowed_write_set=["pdd/foo.py"], + companion_allowlist=[".pdd/meta/*.json"], + ) + runner.project_root = tmp_path.resolve() + + # GENUINE deletion — iter-34 path must still fire. + userwip.unlink() + + monkeypatch.setattr( + mod, "_revert_out_of_scope_changes", lambda _root, _allowed: [] + ) + monkeypatch.setattr( + mod, "revert_out_of_scope_changes_with_dirs", + lambda _root, allowed_dirs, allowed_files: [], + ) + monkeypatch.setattr( + runner, "_resolve_repo_root", lambda _cwd: tmp_path.resolve() + ) + + diagnostic = runner._enforce_scope_guard("mod", tmp_path) + + assert diagnostic is not None, ( + "iter-40: genuinely-deleted untracked baseline must still " + "hard-fail the module — iter-34 regression" + ) + assert "userwip.py" in diagnostic, ( + f"iter-40: deleted baseline path must still appear in diagnostic, " + f"got: {diagnostic!r}" + ) + + def test_wildcard_only_companion_pattern_does_not_auto_allow( + self, tmp_path, monkeypatch + ): + """Iter-10 M-1: a wildcard-only pattern (``**/*``) that bypassed the + parser (e.g. constructed directly, injected through a non-issue + code path) MUST NOT cause ``_matches_companion_allowlist`` to + auto-allow repo-wide writes. The defense-in-depth filter inside + the runner rejects wildcard-only patterns the same way the parser + does.""" + from pdd import agentic_sync_runner as mod + + repo = tmp_path + # Create an out-of-scope file under the module's cwd. + unrelated = repo / "unrelated" + unrelated.mkdir() + offending = unrelated / "file.py" + offending.write_text("out of scope") + + captured_allowed = {} + + def fake_revert(repo_root, allowed_files): + captured_allowed["files"] = set(allowed_files) + # Pretend we reverted the out-of-scope file so the diagnostic + # path returns a non-None message; the assertion below is on + # the auto-allow decision, not on the revert mechanics. + return [offending] + + monkeypatch.setattr(mod, "_revert_out_of_scope_changes", fake_revert) + monkeypatch.setattr( + mod, + "revert_out_of_scope_changes_with_dirs", + lambda _root, allowed_dirs, allowed_files: [], + ) + + runner = self._make_runner( + allowed_write_set=["pdd/foo.py"], + # Inject the dangerous wildcard-only pattern directly, + # bypassing the parser. + companion_allowlist=["**/*"], + ) + monkeypatch.setattr( + runner, "_resolve_repo_root", lambda _cwd: repo.resolve() + ) + # iter-9 M-1 re-scan needs git; stub it out — this test only + # cares about the auto-allow decision feeding ``fake_revert``. + monkeypatch.setattr( + runner, "_remaining_out_of_scope_paths", + lambda _root, _allowed: [], + ) + + # The defense-in-depth filter must reject ``**/*`` directly. + assert runner._matches_companion_allowlist( + "unrelated/file.py", ("**/*",) + ) is False + + diagnostic = runner._enforce_scope_guard("mod", repo) + # The offending file must NOT be in the auto-allowed set despite + # the wildcard-only pattern living in self.companion_allowlist. + assert offending.resolve() not in captured_allowed.get("files", set()), ( + "wildcard-only companion pattern must NOT auto-allow " + "repo-wide writes (iter-10 M-1)" + ) + # Because fake_revert returned the offending file, the diagnostic + # is non-None — confirming the scope guard did flag it. + assert diagnostic is not None + + def test_nested_meta_path_is_not_auto_allowed( + self, tmp_path, monkeypatch + ): + """Iter-14 M-1: the default companion pattern ``.pdd/meta/*.json`` + was previously matched with ``PurePosixPath.match`` (suffix-based), + which falsely matched any path ending in ``.pdd/meta/.json`` + — including ``subdir/.pdd/meta/bar.json``. The anchored matcher + MUST treat the default pattern as TOP-LEVEL, so a nested path is + out of scope even though it carries the right basename and dir + suffix. + """ + from pdd import agentic_sync_runner as mod + + repo = tmp_path + # Create an out-of-scope file at a NESTED .pdd/meta path — the + # exact bug shape: a fingerprint-shaped file under a subdir. + nested = repo / "subdir" / ".pdd" / "meta" + nested.mkdir(parents=True) + offending = nested / "bar.json" + offending.write_text("{}", encoding="utf-8") + + captured_allowed = {} + + def fake_revert(repo_root, allowed_files): + captured_allowed["files"] = set(allowed_files) + return [offending] + + monkeypatch.setattr(mod, "_revert_out_of_scope_changes", fake_revert) + monkeypatch.setattr( + mod, + "revert_out_of_scope_changes_with_dirs", + lambda _root, allowed_dirs, allowed_files: [], + ) + + runner = self._make_runner( + allowed_write_set=["pdd/foo.py"], + companion_allowlist=[".pdd/meta/*.json"], + ) + monkeypatch.setattr( + runner, "_resolve_repo_root", lambda _cwd: repo.resolve() + ) + monkeypatch.setattr( + runner, "_remaining_out_of_scope_paths", + lambda _root, _allowed: [], + ) + + # Direct matcher assertion: anchored, segment-aware match must + # REJECT a nested .pdd/meta/*.json path against the top-level + # pattern (the iter-14 M-1 bug shape). + assert runner._matches_companion_allowlist( + "subdir/.pdd/meta/bar.json", (".pdd/meta/*.json",) + ) is False + + diagnostic = runner._enforce_scope_guard("mod", repo) + # Nested file must NOT be auto-allowed even though it is shaped + # like the default companion pattern. + assert offending.resolve() not in captured_allowed.get("files", set()), ( + "nested .pdd/meta path must NOT be auto-allowed by the " + "default top-level companion pattern (iter-14 M-1)" + ) + assert diagnostic is not None + + def test_deleted_companion_in_git_status_is_preserved( + self, tmp_path, monkeypatch + ): + """Iter-4 F1: when sync legitimately deletes ``.pdd/meta/foo.json``, + the file no longer exists on disk so ``rglob`` misses it. The + deletion appears in ``git status`` as tracked ``D ``; the runner + MUST still treat it as auto-allowed so the revert helper does not + resurrect it and hard-fail the module.""" + from pdd import agentic_sync_runner as mod + + # No file is created on disk — simulates the post-delete state. + deleted_rel = ".pdd/meta/old_module_python.json" + + monkeypatch.setattr( + mod, "_git_changed_paths", lambda _root: {deleted_rel} + ) + captured = {} + + def fake_revert(repo_root, allowed_files): + captured["allowed"] = set(allowed_files) + return [] + + monkeypatch.setattr(mod, "_revert_out_of_scope_changes", fake_revert) + monkeypatch.setattr( + mod, + "revert_out_of_scope_changes_with_dirs", + lambda _root, allowed_dirs, allowed_files: [], + ) + + runner = self._make_runner( + allowed_write_set=["pdd/foo.py"], + companion_allowlist=[".pdd/meta/*.json"], + ) + # Iter-34 M-3: clear the baseline snapshot taken from the real + # working tree so this test isolates the rglob/_git_changed_paths + # companion-allowlist behavior under inspection. + runner._baseline_changed_paths = {} + runner._baseline_ignored_paths = {} + monkeypatch.setattr( + runner, "_resolve_repo_root", lambda _cwd: tmp_path.resolve() + ) + # The iter-9 M-1 post-revert re-scan calls ``git status``; this test + # uses a tmp_path without a real ``git init`` and only cares about + # the rglob/_git_changed_paths companion-allowlist behavior, so stub + # the re-scan to return [] (matching the helpers' mocked behavior). + monkeypatch.setattr( + runner, "_remaining_out_of_scope_paths", + lambda _root, _allowed: [], + ) + + diagnostic = runner._enforce_scope_guard("old_module", tmp_path) + assert diagnostic is None, ( + "Deleted companion artifact should be auto-allowed, not flagged" + ) + assert (tmp_path / deleted_rel).resolve() in captured["allowed"] + + # ------------------------------------------------------------------ + # Iter-9 M-1: fail-closed boundary — re-scan after revert helpers + # ------------------------------------------------------------------ + def _init_repo(self, tmp_path): + """Create a minimal git repo so re-scan ``git status`` succeeds.""" + subprocess.run(["git", "init", "-b", "main", str(tmp_path)], check=True, + capture_output=True) + subprocess.run(["git", "-C", str(tmp_path), "config", "user.email", + "t@t.invalid"], check=True, capture_output=True) + subprocess.run(["git", "-C", str(tmp_path), "config", "user.name", + "T"], check=True, capture_output=True) + (tmp_path / "README.md").write_text("initial") + subprocess.run(["git", "-C", str(tmp_path), "add", "README.md"], + check=True, capture_output=True) + subprocess.run(["git", "-C", str(tmp_path), "commit", "-m", "init"], + check=True, capture_output=True) + + def test_fail_open_regression_unrecovered_path_hard_fails( + self, tmp_path, monkeypatch + ): + """Iter-9 M-1: when both revert helpers return ``[]`` (simulating + helper failure) AND an out-of-scope file remains on disk, the scope + guard MUST hard-fail by returning a diagnostic that surfaces the + unrecovered path under ``Unrecovered``.""" + from pdd import agentic_sync_runner as mod + + self._init_repo(tmp_path) + # Out-of-scope untracked file the revert helper "failed" to remove. + stray = tmp_path / "stray.txt" + stray.write_text("contract violation") + + # Simulate both helpers fail-open (returning []). + monkeypatch.setattr(mod, "_revert_out_of_scope_changes", + lambda _root, _allowed: []) + monkeypatch.setattr( + mod, "revert_out_of_scope_changes_with_dirs", + lambda _root, allowed_dirs, allowed_files: [], + ) + + runner = self._make_runner( + allowed_write_set=["pdd/foo.py"], + companion_allowlist=[".pdd/meta/*.json"], + ) + monkeypatch.setattr( + runner, "_resolve_repo_root", lambda _cwd: tmp_path.resolve() + ) + + diagnostic = runner._enforce_scope_guard("mod", tmp_path) + + assert diagnostic is not None, ( + "Fail-open regression: helpers returned [] but stray.txt still " + "violates the contract; guard must hard-fail." + ) + assert "Unrecovered" in diagnostic + assert "stray.txt" in diagnostic + + def test_clean_working_tree_returns_none(self, tmp_path, monkeypatch): + """No out-of-scope files on disk AND both helpers return [] → + guard MUST return ``None`` (the in-scope path).""" + from pdd import agentic_sync_runner as mod + + self._init_repo(tmp_path) + + monkeypatch.setattr(mod, "_revert_out_of_scope_changes", + lambda _root, _allowed: []) + monkeypatch.setattr( + mod, "revert_out_of_scope_changes_with_dirs", + lambda _root, allowed_dirs, allowed_files: [], + ) + + runner = self._make_runner( + allowed_write_set=["pdd/foo.py"], + companion_allowlist=[".pdd/meta/*.json"], + ) + # Iter-34 M-3: ``_make_runner`` snapshots the baseline from the + # current working directory (the real ``pdd`` repo when the test + # didn't chdir). Reset the snapshot to ``{}`` so the post-fix + # baseline-deletion scan does not flag pre-existing dirty paths + # from the developer's working tree as silently deleted. + runner._baseline_changed_paths = {} + runner._baseline_ignored_paths = {} + monkeypatch.setattr( + runner, "_resolve_repo_root", lambda _cwd: tmp_path.resolve() + ) + + assert runner._enforce_scope_guard("mod", tmp_path) is None + + def test_mixed_success_reverted_and_unrecovered_both_surface( + self, tmp_path, monkeypatch + ): + """Helpers report some reverted paths AND additional out-of-scope + paths remain. Diagnostic MUST contain both 'reverted' and + 'Unrecovered' sections. Companion-allowlisted files (e.g. + ``.pdd/meta/.json`` under ``module_cwd``) must NOT appear + under Unrecovered — they are auto-allowed.""" + from pdd import agentic_sync_runner as mod + + self._init_repo(tmp_path) + + # Companion-allowlisted file under module_cwd: must be auto-allowed. + meta_dir = tmp_path / ".pdd" / "meta" + meta_dir.mkdir(parents=True) + companion = meta_dir / "mod_python.json" + companion.write_text("{}") + # Out-of-scope untracked file the helpers "failed" to remove. + stray = tmp_path / "stray.txt" + stray.write_text("contract violation") + + reverted_path = tmp_path / "pdd" / "reverted_already.py" + + monkeypatch.setattr( + mod, "_revert_out_of_scope_changes", + lambda _root, _allowed: [reverted_path], + ) + monkeypatch.setattr( + mod, "revert_out_of_scope_changes_with_dirs", + lambda _root, allowed_dirs, allowed_files: [], + ) + + runner = self._make_runner( + allowed_write_set=["pdd/foo.py"], + companion_allowlist=[".pdd/meta/*.json"], + ) + monkeypatch.setattr( + runner, "_resolve_repo_root", lambda _cwd: tmp_path.resolve() + ) + + diagnostic = runner._enforce_scope_guard("mod", tmp_path) + + assert diagnostic is not None + # Reverted-side: existing diagnostic prefix surfaces the revert count. + assert "Scope guard reverted 1 out-of-scope file(s)" in diagnostic + assert "pdd/reverted_already.py" in diagnostic + # Unrecovered-side: the stray file shows up under the new section. + assert "Unrecovered" in diagnostic + assert "stray.txt" in diagnostic + # Companion artifact must NOT be flagged. + assert ".pdd/meta/mod_python.json" not in diagnostic + + def test_git_status_failed_sentinel_surfaces_in_diagnostic( + self, tmp_path, monkeypatch + ): + """When the post-revert ``git status`` itself fails (timeout, missing + binary, non-zero return), ``_remaining_out_of_scope_paths`` returns + the sentinel ``['']`` and the diagnostic MUST + clearly indicate the failure under ``Unrecovered`` so operators do + not silently trust an unobservable working tree.""" + from pdd import agentic_sync_runner as mod + + monkeypatch.setattr(mod, "_revert_out_of_scope_changes", + lambda _root, _allowed: []) + monkeypatch.setattr( + mod, "revert_out_of_scope_changes_with_dirs", + lambda _root, allowed_dirs, allowed_files: [], + ) + + runner = self._make_runner( + allowed_write_set=["pdd/foo.py"], + companion_allowlist=[".pdd/meta/*.json"], + ) + monkeypatch.setattr( + runner, "_resolve_repo_root", lambda _cwd: tmp_path.resolve() + ) + # Force the re-scan to report the sentinel directly. + monkeypatch.setattr( + runner, "_remaining_out_of_scope_paths", + lambda _root, _allowed: [""], + ) + + diagnostic = runner._enforce_scope_guard("mod", tmp_path) + + assert diagnostic is not None, ( + "Sentinel [''] must hard-fail the module." + ) + assert "Unrecovered" in diagnostic + # Allow the sentinel wording to evolve — just check the failure + # indicator is present. + assert "git-status-failed" in diagnostic + + # --------------------------------------------------------------------- + # Iter-20 M-1: gitignored out-of-scope detection + # --------------------------------------------------------------------- + + @staticmethod + def _init_git_repo(repo_root: Path) -> None: + """Initialise a minimal committed git repo for ignored-scan tests.""" + subprocess.run( + ["git", "init", "-b", "main", str(repo_root)], + check=True, capture_output=True, + ) + subprocess.run( + ["git", "-C", str(repo_root), "config", "user.email", "t@t.invalid"], + check=True, capture_output=True, + ) + subprocess.run( + ["git", "-C", str(repo_root), "config", "user.name", "T"], + check=True, capture_output=True, + ) + (repo_root / "README.md").write_text("initial") + subprocess.run( + ["git", "-C", str(repo_root), "add", "README.md"], + check=True, capture_output=True, + ) + subprocess.run( + ["git", "-C", str(repo_root), "commit", "-m", "init"], + check=True, capture_output=True, + ) + + def test_gitignored_out_of_scope_file_is_detected( + self, tmp_path, monkeypatch + ): + """Iter-20 M-1: a sync that writes to a gitignored path outside the + contract (e.g. ``build/junk.txt`` under ``.gitignore: build/``) is + invisible to ``git status --untracked-files=all`` — but the second + ``git ls-files --ignored`` scan MUST catch it and surface it via the + ```` set so ``_enforce_scope_guard`` hard-fails.""" + from pdd import agentic_sync_runner as mod + + self._init_git_repo(tmp_path) + (tmp_path / ".gitignore").write_text("build/\n") + subprocess.run( + ["git", "-C", str(tmp_path), "add", ".gitignore"], + check=True, capture_output=True, + ) + subprocess.run( + ["git", "-C", str(tmp_path), "commit", "-m", "ignore build"], + check=True, capture_output=True, + ) + + # Construct the runner FIRST (empty ignored baseline), then create + # the gitignored stray AFTER — simulates sync writing it. + monkeypatch.chdir(tmp_path) + runner = self._make_runner( + allowed_write_set=["pdd/foo.py"], + companion_allowlist=[".pdd/meta/*.json"], + ) + runner.project_root = tmp_path.resolve() + # Iter-24 M-1: baseline snapshots are now Dict[str, Optional[str]]. + assert runner._baseline_ignored_paths == {}, ( + "no ignored files exist yet — baseline must be empty" + ) + + # Sync writes a gitignored file outside the contract. + (tmp_path / "build").mkdir() + (tmp_path / "build" / "junk.txt").write_text("bad") + + # Revert helpers can't see gitignored files — simulate them + # returning empty as Codex's repro describes. + monkeypatch.setattr(mod, "_revert_out_of_scope_changes", + lambda _root, _allowed: []) + monkeypatch.setattr( + mod, "revert_out_of_scope_changes_with_dirs", + lambda _root, allowed_dirs, allowed_files: [], + ) + monkeypatch.setattr( + runner, "_resolve_repo_root", lambda _cwd: tmp_path.resolve() + ) + + diagnostic = runner._enforce_scope_guard("mod", tmp_path) + + assert diagnostic is not None, ( + "gitignored out-of-scope file must hard-fail the module" + ) + assert "build/junk.txt" in diagnostic + assert "Unrecovered" in diagnostic + + def test_gitignored_baseline_file_is_not_falsely_flagged( + self, tmp_path, monkeypatch + ): + """Iter-20 M-1: pre-existing gitignored files (snapshotted at runner + init) MUST NOT be flagged as the sync run's out-of-scope writes.""" + from pdd import agentic_sync_runner as mod + + self._init_git_repo(tmp_path) + (tmp_path / ".gitignore").write_text("cache.bin\n") + subprocess.run( + ["git", "-C", str(tmp_path), "add", ".gitignore"], + check=True, capture_output=True, + ) + subprocess.run( + ["git", "-C", str(tmp_path), "commit", "-m", "ignore cache"], + check=True, capture_output=True, + ) + + # Create the gitignored file BEFORE constructing the runner so it + # lands in the baseline ignored set. + (tmp_path / "cache.bin").write_text("user cache") + + monkeypatch.chdir(tmp_path) + runner = self._make_runner( + allowed_write_set=["pdd/foo.py"], + companion_allowlist=[".pdd/meta/*.json"], + ) + runner.project_root = tmp_path.resolve() + + assert "cache.bin" in runner._baseline_ignored_paths, ( + "pre-existing ignored file must be captured in baseline" + ) + + monkeypatch.setattr(mod, "_revert_out_of_scope_changes", + lambda _root, _allowed: []) + monkeypatch.setattr( + mod, "revert_out_of_scope_changes_with_dirs", + lambda _root, allowed_dirs, allowed_files: [], + ) + monkeypatch.setattr( + runner, "_resolve_repo_root", lambda _cwd: tmp_path.resolve() + ) + + diagnostic = runner._enforce_scope_guard("mod", tmp_path) + + assert diagnostic is None, ( + "pre-existing ignored file must not be flagged: " + f"got {diagnostic!r}" + ) + + def test_gitignored_companion_artifact_is_allowed( + self, tmp_path, monkeypatch + ): + """Iter-20 M-1 integration: when the user has ``.pdd/`` in + ``.gitignore`` (this very project does — see commit a7ce5f0ee), the + fingerprint metadata file ``.pdd/meta/mod_python.json`` is gitignored + — but it's also in the default companion allowlist, so the existing + ``rglob`` + companion-match path in ``_enforce_scope_guard`` adds it + to ``allowed_files``. The new ignored-files scan MUST skip it. + """ + from pdd import agentic_sync_runner as mod + + self._init_git_repo(tmp_path) + (tmp_path / ".gitignore").write_text(".pdd/\n") + subprocess.run( + ["git", "-C", str(tmp_path), "add", ".gitignore"], + check=True, capture_output=True, + ) + subprocess.run( + ["git", "-C", str(tmp_path), "commit", "-m", "ignore .pdd"], + check=True, capture_output=True, + ) + + monkeypatch.chdir(tmp_path) + runner = self._make_runner( + allowed_write_set=["pdd/foo.py"], + companion_allowlist=[".pdd/meta/*.json"], + ) + runner.project_root = tmp_path.resolve() + + # Sync writes a companion-allowed fingerprint file — also gitignored. + (tmp_path / ".pdd" / "meta").mkdir(parents=True) + (tmp_path / ".pdd" / "meta" / "mod_python.json").write_text("{}") + + monkeypatch.setattr(mod, "_revert_out_of_scope_changes", + lambda _root, _allowed: []) + monkeypatch.setattr( + mod, "revert_out_of_scope_changes_with_dirs", + lambda _root, allowed_dirs, allowed_files: [], + ) + monkeypatch.setattr( + runner, "_resolve_repo_root", lambda _cwd: tmp_path.resolve() + ) + + diagnostic = runner._enforce_scope_guard("mod", tmp_path) + + assert diagnostic is None, ( + "gitignored companion artifact must remain auto-allowed: " + f"got {diagnostic!r}" + ) + + def test_ignored_scan_failure_returns_sentinel( + self, tmp_path, monkeypatch + ): + """Iter-20 M-1 fail-closed: when the ``git ls-files --ignored`` scan + itself fails (here: forced FileNotFoundError), the remaining-paths + helper MUST return the existing ```` sentinel so + ``_enforce_scope_guard`` hard-fails rather than treating an + unobservable worktree as clean. + + The status scan succeeds (returning an empty set); only the ignored + scan fails. The sentinel symmetry across both scans is what the + iter-19 review flagged as required. + """ + from pdd import agentic_sync_runner as mod + + self._init_git_repo(tmp_path) + + monkeypatch.chdir(tmp_path) + runner = self._make_runner( + allowed_write_set=["pdd/foo.py"], + companion_allowlist=[".pdd/meta/*.json"], + ) + runner.project_root = tmp_path.resolve() + + real_run = subprocess.run + + def fake_run(cmd, *args, **kwargs): + # The status scan keeps working; the ignored scan blows up. + if ( + isinstance(cmd, list) + and "ls-files" in cmd + and "--ignored" in cmd + ): + raise FileNotFoundError("git ls-files unavailable") + return real_run(cmd, *args, **kwargs) + + monkeypatch.setattr( + mod.subprocess, "run", fake_run + ) + + result = runner._remaining_out_of_scope_paths( + tmp_path.resolve(), allowed_files=set() + ) + assert result == [""], ( + "ignored-scan failure must surface the sentinel" + ) + + # --------------------------------------------------------------------- + # Iter-36 B-1/B-2: PDD-internal-path allowlist + # --------------------------------------------------------------------- + + def test_pdd_audit_logs_do_not_trip_runner_guard( + self, tmp_path, monkeypatch + ): + """Iter-36 B-1: PDD's own audit logs at ``.pdd/agentic-logs/`` written + by :func:`run_agentic_task` during a per-module sync MUST NOT + hard-fail the per-module scope guard. The audit log is tool + infrastructure (NEVER part of a contract) and the internal allowlist + auto-allows it. + """ + from pdd import agentic_sync_runner as mod + + self._init_git_repo(tmp_path) + monkeypatch.chdir(tmp_path) + + runner = self._make_runner( + allowed_write_set=["pdd/foo.py"], + companion_allowlist=[".pdd/meta/*.json"], + ) + runner.project_root = tmp_path.resolve() + + # Audit log appears AFTER runner init — simulates run_agentic_task + # writing a session record during the per-module subprocess. + log_dir = tmp_path / ".pdd" / "agentic-logs" + log_dir.mkdir(parents=True) + log_file = log_dir / "session_20251215_120000.jsonl" + log_file.write_text('{"label": "step1"}\n', encoding="utf-8") + + # Real revert helpers — internal allowlist must keep the log alive. + monkeypatch.setattr( + runner, "_resolve_repo_root", lambda _cwd: tmp_path.resolve() + ) + + diagnostic = runner._enforce_scope_guard("mod", tmp_path) + assert diagnostic is None, ( + f"iter-36 B-1: PDD audit log under .pdd/agentic-logs/ must be " + f"auto-allowed by the internal allowlist in the per-module " + f"guard; got diagnostic: {diagnostic!r}" + ) + assert log_file.exists(), ( + "iter-36 B-1: internal-allowlisted audit log must not be removed" + ) + + def test_pdd_audit_logs_do_not_trip_runner_guard_multi_module( + self, tmp_path, monkeypatch + ): + """Iter-36 B-1/B-2 multi-module variant: when ``module_cwd`` is a + subdirectory (multi-module sync), the audit log under + ``/.pdd/agentic-logs/`` is REPO-rooted, not module-rooted. + The internal allowlist pass must scan repo-rooted so it still matches + — a module-rooted-only pass would miss it. + """ + from pdd import agentic_sync_runner as mod + + self._init_git_repo(tmp_path) + monkeypatch.chdir(tmp_path) + + # Module lives at /mod_a/; audit log lives at + # /.pdd/agentic-logs/. + module_cwd = tmp_path / "mod_a" + module_cwd.mkdir() + log_dir = tmp_path / ".pdd" / "agentic-logs" + log_dir.mkdir(parents=True) + log_file = log_dir / "session_20251215_120000.jsonl" + log_file.write_text('{"label": "step1"}\n', encoding="utf-8") + + runner = self._make_runner( + allowed_write_set=["pdd/foo.py"], + companion_allowlist=[".pdd/meta/*.json"], + ) + runner.project_root = tmp_path.resolve() + + monkeypatch.setattr( + runner, "_resolve_repo_root", lambda _cwd: tmp_path.resolve() + ) + + diagnostic = runner._enforce_scope_guard("mod_a", module_cwd) + assert diagnostic is None, ( + f"iter-36 B-1/B-2: in multi-module sync the audit log lives at " + f"repo-rooted .pdd/agentic-logs/, NOT module-rooted; a " + f"module-rooted-only allowlist pass would miss it. " + f"Got diagnostic: {diagnostic!r}" + ) + assert log_file.exists() + + def test_runner_state_file_does_not_trip_per_module_guard( + self, tmp_path, monkeypatch + ): + """Iter-36 B-2: ``.pdd/agentic_sync_state.json`` written by + :meth:`AsyncSyncRunner._record_result` after the previous module's + scope guard runs is on disk when the NEXT module's guard runs in a + multi-module sync. Without the internal allowlist, the next module's + guard hard-fails on the previous module's state file. Verify the + state file is auto-allowed. + """ + from pdd import agentic_sync_runner as mod + + self._init_git_repo(tmp_path) + monkeypatch.chdir(tmp_path) + + # Simulate previous-module state file present on disk. + state_dir = tmp_path / ".pdd" + state_dir.mkdir(parents=True, exist_ok=True) + state_file = state_dir / "agentic_sync_state.json" + state_file.write_text('{"version": 1, "modules": {}}', encoding="utf-8") + + runner = self._make_runner( + allowed_write_set=["pdd/foo.py"], + companion_allowlist=[".pdd/meta/*.json"], + ) + runner.project_root = tmp_path.resolve() + monkeypatch.setattr( + runner, "_resolve_repo_root", lambda _cwd: tmp_path.resolve() + ) + + diagnostic = runner._enforce_scope_guard("mod", tmp_path) + assert diagnostic is None, ( + f"iter-36 B-2: runner state file at .pdd/agentic_sync_state.json " + f"must be auto-allowed by the internal allowlist; " + f"got diagnostic: {diagnostic!r}" + ) + assert state_file.exists(), ( + "iter-36 B-2: internal-allowlisted state file must not be removed" + ) + + # --------------------------------------------------------------------------- # Issue #745: initial_cost (LLM module analysis cost) tracking # --------------------------------------------------------------------------- diff --git a/tests/test_commands_maintenance.py b/tests/test_commands_maintenance.py index 15b0d657c..33f416fdc 100644 --- a/tests/test_commands_maintenance.py +++ b/tests/test_commands_maintenance.py @@ -684,6 +684,48 @@ def test_sync_architecture_calls_handle_error_on_exception(mock_sync_prompts, mo assert call_args[1] == "sync-architecture" +@patch('pdd.core.cli.auto_update') +@patch('pdd.commands.maintenance.run_agentic_sync') +def test_sync_no_scope_guard_flag_propagates_to_run_agentic_sync( + mock_agentic_sync, + mock_auto_update, + runner, +): + """Issue #1013 (F9, F10): ``pdd sync --no-scope-guard`` must + propagate ``scope_guard=False`` to ``run_agentic_sync``.""" + mock_agentic_sync.return_value = (True, "ok", 0.0, "model") + + result = runner.invoke( + cli.cli, + ["sync", "https://github.com/owner/repo/issues/42", "--no-scope-guard"], + ) + + assert result.exit_code == 0, result.output + mock_agentic_sync.assert_called_once() + assert mock_agentic_sync.call_args.kwargs["scope_guard"] is False + + +@patch('pdd.core.cli.auto_update') +@patch('pdd.commands.maintenance.run_agentic_sync') +def test_sync_without_no_scope_guard_defaults_to_enforcement( + mock_agentic_sync, + mock_auto_update, + runner, +): + """Issue #1013 (F10): the default for ``--no-scope-guard`` is False, so + ``scope_guard=True`` should flow through when the flag is omitted.""" + mock_agentic_sync.return_value = (True, "ok", 0.0, "model") + + result = runner.invoke( + cli.cli, + ["sync", "https://github.com/owner/repo/issues/42"], + ) + + assert result.exit_code == 0, result.output + mock_agentic_sync.assert_called_once() + assert mock_agentic_sync.call_args.kwargs["scope_guard"] is True + + @patch('pdd.core.cli.auto_update') def test_sync_architecture_uses_nearest_cwd_project(mock_auto_update, runner, tmp_path, monkeypatch): """CLI should target the nearest ancestor project, not always the repo root.""" diff --git a/tests/test_durable_sync_runner.py b/tests/test_durable_sync_runner.py index 4dcc578f4..5509d8f03 100644 --- a/tests/test_durable_sync_runner.py +++ b/tests/test_durable_sync_runner.py @@ -298,6 +298,13 @@ def test_metadata_allowlist_rejects_nested_pdd_state_and_wrong_meta_scope(tmp_pa def test_unsafe_staged_paths_rejects_sensitive_artifacts(tmp_path: Path): + """Iter-42 M-1: PDD's own infrastructure writes (``.pdd/agentic-logs/*``, + ``.pdd/agentic_sync_state.json``) match the internal allowlist and must + be treated as safe at checkpoint validation time — mirrors the async + per-module guard. Paths that sit under ``.pdd/`` but are NOT in the + internal allowlist (e.g. ``.pdd/worktrees/...``, ``.pdd/cache/...``) + remain unsafe. + """ repo = _init_repo_with_remote(tmp_path) runner = _runner(repo) @@ -312,12 +319,14 @@ def test_unsafe_staged_paths_rejects_sensitive_artifacts(tmp_path: Path): "config/token.txt", "config/secrets/api.txt", ".pdd/worktrees/sync-issue-1328-foo", - ".pdd/agentic_sync_state.json", ".pdd/cache/unrelated.json", ] safe_paths = [ "src/app.py", ".pdd/meta/foo_python.json", + # Internal allowlist: tool infrastructure, never user-contracted. + ".pdd/agentic_sync_state.json", + ".pdd/agentic-logs/session_test.jsonl", ] result = runner._unsafe_staged_paths("foo", [*unsafe_paths, *safe_paths]) @@ -325,6 +334,353 @@ def test_unsafe_staged_paths_rejects_sensitive_artifacts(tmp_path: Path): assert result == sorted(unsafe_paths) +def test_durable_does_not_flag_pdd_audit_logs_at_checkpoint(tmp_path: Path): + """Iter-42 M-1: PDD's own audit logs under ``.pdd/agentic-logs/`` are + tool-infrastructure side effects of running, never user-contracted. + The durable checkpoint-staging validation must mirror the async + per-module guard (iter-36 B-1/B-2) and skip ``PDD_INTERNAL_PATH_ALLOWLIST`` + matches; otherwise contracted durable runs hard-fail at checkpoint on + PDD's own audit logs. + """ + repo = _init_repo_with_remote(tmp_path) + runner = _runner( + repo, + allowed_write_set=["pdd/foo.py"], + companion_allowlist=[".pdd/meta/*.json"], + ) + + result = runner._out_of_scope_staged_paths( + ["pdd/foo.py", ".pdd/agentic-logs/session_test.jsonl"], + "foo", + repo, + ) + assert ".pdd/agentic-logs/session_test.jsonl" not in result, ( + "PDD audit logs must NOT be flagged as out-of-contract by the " + "durable checkpoint validation (iter-42 M-1)" + ) + assert result == [] + + +def test_durable_does_not_flag_pdd_state_file_at_checkpoint(tmp_path: Path): + """Iter-42 M-1: ``.pdd/agentic_sync_state.json`` is the runner state + file — internal PDD infrastructure, NOT a contract artifact. The + durable checkpoint validation must auto-allow it via the internal + allowlist (mirrors async per-module guard). + """ + repo = _init_repo_with_remote(tmp_path) + runner = _runner( + repo, + allowed_write_set=["pdd/foo.py"], + companion_allowlist=[".pdd/meta/*.json"], + ) + + result = runner._out_of_scope_staged_paths( + ["pdd/foo.py", ".pdd/agentic_sync_state.json"], + "foo", + repo, + ) + assert ".pdd/agentic_sync_state.json" not in result, ( + "PDD runner state file must NOT be flagged as out-of-contract " + "by the durable checkpoint validation (iter-42 M-1)" + ) + assert result == [] + + +def test_durable_still_flags_unrelated_pdd_artifacts(tmp_path: Path): + """Iter-42 M-1 (negative): the internal allowlist must NOT widen into + a generic ``.pdd/**`` bypass. A path like ``.pdd/random/junk.txt`` + that does NOT match any ``PDD_INTERNAL_PATH_ALLOWLIST`` pattern must + still be flagged as out-of-contract. + """ + repo = _init_repo_with_remote(tmp_path) + runner = _runner( + repo, + allowed_write_set=["pdd/foo.py"], + companion_allowlist=[".pdd/meta/*.json"], + ) + + result = runner._out_of_scope_staged_paths( + ["pdd/foo.py", ".pdd/random/junk.txt"], + "foo", + repo, + ) + assert result == [".pdd/random/junk.txt"], ( + "unrelated .pdd artifacts must still be flagged as out-of-" + "contract — the internal allowlist is fixed, not a generic " + ".pdd/** bypass (iter-42 M-1 negative)" + ) + + +def test_durable_unsafe_skips_pdd_internal_allowlist(tmp_path: Path): + """Iter-42 M-1 (unsafe parity): the per-path unsafe-classification + rules in ``_unsafe_staged_paths`` would otherwise reject + ``.pdd/agentic-logs/foo.jsonl`` via the ``_pdd_path_index`` branch + (under ``.pdd/`` but NOT a recognized meta artifact). Internal + allowlist patterns must take precedence so PDD's own infrastructure + writes are not classified as unsafe at checkpoint time. + """ + repo = _init_repo_with_remote(tmp_path) + runner = _runner(repo) + + result = runner._unsafe_staged_paths( + "foo", + [ + ".pdd/agentic-logs/session_test.jsonl", + ".pdd/agentic_sync_state.json", + ".pdd/bug-state/foo.json", + # Negative control: not in internal allowlist, must still + # land in unsafe via _pdd_path_index branch. + ".pdd/random/junk.txt", + ], + ) + assert result == [".pdd/random/junk.txt"], ( + "internal allowlist matches must be skipped before unsafe-" + "classification rules run; only paths NOT in the allowlist " + "should surface as unsafe (iter-42 M-1)" + ) + + +def test_allowed_write_set_rejects_out_of_scope_checkpoint_paths(tmp_path: Path): + """ + Issue #1013 (F5, F13, F14): kwarg is now ``allowed_write_set`` (the + ``allowed_write_paths`` alias was removed) and ``.pdd/meta/*.json`` is + auto-allowed via ``DEFAULT_SYNC_COMPANION_ALLOWLIST`` — only paths + outside both the contract AND the companion allowlist are rejected. + """ + repo = _init_repo_with_remote(tmp_path) + runner = _runner(repo, allowed_write_set=["src/app.py"]) + + assert runner._out_of_scope_staged_paths( + ["src/app.py", "architecture.json", ".pdd/meta/foo_python.json"], + "foo", + repo, + ) == ["architecture.json"] + + +def test_allowed_write_set_none_means_permissive_for_durable_runner(tmp_path: Path): + """ + Issue #1013 (F9): when no contract is parsed (``allowed_write_set=None``), + durable sync runs in permissive mode — out-of-scope rejection is a no-op. + """ + repo = _init_repo_with_remote(tmp_path) + runner = _runner(repo, allowed_write_set=None) + + assert runner._out_of_scope_staged_paths( + ["src/app.py", "architecture.json", "anything/else.txt"], + "foo", + repo, + ) == [] + + +def test_allowed_write_set_empty_rejects_everything_for_durable_runner(tmp_path: Path): + """ + Issue #1013 (F9): explicit empty contract means "reject every primary + write" — though companion artifacts still pass via the default allowlist. + """ + repo = _init_repo_with_remote(tmp_path) + runner = _runner(repo, allowed_write_set=[]) + + result = runner._out_of_scope_staged_paths( + ["src/app.py", ".pdd/meta/foo_python.json"], + "foo", + repo, + ) + assert result == ["src/app.py"] + + +def test_wildcard_only_companion_pattern_is_ignored_by_durable_runner( + tmp_path: Path, +): + """Iter-10 M-1: even if a wildcard-only pattern (``**/*``) bypasses the + parser and lands in ``self.companion_allowlist``, the durable runner's + defense-in-depth filter MUST refuse to treat it as auto-allowing + repo-wide writes.""" + repo = _init_repo_with_remote(tmp_path) + runner = _runner( + repo, + allowed_write_set=["pdd/foo.py"], + companion_allowlist=["**/*"], + ) + + # ``**/*`` is wildcard-only, so it must NOT auto-allow ``unrelated/file.py``. + assert runner._out_of_scope_staged_paths( + ["unrelated/file.py"], + "foo", + repo, + ) == ["unrelated/file.py"] + + +def test_durable_nested_meta_path_is_not_in_companion_allowlist( + tmp_path: Path, +): + """Iter-14 M-2: durable checkpoint scope checking previously used + ``PurePosixPath.match`` (suffix-based), which falsely matched + ``subdir/.pdd/meta/foo.json`` against the default top-level + ``.pdd/meta/*.json`` companion pattern. The anchored matcher MUST + refuse to auto-allow nested fingerprint-shaped paths, so staged + nested ``.pdd/meta/*.json`` files surface as out-of-scope and the + checkpoint is rejected. + """ + repo = _init_repo_with_remote(tmp_path) + runner = _runner( + repo, + allowed_write_set=["pdd/foo.py"], + companion_allowlist=[".pdd/meta/*.json"], + ) + + # The nested path is shaped like a fingerprint-meta artifact but + # sits under ``subdir/`` — the iter-14 M-2 bug shape. + result = runner._out_of_scope_staged_paths( + ["subdir/.pdd/meta/bar.json"], + "foo", + repo, + ) + assert result == ["subdir/.pdd/meta/bar.json"], ( + "nested .pdd/meta path must NOT be auto-allowed by the " + "default top-level companion pattern (iter-14 M-2)" + ) + + +def test_multi_module_durable_companion_matched_module_relative( + tmp_path: Path, +): + """Iter-16 M-1: in a multi-module repo where ``module_cwd`` is a + SUBDIRECTORY of the worktree (``worktree/pkg``), staged paths + surface relative to the worktree git root (``pkg/.pdd/meta/foo.json``). + The companion pattern ``.pdd/meta/*.json`` is module-relative, so the + durable scope check MUST strip the module_cwd prefix before matching; + otherwise legitimate fingerprint metadata is rejected and the + checkpoint commit fails. Mirrors the async-side iter-14 M-1 part-2 + fix. + """ + repo = _init_repo_with_remote(tmp_path) + runner = _runner( + repo, + basenames=["pkg_mod"], + module_cwds={"pkg_mod": repo / "pkg"}, + allowed_write_set=["pkg/foo.py"], + companion_allowlist=[".pdd/meta/*.json"], + ) + + # Staged path is repo-relative (``pkg/.pdd/meta/foo.json``) but the + # companion pattern describes module-relative metadata. With the + # iter-16 fix the prefix is stripped and the anchored matcher sees + # ``.pdd/meta/foo.json`` — a clean match. + result = runner._out_of_scope_staged_paths( + ["pkg/.pdd/meta/foo.json"], + "pkg_mod", + repo, + ) + assert result == [], ( + "multi-module durable runner must strip module_cwd prefix before " + "companion-pattern matching (iter-16 M-1)" + ) + + +def test_durable_sibling_module_metadata_rejected(tmp_path: Path): + """Iter-16 M-1 (sibling-module regression for F1 iter-3): when + ``module_cwd = worktree/pkg``, a sibling module's metadata path like + ``pkg_other/.pdd/meta/foo.json`` sits OUTSIDE the active module's + cwd. The companion allowlist must NOT auto-allow it; only files + UNDER the module's own cwd qualify as companion artifacts. + """ + repo = _init_repo_with_remote(tmp_path) + runner = _runner( + repo, + basenames=["pkg_mod"], + module_cwds={"pkg_mod": repo / "pkg"}, + allowed_write_set=["pkg/foo.py"], + companion_allowlist=[".pdd/meta/*.json"], + ) + + result = runner._out_of_scope_staged_paths( + ["pkg_other/.pdd/meta/foo.json"], + "pkg_mod", + repo, + ) + assert result == ["pkg_other/.pdd/meta/foo.json"], ( + "sibling-module metadata must NOT be auto-allowed by the " + "companion allowlist (F1 iter-3 sibling rule, iter-16 M-1)" + ) + + +def test_single_module_durable_companion_still_matches(tmp_path: Path): + """Iter-16 M-1 (single-module regression): when ``module_cwd == + module_worktree`` (no submodule prefix), top-level + ``.pdd/meta/foo.json`` must still match the default companion + pattern. The iter-16 prefix-stripping must be a no-op in this case. + """ + repo = _init_repo_with_remote(tmp_path) + runner = _runner( + repo, + allowed_write_set=["src/app.py"], + companion_allowlist=[".pdd/meta/*.json"], + ) + + result = runner._out_of_scope_staged_paths( + [".pdd/meta/foo.json"], + "foo", + repo, + ) + assert result == [], ( + "single-module durable runner must still auto-allow top-level " + ".pdd/meta artifacts (iter-16 M-1 single-module regression)" + ) + + +def test_single_module_durable_nested_meta_not_allowed(tmp_path: Path): + """Iter-14 M-2 (regression): single-module durable runner with + ``module_cwd == module_worktree`` must still reject a NESTED + ``subdir/.pdd/meta/foo.json`` — the anchored matcher refuses + suffix-style matches, and iter-16's prefix-stripping must not + weaken that. + """ + repo = _init_repo_with_remote(tmp_path) + runner = _runner( + repo, + allowed_write_set=["src/app.py"], + companion_allowlist=[".pdd/meta/*.json"], + ) + + result = runner._out_of_scope_staged_paths( + ["subdir/.pdd/meta/foo.json"], + "foo", + repo, + ) + assert result == ["subdir/.pdd/meta/foo.json"], ( + "nested .pdd/meta path must remain out-of-scope under single-" + "module mode (iter-14 M-2 regression preserved by iter-16)" + ) + + +def test_staged_rename_source_side_is_scope_checked(tmp_path: Path): + """Iter-6 B3 (rename detection bug): ``git diff --cached --name-only`` + for a staged ``git mv old new`` emits ONLY ``new``. A contract that + allows ``new`` but not ``old`` would pass validation while the rename + silently DELETES the out-of-scope ``old``. + + After the fix the durable runner uses ``--name-status -M`` so both + sides of the rename surface and the out-of-scope deletion is rejected. + """ + repo = _init_repo_with_remote(tmp_path) + (repo / "src").mkdir(exist_ok=True) + (repo / "src" / "old.py").write_text("contents\n", encoding="utf-8") + _git(repo, "add", "src/old.py") + _git(repo, "commit", "-m", "add src/old.py") + _git(repo, "mv", "src/old.py", "src/new.py") + + runner = _runner(repo, allowed_write_set=["src/new.py"]) + success, message, _empty = runner._stage_module_changes("foo", repo) + + assert not success, ( + "Durable sync must reject a checkpoint that deletes src/old.py " + "even when the contract permits src/new.py." + ) + assert "src/old.py" in message, ( + f"Diagnostic must call out the out-of-scope source path; got: {message!r}" + ) + + def test_push_failure_preserves_local_checkpoint_and_next_run_pushes_it(tmp_path: Path): repo = _init_repo_with_remote(tmp_path) first = _runner(repo, runner_cls=PushFailingMetadataRunner) @@ -505,3 +861,251 @@ def test_total_budget_keeps_durable_runner_single_worker(tmp_path: Path): ) assert runner.max_workers == 1 + + +def test_durable_baseline_paths_use_git_root_not_caller_cwd( + tmp_path: Path, monkeypatch: pytest.MonkeyPatch +): + """Issue #1013 iter-18 M-1 + iter-22 M-1: ``DurableSyncRunner`` MUST NOT + inherit baseline-changed-paths from the caller's cwd. Iter-18 first + pinned the snapshot to the durable ``git_root`` (so caller-cwd dirty + files would not leak); iter-22 then made the durable baseline EMPTY by + construction (per-module worktrees are freshly-created and have no + pre-existing user WIP), so this assertion is now vacuously true but is + kept as an explicit guard against regressions. + """ + caller_cwd = tmp_path / "caller_cwd" + caller_cwd.mkdir() + # Dirty file under the caller's cwd; must NOT leak into baseline. + (caller_cwd / "out.py").write_text("dirty file in caller's cwd") + + durable_root = _init_repo_with_remote(tmp_path) + monkeypatch.chdir(caller_cwd) + + runner = _runner( + durable_root, + runner_cls=EmptyDurableRunner, + allowed_write_set=["pdd/foo.py"], + companion_allowlist=[".pdd/meta/*.json"], + ) + + assert runner.project_root == durable_root.resolve() + # Iter-22 M-1 invariant: durable baseline is always empty. + # Iter-24 M-1: empty dict (was empty set before the type swap). + assert "out.py" not in runner._baseline_changed_paths + assert runner._baseline_changed_paths == {} + + +def test_durable_baseline_is_empty_even_when_git_root_has_dirty_files( + tmp_path: Path, monkeypatch: pytest.MonkeyPatch +): + """Issue #1013 iter-22 M-1 (durable baseline-leakage bug): in production + ``git_root`` IS the user's main checkout where dirty WIP lives, but the + per-module sync runs in a SEPARATE ``.pdd/worktrees/sync-issue-N-mod/`` + worktree. If the durable runner inherits the main-checkout baseline, + ``_enforce_scope_guard`` resolves each baseline ``rel_posix`` against the + per-module worktree root and silently auto-allows any same-named file + written there by sync, bypassing the split contract. + + Iter-18 fixed the iter-17 regression where the snapshot was taken from + ``Path.cwd()`` BEFORE the durable runner reassigned ``project_root``; + iter-22 closes the residual leak by making the durable baseline empty + by construction. Per-module worktrees are freshly created via + ``git worktree add`` and have no pre-existing user WIP — so the + iter-6 B1 "preserve pre-existing untracked files" carve-out (which + exists for the in-place async case) has no analog here. + """ + durable_root = _init_repo_with_remote(tmp_path) + + # Dirty (untracked) file inside the durable repo root. In production this + # stands in for the user's WIP in their main checkout. + (durable_root / "dirty.py").write_text("user work-in-progress") + # Also stage a tracked modification so both flavours of "dirty" are + # represented in what ``git status`` would otherwise report. + (durable_root / "README.md").write_text("locally modified\n") + + monkeypatch.chdir(durable_root) + runner = _runner( + durable_root, + runner_cls=EmptyDurableRunner, + allowed_write_set=["pdd/foo.py"], + companion_allowlist=[".pdd/meta/*.json"], + ) + + # Iter-22 M-1: durable baseline is empty regardless of the git_root's + # state. The dirty paths from the main checkout must NOT bleed into the + # per-module worktree's allow set. + # Iter-24 M-1: empty dict (was empty set before the type swap). + assert runner.project_root == durable_root.resolve() + assert runner._baseline_changed_paths == {} + assert runner._baseline_ignored_paths == {} + + +def test_durable_baseline_remains_empty_dict_after_init( + tmp_path: Path, monkeypatch: pytest.MonkeyPatch +): + """Issue #1013 iter-24 M-1: baseline snapshots changed from ``Set[str]`` + to ``Dict[str, Optional[str]]`` (path → SHA-1) for content-aware + preservation. The durable runner's iter-22 "clear baseline" invariant + still holds — but the cleared value is now an empty dict, not an empty + set. Iterating a dict with no entries yields nothing, so all the + ``.items()`` loops in ``_enforce_scope_guard`` and + ``_remaining_out_of_scope_paths`` remain safe no-ops in durable mode. + """ + durable_root = _init_repo_with_remote(tmp_path) + # Dirty paths in the git_root that would have populated the baseline + # under the inherited AsyncSyncRunner init path. + (durable_root / "dirty.py").write_text("user wip") + (durable_root / "build.log").write_text("ignored junk") + + monkeypatch.chdir(durable_root) + runner = _runner( + durable_root, + runner_cls=EmptyDurableRunner, + allowed_write_set=["pdd/foo.py"], + companion_allowlist=[".pdd/meta/*.json"], + ) + + assert runner._baseline_changed_paths == {} + assert runner._baseline_ignored_paths == {} + # Iter-24 invariant: the cleared baseline is a Mapping (so .items() is + # safe). Iteration yields no entries — confirms downstream loops are + # no-ops. + assert list(runner._baseline_changed_paths.items()) == [] + assert list(runner._baseline_ignored_paths.items()) == [] + + +def test_durable_scope_guard_does_not_whitelist_main_checkout_dirty_files( + tmp_path: Path, monkeypatch: pytest.MonkeyPatch +): + """Iter-22 M-1 reviewer repro: a dirty ``out.py`` in the main checkout + must NOT silently whitelist an ``out.py`` written by sync in a separate + per-module worktree. Before iter-22, the durable runner snapshotted the + main checkout's ``out.py`` into ``_baseline_changed_paths``; the scope + guard then resolved that path against the per-module worktree root and + added it to ``allowed_files``, so the contract-violating worktree + ``out.py`` slid through. + """ + from pdd import agentic_sync_runner as mod + + # Main checkout (becomes the durable runner's ``git_root``) with a dirty + # ``out.py`` standing in for the user's WIP. + main_checkout = _init_repo_with_remote(tmp_path) + (main_checkout / "out.py").write_text("user WIP in main checkout") + + # Separate worktree directory, where sync actually runs. Initialize it + # as its own git repo so ``_resolve_repo_root`` and ``git status`` + # operate locally there. + worktree_path = tmp_path / "sync-worktree" + worktree_path.mkdir() + _git(worktree_path, "init", "-b", "main", ".") + _git(worktree_path, "config", "user.name", "Test User") + _git(worktree_path, "config", "user.email", "test@example.invalid") + (worktree_path / ".gitignore").write_text(".pdd/\n", encoding="utf-8") + _git(worktree_path, "add", ".gitignore") + _git(worktree_path, "commit", "-m", "initial") + + # Sync wrote ``out.py`` inside the worktree — this is the contract + # violation that must be detected. + (worktree_path / "out.py").write_text("written by sync, NOT in contract") + + monkeypatch.chdir(main_checkout) + runner = _runner( + main_checkout, + runner_cls=EmptyDurableRunner, + allowed_write_set=["pdd/foo.py"], + companion_allowlist=[".pdd/meta/*.json"], + ) + + # Mock the revert helpers to return [] so the diagnostic depends purely + # on the re-scan + baseline interaction, not the helpers' behaviour. + monkeypatch.setattr( + mod, "_revert_out_of_scope_changes", lambda _root, _allowed: [] + ) + monkeypatch.setattr( + mod, + "revert_out_of_scope_changes_with_dirs", + lambda _root, allowed_dirs, allowed_files: [], + ) + + # Pretend ``module_cwd`` resolves to the separate worktree. + monkeypatch.setattr( + runner, "_resolve_repo_root", lambda _cwd: worktree_path.resolve() + ) + + diagnostic = runner._enforce_scope_guard("mod", worktree_path) + + # Without the iter-22 fix the diagnostic would be ``None`` because + # ``out.py`` from the (leaked) baseline resolves to + # ``/out.py`` and lands in ``allowed_files``. With the + # fix the baseline is empty, so ``out.py`` is correctly out of scope. + assert diagnostic is not None + assert "out.py" in diagnostic + + +def test_durable_runner_aborts_before_worktree_setup_when_baseline_failed( + tmp_path: Path, monkeypatch: pytest.MonkeyPatch +): + """Iter-40 M-2 (durable init ordering): when the inherited + ``AsyncSyncRunner.__init__`` records ``_baseline_acquisition_failed=True`` + (the iter-38 fail-closed signal), the durable runner's + :meth:`~DurableSyncRunner.run` MUST abort BEFORE + :meth:`_prepare_durable_branch` runs. The iter-38 fix was added in + ``AsyncSyncRunner.run()``, but ``DurableSyncRunner.run()`` calls + ``_prepare_durable_branch()`` first — without this iter-40 hoist a + transient git scan failure would leave durable side effects + (worktree creation, branch checkout, remote pushes) in place before + the inherited fail-closed check ran.""" + from pdd import agentic_sync_runner as mod + + durable_root = _init_repo_with_remote(tmp_path) + + # Patch the baseline scan to fail. ``DurableSyncRunner.__init__`` + # forwards to ``AsyncSyncRunner.__init__`` which reads the scan and + # records ``_baseline_acquisition_failed`` on ``None``. + monkeypatch.setattr(mod, "_git_changed_paths", lambda _root: None) + monkeypatch.setattr(mod, "_git_ignored_paths", lambda _root: set()) + + monkeypatch.chdir(durable_root) + + runner = _runner( + durable_root, + runner_cls=EmptyDurableRunner, + allowed_write_set=["pdd/foo.py"], + companion_allowlist=[".pdd/meta/*.json"], + ) + + # The inherited init must have flagged the baseline acquisition as + # failed. (Iter-22 clears the baseline *paths* to {} but does NOT + # clear this flag — exactly so the iter-40 hoist can see it.) + assert runner._baseline_acquisition_failed is True, ( + "iter-40: inherited fail-closed flag must reach the durable runner" + ) + + prepare_calls: list[bool] = [] + + def fake_prepare() -> tuple[bool, str]: + prepare_calls.append(True) + return True, "" + + monkeypatch.setattr(runner, "_prepare_durable_branch", fake_prepare) + + success, message, cost = runner.run() + + assert success is False, ( + "iter-40: durable runner must fail-closed when baseline scan failed" + ) + assert prepare_calls == [], ( + "iter-40: _prepare_durable_branch MUST NOT run when baseline " + "acquisition failed — otherwise worktree creation and branch " + f"checkout happen before the abort. Calls: {prepare_calls}" + ) + assert "fail-closed" in message, ( + f"iter-40: abort message must mention fail-closed; got: {message!r}" + ) + assert "baseline" in message, ( + f"iter-40: abort message must mention baseline; got: {message!r}" + ) + # The abort path returns ``self.initial_cost`` (0.0 for the default + # runner) — mirrors the inherited AsyncSyncRunner.run abort. + assert cost == 0.0