From 0047317e92cc9bbc9242badec77979f7901de9a4 Mon Sep 17 00:00:00 2001 From: PDD Bot Date: Thu, 14 May 2026 20:43:25 +0000 Subject: [PATCH 01/42] feat: enforce split-contract allowed write sets in pdd sync (#1013) MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Add scope-guard plumbing to agentic sync prompts so generated artifacts stay within the issue's allowed write set: - agentic_common_python.prompt: add parse_issue_contract helper, IssueContract dataclass, and DEFAULT_SYNC_COMPANION_ALLOWLIST constant - agentic_sync_runner_python.prompt: add allowed_write_set, companion_allowlist, scope_guard_enabled kwargs to AsyncSyncRunner; enforce scope guard after each per-module subprocess - agentic_sync_python.prompt: add scope_guard kwarg to run_agentic_sync and run_global_sync; parse contract from issue body/comments and plumb through to AsyncSyncRunner / DurableSyncRunner Reuses the existing _revert_out_of_scope_changes helper already in production for update/fix/crash/e2e-fix. Additive only — safe defaults preserve current behavior; --no-scope-guard provides opt-out. Closes #1013 Co-Authored-By: Claude Opus 4 --- CHANGELOG.md | 6 +++ README.md | 41 +++++++++++++++++++ architecture.json | 16 ++++++-- pdd/prompts/agentic_common_python.prompt | 8 +++- pdd/prompts/agentic_sync_python.prompt | 22 +++++++++- pdd/prompts/agentic_sync_runner_python.prompt | 40 +++++++++++++++++- 6 files changed, 126 insertions(+), 7 deletions(-) diff --git a/CHANGELOG.md b/CHANGELOG.md index a449c73c0..0a51b1f66 100644 --- a/CHANGELOG.md +++ b/CHANGELOG.md @@ -1,3 +1,9 @@ +## Unreleased + +### Fix + +- **#1013 sync**: enforce split-contract allowed write sets. When the linked GitHub issue declares an allowed write set (HTML comment `` or a fenced "Allowed Write Set" / "Split Contract" block), `pdd sync` now reverts tracked changes and removes untracked new files that fall outside the contract after each per-module subprocess, hard-fails the module on out-of-scope artifacts, and surfaces the contract source plus offending paths in checkup/review-loop reports. Companion artifacts under `.pdd/meta/*.json` are auto-allowed; additional companions can be opted in via the contract's `companion_allowlist` field. Use `--no-scope-guard` to opt out for a single run. Issues without a contract marker remain in permissive mode (no enforcement). + ## v0.0.237 (2026-05-13) ### Fix diff --git a/README.md b/README.md index 0109335bb..b7bc452e7 100644 --- a/README.md +++ b/README.md @@ -872,6 +872,7 @@ Options: - `--durable-branch TEXT`: Durable mode only. Override the durable checkpoint branch name. Default is `sync/issue-` derived from the GitHub issue. Refused if it resolves to `main`, `master`, or the repository default branch. - `--no-resume`: Durable mode only. Ignore existing `PDD-Sync-Checkpoint-V1` commit trailers on the durable branch and re-run every selected module. By default, durable sync reads checkpoint trailers (`PDD-Sync-Checkpoint-V1: issue= module=`) and skips modules already checkpointed for the same issue, which is what makes a cloud rerun safely resume completed work after a partial failure. - `--durable-max-parallel INT`: Durable mode only. Cap how many module worktrees run concurrently. Defaults to the standard runner concurrency. A total budget still forces sequential execution. +- `--no-scope-guard`: Issue-sync only. Disable the split-contract scope guard for this run. By default, when the linked GitHub issue declares an allowed write set (split contract), `pdd sync` enforces it and rejects out-of-scope generated artifacts. Pass this flag only when intentionally overriding contract enforcement (e.g. recovering from a stale contract). See "Split-Contract Scope Guard" below. **Durable Issue Sync** (`--durable`): @@ -1060,6 +1061,46 @@ Options (agentic mode): **Cross-Machine Resume**: Workflow state is stored in a hidden GitHub comment, enabling resume from any machine. Use `--no-github-state` to disable. +**Split-Contract Scope Guard** (Issue #1013): + +When the linked GitHub issue declares an allowed write set (a "split contract"), `pdd sync` enforces it: each per-module subprocess is followed by a scope check that reverts tracked changes and removes untracked new files that fall outside the contract. Companion artifacts under `.pdd/meta/*.json` are auto-allowed because they are sync's own fingerprint bookkeeping; issues may opt additional companions (e.g. examples or architecture entries) into the allowlist explicitly. + +The contract is read from the issue body or any of its comments in one of two forms: + +1. An HTML-comment block (preferred — invisible in rendered Markdown): + ```html + + ``` +2. A fenced code block under a heading like `### Allowed Write Set` or `### Split Contract`: + ```text + pdd/update_main.py + pdd/prompts/update_main_python.prompt + tests/test_update_main.py + ``` + +When an out-of-scope change is detected, the run records a hard failure for that module with a diagnostic of the form: + +``` +Scope guard reverted N out-of-scope file(s) for module '' (contract source: ): + - path/relative/to/repo + - another/path +Allowed write set: + - path/from/contract +Companion allowlist: + - .pdd/meta/*.json +``` + +This blocks the per-module success record so dependent modules do not schedule on top of an out-of-scope sync, and checkup/review-loop reports surface the failure instead of letting unrelated artifacts land in the PR. When no contract marker is present, the scope guard falls back to permissive mode — no enforcement, no reverts — preserving existing behavior for issues that have not opted in. Use `--no-scope-guard` to disable enforcement for a single run when you intentionally need to override the contract. + ### 1a. sync-architecture Sync `architecture.json` from prompt metadata tags (``, ``, and ``). This is useful after editing prompt metadata directly, or after backfilling prompt tags, so the architecture graph and command metadata stay aligned with the prompts. diff --git a/architecture.json b/architecture.json index 93e764b41..0528bd716 100644 --- a/architecture.json +++ b/architecture.json @@ -128,6 +128,16 @@ "name": "clear_workflow_state", "signature": "(cwd: Path, issue_number: int, workflow_type: str, state_dir: Path, repo_owner: str, repo_name: str, use_github_state: bool = True) -> None", "returns": "None" + }, + { + "name": "parse_issue_contract", + "signature": "(issue_body: Optional[str], issue_comments: Optional[List[str]] = None) -> Optional[IssueContract]", + "returns": "Optional[IssueContract]" + }, + { + "name": "_revert_out_of_scope_changes", + "signature": "(cwd: Path, allowed_paths: set[Path]) -> List[Path]", + "returns": "List[Path]" } ] } @@ -7257,7 +7267,7 @@ "functions": [ { "name": "run_agentic_sync", - "signature": "(issue_url: str, *, verbose: bool, quiet: bool, budget: Optional[float], skip_verify: bool, skip_tests: bool, dry_run: bool, agentic_mode: bool, no_steer: bool, max_attempts: Optional[int], timeout_adder: float, use_github_state: bool, one_session: bool, reasoning_time: Optional[float], durable: bool, durable_branch: Optional[str], no_resume: bool, durable_max_parallel: Optional[int]) -> Tuple[bool, str, float, str]", + "signature": "(issue_url: str, *, verbose: bool, quiet: bool, budget: Optional[float], skip_verify: bool, skip_tests: bool, dry_run: bool, agentic_mode: bool, no_steer: bool, max_attempts: Optional[int], timeout_adder: float, use_github_state: bool, one_session: bool, reasoning_time: Optional[float], durable: bool, durable_branch: Optional[str], no_resume: bool, durable_max_parallel: Optional[int], scope_guard: bool = True) -> Tuple[bool, str, float, str]", "returns": "Tuple[bool, str, float, str]", "sideEffects": [ "None" @@ -7265,7 +7275,7 @@ }, { "name": "run_global_sync", - "signature": "(*, verbose: bool, quiet: bool, budget: Optional[float], skip_verify: bool, skip_tests: bool, agentic_mode: bool, no_steer: bool, max_attempts: Optional[int], dry_run: bool, target_coverage: Optional[float], one_session: bool, local: bool, timeout_adder: float) -> Tuple[bool, str, float, str]", + "signature": "(*, verbose: bool, quiet: bool, budget: Optional[float], skip_verify: bool, skip_tests: bool, agentic_mode: bool, no_steer: bool, max_attempts: Optional[int], dry_run: bool, target_coverage: Optional[float], one_session: bool, local: bool, timeout_adder: float, scope_guard: bool = True) -> Tuple[bool, str, float, str]", "returns": "Tuple[bool, str, float, str]", "sideEffects": [ "Runs AsyncSyncRunner for stale modules unless dry_run=True; timeout_adder is forwarded via sync_options so --timeout-adder stretches the per-module wall-clock cap on the global-sync path the same way it does on run_agentic_sync" @@ -7317,7 +7327,7 @@ "functions": [ { "name": "AsyncSyncRunner", - "signature": "(basenames: List[str], dep_graph: Dict[str, List[str]], sync_options: Dict[str, Any], github_info: Optional[Dict[str, Any]], quiet: bool = False, verbose: bool = False, issue_url: Optional[str] = None, module_cwds: Optional[Dict[str, Path]] = None, initial_cost: float = 0.0)", + "signature": "(basenames: List[str], dep_graph: Dict[str, List[str]], sync_options: Dict[str, Any], github_info: Optional[Dict[str, Any]], quiet: bool = False, verbose: bool = False, issue_url: Optional[str] = None, module_cwds: Optional[Dict[str, Path]] = None, initial_cost: float = 0.0, *, allowed_write_set: Optional[Iterable[str]] = None, companion_allowlist: Optional[Iterable[str]] = None, scope_guard_enabled: bool = True)", "returns": "AsyncSyncRunner", "sideEffects": [ "Initializes runner state; total_budget in sync_options forces sequential scheduling and per-child/per-retry remaining-budget caps" diff --git a/pdd/prompts/agentic_common_python.prompt b/pdd/prompts/agentic_common_python.prompt index 263db3635..15e9c177f 100644 --- a/pdd/prompts/agentic_common_python.prompt +++ b/pdd/prompts/agentic_common_python.prompt @@ -26,7 +26,9 @@ {"name": "github_clear_state", "signature": "(repo_owner: str, repo_name: str, issue_number: int, workflow_type: str, cwd: Path) -> bool", "returns": "bool"}, {"name": "load_workflow_state", "signature": "(cwd: Path, issue_number: int, workflow_type: str, state_dir: Path, repo_owner: str, repo_name: str, use_github_state: bool = True) -> Tuple[Optional[Dict], Optional[int]]", "returns": "Tuple[Optional[Dict], Optional[int]]"}, {"name": "save_workflow_state", "signature": "(cwd: Path, issue_number: int, workflow_type: str, state: Dict, state_dir: Path, repo_owner: str, repo_name: str, use_github_state: bool = True, github_comment_id: Optional[int] = None) -> Optional[int]", "returns": "Optional[int]"}, - {"name": "clear_workflow_state", "signature": "(cwd: Path, issue_number: int, workflow_type: str, state_dir: Path, repo_owner: str, repo_name: str, use_github_state: bool = True) -> None", "returns": "None"} + {"name": "clear_workflow_state", "signature": "(cwd: Path, issue_number: int, workflow_type: str, state_dir: Path, repo_owner: str, repo_name: str, use_github_state: bool = True) -> None", "returns": "None"}, + {"name": "parse_issue_contract", "signature": "(issue_body: Optional[str], issue_comments: Optional[List[str]] = None) -> Optional[IssueContract]", "returns": "Optional[IssueContract]"}, + {"name": "_revert_out_of_scope_changes", "signature": "(cwd: Path, allowed_paths: set[Path]) -> List[Path]", "returns": "List[Path]"} ] } } @@ -105,6 +107,9 @@ Shared infrastructure for agentic CLI invocations (Claude Code, Gemini, Codex, O 18. **Post Final Comment**: `post_final_comment(repo_owner, repo_name, issue_number, reason, total_cost, steps_completed, total_steps, cwd) -> bool`: Post a generated workflow summary comment to the GitHub issue when the workflow stops early. The function builds the comment body from the stop reason, cumulative cost, and completed/total step counts; callers do not pass a preformatted body. 19. **OpenCode Model Resolution**: Resolve the OpenCode model in this order: (1) `OPENCODE_MODEL` env var, kept verbatim including nested slashes like `openrouter/openai/gpt-5.3-codex`; (2) derive a candidate from `llm_model.csv` using PDD's existing model-strength semantics, then translate LiteLLM-oriented IDs via `_translate_to_opencode_model()`. The CSV fallback MUST be auth-aware: build the configured OpenCode provider set from parsed provider credentials in `~/.local/share/opencode/auth.json`, parsed usable OpenCode config provider/model entries (`~/.config/opencode/opencode.json`, nearest project `opencode.json`, `OPENCODE_CONFIG`, `OPENCODE_CONFIG_CONTENT`), and every provider credential env var represented in `llm_model.csv`; filter candidate rows to providers that are configured before selecting a model. OpenCode config sources contribute a configured provider only when they declare a provider/model path with resolvable auth or explicit local/no-key provider semantics; bare config existence is diagnostic-only. OpenCode agentic runs use `OPENCODE_MODEL` or the auth-aware CSV fallback, not generic direct-prompt model defaults. Required translations include `github_copilot/X -> github-copilot/X`, `gemini/X -> google/X`, bare Anthropic rows like `claude-sonnet-... -> anthropic/claude-sonnet-...`, and bare OpenAI rows like `gpt-5 -> openai/gpt-5`; IDs already in OpenCode `provider/model` form pass through unchanged. If no configured provider can serve the selected model, fail fast with an actionable error telling the user to set `OPENCODE_MODEL=provider/model`, configure the matching provider, or run `opencode models` after authentication. Do not rely on OpenCode default model resolution. 20. **OpenCode Optional Knobs**: Honor `OPENCODE_AGENT` by passing `--agent ` and `OPENCODE_VARIANT` by passing `--variant ` when set. Omit both flags when unset. `PDD_OPENCODE_MODE` is out of scope for this module version; use `opencode run` only. +21. **Issue Contract Parsing (Issue #1013 — sync scope guard)**: Provide `IssueContract` (frozen dataclass with `allowed_paths: Tuple[str, ...]`, `companion_allowlist: Tuple[str, ...]`, `source: str`) and `parse_issue_contract(issue_body, issue_comments=None) -> Optional[IssueContract]`. The parser scans the issue body first, then each comment (newest last is fine), looking for either (a) an HTML-comment block of the form `` whose JSON declares `allowed_paths` (required, list of repo-relative path strings) and optionally `companion_allowlist` (list of `pathlib`-style glob patterns), or (b) a fenced code block introduced by a heading-like line matching `(?im)^\s*(?:#+\s*)?(?:allowed[\s_-]*write[\s_-]*set|split[\s_-]*contract)\b.*$` immediately followed by a fenced block (```text``` or ```json```) whose lines list one repo-relative path per line (blank lines and `#`-prefixed comments ignored). Path strings are repo-relative POSIX paths; do NOT resolve to absolute filesystem paths here — that is the caller's job once it knows the repo root. The parser MUST be tolerant: malformed JSON, missing fields, or no matching marker returns `None` (the caller treats `None` as "no contract → scope guard runs in permissive fallback mode, no enforcement"). Set `source` to `"html-comment"`, `"fenced-block"`, or the value that was matched, for diagnostics. The parser MUST NOT raise on any input; wrap the JSON load in try/except and return `None` on failure. When both a body marker and a comment marker are present, prefer the body marker (issues are edited authoritatively in the body; comments are append-only and may contain stale snapshots from earlier workflow steps). +22. **Default Sync Companion Allowlist (Issue #1013)**: Expose a module-level constant `DEFAULT_SYNC_COMPANION_ALLOWLIST: Tuple[str, ...]` listing glob patterns for files that `pdd sync` MAY touch as legitimate companion artifacts even when an issue contract restricts the primary write set. The default value MUST be `(".pdd/meta/*.json",)` — only fingerprint metadata under `.pdd/meta/` is auto-allowed. Architecture, examples, and unrelated prompt files are NOT in the default companion allowlist; the issue contract must opt them in explicitly via its own `companion_allowlist` field. This constant exists so `agentic_sync_runner` and `agentic_sync` import a single shared default rather than redefining it inline. +23. **Scope Guard Helper Re-export**: `_revert_out_of_scope_changes(cwd, allowed_paths)` already exists in this module and is reused by sync scope enforcement (Issue #1013). The signature MUST remain `(cwd: Path, allowed_paths: set[Path]) -> List[Path]` and the behavior MUST remain: skip silently when `cwd` is not a git repo or when no allowed path lies under `cwd`; detect tracked changes via `git status --porcelain -uno`; restore out-of-scope tracked files via `git checkout HEAD --`; return the list of resolved paths that were reverted. This requirement is documentation-only — do not change the existing behavior, callers from `agentic_update`, `agentic_fix`, `agentic_crash`, `agentic_e2e_fix_orchestrator`, and the new sync caller all depend on the current contract. % Function Signatures `get_agent_provider_preference() -> List[str]` @@ -142,6 +147,7 @@ Shared infrastructure for agentic CLI invocations (Claude Code, Gemini, Codex, O - `MIN_ATTEMPT_TIMEOUT_SECONDS: 60.0` - `MAX_ERROR_SNIPPET_LENGTH: 2000` - `MAX_ERROR_RESPONSE_NEWLINES: 3` (Issue #1232: newline-count gate for leading-`Error:` false-positive heuristic) +- `DEFAULT_SYNC_COMPANION_ALLOWLIST: Tuple[str, ...] = (".pdd/meta/*.json",)` (Issue #1013: glob patterns for sync companion artifacts that bypass the issue contract's primary write set) % Token Pricing `Pricing(input_per_million, output_per_million, cached_input_multiplier)` diff --git a/pdd/prompts/agentic_sync_python.prompt b/pdd/prompts/agentic_sync_python.prompt index bf22761cf..dd4c3cfd2 100644 --- a/pdd/prompts/agentic_sync_python.prompt +++ b/pdd/prompts/agentic_sync_python.prompt @@ -1,8 +1,25 @@ +Entry point for pdd sync orchestration over GitHub issues with split-contract scope-guard enforcement. + + +{ + "type": "module", + "module": { + "functions": [ + {"name": "run_agentic_sync", "signature": "(issue_url: str, *, verbose: bool = False, quiet: bool = False, budget: Optional[float] = None, skip_verify: bool = False, skip_tests: bool = False, dry_run: bool = False, agentic_mode: bool = True, no_steer: bool = True, max_attempts: Optional[int] = None, timeout_adder: float = 0.0, use_github_state: bool = True, one_session: bool = False, reasoning_time: Optional[float] = None, durable: bool = False, durable_branch: Optional[str] = None, no_resume: bool = False, durable_max_parallel: Optional[int] = None, scope_guard: bool = True) -> Tuple[bool, str, float, str]", "returns": "Tuple[bool, str, float, str]"}, + {"name": "run_global_sync", "signature": "(*, verbose: bool = False, quiet: bool = False, budget: Optional[float] = None, skip_verify: bool = False, skip_tests: bool = False, agentic_mode: bool = True, no_steer: bool = True, max_attempts: Optional[int] = None, dry_run: bool = False, target_coverage: Optional[float] = None, one_session: bool = False, local: bool = False, timeout_adder: float = 0.0, scope_guard: bool = True) -> Tuple[bool, str, float, str]", "returns": "Tuple[bool, str, float, str]"}, + {"name": "_is_github_issue_url", "signature": "(s: str) -> bool", "returns": "bool"}, + {"name": "_parse_llm_response", "signature": "(response: str) -> Tuple[List[str], bool, List[Dict]]", "returns": "Tuple[List[str], bool, List[Dict]]"} + ] + } +} + + context/python_preamble.prompt architecture_sync_python.prompt auto_deps_main_python.prompt agentic_sync_runner_python.prompt durable_sync_runner_python.prompt +agentic_common_python.prompt % Goal Write the `pdd/agentic_sync.py` module. @@ -13,7 +30,7 @@ Entry point for sync orchestration. Supports two public workflows: 2. `run_global_sync(...)`: Tier 1 no-argument global sync. Triggered by the CLI when `pdd sync` is invoked with no BASENAME. Do not use PRD files, PRD fingerprinting, architecture schema migration, or agentic PRD analysis in v1. % Requirements -1. Function: `run_agentic_sync(issue_url: str, *, verbose: bool = False, quiet: bool = False, budget: Optional[float] = None, skip_verify: bool = False, skip_tests: bool = False, dry_run: bool = False, agentic_mode: bool = True, no_steer: bool = True, max_attempts: Optional[int] = None, timeout_adder: float = 0.0, use_github_state: bool = True, one_session: bool = False, reasoning_time: Optional[float] = None, durable: bool = False, durable_branch: Optional[str] = None, no_resume: bool = False, durable_max_parallel: Optional[int] = None) -> Tuple[bool, str, float, str]` +1. Function: `run_agentic_sync(issue_url: str, *, verbose: bool = False, quiet: bool = False, budget: Optional[float] = None, skip_verify: bool = False, skip_tests: bool = False, dry_run: bool = False, agentic_mode: bool = True, no_steer: bool = True, max_attempts: Optional[int] = None, timeout_adder: float = 0.0, use_github_state: bool = True, one_session: bool = False, reasoning_time: Optional[float] = None, durable: bool = False, durable_branch: Optional[str] = None, no_resume: bool = False, durable_max_parallel: Optional[int] = None, scope_guard: bool = True) -> Tuple[bool, str, float, str]` 2. Return 4-tuple: (success, message, total_cost, model_used) 3. Parse GitHub issue URL to extract: owner, repo, issue_number (reuse `_parse_issue_url` from `agentic_change.py`) 4. Fetch issue content and comments via `gh api` (reuse `_run_gh_command` from `agentic_change.py`) @@ -36,9 +53,10 @@ Entry point for sync orchestration. Supports two public workflows: 18. If `durable=True`, dispatch to `DurableSyncRunner` instead of `AsyncSyncRunner`. Pass the issue URL, durable branch override, `no_resume` flag, and durable max-parallel setting through unchanged. Durable mode must still use the same module identification, dependency graph, dry-run validation, fingerprint filtering, sync options, and initial cost accounting as standard issue sync. 19. Aggregate costs from LLM identification + dry-run LLM fallback + runner execution 20. Include `one_session`, `local`, and `target_coverage` in `sync_options` dict passed to the selected runner +21. **Split-Contract Scope Guard (Issue #1013)**: After fetching the issue body and comments and BEFORE dispatching to the runner, call `parse_issue_contract(issue_body, issue_comments)` from `pdd.agentic_common`. When the returned `IssueContract` is non-None, plumb its `allowed_paths` (as `allowed_write_set`) and the union of its `companion_allowlist` with `DEFAULT_SYNC_COMPANION_ALLOWLIST` (as `companion_allowlist`) into BOTH the `AsyncSyncRunner` and `DurableSyncRunner` constructors. Always pass `scope_guard_enabled=scope_guard` so the CLI opt-out flows through. When the contract is None (no marker in the issue), pass `allowed_write_set=None` so the runner falls back to permissive mode without enforcement. Emit one INFO line (suppressed under `quiet`) reporting which contract source was detected — for example `Sync scope guard: contract loaded from (N allowed paths)` — so operators can see at a glance that an issue contract is active. When `scope_guard=False`, emit one WARNING line `Sync scope guard: disabled via --no-scope-guard` instead. % Global Sync (Tier 1) -1. Function: `run_global_sync(*, verbose: bool = False, quiet: bool = False, budget: Optional[float] = None, skip_verify: bool = False, skip_tests: bool = False, agentic_mode: bool = True, no_steer: bool = True, max_attempts: Optional[int] = None, dry_run: bool = False, target_coverage: Optional[float] = None, one_session: bool = False, local: bool = False, timeout_adder: float = 0.0) -> Tuple[bool, str, float, str]`. Forward `timeout_adder` to the runner via `sync_options['timeout_adder']`. +1. Function: `run_global_sync(*, verbose: bool = False, quiet: bool = False, budget: Optional[float] = None, skip_verify: bool = False, skip_tests: bool = False, agentic_mode: bool = True, no_steer: bool = True, max_attempts: Optional[int] = None, dry_run: bool = False, target_coverage: Optional[float] = None, one_session: bool = False, local: bool = False, timeout_adder: float = 0.0, scope_guard: bool = True) -> Tuple[bool, str, float, str]`. Forward `timeout_adder` to the runner via `sync_options['timeout_adder']`. The `scope_guard` kwarg is accepted for CLI signature parity with `run_agentic_sync`; in global mode there is no issue body to parse, so the runner is always constructed with `allowed_write_set=None` (permissive fallback) regardless of `scope_guard`. The kwarg exists so `pdd sync --no-scope-guard` does not raise a `TypeError` in global mode. 2. Load combined architecture data using existing `_load_architecture_json(project_root)` / `load_combined_architecture_data`. Fail clearly if no architecture data exists. 3. Extract syncable basenames from architecture `filename` values. Preserve subdirectory prefixes, e.g. `commands/maintenance_python.prompt` becomes `commands/maintenance`; skip non-syncable filenames such as `_LLM.prompt`. 4. For each architecture module, resolve cwd/context/prompts_dir with the same nearest-config behavior used by issue sync, then call `sync_determine_operation(..., log_mode=True, read_only=True, ...)` from the module's resolved cwd for each detected language. Suppress info-level logs from `pdd.sync_determine_operation` around the read-only calls so global dry-run output stays readable and metadata files are never changed during analysis. diff --git a/pdd/prompts/agentic_sync_runner_python.prompt b/pdd/prompts/agentic_sync_runner_python.prompt index d01e56428..e34dab40b 100644 --- a/pdd/prompts/agentic_sync_runner_python.prompt +++ b/pdd/prompts/agentic_sync_runner_python.prompt @@ -1,7 +1,24 @@ +Parallel pdd sync engine enforcing dep ordering, per-module budgets, and split-contract scope guard. + + +{ + "type": "module", + "module": { + "functions": [ + {"name": "AsyncSyncRunner", "signature": "(basenames: List[str], dep_graph: Dict[str, List[str]], sync_options: Dict[str, Any], github_info: Optional[Dict[str, Any]], quiet: bool = False, verbose: bool = False, issue_url: Optional[str] = None, module_cwds: Optional[Dict[str, Path]] = None, initial_cost: float = 0.0, *, allowed_write_set: Optional[Iterable[str]] = None, companion_allowlist: Optional[Iterable[str]] = None, scope_guard_enabled: bool = True)", "returns": "AsyncSyncRunner"}, + {"name": "AsyncSyncRunner.run", "signature": "() -> Tuple[bool, str, float]", "returns": "Tuple[bool, str, float]"}, + {"name": "build_dep_graph_from_architecture", "signature": "(arch_path: Path, target_basenames: List[str]) -> DepGraphFromArchitectureResult", "returns": "DepGraphFromArchitectureResult"}, + {"name": "build_dep_graph_from_architecture_data", "signature": "(architecture: Any, target_basenames: List[str], *, source_name: str = 'architecture.json') -> DepGraphFromArchitectureResult", "returns": "DepGraphFromArchitectureResult"} + ] + } +} + + context/python_preamble.prompt architecture_sync_python.prompt agentic_langtest_python.prompt agentic_test_orchestrator_python.prompt +agentic_common_python.prompt % Goal Write the `pdd/agentic_sync_runner.py` module. @@ -10,7 +27,7 @@ Write the `pdd/agentic_sync_runner.py` module. Parallel sync engine that runs `pdd sync` for multiple modules concurrently using a ThreadPoolExecutor, respecting dependency ordering. Posts live progress updates to a GitHub issue comment. Supports state persistence for resumability across runs, phase tracking, and graceful interrupt handling. % Requirements -1. Class: `AsyncSyncRunner(basenames, dep_graph, sync_options, github_info, quiet, verbose, issue_url, module_cwds, initial_cost=0.0)` +1. Class: `AsyncSyncRunner(basenames, dep_graph, sync_options, github_info, quiet, verbose, issue_url, module_cwds, initial_cost=0.0, *, allowed_write_set=None, companion_allowlist=None, scope_guard_enabled=True)` - `basenames: List[str]` — modules to sync - `dep_graph: Dict[str, List[str]]` — basename -> [dependency basenames] - `sync_options: Dict` — budget, total_budget, target_coverage, skip_verify, skip_tests, agentic, no_steer, max_attempts, one_session, local, timeout_adder @@ -18,6 +35,9 @@ Parallel sync engine that runs `pdd sync` for multiple modules concurrently usin - `issue_url: Optional[str]` — GitHub issue URL for state persistence (None disables resumability) - `module_cwds: Optional[Dict[str, Path]]` — per-module working directories (defaults to project_root) - `initial_cost: float` — pre-runner cost (LLM module identification and dry-run fallback) to include in total cost display and return value (default 0.0) + - `allowed_write_set: Optional[Iterable[str]]` — repo-relative path strings from the issue split contract that this sync run is permitted to modify. `None` means "no contract was parseable from the issue → run in permissive mode (no enforcement)". An explicit empty iterable means "contract present but empty → reject every change as out-of-scope" (a degenerate but legal contract). Resolved against each module's `cwd`/repo root inside the runner. + - `companion_allowlist: Optional[Iterable[str]]` — additional glob patterns (e.g. `".pdd/meta/*.json"`) describing companion artifacts that MAY be modified outside the primary `allowed_write_set`. Defaults to `DEFAULT_SYNC_COMPANION_ALLOWLIST` from `agentic_common` (currently `(".pdd/meta/*.json",)`) when `None`. Issue contracts MAY widen the companion allowlist by passing a superset. + - `scope_guard_enabled: bool` — master switch (default `True`). When `False`, the runner records the parsed contract for diagnostics but performs no enforcement, no revert, and no hard-fail. Maps to the CLI `--no-scope-guard` opt-out. - Tracks per-module state: pending -> running -> success | failed 2. Method: `run() -> Tuple[bool, str, float]` — returns (all_success, summary_message, total_cost) where total_cost includes initial_cost + per-module costs 3. Use `concurrent.futures.ThreadPoolExecutor` with `MAX_WORKERS = 4`; when `sync_options["total_budget"]` is set, run sequentially and pass only the remaining total budget to each child process so the total budget is not multiplied per module. @@ -43,6 +63,23 @@ Parallel sync engine that runs `pdd sync` for multiple modules concurrently usin 19. Subprocess env: include `PYTHONUNBUFFERED=1` for real-time output 20. Forward "Successfully submitted example" messages from child stdout to parent console 21. Heartbeat logging: during long-running syncs, print progress updates every 60s. Prefer parsed `PDD_PHASE` state — `f" — phase: {current_phase} ({len(completed_phases)} done)"` — so operators see real progress through the generate/test/fix phases instead of a stale `Preprocessing complete` line. Fall back to the last non-box-drawing stdout line only when no phase has been reported yet. +22. **Split-Contract Scope Guard (Issue #1013)**: After each per-module `pdd sync` subprocess completes (success or failure), and **before** the runner declares that module successful or persists state, the runner MUST invoke `_enforce_scope_guard(basename, module_cwd)` when `self.scope_guard_enabled` is True AND `self.allowed_write_set is not None`. The helper: + - Builds the effective allow set for the module: every path in `self.allowed_write_set` resolved against the module's repo root (the git toplevel of `module_cwd`, falling back to `module_cwd` itself), plus every path under `module_cwd` that matches any glob in the effective companion allowlist (`self.companion_allowlist` ∪ `DEFAULT_SYNC_COMPANION_ALLOWLIST`). + - Calls `_revert_out_of_scope_changes(repo_root, allowed_paths)` from `pdd.agentic_common` to revert tracked out-of-scope modifications, AND calls `revert_out_of_scope_changes_with_dirs(repo_root, allowed_dirs=set(), allowed_files=allowed_paths)` from `pdd.agentic_common_worktree` to additionally remove untracked out-of-scope new files. The combination matches the existing scope-guard pattern used by `agentic_update`/`agentic_fix`/`agentic_crash`/`agentic_e2e_fix_orchestrator`. + - Diagnostic format (printed to stderr; structured for downstream parsers — checkup, review-loop reports): + ``` + Scope guard reverted N out-of-scope file(s) for module '' (contract source: ): + - path/relative/to/repo + - another/path + Allowed write set: + - path/from/contract + Companion allowlist: + - .pdd/meta/*.json + ``` + - **Hard-fail policy (Issue #1013 acceptance criteria 3 and 4)**: if any out-of-scope path was detected, the module MUST be recorded as failed with `error="Scope guard hard-fail: out-of-scope artifacts detected"` followed by the diagnostic body. This blocks the per-module success record, blocks dependent modules from scheduling, and ensures checkup/review-loop reports surface the failure rather than burying it under an apparently-successful sync. Hard-fail applies even when the underlying `pdd sync` subprocess succeeded — the contract violation is the failure mode the scope guard exists to catch. + - **Permissive fallback**: when `self.allowed_write_set is None` (no parseable contract on the issue), `_enforce_scope_guard` returns immediately without enforcement. Document this in a one-line dim INFO log on `run()` entry so operators understand why enforcement is off for that run. + - **Opt-out**: when `self.scope_guard_enabled is False`, log a one-line dim WARNING on `run()` entry ("Scope guard disabled via --no-scope-guard") and skip enforcement entirely. Even an explicit `allowed_write_set` is recorded only for diagnostics in this mode. + - The scope-guard step MUST run with a `threading.Lock` held around git operations on a per-`module_cwd` basis to avoid `git status` / `git checkout` races when modules share a repo root (the common shared-worktree case for non-durable issue sync). % Dataclass: `ModuleState` - `status: str` — "pending", "running", "success", "failed" @@ -102,6 +139,7 @@ Parallel sync engine that runs `pdd sync` for multiple modules concurrently usin - `_find_pdd_executable() -> Optional[str]`: find pdd binary (same pattern as `server/jobs.py`) - `_parse_cost_from_csv(csv_path: str) -> float`: sum cost column from PDD_OUTPUT_COST_PATH CSV - `_format_duration(start, end) -> str`: format seconds as "Xs" or "Xm Ys" +- `_enforce_scope_guard(self, basename: str, module_cwd: Path) -> Optional[str]`: Issue #1013 scope guard. Returns `None` when the module is in scope; returns a multi-line diagnostic string (see Req 22) when out-of-scope artifacts were detected. Callers (the per-future completion handler) treat a non-None return as a module failure and replace any prior success record with it. No-ops when `self.scope_guard_enabled is False` or `self.allowed_write_set is None`. Reuses `pdd.agentic_common._revert_out_of_scope_changes` and `pdd.agentic_common_worktree.revert_out_of_scope_changes_with_dirs` rather than reimplementing git scanning. - `_parse_conformance_failure(stdout: str, stderr: str) -> Optional[Tuple[str, Tuple[str, ...]]]`: scan combined stdout+stderr for the line prefix `Architecture conformance error for ` and, when matched, return `(repair_directive, missing_symbols)` where `missing_symbols` is a sorted tuple of the symbols listed after any of the following inline shapes (route each into its own directive bucket — they MUST NOT be merged): - (a) `declared symbols missing from generated code:` — default `ArchitectureConformanceError` shape (architecture.json symbol-existence check). - (b) `Python code uses camelCase names (...)` parenthesised list — camelCase guard. From 8ca3b18117f9873a16413384d9e284d0c0ccffb6 Mon Sep 17 00:00:00 2001 From: Serhan Date: Thu, 14 May 2026 17:00:17 -0700 Subject: [PATCH 02/42] fix: enforce sync allowed write sets --- pdd/agentic_sync.py | 60 +++++++++++++ pdd/agentic_sync_runner.py | 84 +++++++++++++++++++ pdd/durable_sync_runner.py | 27 ++++++ pdd/prompts/durable_sync_runner_python.prompt | 4 +- tests/test_agentic_sync.py | 32 +++++++ tests/test_agentic_sync_runner.py | 48 +++++++++++ tests/test_durable_sync_runner.py | 9 ++ 7 files changed, 262 insertions(+), 2 deletions(-) diff --git a/pdd/agentic_sync.py b/pdd/agentic_sync.py index af4742fa2..bb87696f9 100644 --- a/pdd/agentic_sync.py +++ b/pdd/agentic_sync.py @@ -733,8 +733,10 @@ def run_global_sync( one_session: bool = False, local: bool = False, timeout_adder: float = 0.0, + scope_guard: bool = True, ) -> Tuple[bool, str, float, str]: """Run project-wide Tier 1 global sync from architecture.json.""" + del scope_guard project_root = _find_project_root(Path.cwd()) architecture, arch_path = _load_architecture_json(project_root) if architecture is None: @@ -1353,6 +1355,53 @@ def _parse_llm_response(response: str) -> Tuple[List[str], bool, List[Dict[str, return modules_to_sync, deps_valid, deps_corrections +def _extract_allowed_write_paths(issue_text: str) -> List[str]: + """Extract a split-contract allowed write set from issue text.""" + if not issue_text: + return [] + + allowed: List[str] = [] + seen: set[str] = set() + capture = False + path_re = re.compile(r"`([^`]+)`") + + for raw_line in issue_text.splitlines(): + line = raw_line.strip() + lower = line.lower() + marker_line = re.sub(r"^(?:#+\s*|[-*]\s*)", "", lower).strip() + if ( + marker_line.startswith(("allowed write set", "allowed write-set")) + or marker_line.startswith(("allowed files", "allowed paths")) + or marker_line.startswith("allowed only") + or ( + marker_line.startswith(("split contract", "issue contract")) + and "allowed" in marker_line + ) + ): + capture = True + elif capture and line.startswith("#"): + break + + if not capture: + continue + + matches = path_re.findall(line) + if not matches: + if allowed and not line: + break + continue + + for match in matches: + path = match.strip().replace("\\", "/").lstrip("./") + if not path or " " in path or path.startswith("#"): + continue + if path not in seen: + allowed.append(path) + seen.add(path) + + return allowed + + def _apply_architecture_corrections( arch_path: Path, architecture: List[Dict[str, Any]], @@ -1426,6 +1475,7 @@ def run_agentic_sync( durable_branch: Optional[str] = None, no_resume: bool = False, durable_max_parallel: Optional[int] = None, + scope_guard: bool = True, ) -> Tuple[bool, str, float, str]: """ Run agentic sync workflow: identify modules from a GitHub issue and sync in parallel. @@ -1497,6 +1547,7 @@ def run_agentic_sync( # 5. Build issue content issue_content = f"Title: {title}\n\nDescription:\n{body}\n" + raw_contract_text = body if comments_data and isinstance(comments_data, list): issue_content += "\nComments:\n" for comment in comments_data: @@ -1504,6 +1555,11 @@ def run_agentic_sync( c_user = comment.get("user", {}).get("login", "unknown") c_body = comment.get("body", "") issue_content += f"\n--- Comment by {c_user} ---\n{c_body}\n" + raw_contract_text += f"\n{c_body}\n" + + allowed_write_paths = ( + _extract_allowed_write_paths(raw_contract_text) if scope_guard else [] + ) issue_content = _escape_format_braces(issue_content) @@ -1775,6 +1831,8 @@ def run_agentic_sync( issue_url=issue_url, module_cwds=module_cwds, initial_cost=llm_cost, + allowed_write_set=allowed_write_paths, + scope_guard_enabled=scope_guard, ) else: runner = AsyncSyncRunner( @@ -1787,6 +1845,8 @@ def run_agentic_sync( issue_url=issue_url, module_cwds=module_cwds, initial_cost=llm_cost, + allowed_write_set=allowed_write_paths, + scope_guard_enabled=scope_guard, ) runner_success, runner_msg, total_cost = runner.run() diff --git a/pdd/agentic_sync_runner.py b/pdd/agentic_sync_runner.py index b70761776..762465666 100644 --- a/pdd/agentic_sync_runner.py +++ b/pdd/agentic_sync_runner.py @@ -133,6 +133,42 @@ class DepGraphFromArchitectureResult(NamedTuple): warnings: List[str] +def _normalize_repo_path(path: str) -> str: + """Normalize a repository-relative path for contract comparisons.""" + return str(path or "").replace("\\", "/").strip().lstrip("./") + + +def _git_changed_paths(project_root: Path) -> set[str]: + """Return changed paths from git status, including untracked files.""" + try: + result = subprocess.run( + ["git", "status", "--porcelain", "--untracked-files=all"], + cwd=project_root, + capture_output=True, + text=True, + check=False, + ) + except (OSError, subprocess.SubprocessError): + return set() + if result.returncode != 0: + return set() + + paths: set[str] = set() + for line in result.stdout.splitlines(): + if len(line) < 4: + continue + payload = line[3:].strip() + if not payload: + continue + if " -> " in payload: + old_path, new_path = payload.split(" -> ", 1) + paths.add(_normalize_repo_path(old_path.strip('"'))) + paths.add(_normalize_repo_path(new_path.strip('"'))) + else: + paths.add(_normalize_repo_path(payload.strip('"'))) + return {p for p in paths if p} + + # --------------------------------------------------------------------------- # Helper functions # --------------------------------------------------------------------------- @@ -819,6 +855,10 @@ def __init__( issue_url: Optional[str] = None, module_cwds: Optional[Dict[str, Any]] = None, initial_cost: float = 0.0, + allowed_write_paths: Optional[List[str]] = None, + allowed_write_set: Optional[List[str]] = None, + companion_allowlist: Optional[List[str]] = None, + scope_guard_enabled: bool = True, ): self.basenames: List[str] = list(basenames) self.dep_graph: Dict[str, List[str]] = { @@ -832,9 +872,25 @@ def __init__( self.project_root: Path = Path.cwd() self.module_cwds: Dict[str, Any] = dict(module_cwds or {}) self.initial_cost = float(initial_cost or 0.0) + del companion_allowlist # accepted for prompt/CLI compatibility + active_allowed_paths = ( + allowed_write_paths + if allowed_write_paths is not None + else allowed_write_set + ) + if not scope_guard_enabled: + active_allowed_paths = None + self.allowed_write_paths: set[str] = { + _normalize_repo_path(path) for path in (active_allowed_paths or []) if path + } + self._baseline_changed_paths: set[str] = ( + _git_changed_paths(self.project_root) if self.allowed_write_paths else set() + ) self.total_budget = self.sync_options.get("total_budget") self.max_workers = 1 if self.total_budget is not None else MAX_WORKERS + if self.allowed_write_paths: + self.max_workers = 1 self.module_states: Dict[str, ModuleState] = { b: ModuleState() for b in self.basenames @@ -1662,6 +1718,13 @@ def _sync_one_module(self, basename: str) -> Tuple[bool, float, str]: last_stderr = stderr if success: + violations = self._allowed_write_set_violations() + if violations: + return ( + False, + total_cost, + self._format_allowed_write_set_error(violations), + ) return True, total_cost, "" conformance = _parse_conformance_failure(stdout, stderr) @@ -1692,6 +1755,27 @@ def _sync_one_module(self, basename: str) -> Tuple[bool, float, str]: ) return False, total_cost, hard_block + def _allowed_write_set_violations(self) -> List[str]: + """Return newly changed paths outside the issue's allowed write set.""" + if not self.allowed_write_paths: + return [] + current = _git_changed_paths(self.project_root) + newly_changed = current - self._baseline_changed_paths + violations = [ + path for path in newly_changed if path not in self.allowed_write_paths + ] + return sorted(violations) + + def _format_allowed_write_set_error(self, violations: List[str]) -> str: + allowed = ", ".join(sorted(self.allowed_write_paths)) or "" + blocked = ", ".join(violations) + return ( + "Issue split-contract allowed write set violation. " + f"Out-of-scope changed path(s): {blocked}. " + f"Allowed path(s): {allowed}. " + "Revert or explicitly justify companion artifacts before rerunning pdd sync." + ) + def _build_conformance_hard_failure( self, basename: str, diff --git a/pdd/durable_sync_runner.py b/pdd/durable_sync_runner.py index 1ec1c5f69..794b51a0d 100644 --- a/pdd/durable_sync_runner.py +++ b/pdd/durable_sync_runner.py @@ -47,6 +47,10 @@ def __init__( issue_url: Optional[str] = None, module_cwds: Optional[Dict[str, Path]] = None, initial_cost: float = 0.0, + allowed_write_paths: Optional[List[str]] = None, + allowed_write_set: Optional[List[str]] = None, + companion_allowlist: Optional[List[str]] = None, + scope_guard_enabled: bool = True, ) -> None: self.issue_number = issue_number self.git_root = project_root.resolve() @@ -72,6 +76,10 @@ def __init__( issue_url=issue_url, module_cwds={}, initial_cost=initial_cost, + allowed_write_paths=allowed_write_paths, + allowed_write_set=allowed_write_set, + companion_allowlist=companion_allowlist, + scope_guard_enabled=scope_guard_enabled, ) self.project_root = self.git_root if self.total_budget is not None: @@ -361,6 +369,14 @@ def _stage_module_changes( return False, f"Failed to inspect staged changes: {_combined_output(names)}", False changed_paths = [line.strip() for line in names.stdout.splitlines() if line.strip()] + out_of_scope = self._out_of_scope_staged_paths(changed_paths) + if out_of_scope: + return ( + False, + "Durable sync refuses to checkpoint path(s) outside the issue " + "split-contract allowed write set: " + ", ".join(out_of_scope), + False, + ) unsafe = self._unsafe_staged_paths(basename, changed_paths) if unsafe: return ( @@ -373,6 +389,17 @@ def _stage_module_changes( empty = not changed_paths return True, "", empty + def _out_of_scope_staged_paths(self, paths: List[str]) -> List[str]: + if not self.allowed_write_paths: + return [] + return sorted( + { + path.replace(os.sep, "/").lstrip("./") + for path in paths + if path.replace(os.sep, "/").lstrip("./") not in self.allowed_write_paths + } + ) + def _force_add_module_metadata(self, basename: str, module_worktree: Path) -> None: safe = basename.replace("/", "_") meta_dirs = [ diff --git a/pdd/prompts/durable_sync_runner_python.prompt b/pdd/prompts/durable_sync_runner_python.prompt index 452f8e98b..e4366f3fd 100644 --- a/pdd/prompts/durable_sync_runner_python.prompt +++ b/pdd/prompts/durable_sync_runner_python.prompt @@ -14,7 +14,7 @@ Durable execution engine for `pdd sync --durable`. It must pr 4. Use `.pdd/worktrees/durable-issue-` as the main durable worktree and `.pdd/worktrees/sync-issue--` for per-module worktrees. 5. Resume by scanning pushed checkpoint commits on the durable branch for trailers formatted as `PDD-Sync-Checkpoint-V1: issue= module=`. Ignore trailers for other issues. 6. Do not rely on `.pdd/agentic_sync_state.json` for durable resume. Corrupt or missing local state must not prevent resuming from remote checkpoint trailers. -7. For each successful module, create a checkpoint commit containing only safe, relevant project files and allowed `.pdd/meta/_*.json` metadata. Push the checkpoint before printing `PDD_CHECKPOINT:`. +7. For each successful module, create a checkpoint commit containing only safe, relevant project files and allowed `.pdd/meta/_*.json` metadata. If the parent issue supplied an allowed write set, reject any staged path outside that exact repo-relative set before creating the checkpoint. Push the checkpoint before printing `PDD_CHECKPOINT:`. 8. If a module succeeds with no file diff, create an empty checkpoint commit so resume can still skip it later. 9. Never checkpoint unsafe files: `.env`, `.env.local`, `cost.csv`, `crash.log`, `fix_errors.log`, `.pem`, `.key`, token/secret paths, `.pdd/worktrees`, `.pdd/agentic_sync_state.json`, or unrelated `.pdd` files. 10. On patch conflict or failed module output, exit non-zero for the durable run, abort any in-progress `git am`, preserve prior checkpoints, and do not create later checkpoints. @@ -39,4 +39,4 @@ Durable execution engine for `pdd sync --durable`. It must pr - Preserve already-pushed checkpoints as the source of truth. % Deliverables -- Code: `pdd/durable_sync_runner.py` \ No newline at end of file +- Code: `pdd/durable_sync_runner.py` diff --git a/tests/test_agentic_sync.py b/tests/test_agentic_sync.py index 7e9b49d94..f03e18d86 100644 --- a/tests/test_agentic_sync.py +++ b/tests/test_agentic_sync.py @@ -29,6 +29,7 @@ _is_runtime_llm_template, _llm_fix_dry_run_failure, _load_architecture_json, + _extract_allowed_write_paths, _parse_llm_response, _resolve_module_cwd, _run_dry_run_validation, @@ -169,6 +170,37 @@ def test_deps_valid_case_insensitive(self): assert valid2 is False +class TestExtractAllowedWritePaths: + def test_extracts_split_contract_allowed_paths(self): + issue = """ +## Split Contract +Allowed write set: + + * `pdd/update_main.py` + * `pdd/prompts/update_main_python.prompt` + * `tests/test_update_main.py` + +But sync wrote other files. +""" + assert _extract_allowed_write_paths(issue) == [ + "pdd/update_main.py", + "pdd/prompts/update_main_python.prompt", + "tests/test_update_main.py", + ] + + def test_returns_empty_without_contract_marker(self): + assert _extract_allowed_write_paths("Touch `pdd/foo.py` if needed.") == [] + + def test_ignores_historical_allowed_only_examples(self): + issue = """ +In PR #1010 for issue #1005, the issue contract allowed only: + + * `pdd/update_main.py` + * `tests/test_update_main.py` +""" + assert _extract_allowed_write_paths(issue) == [] + + # --------------------------------------------------------------------------- # _apply_architecture_corrections # --------------------------------------------------------------------------- diff --git a/tests/test_agentic_sync_runner.py b/tests/test_agentic_sync_runner.py index 497a938a7..da4df2c61 100644 --- a/tests/test_agentic_sync_runner.py +++ b/tests/test_agentic_sync_runner.py @@ -2518,6 +2518,54 @@ def test_missing_module_falls_back_to_project_root(self, mock_find, mock_popen, assert popen_kwargs["cwd"] == str(runner.project_root) +class TestAllowedWriteSet: + def test_successful_module_fails_when_new_path_outside_allowed_set(self, monkeypatch): + import pdd.agentic_sync_runner as mod + + snapshots = iter([ + {"README.md"}, + {"README.md", "pdd/allowed.py", "architecture.json"}, + ]) + monkeypatch.setattr(mod, "_git_changed_paths", lambda _root: next(snapshots)) + + runner = AsyncSyncRunner( + basenames=["allowed"], + dep_graph={"allowed": []}, + sync_options={}, + github_info=None, + quiet=True, + allowed_write_paths=["pdd/allowed.py"], + ) + monkeypatch.setattr( + runner, + "_run_attempt", + lambda *_args, **_kwargs: (True, 0.0, "", "ok", ""), + ) + + success, cost, error = runner._sync_one_module("allowed") + + assert success is False + assert cost == 0.0 + assert "allowed write set violation" in error + assert "architecture.json" in error + assert "README.md" not in error + + def test_allowed_write_set_forces_sequential_execution(self, monkeypatch): + import pdd.agentic_sync_runner as mod + + monkeypatch.setattr(mod, "_git_changed_paths", lambda _root: set()) + runner = AsyncSyncRunner( + basenames=["a", "b"], + dep_graph={"a": [], "b": []}, + sync_options={}, + github_info=None, + quiet=True, + allowed_write_paths=["pdd/a.py"], + ) + + assert runner.max_workers == 1 + + # --------------------------------------------------------------------------- # Issue #745: initial_cost (LLM module analysis cost) tracking # --------------------------------------------------------------------------- diff --git a/tests/test_durable_sync_runner.py b/tests/test_durable_sync_runner.py index 4dcc578f4..e6cb1176e 100644 --- a/tests/test_durable_sync_runner.py +++ b/tests/test_durable_sync_runner.py @@ -325,6 +325,15 @@ def test_unsafe_staged_paths_rejects_sensitive_artifacts(tmp_path: Path): assert result == sorted(unsafe_paths) +def test_allowed_write_set_rejects_out_of_scope_checkpoint_paths(tmp_path: Path): + repo = _init_repo_with_remote(tmp_path) + runner = _runner(repo, allowed_write_paths=["src/app.py"]) + + assert runner._out_of_scope_staged_paths( + ["src/app.py", "architecture.json", ".pdd/meta/foo_python.json"] + ) == [".pdd/meta/foo_python.json", "architecture.json"] + + def test_push_failure_preserves_local_checkpoint_and_next_run_pushes_it(tmp_path: Path): repo = _init_repo_with_remote(tmp_path) first = _runner(repo, runner_cls=PushFailingMetadataRunner) From 7345d013f6f93c106d9737901ab4738b4c17f4e2 Mon Sep 17 00:00:00 2001 From: Serhan Date: Thu, 14 May 2026 17:15:37 -0700 Subject: [PATCH 03/42] fix(sync): F1+F2 add IssueContract parser and DEFAULT_SYNC_COMPANION_ALLOWLIST --- pdd/agentic_common.py | 240 +++++++++++++++++++++++++++++++++++++++++- 1 file changed, 239 insertions(+), 1 deletion(-) diff --git a/pdd/agentic_common.py b/pdd/agentic_common.py index f3bb88b9f..625a27816 100644 --- a/pdd/agentic_common.py +++ b/pdd/agentic_common.py @@ -15,7 +15,7 @@ from datetime import datetime from pathlib import Path from typing import List, Optional, Tuple, Dict, Any, Union -from dataclasses import dataclass +from dataclasses import dataclass, field from rich.console import Console @@ -44,6 +44,14 @@ def _load_model_data(*args, **kwargs): # when LLMs quote/discuss a status without declaring it (Issue #865). _SEMANTIC_TAIL_LINES = 30 +# Issue #1013 — sync scope guard: glob patterns for companion artifacts that +# ``pdd sync`` MAY touch as legitimate metadata even when an issue split +# contract narrows the primary write set. Only fingerprint metadata under +# ``.pdd/meta/`` is auto-allowed; everything else (architecture.json, examples, +# unrelated prompts, README/CHANGELOG, etc.) must be opted-in by the contract's +# own ``companion_allowlist`` field. +DEFAULT_SYNC_COMPANION_ALLOWLIST: Tuple[str, ...] = (".pdd/meta/*.json",) + # Semantic fallback patterns for when LLMs paraphrase instead of emitting exact tokens. # Each token maps to a list of regex patterns that capture common paraphrases. # Patterns are checked only after exact and case-insensitive matching fail, @@ -2286,6 +2294,236 @@ def _revert_out_of_scope_changes( ) return reverted + +# --------------------------------------------------------------------------- +# Issue #1013 — split-contract scope guard: issue body / comment parser +# --------------------------------------------------------------------------- + +@dataclass(frozen=True) +class IssueContract: + """ + Parsed split-contract declaration extracted from a GitHub issue body or + comment. + + Attributes: + allowed_paths: Repo-relative POSIX path strings the linked sync run is + permitted to modify as its primary write set. Resolved against the + module's repo root by the caller (this dataclass does NOT resolve + to absolute filesystem paths). + companion_allowlist: Glob patterns (e.g. ``".pdd/meta/*.json"``) for + companion artifacts the run MAY touch outside the primary write + set. The caller unions this with + :data:`DEFAULT_SYNC_COMPANION_ALLOWLIST` to produce the effective + allowlist. + source: Diagnostic label describing where the contract was detected + (currently ``"html-comment"`` or ``"fenced-block"``). + """ + + allowed_paths: Tuple[str, ...] + companion_allowlist: Tuple[str, ...] + source: str + + +# Matches the heading line introducing a fenced-block contract. Case-insensitive +# multiline match so the heading can be ``## Allowed Write Set``, +# ``# split-contract``, etc. +_FENCED_BLOCK_HEADER_RE = re.compile( + r"^\s*(?:#+\s*)?(?:allowed[\s_-]*write[\s_-]*set|split[\s_-]*contract)\b.*$", + re.IGNORECASE | re.MULTILINE, +) + +# Matches the HTML-comment JSON block, e.g.:: +# +# +_HTML_COMMENT_CONTRACT_RE = re.compile( + r"", + re.DOTALL, +) + +# Matches a fenced code block (```text``` or ```json```) optionally preceded by +# whitespace/newlines. Captures the inner body. +_FENCED_BLOCK_RE = re.compile( + r"```(?:[a-zA-Z0-9_-]+)?\s*\n(?P.*?)```", + re.DOTALL, +) + + +def _is_valid_contract_path(raw: object) -> bool: + """ + Return True iff *raw* is a non-empty repo-relative POSIX path string with + no traversal segments and no Windows separators. + + Validation runs inside the parser so a malformed entry never reaches the + runner. Per the docstring on :func:`parse_issue_contract`, invalid entries + are dropped silently; if all entries drop, the parser returns None. + """ + if not isinstance(raw, str): + return False + candidate = raw.strip() + if not candidate: + return False + if "\\" in candidate: + return False + if candidate.startswith("/"): + return False + # Reject parent-traversal segments at any position + parts = candidate.split("/") + if any(part == ".." for part in parts): + return False + return True + + +def _parse_html_comment_contract(text: str) -> Optional[IssueContract]: + """Return a contract parsed from a ```` block, else None.""" + match = _HTML_COMMENT_CONTRACT_RE.search(text) + if not match: + return None + raw_json = match.group("json").strip() + if not raw_json: + return None + try: + parsed = json.loads(raw_json) + except (ValueError, TypeError): + return None + if not isinstance(parsed, dict): + return None + raw_allowed = parsed.get("allowed_paths") + if not isinstance(raw_allowed, list): + return None + # Drop invalid entries silently; if all entries drop, treat as no contract. + allowed = tuple(p.strip() for p in raw_allowed if _is_valid_contract_path(p)) + if not allowed: + return None + raw_companion = parsed.get("companion_allowlist", []) + if not isinstance(raw_companion, list): + raw_companion = [] + companion = tuple( + p.strip() for p in raw_companion if isinstance(p, str) and p.strip() + ) + return IssueContract( + allowed_paths=allowed, + companion_allowlist=companion, + source="html-comment", + ) + + +def _parse_fenced_block_contract(text: str) -> Optional[IssueContract]: + """Return a contract parsed from a heading + fenced code block, else None.""" + header_match = _FENCED_BLOCK_HEADER_RE.search(text) + if not header_match: + return None + after_header = text[header_match.end():] + block_match = _FENCED_BLOCK_RE.search(after_header) + if not block_match: + return None + body = block_match.group("body") + # One repo-relative path per line; ignore blank lines and "#" comments. + paths: List[str] = [] + seen: set = set() + for raw_line in body.splitlines(): + line = raw_line.strip() + if not line or line.startswith("#"): + continue + # Strip surrounding backticks if a user wrapped the path + line = line.strip("`").strip() + if not _is_valid_contract_path(line): + continue + if line not in seen: + paths.append(line) + seen.add(line) + if not paths: + return None + return IssueContract( + allowed_paths=tuple(paths), + companion_allowlist=(), + source="fenced-block", + ) + + +def _parse_contract_from_text(text: Optional[str]) -> Optional[IssueContract]: + """Try HTML-comment first, then fenced-block. Returns None on any failure.""" + if not text: + return None + try: + html_contract = _parse_html_comment_contract(text) + except Exception: # noqa: BLE001 — parser MUST NOT raise on any input + html_contract = None + if html_contract is not None: + return html_contract + try: + return _parse_fenced_block_contract(text) + except Exception: # noqa: BLE001 — parser MUST NOT raise on any input + return None + + +def parse_issue_contract( + issue_body: Optional[str], + issue_comments: Optional[List[str]] = None, +) -> Optional[IssueContract]: + """ + Parse an issue split-contract from an issue body or its comments. + + Two declaration formats are supported (Issue #1013): + + 1. HTML-comment block (authoritative):: + + + + JSON ``allowed_paths`` is required (list of repo-relative path + strings); ``companion_allowlist`` is optional (list of glob patterns). + + 2. Fenced-code-block following a heading line matching + ``allowed write set`` or ``split contract``:: + + ## Allowed Write Set + ```text + pdd/foo.py + tests/test_foo.py + ``` + + One repo-relative path per line; blank lines and ``#``-prefixed + comments are ignored. + + The body is scanned first; if no contract is found there, each comment is + scanned in order. When both sources declare a contract, the body wins + (issues are edited authoritatively; comments are append-only and may carry + stale snapshots from earlier workflow steps). + + Path entries are validated as repo-relative POSIX paths: invalid entries + (absolute, containing ``..``, empty, or using Windows separators ``\\``) + are dropped silently. If ``allowed_paths`` becomes empty after dropping, + the parser returns ``None`` (treated as "no contract → permissive + fallback"). Resolution to absolute filesystem paths is the caller's job + once it knows the repo root. + + The parser MUST NOT raise on any input: malformed JSON, missing fields, + unexpected types, or absent markers all return ``None``. + + Args: + issue_body: Raw issue body markdown. + issue_comments: Optional list of raw issue comment markdown bodies + (oldest first or newest first is fine; the parser picks the first + comment with a valid contract). + + Returns: + Parsed :class:`IssueContract` or ``None`` when no valid contract is + present. + """ + body_contract = _parse_contract_from_text(issue_body) + if body_contract is not None: + return body_contract + for comment in issue_comments or []: + contract = _parse_contract_from_text(comment) + if contract is not None: + return contract + return None + + _CLAUDE_OAUTH_PROBE_TIMEOUT_SECONDS = 10 _ANTHROPIC_KEY_STRIP_NOTICE_LOGGED: Dict[str, bool] = {} From d23b1ef400c78f2d31ea1a3100af8c5b6e5863a9 Mon Sep 17 00:00:00 2001 From: Serhan Date: Thu, 14 May 2026 17:16:48 -0700 Subject: [PATCH 04/42] fix(sync): F3+F4+F11+F15 wire parse_issue_contract through run_agentic_sync --- pdd/agentic_sync.py | 126 ++++++++++++++++++++++++++------------------ 1 file changed, 75 insertions(+), 51 deletions(-) diff --git a/pdd/agentic_sync.py b/pdd/agentic_sync.py index bb87696f9..fbf071591 100644 --- a/pdd/agentic_sync.py +++ b/pdd/agentic_sync.py @@ -21,7 +21,12 @@ from rich.console import Console from .agentic_change import _check_gh_cli, _escape_format_braces, _parse_issue_url, _run_gh_command -from .agentic_common import run_agentic_task +from .agentic_common import ( + DEFAULT_SYNC_COMPANION_ALLOWLIST, + IssueContract, + parse_issue_contract, + run_agentic_task, +) from .agentic_sync_runner import ( AsyncSyncRunner, _architecture_entry_aliases, @@ -736,7 +741,13 @@ def run_global_sync( scope_guard: bool = True, ) -> Tuple[bool, str, float, str]: """Run project-wide Tier 1 global sync from architecture.json.""" - del scope_guard + # Per ``agentic_sync_python.prompt`` § Global Sync 1: the ``scope_guard`` + # kwarg is accepted for CLI signature parity with ``run_agentic_sync`` but + # has no effect in global mode. Global sync has no issue body to parse, so + # the runner is always constructed in permissive fallback mode regardless + # of ``scope_guard``. The kwarg exists so ``pdd sync --no-scope-guard`` + # does not raise ``TypeError`` when dispatched into global mode. + _ = scope_guard project_root = _find_project_root(Path.cwd()) architecture, arch_path = _load_architecture_json(project_root) if architecture is None: @@ -1356,50 +1367,19 @@ def _parse_llm_response(response: str) -> Tuple[List[str], bool, List[Dict[str, def _extract_allowed_write_paths(issue_text: str) -> List[str]: - """Extract a split-contract allowed write set from issue text.""" - if not issue_text: - return [] - - allowed: List[str] = [] - seen: set[str] = set() - capture = False - path_re = re.compile(r"`([^`]+)`") - - for raw_line in issue_text.splitlines(): - line = raw_line.strip() - lower = line.lower() - marker_line = re.sub(r"^(?:#+\s*|[-*]\s*)", "", lower).strip() - if ( - marker_line.startswith(("allowed write set", "allowed write-set")) - or marker_line.startswith(("allowed files", "allowed paths")) - or marker_line.startswith("allowed only") - or ( - marker_line.startswith(("split contract", "issue contract")) - and "allowed" in marker_line - ) - ): - capture = True - elif capture and line.startswith("#"): - break - - if not capture: - continue - - matches = path_re.findall(line) - if not matches: - if allowed and not line: - break - continue - - for match in matches: - path = match.strip().replace("\\", "/").lstrip("./") - if not path or " " in path or path.startswith("#"): - continue - if path not in seen: - allowed.append(path) - seen.add(path) - - return allowed + """ + Deprecated thin wrapper around :func:`parse_issue_contract` (Issue #1013, F3). + + This helper used to do its own loose markdown scan for allowed-write + paths. It now delegates to the structured contract parser in + :mod:`pdd.agentic_common` so the public contract API (HTML-comment JSON + and fenced-block formats) is the single source of truth. The wrapper is + kept for one release so any external caller that imported the private + name does not crash at import time; it returns an empty list whenever + :func:`parse_issue_contract` cannot find a valid contract. + """ + contract = parse_issue_contract(issue_text) + return list(contract.allowed_paths) if contract is not None else [] def _apply_architecture_corrections( @@ -1547,7 +1527,7 @@ def run_agentic_sync( # 5. Build issue content issue_content = f"Title: {title}\n\nDescription:\n{body}\n" - raw_contract_text = body + comment_bodies: List[str] = [] if comments_data and isinstance(comments_data, list): issue_content += "\nComments:\n" for comment in comments_data: @@ -1555,11 +1535,48 @@ def run_agentic_sync( c_user = comment.get("user", {}).get("login", "unknown") c_body = comment.get("body", "") issue_content += f"\n--- Comment by {c_user} ---\n{c_body}\n" - raw_contract_text += f"\n{c_body}\n" + if isinstance(c_body, str) and c_body: + comment_bodies.append(c_body) + + # Issue #1013 — split-contract scope guard (F3, F4, F11): + # Parse the structured contract from the issue body first, then comments. + # When ``scope_guard=False``, log a single WARNING and skip parsing so the + # runner falls back to permissive mode regardless of contract content. + issue_contract: Optional[IssueContract] = None + if scope_guard: + issue_contract = parse_issue_contract(body, comment_bodies) + if not quiet: + if issue_contract is not None: + console.print( + f"[dim]Sync scope guard: contract loaded from " + f"{issue_contract.source} " + f"({len(issue_contract.allowed_paths)} allowed paths)[/dim]" + ) + else: + console.print( + "[dim]Sync scope guard: no contract on issue — " + "running in permissive mode[/dim]" + ) + else: + if not quiet: + console.print( + "[yellow]Sync scope guard: disabled via --no-scope-guard[/yellow]" + ) - allowed_write_paths = ( - _extract_allowed_write_paths(raw_contract_text) if scope_guard else [] - ) + # Resolve effective allow set / companion allowlist for the runner. + # ``None`` (permissive) is preserved when no contract was parsed so the + # runner can distinguish "no contract" from "explicit empty contract". + if issue_contract is not None: + allowed_write_paths: Optional[List[str]] = list(issue_contract.allowed_paths) + effective_companion_allowlist: Tuple[str, ...] = tuple( + dict.fromkeys( + tuple(issue_contract.companion_allowlist) + + tuple(DEFAULT_SYNC_COMPANION_ALLOWLIST) + ) + ) + else: + allowed_write_paths = None + effective_companion_allowlist = tuple(DEFAULT_SYNC_COMPANION_ALLOWLIST) issue_content = _escape_format_braces(issue_content) @@ -1815,6 +1832,9 @@ def run_agentic_sync( "cwd": project_root, } if use_github_state else None + contract_source: Optional[str] = ( + issue_contract.source if issue_contract is not None else None + ) if durable: runner = DurableSyncRunner( basenames=modules_to_sync, @@ -1832,7 +1852,9 @@ def run_agentic_sync( module_cwds=module_cwds, initial_cost=llm_cost, allowed_write_set=allowed_write_paths, + companion_allowlist=effective_companion_allowlist, scope_guard_enabled=scope_guard, + contract_source=contract_source, ) else: runner = AsyncSyncRunner( @@ -1846,7 +1868,9 @@ def run_agentic_sync( module_cwds=module_cwds, initial_cost=llm_cost, allowed_write_set=allowed_write_paths, + companion_allowlist=effective_companion_allowlist, scope_guard_enabled=scope_guard, + contract_source=contract_source, ) runner_success, runner_msg, total_cost = runner.run() From 3423c71da0ce9d4ce42980e2dd8ae34a6962f7ef Mon Sep 17 00:00:00 2001 From: Serhan Date: Thu, 14 May 2026 17:19:03 -0700 Subject: [PATCH 05/42] fix(sync): F5-F9+F12+F14 rebuild scope guard around shared revert helpers --- pdd/agentic_sync_runner.py | 267 ++++++++++++++++++++++++------ tests/test_agentic_sync_runner.py | 59 +++---- 2 files changed, 249 insertions(+), 77 deletions(-) diff --git a/pdd/agentic_sync_runner.py b/pdd/agentic_sync_runner.py index 762465666..7c80d6058 100644 --- a/pdd/agentic_sync_runner.py +++ b/pdd/agentic_sync_runner.py @@ -8,6 +8,7 @@ import csv as _csv import datetime +import fnmatch import json import os import re @@ -18,13 +19,19 @@ import tempfile import threading import time +from collections import defaultdict from concurrent.futures import FIRST_COMPLETED, ThreadPoolExecutor, wait from dataclasses import dataclass, field from pathlib import Path -from typing import Any, Dict, List, NamedTuple, Optional, Tuple +from typing import Any, Dict, Iterable, List, NamedTuple, Optional, Set, Tuple from rich.console import Console +from .agentic_common import ( + DEFAULT_SYNC_COMPANION_ALLOWLIST, + _revert_out_of_scope_changes, +) +from .agentic_common_worktree import revert_out_of_scope_changes_with_dirs from .construct_paths import _is_known_language console = Console() @@ -855,10 +862,10 @@ def __init__( issue_url: Optional[str] = None, module_cwds: Optional[Dict[str, Any]] = None, initial_cost: float = 0.0, - allowed_write_paths: Optional[List[str]] = None, - allowed_write_set: Optional[List[str]] = None, - companion_allowlist: Optional[List[str]] = None, + allowed_write_set: Optional[Iterable[str]] = None, + companion_allowlist: Optional[Iterable[str]] = None, scope_guard_enabled: bool = True, + contract_source: Optional[str] = None, ): self.basenames: List[str] = list(basenames) self.dep_graph: Dict[str, List[str]] = { @@ -872,24 +879,43 @@ def __init__( self.project_root: Path = Path.cwd() self.module_cwds: Dict[str, Any] = dict(module_cwds or {}) self.initial_cost = float(initial_cost or 0.0) - del companion_allowlist # accepted for prompt/CLI compatibility - active_allowed_paths = ( - allowed_write_paths - if allowed_write_paths is not None - else allowed_write_set - ) - if not scope_guard_enabled: - active_allowed_paths = None - self.allowed_write_paths: set[str] = { - _normalize_repo_path(path) for path in (active_allowed_paths or []) if path - } - self._baseline_changed_paths: set[str] = ( - _git_changed_paths(self.project_root) if self.allowed_write_paths else set() + + # Issue #1013 — split-contract scope guard (F5, F9, F14): + # Track contract presence separately from set truthiness. ``None`` + # means "no contract → permissive fallback"; an explicit empty + # iterable means "contract present but empty → reject everything" + # (degenerate but legal). The single accepted kwarg name is + # ``allowed_write_set``; the legacy ``allowed_write_paths`` alias is + # gone per F14. + self.scope_guard_enabled: bool = bool(scope_guard_enabled) + self.contract_source: Optional[str] = contract_source + if scope_guard_enabled and allowed_write_set is not None: + self.allowed_write_paths: Optional[Set[str]] = { + _normalize_repo_path(path) + for path in allowed_write_set + if isinstance(path, str) and path.strip() + } + else: + self.allowed_write_paths = None + self.companion_allowlist: Tuple[str, ...] = tuple( + companion_allowlist if companion_allowlist is not None + else DEFAULT_SYNC_COMPANION_ALLOWLIST ) + # Per-`git toplevel` locks for scope-guard git operations (F12). + # Modules may share a repo root (shared-worktree non-durable sync) so + # the lock key MUST resolve to the actual git toplevel, not the raw + # module_cwd path — otherwise two modules in the same repo get + # separate locks and the race we're trying to prevent reappears. + self._scope_guard_locks: Dict[str, threading.Lock] = defaultdict(threading.Lock) + self._scope_guard_locks_lock = threading.Lock() + self.total_budget = self.sync_options.get("total_budget") self.max_workers = 1 if self.total_budget is not None else MAX_WORKERS - if self.allowed_write_paths: + # When a contract narrows writes, serialise scope-guard enforcement + # across modules so the per-cwd lock isn't fighting parallel git + # status / git checkout calls. + if self.allowed_write_paths is not None: self.max_workers = 1 self.module_states: Dict[str, ModuleState] = { @@ -1483,6 +1509,27 @@ def run(self) -> Tuple[bool, str, float]: f"module(s): {resumed}[/green]" ) + # Issue #1013 — split-contract scope guard logging on run entry. + # WARN on opt-out, dim INFO on permissive fallback, dim INFO with + # source/count when a contract was parsed. Suppress all of these + # under ``quiet`` to honour the orchestrator's no-non-error contract. + if not self.quiet: + if not self.scope_guard_enabled: + console.print( + "[yellow]Scope guard disabled via --no-scope-guard[/yellow]" + ) + elif self.allowed_write_paths is None: + console.print( + "[dim]Scope guard: no contract on issue — " + "running in permissive mode[/dim]" + ) + else: + source = self.contract_source or "" + console.print( + f"[dim]Scope guard: contract loaded from {source} " + f"({len(self.allowed_write_paths)} allowed paths)[/dim]" + ) + self._update_github_comment() prev_sigint = signal.getsignal(signal.SIGINT) @@ -1705,6 +1752,25 @@ def _sync_one_module(self, basename: str) -> Tuple[bool, float, str]: last_stdout = "" last_stderr = "" repair_directive: Optional[str] = None + module_cwd = Path(self.module_cwds.get(basename, self.project_root)) + + def _apply_scope_guard( + success: bool, total_cost: float, error: str + ) -> Tuple[bool, float, str]: + """ + Wrap the result of a per-module attempt with scope-guard + enforcement (Issue #1013, F6, F7, F8). Runs after every attempt — + success OR failure — before returning so out-of-scope artifacts + are reverted even when ``pdd sync`` itself failed. + """ + diagnostic = self._enforce_scope_guard(basename, module_cwd) + if diagnostic is None: + return success, total_cost, error + scope_failure = ( + "Scope guard hard-fail: out-of-scope artifacts detected\n" + + diagnostic + ) + return False, total_cost, scope_failure for attempt in range(MAX_CONFORMANCE_ATTEMPTS): success, cost, error, stdout, stderr = self._run_attempt( @@ -1718,19 +1784,12 @@ def _sync_one_module(self, basename: str) -> Tuple[bool, float, str]: last_stderr = stderr if success: - violations = self._allowed_write_set_violations() - if violations: - return ( - False, - total_cost, - self._format_allowed_write_set_error(violations), - ) - return True, total_cost, "" + return _apply_scope_guard(True, total_cost, "") conformance = _parse_conformance_failure(stdout, stderr) if conformance is None: # Not a conformance failure: do not retry - return False, total_cost, error + return _apply_scope_guard(False, total_cost, error) new_directive, new_missing = conformance if last_missing is not None and new_missing == last_missing: @@ -1749,32 +1808,142 @@ def _sync_one_module(self, basename: str) -> Tuple[bool, float, str]: ) break - # Hard-failure path: include structured conformance block + # Hard-failure path: include structured conformance block, then run + # the scope guard so a failing conformance loop still cleans up + # out-of-scope writes the LLM made on the way to the failure. hard_block = self._build_conformance_hard_failure( basename, last_error, last_stdout, last_stderr ) - return False, total_cost, hard_block - - def _allowed_write_set_violations(self) -> List[str]: - """Return newly changed paths outside the issue's allowed write set.""" - if not self.allowed_write_paths: - return [] - current = _git_changed_paths(self.project_root) - newly_changed = current - self._baseline_changed_paths - violations = [ - path for path in newly_changed if path not in self.allowed_write_paths - ] - return sorted(violations) + return _apply_scope_guard(False, total_cost, hard_block) - def _format_allowed_write_set_error(self, violations: List[str]) -> str: - allowed = ", ".join(sorted(self.allowed_write_paths)) or "" - blocked = ", ".join(violations) - return ( - "Issue split-contract allowed write set violation. " - f"Out-of-scope changed path(s): {blocked}. " - f"Allowed path(s): {allowed}. " - "Revert or explicitly justify companion artifacts before rerunning pdd sync." - ) + # ------------------------------------------------------------------ + # Issue #1013 — split-contract scope guard + # ------------------------------------------------------------------ + + def _resolve_repo_root(self, module_cwd: Path) -> Path: + """ + Return the git toplevel for *module_cwd*, falling back to *module_cwd* + when git is unavailable or the directory is not in a repo. + """ + try: + result = subprocess.run( + ["git", "-C", str(module_cwd), "rev-parse", "--show-toplevel"], + capture_output=True, + text=True, + timeout=10, + ) + except (OSError, subprocess.SubprocessError): + return module_cwd + if result.returncode != 0: + return module_cwd + toplevel = result.stdout.strip() + if not toplevel: + return module_cwd + return Path(toplevel) + + def _scope_guard_lock(self, repo_root: Path) -> threading.Lock: + """Return a per-repo-root :class:`threading.Lock` (F12).""" + key = str(repo_root.resolve()) + with self._scope_guard_locks_lock: + return self._scope_guard_locks[key] + + def _matches_companion_allowlist( + self, rel_posix_path: str, allowlist: Iterable[str] + ) -> bool: + """Return True if *rel_posix_path* matches any companion glob.""" + for pattern in allowlist: + if not pattern: + continue + if fnmatch.fnmatch(rel_posix_path, pattern): + return True + return False + + def _enforce_scope_guard( + self, basename: str, module_cwd: Path + ) -> Optional[str]: + """ + Issue #1013 split-contract enforcement after each per-module sync. + + Returns: + ``None`` when the module is in scope (or enforcement is disabled); + a multi-line diagnostic string when out-of-scope artifacts were + detected and reverted/removed. + + This is a no-op when ``self.scope_guard_enabled`` is False or + ``self.allowed_write_paths is None`` (no parseable contract). + """ + if not self.scope_guard_enabled: + return None + if self.allowed_write_paths is None: + return None + + repo_root = self._resolve_repo_root(Path(module_cwd)) + lock = self._scope_guard_lock(repo_root) + with lock: + # Resolve contract paths to absolute paths under the repo root. + allowed_files: Set[Path] = set() + for rel in self.allowed_write_paths: + if not rel: + continue + allowed_files.add((repo_root / rel).resolve()) + + # Auto-allow companion artifacts (e.g. ``.pdd/meta/*.json``) that + # currently exist or are about to be created under the repo + # root. We add them to the allowed-files set so the helpers in + # ``agentic_common`` / ``agentic_common_worktree`` skip them. + allowlist = tuple(self.companion_allowlist) or DEFAULT_SYNC_COMPANION_ALLOWLIST + for path in repo_root.rglob("*"): + if not path.is_file(): + continue + try: + rel_posix = path.resolve().relative_to(repo_root).as_posix() + except ValueError: + continue + if self._matches_companion_allowlist(rel_posix, allowlist): + allowed_files.add(path.resolve()) + + tracked_reverted = _revert_out_of_scope_changes(repo_root, allowed_files) + untracked_reverted = revert_out_of_scope_changes_with_dirs( + repo_root, allowed_dirs=set(), allowed_files=allowed_files + ) + + # Combine while preserving order and uniqueness for the diagnostic. + seen: Set[str] = set() + offending: List[str] = [] + for path in list(tracked_reverted) + list(untracked_reverted): + try: + rel = Path(path).resolve().relative_to(repo_root).as_posix() + except ValueError: + rel = str(path) + if rel in seen: + continue + # Filter out anything that ended up in the allowed set — + # e.g. companion artifacts that the helpers do not revert + # but that we still surface as no-ops. + if (repo_root / rel).resolve() in allowed_files: + continue + seen.add(rel) + offending.append(rel) + + if not offending: + return None + + source = self.contract_source or "" + allowed_lines = "\n".join( + f" - {p}" for p in sorted(self.allowed_write_paths) + ) or " - " + companion_lines = "\n".join( + f" - {p}" for p in allowlist + ) or " - " + offending_lines = "\n".join(f" - {p}" for p in offending) + diagnostic = ( + f"Scope guard reverted {len(offending)} out-of-scope file(s) " + f"for module '{basename}' (contract source: {source}):\n" + f"{offending_lines}\n" + f"Allowed write set:\n{allowed_lines}\n" + f"Companion allowlist:\n{companion_lines}" + ) + return diagnostic def _build_conformance_hard_failure( self, diff --git a/tests/test_agentic_sync_runner.py b/tests/test_agentic_sync_runner.py index da4df2c61..f7a8cb082 100644 --- a/tests/test_agentic_sync_runner.py +++ b/tests/test_agentic_sync_runner.py @@ -2519,50 +2519,53 @@ def test_missing_module_falls_back_to_project_root(self, mock_find, mock_popen, class TestAllowedWriteSet: - def test_successful_module_fails_when_new_path_outside_allowed_set(self, monkeypatch): - import pdd.agentic_sync_runner as mod - - snapshots = iter([ - {"README.md"}, - {"README.md", "pdd/allowed.py", "architecture.json"}, - ]) - monkeypatch.setattr(mod, "_git_changed_paths", lambda _root: next(snapshots)) + """ + Issue #1013 (F14): the legacy ``allowed_write_paths`` kwarg was removed. + Only ``allowed_write_set`` is accepted by ``AsyncSyncRunner``. Deeper + behavioural coverage for the new ``_enforce_scope_guard`` helper lives + in ``TestEnforceScopeGuard`` below. + """ + def test_allowed_write_set_forces_sequential_execution(self): runner = AsyncSyncRunner( - basenames=["allowed"], - dep_graph={"allowed": []}, + basenames=["a", "b"], + dep_graph={"a": [], "b": []}, sync_options={}, github_info=None, quiet=True, - allowed_write_paths=["pdd/allowed.py"], - ) - monkeypatch.setattr( - runner, - "_run_attempt", - lambda *_args, **_kwargs: (True, 0.0, "", "ok", ""), + allowed_write_set=["pdd/a.py"], ) - success, cost, error = runner._sync_one_module("allowed") + assert runner.max_workers == 1 - assert success is False - assert cost == 0.0 - assert "allowed write set violation" in error - assert "architecture.json" in error - assert "README.md" not in error + def test_permissive_mode_when_no_contract(self): + runner = AsyncSyncRunner( + basenames=["a"], + dep_graph={"a": []}, + sync_options={}, + github_info=None, + quiet=True, + allowed_write_set=None, + ) - def test_allowed_write_set_forces_sequential_execution(self, monkeypatch): - import pdd.agentic_sync_runner as mod + assert runner.allowed_write_paths is None + assert runner.scope_guard_enabled is True + # No contract → no forced sequential execution + assert runner.max_workers == 4 - monkeypatch.setattr(mod, "_git_changed_paths", lambda _root: set()) + def test_explicit_empty_contract_rejects_all_changes(self): runner = AsyncSyncRunner( - basenames=["a", "b"], - dep_graph={"a": [], "b": []}, + basenames=["a"], + dep_graph={"a": []}, sync_options={}, github_info=None, quiet=True, - allowed_write_paths=["pdd/a.py"], + allowed_write_set=[], ) + # Empty-but-present contract is still "contract present"; max_workers + # is forced to 1 and the allow set is the empty set (NOT None). + assert runner.allowed_write_paths == set() assert runner.max_workers == 1 From 4f80645fc8d7d1b1ff308db49d51ab2044d3a234 Mon Sep 17 00:00:00 2001 From: Serhan Date: Thu, 14 May 2026 17:20:23 -0700 Subject: [PATCH 06/42] fix(sync): F13 reuse scope guard in durable runner with companion allowlist --- pdd/durable_sync_runner.py | 38 +++++++++++++++++++++++-------- tests/test_durable_sync_runner.py | 37 ++++++++++++++++++++++++++++-- 2 files changed, 64 insertions(+), 11 deletions(-) diff --git a/pdd/durable_sync_runner.py b/pdd/durable_sync_runner.py index 794b51a0d..bc3eae52c 100644 --- a/pdd/durable_sync_runner.py +++ b/pdd/durable_sync_runner.py @@ -8,6 +8,7 @@ # pylint: disable=too-few-public-methods from __future__ import annotations +import fnmatch import os import re import shutil @@ -20,6 +21,7 @@ from pathlib import Path from typing import Dict, List, Optional, Set, Tuple +from .agentic_common import DEFAULT_SYNC_COMPANION_ALLOWLIST from .agentic_sync_runner import AsyncSyncRunner, MAX_WORKERS CHECKPOINT_TRAILER = "PDD-Sync-Checkpoint-V1" @@ -47,10 +49,10 @@ def __init__( issue_url: Optional[str] = None, module_cwds: Optional[Dict[str, Path]] = None, initial_cost: float = 0.0, - allowed_write_paths: Optional[List[str]] = None, allowed_write_set: Optional[List[str]] = None, companion_allowlist: Optional[List[str]] = None, scope_guard_enabled: bool = True, + contract_source: Optional[str] = None, ) -> None: self.issue_number = issue_number self.git_root = project_root.resolve() @@ -76,10 +78,10 @@ def __init__( issue_url=issue_url, module_cwds={}, initial_cost=initial_cost, - allowed_write_paths=allowed_write_paths, allowed_write_set=allowed_write_set, companion_allowlist=companion_allowlist, scope_guard_enabled=scope_guard_enabled, + contract_source=contract_source, ) self.project_root = self.git_root if self.total_budget is not None: @@ -390,15 +392,33 @@ def _stage_module_changes( return True, "", empty def _out_of_scope_staged_paths(self, paths: List[str]) -> List[str]: - if not self.allowed_write_paths: + """ + Return staged paths that violate the issue split-contract. + + Issue #1013 (F5, F13): when no contract is parsed, + ``self.allowed_write_paths is None`` and durable sync runs in + permissive mode — never reject. When a contract is present, accept + both contract paths AND companion-allowlist matches (e.g. + ``.pdd/meta/*.json``) so fingerprint metadata can still be + checkpointed alongside the primary write set. + """ + # Permissive mode: scope_guard disabled or no contract parsed. + if not self.scope_guard_enabled or self.allowed_write_paths is None: return [] - return sorted( - { - path.replace(os.sep, "/").lstrip("./") - for path in paths - if path.replace(os.sep, "/").lstrip("./") not in self.allowed_write_paths - } + allowlist = ( + tuple(self.companion_allowlist) or DEFAULT_SYNC_COMPANION_ALLOWLIST ) + offending: Set[str] = set() + for raw in paths: + normalized = raw.replace(os.sep, "/").lstrip("./") + if normalized in self.allowed_write_paths: + continue + if any( + fnmatch.fnmatch(normalized, pattern) for pattern in allowlist + ): + continue + offending.add(normalized) + return sorted(offending) def _force_add_module_metadata(self, basename: str, module_worktree: Path) -> None: safe = basename.replace("/", "_") diff --git a/tests/test_durable_sync_runner.py b/tests/test_durable_sync_runner.py index e6cb1176e..ad9c558a5 100644 --- a/tests/test_durable_sync_runner.py +++ b/tests/test_durable_sync_runner.py @@ -326,12 +326,45 @@ def test_unsafe_staged_paths_rejects_sensitive_artifacts(tmp_path: Path): def test_allowed_write_set_rejects_out_of_scope_checkpoint_paths(tmp_path: Path): + """ + Issue #1013 (F5, F13, F14): kwarg is now ``allowed_write_set`` (the + ``allowed_write_paths`` alias was removed) and ``.pdd/meta/*.json`` is + auto-allowed via ``DEFAULT_SYNC_COMPANION_ALLOWLIST`` — only paths + outside both the contract AND the companion allowlist are rejected. + """ repo = _init_repo_with_remote(tmp_path) - runner = _runner(repo, allowed_write_paths=["src/app.py"]) + runner = _runner(repo, allowed_write_set=["src/app.py"]) assert runner._out_of_scope_staged_paths( ["src/app.py", "architecture.json", ".pdd/meta/foo_python.json"] - ) == [".pdd/meta/foo_python.json", "architecture.json"] + ) == ["architecture.json"] + + +def test_allowed_write_set_none_means_permissive_for_durable_runner(tmp_path: Path): + """ + Issue #1013 (F9): when no contract is parsed (``allowed_write_set=None``), + durable sync runs in permissive mode — out-of-scope rejection is a no-op. + """ + repo = _init_repo_with_remote(tmp_path) + runner = _runner(repo, allowed_write_set=None) + + assert runner._out_of_scope_staged_paths( + ["src/app.py", "architecture.json", "anything/else.txt"] + ) == [] + + +def test_allowed_write_set_empty_rejects_everything_for_durable_runner(tmp_path: Path): + """ + Issue #1013 (F9): explicit empty contract means "reject every primary + write" — though companion artifacts still pass via the default allowlist. + """ + repo = _init_repo_with_remote(tmp_path) + runner = _runner(repo, allowed_write_set=[]) + + result = runner._out_of_scope_staged_paths( + ["src/app.py", ".pdd/meta/foo_python.json"] + ) + assert result == ["src/app.py"] def test_push_failure_preserves_local_checkpoint_and_next_run_pushes_it(tmp_path: Path): From 604c2582fda4b93ba21b2cb7b8d615b0c76931d8 Mon Sep 17 00:00:00 2001 From: Serhan Date: Thu, 14 May 2026 17:21:20 -0700 Subject: [PATCH 07/42] fix(sync): F10 add --no-scope-guard CLI flag to pdd sync --- pdd/commands/maintenance.py | 18 ++++++++++++++++++ 1 file changed, 18 insertions(+) diff --git a/pdd/commands/maintenance.py b/pdd/commands/maintenance.py index 6b35ae717..714a1acdd 100644 --- a/pdd/commands/maintenance.py +++ b/pdd/commands/maintenance.py @@ -121,6 +121,17 @@ default=None, help="Maximum parallel module worktrees in durable mode. Default: current runner concurrency.", ) +@click.option( + "--no-scope-guard", + "no_scope_guard", + is_flag=True, + default=False, + help="Issue-sync only. Disable the split-contract scope guard for this run. " + "By default, when the linked GitHub issue declares an allowed write set " + "(split contract), `pdd sync` enforces it and rejects out-of-scope generated " + "artifacts. Pass this flag only when intentionally overriding contract " + "enforcement (e.g. recovering from a stale contract).", +) @click.pass_context @track_cost def sync( @@ -143,6 +154,7 @@ def sync( durable_branch: Optional[str], no_resume: bool, durable_max_parallel: Optional[int], + no_scope_guard: bool, ) -> Optional[Tuple[str, float, str]]: """ Synchronize prompts with code and tests. @@ -179,6 +191,7 @@ def sync( max_attempts=max_attempts, one_session=effective_one_session, timeout_adder=timeout_adder, + scope_guard=not no_scope_guard, ) # Detect GitHub issue URL -> dispatch to agentic sync @@ -208,6 +221,7 @@ def sync( durable_branch=durable_branch, no_resume=no_resume, durable_max_parallel=durable_max_parallel, + scope_guard=not no_scope_guard, ) if durable or durable_branch or no_resume or durable_max_parallel is not None: @@ -255,6 +269,7 @@ def _run_agentic_sync_dispatch( durable_branch: Optional[str] = None, no_resume: bool = False, durable_max_parallel: Optional[int] = None, + scope_guard: bool = True, ) -> Optional[Tuple[str, float, str]]: """Dispatch to agentic sync runner for GitHub issue URLs.""" ctx.ensure_object(dict) @@ -281,6 +296,7 @@ def _run_agentic_sync_dispatch( durable_branch=durable_branch, no_resume=no_resume, durable_max_parallel=durable_max_parallel, + scope_guard=scope_guard, ) if not quiet: @@ -314,6 +330,7 @@ def _run_global_sync_dispatch( max_attempts: Optional[int], one_session: bool = False, timeout_adder: float = 0.0, + scope_guard: bool = True, ) -> Optional[Tuple[str, float, str]]: """Dispatch to global sync runner for no-argument `pdd sync`.""" ctx.ensure_object(dict) @@ -337,6 +354,7 @@ def _run_global_sync_dispatch( one_session=one_session, local=ctx.obj.get("local", False), timeout_adder=timeout_adder, + scope_guard=scope_guard, ) if not quiet: From 7b41c91e2a4bdeee9118c6b358e94055ff9cf423 Mon Sep 17 00:00:00 2001 From: Serhan Date: Thu, 14 May 2026 17:40:39 -0700 Subject: [PATCH 08/42] test(sync): F16 migrate _extract_allowed_write_paths tests to structured contract formats The deprecated wrapper now delegates to parse_issue_contract, which only recognizes HTML-comment JSON or heading+fenced-block formats. Update the tests to cover both supported formats and confirm the loose-markdown-bullet format from earlier iterations is no longer accepted. --- tests/test_agentic_sync.py | 45 ++++++++++++++++++++++++++++++-------- 1 file changed, 36 insertions(+), 9 deletions(-) diff --git a/tests/test_agentic_sync.py b/tests/test_agentic_sync.py index f03e18d86..0ab652099 100644 --- a/tests/test_agentic_sync.py +++ b/tests/test_agentic_sync.py @@ -171,14 +171,23 @@ def test_deps_valid_case_insensitive(self): class TestExtractAllowedWritePaths: - def test_extracts_split_contract_allowed_paths(self): - issue = """ -## Split Contract -Allowed write set: + """ + Issue #1013 (F1, F3, F16): the deprecated ``_extract_allowed_write_paths`` + wrapper now delegates to :func:`pdd.agentic_common.parse_issue_contract`, + which only recognizes two structured contract formats: HTML-comment + blocks and heading+fenced-block. The legacy loose-markdown parsing tested + here previously is intentionally NOT supported by the new contract API — + deeper coverage lives in ``tests/test_agentic_common.py``. + """ - * `pdd/update_main.py` - * `pdd/prompts/update_main_python.prompt` - * `tests/test_update_main.py` + def test_extracts_split_contract_allowed_paths_from_fenced_block(self): + issue = """ +## Allowed Write Set +```text +pdd/update_main.py +pdd/prompts/update_main_python.prompt +tests/test_update_main.py +``` But sync wrote other files. """ @@ -188,12 +197,30 @@ def test_extracts_split_contract_allowed_paths(self): "tests/test_update_main.py", ] + def test_extracts_split_contract_allowed_paths_from_html_comment(self): + issue = """ +Some discussion. + + + +More discussion. +""" + assert _extract_allowed_write_paths(issue) == [ + "pdd/update_main.py", + "tests/test_update_main.py", + ] + def test_returns_empty_without_contract_marker(self): assert _extract_allowed_write_paths("Touch `pdd/foo.py` if needed.") == [] - def test_ignores_historical_allowed_only_examples(self): + def test_ignores_loose_markdown_bullets_without_structured_block(self): + # The legacy markdown-bullet format is no longer supported; the new + # contract API requires either an HTML-comment or a fenced block. issue = """ -In PR #1010 for issue #1005, the issue contract allowed only: +## Split Contract +Allowed write set: * `pdd/update_main.py` * `tests/test_update_main.py` From 241021ba536f8c3c3e3f655807884a78ad192af4 Mon Sep 17 00:00:00 2001 From: Serhan Date: Thu, 14 May 2026 17:42:39 -0700 Subject: [PATCH 09/42] fix(sync): strip only leading './' from contract paths, not arbitrary dot/slash MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Replaces .lstrip('./') in two scope-guard normalizers. lstrip strips any combination of '.' and '/' characters from the left, so '.pdd/meta/foo.json' was being rewritten to 'pdd/meta/foo.json' and missing the '.pdd/meta/*.json' companion glob — the auto-allow that issue #1013 documents. Now strips a single './' prefix only, preserving paths whose first segment begins with a dot. --- pdd/agentic_sync_runner.py | 14 ++++++++++++-- pdd/durable_sync_runner.py | 4 +++- 2 files changed, 15 insertions(+), 3 deletions(-) diff --git a/pdd/agentic_sync_runner.py b/pdd/agentic_sync_runner.py index 7c80d6058..fd18324d9 100644 --- a/pdd/agentic_sync_runner.py +++ b/pdd/agentic_sync_runner.py @@ -141,8 +141,18 @@ class DepGraphFromArchitectureResult(NamedTuple): def _normalize_repo_path(path: str) -> str: - """Normalize a repository-relative path for contract comparisons.""" - return str(path or "").replace("\\", "/").strip().lstrip("./") + """Normalize a repository-relative path for contract comparisons. + + Strip a single leading ``./`` segment only. Do NOT use ``str.lstrip("./")`` + which strips arbitrary leading ``.`` and ``/`` characters and would mangle + legitimate paths whose first segment starts with a dot (e.g. + ``.pdd/meta/foo.json`` would become ``pdd/meta/foo.json`` and miss the + ``.pdd/meta/*.json`` companion glob — Issue #1013 F5 regression). + """ + cleaned = str(path or "").replace("\\", "/").strip() + if cleaned.startswith("./"): + cleaned = cleaned[2:] + return cleaned def _git_changed_paths(project_root: Path) -> set[str]: diff --git a/pdd/durable_sync_runner.py b/pdd/durable_sync_runner.py index bc3eae52c..f77405acc 100644 --- a/pdd/durable_sync_runner.py +++ b/pdd/durable_sync_runner.py @@ -410,7 +410,9 @@ def _out_of_scope_staged_paths(self, paths: List[str]) -> List[str]: ) offending: Set[str] = set() for raw in paths: - normalized = raw.replace(os.sep, "/").lstrip("./") + normalized = raw.replace(os.sep, "/").strip() + if normalized.startswith("./"): + normalized = normalized[2:] if normalized in self.allowed_write_paths: continue if any( From fc57b44896cd95b5dd7ef30c93e8531c14a2fb68 Mon Sep 17 00:00:00 2001 From: Serhan Date: Thu, 14 May 2026 17:51:49 -0700 Subject: [PATCH 10/42] fix(sync): F1+F2 keep empty contracts as reject-all and tighten fenced-block parser MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit F1: parse_issue_contract used to return None when allowed_paths parsed to an empty list (either declared as [] or reduced to [] after dropping invalid entries). Per Issue #1013, a syntactically valid empty contract means "reject every change as out-of-scope" — return IssueContract(allowed_paths=()) instead so the runner can enforce reject-all. Update the docstring to match. F2: tighten the fenced-block regex so it requires the fence to IMMEDIATELY follow the heading (anchored with \A, only whitespace between) and restrict the info string to empty/text/json. Prevents the parser from picking up a random ```python``` or ```bash``` block somewhere later in the issue body. Co-Authored-By: Claude Opus 4.7 --- pdd/agentic_common.py | 51 ++++++++++++++++++++++++++++--------------- 1 file changed, 33 insertions(+), 18 deletions(-) diff --git a/pdd/agentic_common.py b/pdd/agentic_common.py index 625a27816..465ede62a 100644 --- a/pdd/agentic_common.py +++ b/pdd/agentic_common.py @@ -2342,10 +2342,13 @@ class IssueContract: re.DOTALL, ) -# Matches a fenced code block (```text``` or ```json```) optionally preceded by -# whitespace/newlines. Captures the inner body. +# Matches a fenced code block (```text``` or ```json```, or bare ```) that +# IMMEDIATELY follows the heading. Only whitespace/newlines may precede the +# fence (anchored via ``\A`` once we slice the text after the heading); the +# info string is restricted to empty/``text``/``json`` per spec. Captures the +# inner body. _FENCED_BLOCK_RE = re.compile( - r"```(?:[a-zA-Z0-9_-]+)?\s*\n(?P.*?)```", + r"\A\s*```(?:text|json)?[ \t]*\n(?P.*?)```", re.DOTALL, ) @@ -2356,8 +2359,10 @@ def _is_valid_contract_path(raw: object) -> bool: no traversal segments and no Windows separators. Validation runs inside the parser so a malformed entry never reaches the - runner. Per the docstring on :func:`parse_issue_contract`, invalid entries - are dropped silently; if all entries drop, the parser returns None. + runner. Per Issue #1013 spec, syntactically invalid entries are dropped + silently; the *contract* itself remains valid even if the resulting + ``allowed_paths`` ends up empty (empty contract → reject-all enforcement, + see :func:`parse_issue_contract`). """ if not isinstance(raw, str): return False @@ -2392,10 +2397,10 @@ def _parse_html_comment_contract(text: str) -> Optional[IssueContract]: raw_allowed = parsed.get("allowed_paths") if not isinstance(raw_allowed, list): return None - # Drop invalid entries silently; if all entries drop, treat as no contract. + # Drop syntactically invalid entries silently. Per Issue #1013, a + # syntactically valid contract with an empty ``allowed_paths`` is a + # legal degenerate contract meaning "reject every change" — keep it. allowed = tuple(p.strip() for p in raw_allowed if _is_valid_contract_path(p)) - if not allowed: - return None raw_companion = parsed.get("companion_allowlist", []) if not isinstance(raw_companion, list): raw_companion = [] @@ -2410,12 +2415,19 @@ def _parse_html_comment_contract(text: str) -> Optional[IssueContract]: def _parse_fenced_block_contract(text: str) -> Optional[IssueContract]: - """Return a contract parsed from a heading + fenced code block, else None.""" + """Return a contract parsed from a heading + immediately-following fenced + code block, else None. The fence MUST be ```` ``` ```` / + ```` ```text ```` / ```` ```json ```` and MUST appear immediately after + the heading (only whitespace permitted between). When the fence body + contains no valid paths, the contract is still returned as an empty + (reject-all) contract per Issue #1013.""" header_match = _FENCED_BLOCK_HEADER_RE.search(text) if not header_match: return None after_header = text[header_match.end():] - block_match = _FENCED_BLOCK_RE.search(after_header) + # ``\A``-anchored regex: the fence must IMMEDIATELY follow the heading + # (only whitespace/newlines between heading end and the opening fence). + block_match = _FENCED_BLOCK_RE.match(after_header) if not block_match: return None body = block_match.group("body") @@ -2433,8 +2445,7 @@ def _parse_fenced_block_contract(text: str) -> Optional[IssueContract]: if line not in seen: paths.append(line) seen.add(line) - if not paths: - return None + # Empty fenced block is a legal degenerate contract (reject all). return IssueContract( allowed_paths=tuple(paths), companion_allowlist=(), @@ -2494,12 +2505,16 @@ def parse_issue_contract( (issues are edited authoritatively; comments are append-only and may carry stale snapshots from earlier workflow steps). - Path entries are validated as repo-relative POSIX paths: invalid entries - (absolute, containing ``..``, empty, or using Windows separators ``\\``) - are dropped silently. If ``allowed_paths`` becomes empty after dropping, - the parser returns ``None`` (treated as "no contract → permissive - fallback"). Resolution to absolute filesystem paths is the caller's job - once it knows the repo root. + Path entries are validated as repo-relative POSIX paths: syntactically + invalid entries (absolute, containing ``..``, empty, or using Windows + separators ``\\``) are dropped silently. A syntactically valid contract + whose ``allowed_paths`` ends up empty (either declared as ``[]`` or + reduced to ``[]`` after dropping invalid entries) is still returned as + an :class:`IssueContract` with ``allowed_paths=()``; the caller treats + that as a degenerate "reject every change" contract. The parser returns + ``None`` only when there is no parseable marker at all or when the + marker payload is syntactically malformed. Resolution to absolute + filesystem paths is the caller's job once it knows the repo root. The parser MUST NOT raise on any input: malformed JSON, missing fields, unexpected types, or absent markers all return ``None``. From d297f87b6dea356e0e87dcd54bf50b40a580dea3 Mon Sep 17 00:00:00 2001 From: Serhan Date: Thu, 14 May 2026 17:53:01 -0700 Subject: [PATCH 11/42] fix(sync): F3 use pathlib-style match for companion allowlist globs fnmatch.fnmatch treats ``*`` as matching any character including ``/``, so a companion pattern of ``.pdd/meta/*.json`` was inadvertently allowing nested paths like ``.pdd/meta/nested/foo.json``. Issue #1013 specifies pathlib-style glob semantics. Switch both AsyncSyncRunner._matches_companion_allowlist and DurableSyncRunner._out_of_scope_staged_paths to PurePosixPath(rel).match(pat). Drop the now-unused fnmatch / DEFAULT_SYNC_COMPANION_ALLOWLIST imports. Co-Authored-By: Claude Opus 4.7 --- pdd/agentic_sync_runner.py | 19 ++++++++++++++----- pdd/durable_sync_runner.py | 26 +++++++++++++++++--------- 2 files changed, 31 insertions(+), 14 deletions(-) diff --git a/pdd/agentic_sync_runner.py b/pdd/agentic_sync_runner.py index fd18324d9..a5695a261 100644 --- a/pdd/agentic_sync_runner.py +++ b/pdd/agentic_sync_runner.py @@ -8,7 +8,6 @@ import csv as _csv import datetime -import fnmatch import json import os import re @@ -22,7 +21,7 @@ from collections import defaultdict from concurrent.futures import FIRST_COMPLETED, ThreadPoolExecutor, wait from dataclasses import dataclass, field -from pathlib import Path +from pathlib import Path, PurePosixPath from typing import Any, Dict, Iterable, List, NamedTuple, Optional, Set, Tuple from rich.console import Console @@ -1860,12 +1859,22 @@ def _scope_guard_lock(self, repo_root: Path) -> threading.Lock: def _matches_companion_allowlist( self, rel_posix_path: str, allowlist: Iterable[str] ) -> bool: - """Return True if *rel_posix_path* matches any companion glob.""" + """Return True if *rel_posix_path* matches any companion glob. + + Uses ``pathlib.PurePosixPath.match`` (not ``fnmatch.fnmatch``) so that + ``.pdd/meta/*.json`` does NOT inadvertently match nested paths like + ``.pdd/meta/nested/foo.json``. Matches Issue #1013 spec. + """ + candidate = PurePosixPath(rel_posix_path) for pattern in allowlist: if not pattern: continue - if fnmatch.fnmatch(rel_posix_path, pattern): - return True + try: + if candidate.match(pattern): + return True + except ValueError: + # Invalid glob pattern — treat as non-match rather than raise. + continue return False def _enforce_scope_guard( diff --git a/pdd/durable_sync_runner.py b/pdd/durable_sync_runner.py index f77405acc..af0a8e0f7 100644 --- a/pdd/durable_sync_runner.py +++ b/pdd/durable_sync_runner.py @@ -8,7 +8,6 @@ # pylint: disable=too-few-public-methods from __future__ import annotations -import fnmatch import os import re import shutil @@ -18,10 +17,9 @@ import time import uuid from hashlib import sha1 -from pathlib import Path +from pathlib import Path, PurePosixPath from typing import Dict, List, Optional, Set, Tuple -from .agentic_common import DEFAULT_SYNC_COMPANION_ALLOWLIST from .agentic_sync_runner import AsyncSyncRunner, MAX_WORKERS CHECKPOINT_TRAILER = "PDD-Sync-Checkpoint-V1" @@ -405,9 +403,7 @@ def _out_of_scope_staged_paths(self, paths: List[str]) -> List[str]: # Permissive mode: scope_guard disabled or no contract parsed. if not self.scope_guard_enabled or self.allowed_write_paths is None: return [] - allowlist = ( - tuple(self.companion_allowlist) or DEFAULT_SYNC_COMPANION_ALLOWLIST - ) + allowlist = tuple(self.companion_allowlist) offending: Set[str] = set() for raw in paths: normalized = raw.replace(os.sep, "/").strip() @@ -415,9 +411,21 @@ def _out_of_scope_staged_paths(self, paths: List[str]) -> List[str]: normalized = normalized[2:] if normalized in self.allowed_write_paths: continue - if any( - fnmatch.fnmatch(normalized, pattern) for pattern in allowlist - ): + # F3 (Issue #1013): companion glob matching uses pathlib-style + # semantics so ``.pdd/meta/*.json`` does NOT match nested paths + # like ``.pdd/meta/nested/foo.json``. + candidate = PurePosixPath(normalized) + matched = False + for pattern in allowlist: + if not pattern: + continue + try: + if candidate.match(pattern): + matched = True + break + except ValueError: + continue + if matched: continue offending.add(normalized) return sorted(offending) From 7de4c4d037ef28fcd627d12c2e94c927a379e571 Mon Sep 17 00:00:00 2001 From: Serhan Date: Thu, 14 May 2026 17:54:31 -0700 Subject: [PATCH 12/42] fix(sync): F4+F6+F7 union companion allowlist, record contract under opt-out, single log line MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit F4: AsyncSyncRunner now ALWAYS unions the caller-supplied companion_allowlist with DEFAULT_SYNC_COMPANION_ALLOWLIST (caller patterns first, defaults appended, deterministic dedup). Previously the runner stored the tuple as-is and only fell back to the default when empty, so a caller passing [".github/*.yml"] would silently lose .pdd/meta/*.json coverage. The dead ``or DEFAULT_SYNC_COMPANION_ALLOWLIST`` fallback at the enforcement call site is removed now that __init__ guarantees a non-empty allowlist. F6: parse_issue_contract is now called UNCONDITIONALLY in run_agentic_sync, and the parsed contract is plumbed through to the runner regardless of the ``--no-scope-guard`` flag. The runner records ``allowed_write_paths`` and ``contract_source`` for diagnostics even when scope_guard_enabled=False; _enforce_scope_guard short-circuits on the disabled flag so behaviour is unchanged at enforcement time. The max_workers serialisation gate is also updated to require ``scope_guard_enabled AND contract`` — opt-out runs no longer needlessly drop to single-threaded execution. F7: scope-guard status logging now lives in exactly one layer (``agentic_sync.run_agentic_sync``, closer to the user). The runner's duplicate ``run()``-entry log block is removed. Co-Authored-By: Claude Opus 4.7 --- pdd/agentic_sync.py | 49 +++++++++++++++++------------- pdd/agentic_sync_runner.py | 61 ++++++++++++++++++++------------------ 2 files changed, 60 insertions(+), 50 deletions(-) diff --git a/pdd/agentic_sync.py b/pdd/agentic_sync.py index fbf071591..7a182fd35 100644 --- a/pdd/agentic_sync.py +++ b/pdd/agentic_sync.py @@ -1538,34 +1538,41 @@ def run_agentic_sync( if isinstance(c_body, str) and c_body: comment_bodies.append(c_body) - # Issue #1013 — split-contract scope guard (F3, F4, F11): - # Parse the structured contract from the issue body first, then comments. - # When ``scope_guard=False``, log a single WARNING and skip parsing so the - # runner falls back to permissive mode regardless of contract content. - issue_contract: Optional[IssueContract] = None - if scope_guard: - issue_contract = parse_issue_contract(body, comment_bodies) - if not quiet: - if issue_contract is not None: - console.print( - f"[dim]Sync scope guard: contract loaded from " - f"{issue_contract.source} " - f"({len(issue_contract.allowed_paths)} allowed paths)[/dim]" - ) - else: - console.print( - "[dim]Sync scope guard: no contract on issue — " - "running in permissive mode[/dim]" - ) - else: - if not quiet: + # Issue #1013 — split-contract scope guard (F3, F4, F6, F11): + # Parse the structured contract from the issue body first, then comments, + # *regardless* of whether scope-guard enforcement is enabled — the + # ``--no-scope-guard`` opt-out should still record the parsed contract + # for diagnostics. The runner short-circuits enforcement when + # ``scope_guard_enabled=False`` (see AsyncSyncRunner._enforce_scope_guard). + issue_contract: Optional[IssueContract] = parse_issue_contract( + body, comment_bodies + ) + if not quiet: + if not scope_guard: + # F7: this is the single user-facing log line for the + # ``--no-scope-guard`` opt-out. The runner no longer logs the + # same state on entry — kept here so it's closer to the user. console.print( "[yellow]Sync scope guard: disabled via --no-scope-guard[/yellow]" ) + elif issue_contract is not None: + console.print( + f"[dim]Sync scope guard: contract loaded from " + f"{issue_contract.source} " + f"({len(issue_contract.allowed_paths)} allowed paths)[/dim]" + ) + else: + console.print( + "[dim]Sync scope guard: no contract on issue — " + "running in permissive mode[/dim]" + ) # Resolve effective allow set / companion allowlist for the runner. # ``None`` (permissive) is preserved when no contract was parsed so the # runner can distinguish "no contract" from "explicit empty contract". + # The runner unions the companion allowlist with + # DEFAULT_SYNC_COMPANION_ALLOWLIST in its __init__ (F4); we still pre-union + # here so the durable runner's parent __init__ does the same dedup pass. if issue_contract is not None: allowed_write_paths: Optional[List[str]] = list(issue_contract.allowed_paths) effective_companion_allowlist: Tuple[str, ...] = tuple( diff --git a/pdd/agentic_sync_runner.py b/pdd/agentic_sync_runner.py index a5695a261..18a7688ba 100644 --- a/pdd/agentic_sync_runner.py +++ b/pdd/agentic_sync_runner.py @@ -889,16 +889,22 @@ def __init__( self.module_cwds: Dict[str, Any] = dict(module_cwds or {}) self.initial_cost = float(initial_cost or 0.0) - # Issue #1013 — split-contract scope guard (F5, F9, F14): + # Issue #1013 — split-contract scope guard (F5, F9, F14, F4, F6): # Track contract presence separately from set truthiness. ``None`` # means "no contract → permissive fallback"; an explicit empty # iterable means "contract present but empty → reject everything" # (degenerate but legal). The single accepted kwarg name is # ``allowed_write_set``; the legacy ``allowed_write_paths`` alias is # gone per F14. + # + # F6: the parsed contract is recorded for diagnostics *regardless* of + # whether scope-guard enforcement is enabled. ``_enforce_scope_guard`` + # short-circuits on ``scope_guard_enabled=False``, so storing the + # contract in opt-out mode is safe and matches the spec requirement + # that disabled runners still see the parsed contract. self.scope_guard_enabled: bool = bool(scope_guard_enabled) self.contract_source: Optional[str] = contract_source - if scope_guard_enabled and allowed_write_set is not None: + if allowed_write_set is not None: self.allowed_write_paths: Optional[Set[str]] = { _normalize_repo_path(path) for path in allowed_write_set @@ -906,9 +912,18 @@ def __init__( } else: self.allowed_write_paths = None + # F4: the effective companion allowlist is *always* the caller-provided + # patterns unioned with DEFAULT_SYNC_COMPANION_ALLOWLIST. Order is + # preserved (caller patterns first, defaults appended) and duplicates + # are removed deterministically. Passing an empty iterable still + # produces at least the default; passing ``None`` is identical to + # passing an empty iterable. + provided: Tuple[str, ...] = tuple( + p for p in (companion_allowlist or ()) + if isinstance(p, str) and p + ) self.companion_allowlist: Tuple[str, ...] = tuple( - companion_allowlist if companion_allowlist is not None - else DEFAULT_SYNC_COMPANION_ALLOWLIST + dict.fromkeys(provided + tuple(DEFAULT_SYNC_COMPANION_ALLOWLIST)) ) # Per-`git toplevel` locks for scope-guard git operations (F12). @@ -921,10 +936,12 @@ def __init__( self.total_budget = self.sync_options.get("total_budget") self.max_workers = 1 if self.total_budget is not None else MAX_WORKERS - # When a contract narrows writes, serialise scope-guard enforcement - # across modules so the per-cwd lock isn't fighting parallel git - # status / git checkout calls. - if self.allowed_write_paths is not None: + # When a contract narrows writes AND scope-guard enforcement is + # active, serialise across modules so the per-cwd lock isn't fighting + # parallel git status / git checkout calls. With ``--no-scope-guard`` + # the contract is recorded for diagnostics only — no enforcement runs + # — so parallelism is preserved (F6). + if self.scope_guard_enabled and self.allowed_write_paths is not None: self.max_workers = 1 self.module_states: Dict[str, ModuleState] = { @@ -1518,26 +1535,10 @@ def run(self) -> Tuple[bool, str, float]: f"module(s): {resumed}[/green]" ) - # Issue #1013 — split-contract scope guard logging on run entry. - # WARN on opt-out, dim INFO on permissive fallback, dim INFO with - # source/count when a contract was parsed. Suppress all of these - # under ``quiet`` to honour the orchestrator's no-non-error contract. - if not self.quiet: - if not self.scope_guard_enabled: - console.print( - "[yellow]Scope guard disabled via --no-scope-guard[/yellow]" - ) - elif self.allowed_write_paths is None: - console.print( - "[dim]Scope guard: no contract on issue — " - "running in permissive mode[/dim]" - ) - else: - source = self.contract_source or "" - console.print( - f"[dim]Scope guard: contract loaded from {source} " - f"({len(self.allowed_write_paths)} allowed paths)[/dim]" - ) + # Issue #1013 (F7): scope-guard status logging happens once in the + # sync-layer dispatch (``agentic_sync.run_agentic_sync``) — that's + # closer to the user and emits exactly one INFO/WARNING line. The + # runner intentionally does NOT log the same state again on entry. self._update_github_comment() @@ -1910,7 +1911,9 @@ def _enforce_scope_guard( # currently exist or are about to be created under the repo # root. We add them to the allowed-files set so the helpers in # ``agentic_common`` / ``agentic_common_worktree`` skip them. - allowlist = tuple(self.companion_allowlist) or DEFAULT_SYNC_COMPANION_ALLOWLIST + # ``self.companion_allowlist`` already includes DEFAULT_* + # (unioned in __init__ per F4); no fallback needed here. + allowlist = tuple(self.companion_allowlist) for path in repo_root.rglob("*"): if not path.is_file(): continue From eb775a95477ce96d6580f6d8f50565c7c4dab85f Mon Sep 17 00:00:00 2001 From: Serhan Date: Thu, 14 May 2026 17:55:54 -0700 Subject: [PATCH 13/42] fix(sync): F5 use --untracked-files=all and handle untracked dirs in revert helper revert_out_of_scope_changes_with_dirs invoked ``git status --porcelain -u``, which is ambiguous: in some git versions/configs the result can collapse an untracked directory into a single ``?? subdir/`` entry instead of listing the files within. ``os.remove`` then fails on the directory and the contained files are left behind, defeating the scope-guard's "remove out-of-scope untracked files" promise that Issue #1013 relies on. Fix: - Use the explicit ``--untracked-files=all`` form so individual files are always listed. - Defensively detect directory targets (path ending in ``/`` or filesystem shows is_dir) and use ``shutil.rmtree`` instead of ``os.remove`` so any remaining nested files are still removed. Update the corresponding prompt text in pdd/prompts/agentic_common_worktree_python.prompt:46 to match the corrected behavior (the requirement is strengthened, not weakened). Co-Authored-By: Claude Opus 4.7 --- pdd/agentic_common_worktree.py | 30 +++++++++++++++++-- .../agentic_common_worktree_python.prompt | 2 +- 2 files changed, 28 insertions(+), 4 deletions(-) diff --git a/pdd/agentic_common_worktree.py b/pdd/agentic_common_worktree.py index ad08b66b8..98afeb9f9 100644 --- a/pdd/agentic_common_worktree.py +++ b/pdd/agentic_common_worktree.py @@ -389,8 +389,14 @@ def revert_out_of_scope_changes_with_dirs( reverted: List[Path] = [] try: + # ``--untracked-files=all`` (a.k.a. ``-uall``) forces git to list every + # individual untracked file even when ``status.showUntrackedFiles`` is + # ``normal`` or when bare ``-u`` would be interpreted as "default mode" + # by older git releases. Without this, an untracked directory would be + # reported as a single ``?? path/`` entry and the os.remove below would + # leave the contained files behind. (Issue #1013, F5.) result = subprocess.run( - ["git", "status", "--porcelain", "-u"], + ["git", "status", "--porcelain", "--untracked-files=all"], cwd=str(cwd), capture_output=True, text=True, @@ -448,9 +454,27 @@ def revert_out_of_scope_changes_with_dirs( rel_path = Path(filepath_str) if is_untracked: + # Defensive: even with ``--untracked-files=all`` above, exotic git + # configs / submodule edge cases could conceivably hand us a path + # ending in ``/`` (an untracked directory). Detect and use + # ``shutil.rmtree`` so contained files don't get left behind. + # (Issue #1013, F5.) + target = cwd / filepath_str try: - os.remove(str(cwd / filepath_str)) - logger.info("Removed untracked out-of-scope file: %s", filepath_str) + if filepath_str.endswith("/") or ( + target.exists() and target.is_dir() and not target.is_symlink() + ): + shutil.rmtree(str(target)) + logger.info( + "Removed untracked out-of-scope directory: %s", + filepath_str, + ) + else: + os.remove(str(target)) + logger.info( + "Removed untracked out-of-scope file: %s", + filepath_str, + ) reverted.append(rel_path) except OSError as exc: logger.warning("Failed to remove %s: %s", filepath_str, exc) diff --git a/pdd/prompts/agentic_common_worktree_python.prompt b/pdd/prompts/agentic_common_worktree_python.prompt index b7afe1e44..165af32cf 100644 --- a/pdd/prompts/agentic_common_worktree_python.prompt +++ b/pdd/prompts/agentic_common_worktree_python.prompt @@ -43,7 +43,7 @@ All functions are public (no leading underscore) so orchestrators can import the 7. **`setup_worktree(cwd, issue_number, quiet, *, resume_existing=False, branch_prefix="fix", worktree_prefix="fix") -> Tuple[Optional[Path], Optional[str]]`**: Create an isolated git worktree at `.pdd/worktrees/{worktree_prefix}-issue-{issue_number}/` on branch `{branch_prefix}/issue-{issue_number}`. Clean up existing worktree/directory and branch before creating. If `resume_existing` is True and branch exists, reuse it (attach with `--force`). Otherwise delete the old branch first. When reusing an undeletable branch, reset to main ref after attaching. Print worktree path unless `quiet`. The `branch_prefix` and `worktree_prefix` kwargs let callers customize naming (e.g. `change` prefix for change workflows, `fix` for bug workflows). Return `(worktree_path, None)` on success, `(None, error_msg)` on failure. 8. **`get_modified_and_untracked(cwd: Path) -> List[str]`**: Return modified tracked files (`git diff --name-only HEAD`) plus untracked files (`git ls-files --others --exclude-standard`). 9. **`check_target_file_unchanged(cwd: Path, target_file: str, baseline_sha: Optional[str] = None) -> Tuple[bool, Optional[str]]`**: Detect concurrent edits. Run `git fetch origin` then `git rev-parse origin/main:{target_file}`. If `baseline_sha` is provided, compare current SHA against it — return `(True, current_sha)` if unchanged, `(False, current_sha)` if changed. If `baseline_sha` is None, just return `(True, current_sha)` to establish the baseline. Return `(True, None)` on git failures (fail-open to avoid blocking workflows). -10. **`revert_out_of_scope_changes_with_dirs(cwd: Path, allowed_dirs: set[str], allowed_files: set[Path]) -> List[Path]`**: Scope guard that detects both tracked changes AND new untracked files via `git status --porcelain -u`. For each changed/new file, check if its path starts with any prefix in `allowed_dirs` OR its resolved absolute path is in `allowed_files`. Revert tracked out-of-scope changes via `git checkout HEAD --`. Remove untracked out-of-scope files via `os.remove`. Return list of reverted/removed paths. Log actions via module logger. Handle timeout and OS errors gracefully. +10. **`revert_out_of_scope_changes_with_dirs(cwd: Path, allowed_dirs: set[str], allowed_files: set[Path]) -> List[Path]`**: Scope guard that detects both tracked changes AND new untracked files via `git status --porcelain --untracked-files=all` (a.k.a. `-uall`). The explicit `--untracked-files=all` is required so untracked content nested under a brand-new directory is expanded into individual `?? path/to/file` entries — bare `-u` is ambiguous across git versions/configs and may collapse the directory into a single `?? subdir/` entry that `os.remove` cannot delete. For each changed/new file, check if its path starts with any prefix in `allowed_dirs` OR its resolved absolute path is in `allowed_files`. Revert tracked out-of-scope changes via `git checkout HEAD --`. Remove untracked out-of-scope files via `os.remove`; if (defensively) git ever reports an untracked directory (path ending in `/` or whose target resolves to a directory), use `shutil.rmtree` instead so contained files don't get left behind. Return list of reverted/removed paths. Log actions via module logger. Handle timeout and OS errors gracefully. 11. **`extract_block_marker(output: str, name: str) -> str`**: Parse a multi-line block delimited by `BEGIN_{name}` and `END_{name}` markers from agent output. Return the content between markers (stripped), or empty string if markers not found. Case-insensitive marker matching. % Dependencies From 5074a95485287d06862157a6f8aa4bbf9108eea8 Mon Sep 17 00:00:00 2001 From: Serhan Date: Thu, 14 May 2026 17:56:20 -0700 Subject: [PATCH 14/42] fix(sync): F8 emit scope-guard diagnostic to stderr at revert time MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit The spec (pdd/prompts/agentic_sync_runner_python.prompt:69) says the scope-guard diagnostic is printed to stderr. Until now the diagnostic only appeared inside the assembled module-failure error string surfaced later by maintenance.py — operators tailing stderr in real time never saw it. Add ``print(diagnostic, file=sys.stderr)`` immediately after the revert operations inside ``_enforce_scope_guard``. The deferred stdout echo in maintenance.py is a separate event and remains. Co-Authored-By: Claude Opus 4.7 --- pdd/agentic_sync_runner.py | 5 +++++ 1 file changed, 5 insertions(+) diff --git a/pdd/agentic_sync_runner.py b/pdd/agentic_sync_runner.py index 18a7688ba..52b3d8f1e 100644 --- a/pdd/agentic_sync_runner.py +++ b/pdd/agentic_sync_runner.py @@ -1965,6 +1965,11 @@ def _enforce_scope_guard( f"Allowed write set:\n{allowed_lines}\n" f"Companion allowlist:\n{companion_lines}" ) + # F8 (Issue #1013): print the diagnostic to stderr immediately + # after reverting. ``maintenance.py`` separately echoes the + # assembled module-failure error at the end of the run — two + # distinct events, so keep both. + print(diagnostic, file=sys.stderr) return diagnostic def _build_conformance_hard_failure( From f8c62a2507843c7884dd8d6be1209742b1fdf786 Mon Sep 17 00:00:00 2001 From: Serhan Date: Thu, 14 May 2026 18:18:00 -0700 Subject: [PATCH 15/42] test(sync): F9 add parse_issue_contract regression coverage Adds TestParseIssueContract with 9 cases covering Issue #1013 prompt requirements (item 21): HTML-comment happy path, empty allowed_paths as reject-all contract (F1), malformed JSON returns None, body-marker wins over comment-marker, path traversal dropped silently, fenced block accepts only text/json/bare fence (F2), fenced block must immediately follow heading (F2), empty fenced body returns empty contract. --- tests/test_agentic_common.py | 113 +++++++++++++++++++++++++++++++++++ 1 file changed, 113 insertions(+) diff --git a/tests/test_agentic_common.py b/tests/test_agentic_common.py index 42b64f368..24972f2b6 100644 --- a/tests/test_agentic_common.py +++ b/tests/test_agentic_common.py @@ -6986,3 +6986,116 @@ def test_anthropic_is_error_json_envelope_skips_retries( ) # 3. No backoff sleep — permanent errors must NOT delay the fallback sleep_mock.assert_not_called() + + +# --------------------------------------------------------------------------- +# Issue #1013 — IssueContract / parse_issue_contract regression coverage (F9) +# --------------------------------------------------------------------------- + +class TestParseIssueContract: + """Regression coverage for ``pdd.agentic_common.parse_issue_contract``. + + Exercises the prompt-level requirements at + ``pdd/prompts/agentic_common_python.prompt:21`` (item 21 — issue contract + parsing) and the F1+F2 hardening done in Issue #1013 review iteration 2. + """ + + def test_html_comment_happy_path_returns_contract(self): + from pdd.agentic_common import parse_issue_contract, IssueContract + + body = ( + "" + ) + c = parse_issue_contract(body) + assert isinstance(c, IssueContract) + assert c.allowed_paths == ("pdd/foo.py", "tests/test_foo.py") + assert c.companion_allowlist == (".pdd/meta/*.json",) + assert c.source == "html-comment" + + def test_empty_allowed_paths_returns_reject_all_contract(self): + """F1: an explicit empty contract is a valid 'reject every change' + contract, NOT permissive fallback.""" + from pdd.agentic_common import parse_issue_contract, IssueContract + + body = '' + c = parse_issue_contract(body) + assert isinstance(c, IssueContract) + assert c.allowed_paths == () + assert c.source == "html-comment" + + def test_malformed_json_returns_none(self): + from pdd.agentic_common import parse_issue_contract + + body = "" + assert parse_issue_contract(body) is None + + def test_body_marker_wins_over_comment_marker(self): + from pdd.agentic_common import parse_issue_contract + + body = '' + comment = ( + '' + ) + c = parse_issue_contract(body, [comment]) + assert c is not None + assert c.allowed_paths == ("from_body.py",) + + def test_path_traversal_entries_are_dropped_but_contract_kept(self): + """F1: syntactically invalid entries are dropped silently; the + contract itself remains valid even if filtering leaves it empty.""" + from pdd.agentic_common import parse_issue_contract, IssueContract + + body = ( + "" + ) + c = parse_issue_contract(body) + assert isinstance(c, IssueContract) + assert c.allowed_paths == ("pdd/ok.py",) + + def test_fenced_block_only_text_or_json_languages_are_accepted(self): + """F2: the parser must reject arbitrary fence info strings such as + ``python`` or ``bash`` — only empty / ``text`` / ``json`` are + accepted.""" + from pdd.agentic_common import parse_issue_contract + + for lang in ("python", "bash", "yaml", "shell"): + body = f"## Allowed Write Set\n```{lang}\npdd/foo.py\n```\n" + assert parse_issue_contract(body) is None, ( + f"fence language {lang!r} must be rejected" + ) + + def test_fenced_block_must_immediately_follow_heading(self): + """F2: a fence that appears later in the body (after intervening + prose) must NOT be picked up — only whitespace is allowed between + the heading and the fence.""" + from pdd.agentic_common import parse_issue_contract + + body = ( + "## Allowed Write Set\n\n" + "Some discussion paragraph here.\n\n" + "```text\npdd/foo.py\n```\n" + ) + assert parse_issue_contract(body) is None + + def test_fenced_block_accepts_text_json_or_bare_fence(self): + from pdd.agentic_common import parse_issue_contract + + for fence in ("```text", "```json", "```"): + body = f"## Allowed Write Set\n{fence}\npdd/foo.py\n```\n" + c = parse_issue_contract(body) + assert c is not None and c.allowed_paths == ("pdd/foo.py",), fence + assert c.source == "fenced-block" + + def test_fenced_block_empty_body_returns_empty_contract(self): + """F1: an empty fenced block is a degenerate but legal contract.""" + from pdd.agentic_common import parse_issue_contract, IssueContract + + body = "## Allowed Write Set\n```text\n```\n" + c = parse_issue_contract(body) + assert isinstance(c, IssueContract) + assert c.allowed_paths == () From fea52003a60d06a8b0d0f59099589f2aca924f45 Mon Sep 17 00:00:00 2001 From: Serhan Date: Thu, 14 May 2026 18:21:22 -0700 Subject: [PATCH 16/42] test(sync): F9 add scope-guard runner and CLI flag regression coverage TestEnforceScopeGuard in test_agentic_sync_runner.py covers: - permissive mode returns None - scope_guard_enabled=False returns None - pathlib companion match (F3): .pdd/meta/*.json matches only top-level - companion_allowlist always unions DEFAULT_SYNC_COMPANION_ALLOWLIST (F4) - diagnostic prefix "Scope guard reverted N out-of-scope file(s) for module" test_commands_maintenance.py covers --no-scope-guard CLI flag (F10): - --no-scope-guard propagates scope_guard=False to run_agentic_sync - omitted flag defaults to scope_guard=True --- tests/test_agentic_sync_runner.py | 97 ++++++++++++++++++++++++++++++ tests/test_commands_maintenance.py | 42 +++++++++++++ 2 files changed, 139 insertions(+) diff --git a/tests/test_agentic_sync_runner.py b/tests/test_agentic_sync_runner.py index f7a8cb082..2cd050268 100644 --- a/tests/test_agentic_sync_runner.py +++ b/tests/test_agentic_sync_runner.py @@ -2569,6 +2569,103 @@ def test_explicit_empty_contract_rejects_all_changes(self): assert runner.max_workers == 1 +class TestEnforceScopeGuard: + """Issue #1013 (F9): direct behavioural coverage for ``_enforce_scope_guard`` + and ``_matches_companion_allowlist``. The constructor-state checks above + establish baseline; these tests exercise the methods themselves. + """ + + def _make_runner(self, **kwargs): + defaults = { + "basenames": ["mod"], + "dep_graph": {"mod": []}, + "sync_options": {}, + "github_info": None, + "quiet": True, + } + defaults.update(kwargs) + return AsyncSyncRunner(**defaults) + + def test_returns_none_when_permissive_mode(self, tmp_path): + runner = self._make_runner(allowed_write_set=None) + assert runner._enforce_scope_guard("mod", tmp_path) is None + + def test_returns_none_when_scope_guard_disabled(self, tmp_path): + runner = self._make_runner( + allowed_write_set=["pdd/foo.py"], + scope_guard_enabled=False, + ) + assert runner._enforce_scope_guard("mod", tmp_path) is None + + def test_companion_allowlist_strict_pathlib_match(self): + """F3: ``.pdd/meta/*.json`` must NOT match nested directories.""" + runner = self._make_runner( + allowed_write_set=["pdd/foo.py"], + companion_allowlist=[".pdd/meta/*.json"], + ) + # Top-level companion match → allowed + assert runner._matches_companion_allowlist( + ".pdd/meta/foo_python.json", runner.companion_allowlist + ) is True + # Nested under companion dir → rejected (pathlib semantics) + assert runner._matches_companion_allowlist( + ".pdd/meta/nested/foo_python.json", runner.companion_allowlist + ) is False + # Unrelated path → rejected + assert runner._matches_companion_allowlist( + "pdd/unrelated.py", runner.companion_allowlist + ) is False + + def test_companion_allowlist_unions_default(self): + """F4: caller-supplied allowlist is always unioned with the default.""" + from pdd.agentic_common import DEFAULT_SYNC_COMPANION_ALLOWLIST + + runner = self._make_runner( + allowed_write_set=["pdd/foo.py"], + companion_allowlist=["docs/*.md"], + ) + # Both caller pattern AND default appear in effective allowlist + assert "docs/*.md" in runner.companion_allowlist + for default in DEFAULT_SYNC_COMPANION_ALLOWLIST: + assert default in runner.companion_allowlist + + def test_diagnostic_format_has_scope_guard_reverted_prefix( + self, tmp_path, monkeypatch + ): + """Verify the spec-required diagnostic prefix is emitted.""" + # Stub the revert helpers so the test does not require a real git repo: + # _enforce_scope_guard composes the diagnostic from their return values. + from pdd import agentic_sync_runner as mod + + offending = tmp_path / "out_of_scope.txt" + offending.write_text("oops") + monkeypatch.setattr( + mod, + "_revert_out_of_scope_changes", + lambda _root, _allowed: [offending], + ) + monkeypatch.setattr( + mod, + "revert_out_of_scope_changes_with_dirs", + lambda _root, allowed_dirs, allowed_files: [], + ) + + runner = self._make_runner( + allowed_write_set=["pdd/foo.py"], + companion_allowlist=[".pdd/meta/*.json"], + ) + runner.contract_source = "html-comment" + + diagnostic = runner._enforce_scope_guard("mod", tmp_path) + assert diagnostic is not None + assert diagnostic.startswith( + "Scope guard reverted 1 out-of-scope file(s) for module 'mod' " + "(contract source: html-comment):" + ) + assert "Allowed write set:" in diagnostic + assert "Companion allowlist:" in diagnostic + + # --------------------------------------------------------------------------- # Issue #745: initial_cost (LLM module analysis cost) tracking # --------------------------------------------------------------------------- diff --git a/tests/test_commands_maintenance.py b/tests/test_commands_maintenance.py index 15b0d657c..33f416fdc 100644 --- a/tests/test_commands_maintenance.py +++ b/tests/test_commands_maintenance.py @@ -684,6 +684,48 @@ def test_sync_architecture_calls_handle_error_on_exception(mock_sync_prompts, mo assert call_args[1] == "sync-architecture" +@patch('pdd.core.cli.auto_update') +@patch('pdd.commands.maintenance.run_agentic_sync') +def test_sync_no_scope_guard_flag_propagates_to_run_agentic_sync( + mock_agentic_sync, + mock_auto_update, + runner, +): + """Issue #1013 (F9, F10): ``pdd sync --no-scope-guard`` must + propagate ``scope_guard=False`` to ``run_agentic_sync``.""" + mock_agentic_sync.return_value = (True, "ok", 0.0, "model") + + result = runner.invoke( + cli.cli, + ["sync", "https://github.com/owner/repo/issues/42", "--no-scope-guard"], + ) + + assert result.exit_code == 0, result.output + mock_agentic_sync.assert_called_once() + assert mock_agentic_sync.call_args.kwargs["scope_guard"] is False + + +@patch('pdd.core.cli.auto_update') +@patch('pdd.commands.maintenance.run_agentic_sync') +def test_sync_without_no_scope_guard_defaults_to_enforcement( + mock_agentic_sync, + mock_auto_update, + runner, +): + """Issue #1013 (F10): the default for ``--no-scope-guard`` is False, so + ``scope_guard=True`` should flow through when the flag is omitted.""" + mock_agentic_sync.return_value = (True, "ok", 0.0, "model") + + result = runner.invoke( + cli.cli, + ["sync", "https://github.com/owner/repo/issues/42"], + ) + + assert result.exit_code == 0, result.output + mock_agentic_sync.assert_called_once() + assert mock_agentic_sync.call_args.kwargs["scope_guard"] is True + + @patch('pdd.core.cli.auto_update') def test_sync_architecture_uses_nearest_cwd_project(mock_auto_update, runner, tmp_path, monkeypatch): """CLI should target the nearest ancestor project, not always the repo root.""" From 9c98bc18973649ac1de2b30c28ef7cffa928048f Mon Sep 17 00:00:00 2001 From: Serhan Date: Thu, 14 May 2026 18:28:43 -0700 Subject: [PATCH 17/42] fix(sync): F1+F2+F3 (iter-3) scope companion globs to module_cwd, restore runner-entry log, reject bare fences Iter-3 codex review surfaced three remaining gaps against the prompt spec: - F1 (MAJOR): _enforce_scope_guard auto-allowed any file matching the companion glob anywhere under repo_root. The prompt says "every path under module_cwd", so sibling-module .pdd/meta/*.json could leak through in a shared-worktree run. Scan rglob from module_cwd instead. - F2 (MAJOR): iter-2 deduplicated logging in favor of the sync layer only, but the spec at agentic_sync_runner_python.prompt items 22 requires the runner to ALSO log at run() entry. The two logs report different events: the sync-layer line records *contract detection*; the runner-entry line records *runtime enforcement state* ("permissive mode" or "disabled via --no-scope-guard"). Both are required. - F3 (MINOR): the fenced-block regex accepted bare ``` fences with the ``(?:text|json)?`` optional group. The spec at agentic_common_python.prompt item 21 says only ``text`` or ``json`` info strings are legal. Make the language required and update the test. --- pdd/agentic_common.py | 12 ++++++------ pdd/agentic_sync_runner.py | 25 ++++++++++++++++++++----- tests/test_agentic_common.py | 14 ++++++++++++-- 3 files changed, 38 insertions(+), 13 deletions(-) diff --git a/pdd/agentic_common.py b/pdd/agentic_common.py index 465ede62a..f1412a435 100644 --- a/pdd/agentic_common.py +++ b/pdd/agentic_common.py @@ -2342,13 +2342,13 @@ class IssueContract: re.DOTALL, ) -# Matches a fenced code block (```text``` or ```json```, or bare ```) that -# IMMEDIATELY follows the heading. Only whitespace/newlines may precede the -# fence (anchored via ``\A`` once we slice the text after the heading); the -# info string is restricted to empty/``text``/``json`` per spec. Captures the -# inner body. +# Matches a fenced code block (```text``` or ```json```) that IMMEDIATELY +# follows the heading. Only whitespace/newlines may precede the fence +# (anchored via ``\A`` once we slice the text after the heading); the info +# string MUST be ``text`` or ``json`` per spec (Issue #1013, iter-3 F3 — bare +# fences are rejected). Captures the inner body. _FENCED_BLOCK_RE = re.compile( - r"\A\s*```(?:text|json)?[ \t]*\n(?P.*?)```", + r"\A\s*```(?:text|json)[ \t]*\n(?P.*?)```", re.DOTALL, ) diff --git a/pdd/agentic_sync_runner.py b/pdd/agentic_sync_runner.py index 52b3d8f1e..1d3af9dcb 100644 --- a/pdd/agentic_sync_runner.py +++ b/pdd/agentic_sync_runner.py @@ -1535,10 +1535,21 @@ def run(self) -> Tuple[bool, str, float]: f"module(s): {resumed}[/green]" ) - # Issue #1013 (F7): scope-guard status logging happens once in the - # sync-layer dispatch (``agentic_sync.run_agentic_sync``) — that's - # closer to the user and emits exactly one INFO/WARNING line. The - # runner intentionally does NOT log the same state again on entry. + # Issue #1013: scope-guard run-entry logging required by the prompt + # spec (``agentic_sync_runner_python.prompt`` items 22 permissive + # fallback / opt-out). Distinct from the sync-layer "contract loaded" + # INFO line in ``run_agentic_sync`` — that one reports *detection*; + # this one reports the runtime *enforcement state* at dispatch. + if not self.quiet: + if not self.scope_guard_enabled: + console.print( + "[yellow dim]Scope guard disabled via --no-scope-guard[/yellow dim]" + ) + elif self.allowed_write_paths is None: + console.print( + "[dim]Scope guard: no contract on issue — running in " + "permissive mode[/dim]" + ) self._update_github_comment() @@ -1913,8 +1924,12 @@ def _enforce_scope_guard( # ``agentic_common`` / ``agentic_common_worktree`` skip them. # ``self.companion_allowlist`` already includes DEFAULT_* # (unioned in __init__ per F4); no fallback needed here. + # F1 (Issue #1013 iter-3): only files UNDER ``module_cwd`` count + # as companion artifacts — never auto-allow a sibling module's + # ``.pdd/meta/*.json`` just because it lives in the same repo. allowlist = tuple(self.companion_allowlist) - for path in repo_root.rglob("*"): + cwd_path = Path(module_cwd).resolve() + for path in cwd_path.rglob("*"): if not path.is_file(): continue try: diff --git a/tests/test_agentic_common.py b/tests/test_agentic_common.py index 24972f2b6..6cd64f90d 100644 --- a/tests/test_agentic_common.py +++ b/tests/test_agentic_common.py @@ -7082,15 +7082,25 @@ def test_fenced_block_must_immediately_follow_heading(self): ) assert parse_issue_contract(body) is None - def test_fenced_block_accepts_text_json_or_bare_fence(self): + def test_fenced_block_accepts_only_text_or_json(self): + """Iter-3 F3: the spec at agentic_common_python.prompt:110 requires + ``text`` or ``json`` info strings; bare ``` (no language) is NOT + accepted as a split-contract fence.""" from pdd.agentic_common import parse_issue_contract - for fence in ("```text", "```json", "```"): + for fence in ("```text", "```json"): body = f"## Allowed Write Set\n{fence}\npdd/foo.py\n```\n" c = parse_issue_contract(body) assert c is not None and c.allowed_paths == ("pdd/foo.py",), fence assert c.source == "fenced-block" + def test_fenced_block_rejects_bare_fence(self): + """Iter-3 F3: bare ``` fence (no language) must be rejected.""" + from pdd.agentic_common import parse_issue_contract + + body = "## Allowed Write Set\n```\npdd/foo.py\n```\n" + assert parse_issue_contract(body) is None + def test_fenced_block_empty_body_returns_empty_contract(self): """F1: an empty fenced block is a degenerate but legal contract.""" from pdd.agentic_common import parse_issue_contract, IssueContract From 336987b6fe5ed97241b83a281fc05a91cd1ffb3f Mon Sep 17 00:00:00 2001 From: Serhan Date: Thu, 14 May 2026 18:31:01 -0700 Subject: [PATCH 18/42] test(sync): iter-3 F1+F2 add sibling-module-companion and run-entry log coverage F1: test_companion_glob_scoped_to_module_cwd_not_sibling locks in the module_cwd scoping; a sibling module's .pdd/meta/*.json is no longer auto-allowed when scanning from the current module's cwd. F2: test_run_entry_logs_permissive_mode and test_run_entry_logs_opt_out_warning verify the runner-entry INFO/WARNING required by agentic_sync_runner_python.prompt items 22. The entry log moved above the empty-basenames short-circuit so state is visible even on no-op runs. --- pdd/agentic_sync_runner.py | 24 +++++----- tests/test_agentic_sync_runner.py | 75 +++++++++++++++++++++++++++++++ 2 files changed, 88 insertions(+), 11 deletions(-) diff --git a/pdd/agentic_sync_runner.py b/pdd/agentic_sync_runner.py index 1d3af9dcb..d41d2fd2c 100644 --- a/pdd/agentic_sync_runner.py +++ b/pdd/agentic_sync_runner.py @@ -1525,21 +1525,13 @@ def _record_result( # ------------------------------------------------------------------ def run(self) -> Tuple[bool, str, float]: """Run all syncs respecting dependencies.""" - if not self.basenames: - return True, "No modules to sync", self.initial_cost - - if self._resumed_modules and not self.quiet: - resumed = sorted(self._resumed_modules) - console.print( - f"[green]Resuming: skipping {len(resumed)} already-succeeded " - f"module(s): {resumed}[/green]" - ) - # Issue #1013: scope-guard run-entry logging required by the prompt # spec (``agentic_sync_runner_python.prompt`` items 22 permissive # fallback / opt-out). Distinct from the sync-layer "contract loaded" # INFO line in ``run_agentic_sync`` — that one reports *detection*; - # this one reports the runtime *enforcement state* at dispatch. + # this one reports the runtime *enforcement state* at dispatch. Log + # before the empty-basenames short-circuit so the state is visible + # even when there are no modules to sync. if not self.quiet: if not self.scope_guard_enabled: console.print( @@ -1551,6 +1543,16 @@ def run(self) -> Tuple[bool, str, float]: "permissive mode[/dim]" ) + if not self.basenames: + return True, "No modules to sync", self.initial_cost + + if self._resumed_modules and not self.quiet: + resumed = sorted(self._resumed_modules) + console.print( + f"[green]Resuming: skipping {len(resumed)} already-succeeded " + f"module(s): {resumed}[/green]" + ) + self._update_github_comment() prev_sigint = signal.getsignal(signal.SIGINT) diff --git a/tests/test_agentic_sync_runner.py b/tests/test_agentic_sync_runner.py index 2cd050268..61598eaee 100644 --- a/tests/test_agentic_sync_runner.py +++ b/tests/test_agentic_sync_runner.py @@ -2665,6 +2665,81 @@ def test_diagnostic_format_has_scope_guard_reverted_prefix( assert "Allowed write set:" in diagnostic assert "Companion allowlist:" in diagnostic + def test_companion_glob_scoped_to_module_cwd_not_sibling( + self, tmp_path, monkeypatch + ): + """Iter-3 F1: a sibling module's companion artifact (under a different + ``module_cwd``) must NOT be auto-allowed. The rglob must scope to + the current module's cwd only. + """ + from pdd import agentic_sync_runner as mod + + # Build a fake repo with two module dirs; place ``.pdd/meta/foo.json`` + # under EACH so the companion glob would match both if scanned at + # repo level. + repo = tmp_path + module_a = repo / "mod_a" + module_b = repo / "mod_b" + for m in (module_a, module_b): + (m / ".pdd" / "meta").mkdir(parents=True) + (m / ".pdd" / "meta" / "x.json").write_text("{}") + + captured_allowed = {} + + def fake_revert(repo_root, allowed_files): + captured_allowed["files"] = set(allowed_files) + return [] + + monkeypatch.setattr(mod, "_revert_out_of_scope_changes", fake_revert) + monkeypatch.setattr( + mod, + "revert_out_of_scope_changes_with_dirs", + lambda _root, allowed_dirs, allowed_files: [], + ) + + runner = self._make_runner( + allowed_write_set=["pdd/foo.py"], + companion_allowlist=[".pdd/meta/*.json"], + ) + # Skip git toplevel resolution; pretend repo root == tmp_path. + monkeypatch.setattr( + runner, "_resolve_repo_root", lambda _cwd: repo.resolve() + ) + + runner._enforce_scope_guard("mod_a", module_a) + + files = captured_allowed["files"] + assert (module_a / ".pdd" / "meta" / "x.json").resolve() in files + # Sibling module's companion artifact must NOT be auto-allowed. + assert (module_b / ".pdd" / "meta" / "x.json").resolve() not in files + + def test_run_entry_logs_permissive_mode(self, capsys): + """Iter-3 F2: runner emits dim INFO on run() entry when no contract.""" + runner = self._make_runner( + allowed_write_set=None, + quiet=False, + ) + # Make run() return immediately by emptying the basenames list AFTER + # construction; the dispatch loop short-circuits and we only want the + # entry log. + runner.basenames = [] + runner.run() + out = capsys.readouterr().out + assert "permissive mode" in out + + def test_run_entry_logs_opt_out_warning(self, capsys): + """Iter-3 F2: runner emits dim WARNING on run() entry when scope guard + is disabled via --no-scope-guard.""" + runner = self._make_runner( + allowed_write_set=["pdd/foo.py"], + scope_guard_enabled=False, + quiet=False, + ) + runner.basenames = [] + runner.run() + out = capsys.readouterr().out + assert "--no-scope-guard" in out + # --------------------------------------------------------------------------- # Issue #745: initial_cost (LLM module analysis cost) tracking From e8795e8445656b45a53a2aecfb569f507deaa5a4 Mon Sep 17 00:00:00 2001 From: Serhan Date: Thu, 14 May 2026 18:37:17 -0700 Subject: [PATCH 19/42] fix(sync): iter-4 F1 preserve deleted companion artifacts in scope guard Iter-4 codex flagged that the companion-allowlist build pass used ``cwd_path.rglob("*")``, which only sees files that still exist on disk. When sync legitimately deletes a ``.pdd/meta/.json`` companion (module renamed/removed), the deletion appears in ``git status`` as a tracked ``D `` but the file is gone, so it was missing from the allowed-files set. The subsequent revert helper would resurrect the deletion and hard-fail the module on a legitimate operation. Now also pulls paths from ``_git_changed_paths`` and adds those matching the companion allowlist to the allowed set, while still respecting the module_cwd scoping from iter-3 F1. --- pdd/agentic_sync_runner.py | 16 ++++++++++++ tests/test_agentic_sync_runner.py | 43 +++++++++++++++++++++++++++++++ 2 files changed, 59 insertions(+) diff --git a/pdd/agentic_sync_runner.py b/pdd/agentic_sync_runner.py index d41d2fd2c..6ebacce7e 100644 --- a/pdd/agentic_sync_runner.py +++ b/pdd/agentic_sync_runner.py @@ -1941,6 +1941,22 @@ def _enforce_scope_guard( if self._matches_companion_allowlist(rel_posix, allowlist): allowed_files.add(path.resolve()) + # Iter-4 F1: rglob only sees files that still exist on disk. Sync + # legitimately DELETES companion artifacts (e.g. ``.pdd/meta/foo_python.json`` + # when a module is renamed/removed); those deletions appear in + # ``git status`` as tracked ``D ``. Without this pass the revert + # helper would resurrect the deleted companion and hard-fail. + for rel_posix in _git_changed_paths(repo_root): + if not self._matches_companion_allowlist(rel_posix, allowlist): + continue + absolute = (repo_root / rel_posix).resolve() + try: + absolute.relative_to(cwd_path) + except ValueError: + # Outside the module's cwd — scoped out by F1 iter-3. + continue + allowed_files.add(absolute) + tracked_reverted = _revert_out_of_scope_changes(repo_root, allowed_files) untracked_reverted = revert_out_of_scope_changes_with_dirs( repo_root, allowed_dirs=set(), allowed_files=allowed_files diff --git a/tests/test_agentic_sync_runner.py b/tests/test_agentic_sync_runner.py index 61598eaee..724abdeac 100644 --- a/tests/test_agentic_sync_runner.py +++ b/tests/test_agentic_sync_runner.py @@ -2740,6 +2740,49 @@ def test_run_entry_logs_opt_out_warning(self, capsys): out = capsys.readouterr().out assert "--no-scope-guard" in out + def test_deleted_companion_in_git_status_is_preserved( + self, tmp_path, monkeypatch + ): + """Iter-4 F1: when sync legitimately deletes ``.pdd/meta/foo.json``, + the file no longer exists on disk so ``rglob`` misses it. The + deletion appears in ``git status`` as tracked ``D ``; the runner + MUST still treat it as auto-allowed so the revert helper does not + resurrect it and hard-fail the module.""" + from pdd import agentic_sync_runner as mod + + # No file is created on disk — simulates the post-delete state. + deleted_rel = ".pdd/meta/old_module_python.json" + + monkeypatch.setattr( + mod, "_git_changed_paths", lambda _root: {deleted_rel} + ) + captured = {} + + def fake_revert(repo_root, allowed_files): + captured["allowed"] = set(allowed_files) + return [] + + monkeypatch.setattr(mod, "_revert_out_of_scope_changes", fake_revert) + monkeypatch.setattr( + mod, + "revert_out_of_scope_changes_with_dirs", + lambda _root, allowed_dirs, allowed_files: [], + ) + + runner = self._make_runner( + allowed_write_set=["pdd/foo.py"], + companion_allowlist=[".pdd/meta/*.json"], + ) + monkeypatch.setattr( + runner, "_resolve_repo_root", lambda _cwd: tmp_path.resolve() + ) + + diagnostic = runner._enforce_scope_guard("old_module", tmp_path) + assert diagnostic is None, ( + "Deleted companion artifact should be auto-allowed, not flagged" + ) + assert (tmp_path / deleted_rel).resolve() in captured["allowed"] + # --------------------------------------------------------------------------- # Issue #745: initial_cost (LLM module analysis cost) tracking From 96746b09c8e21500a9c05339049e5bc10f3711cb Mon Sep 17 00:00:00 2001 From: Serhan Date: Thu, 14 May 2026 19:16:46 -0700 Subject: [PATCH 20/42] fix(sync): iter-6 B1 preserve pre-existing untracked files in scope guard MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit DATA-LOSS BUG (reported by external review): the scope guard removed any untracked file that wasn't in the contract or companion allowlist, including pre-existing user files like scratch.txt or unrelated WIP that existed before pdd sync started. Fix: capture self._baseline_changed_paths at runner __init__ (set of paths that git status reported BEFORE any module ran), and during _enforce_scope_guard add each baseline path to allowed_files so the revert helpers never touch them. Regression test uses a real tmp_path git repo with scratch.txt as the pre-existing untracked file — mock-based tests would not have caught this because the revert helpers were stubbed out in iter-1..5. --- pdd/agentic_sync_runner.py | 17 ++++++++++++++ tests/test_agentic_sync_runner.py | 39 +++++++++++++++++++++++++++++++ 2 files changed, 56 insertions(+) diff --git a/pdd/agentic_sync_runner.py b/pdd/agentic_sync_runner.py index 6ebacce7e..1eca59551 100644 --- a/pdd/agentic_sync_runner.py +++ b/pdd/agentic_sync_runner.py @@ -934,6 +934,16 @@ def __init__( self._scope_guard_locks: Dict[str, threading.Lock] = defaultdict(threading.Lock) self._scope_guard_locks_lock = threading.Lock() + # Iter-6 B1 (data-loss bug): snapshot the working-tree changed/untracked + # set BEFORE any module sync runs. Pre-existing untracked files + # (e.g. user's ``scratch.txt``) are not the sync run's responsibility + # and MUST be preserved by the scope guard. + self._baseline_changed_paths: Set[str] = ( + _git_changed_paths(self.project_root) + if self.scope_guard_enabled and self.allowed_write_paths is not None + else set() + ) + self.total_budget = self.sync_options.get("total_budget") self.max_workers = 1 if self.total_budget is not None else MAX_WORKERS # When a contract narrows writes AND scope-guard enforcement is @@ -1957,6 +1967,13 @@ def _enforce_scope_guard( continue allowed_files.add(absolute) + # Iter-6 B1 (data-loss bug): pre-existing untracked files + # captured at runner __init__ are NEVER out-of-scope. Without + # this pass, a user's ``scratch.txt`` or unrelated WIP under + # the repo root would be removed by the revert helper. + for rel_posix in self._baseline_changed_paths: + allowed_files.add((repo_root / rel_posix).resolve()) + tracked_reverted = _revert_out_of_scope_changes(repo_root, allowed_files) untracked_reverted = revert_out_of_scope_changes_with_dirs( repo_root, allowed_dirs=set(), allowed_files=allowed_files diff --git a/tests/test_agentic_sync_runner.py b/tests/test_agentic_sync_runner.py index 724abdeac..4d4b239e8 100644 --- a/tests/test_agentic_sync_runner.py +++ b/tests/test_agentic_sync_runner.py @@ -2740,6 +2740,45 @@ def test_run_entry_logs_opt_out_warning(self, capsys): out = capsys.readouterr().out assert "--no-scope-guard" in out + def test_pre_existing_untracked_files_are_preserved(self, tmp_path): + """Iter-6 B1 (data-loss bug): a user's pre-existing untracked file + (``scratch.txt``) must NOT be removed by the scope guard. Uses a + real ``git init`` repo because the bug only reproduces when the + revert helpers actually touch the filesystem. + """ + subprocess.run(["git", "init", "-b", "main", str(tmp_path)], check=True, + capture_output=True) + subprocess.run(["git", "-C", str(tmp_path), "config", "user.email", + "t@t.invalid"], check=True, capture_output=True) + subprocess.run(["git", "-C", str(tmp_path), "config", "user.name", + "T"], check=True, capture_output=True) + (tmp_path / "README.md").write_text("initial") + subprocess.run(["git", "-C", str(tmp_path), "add", "README.md"], + check=True, capture_output=True) + subprocess.run(["git", "-C", str(tmp_path), "commit", "-m", "init"], + check=True, capture_output=True) + + scratch = tmp_path / "scratch.txt" + scratch.write_text("user work-in-progress — do not delete") + assert scratch.exists() + + from unittest.mock import patch + with patch("pdd.agentic_sync_runner.Path.cwd", return_value=tmp_path): + runner = self._make_runner( + allowed_write_set=["pdd/foo.py"], + companion_allowlist=[".pdd/meta/*.json"], + ) + runner.project_root = tmp_path + + assert "scratch.txt" in runner._baseline_changed_paths + + diagnostic = runner._enforce_scope_guard("mod", tmp_path) + + assert scratch.exists(), ( + "scope guard incorrectly deleted user's pre-existing scratch.txt" + ) + assert diagnostic is None or "scratch.txt" not in diagnostic + def test_deleted_companion_in_git_status_is_preserved( self, tmp_path, monkeypatch ): From 2ca65e9f84ded59a46fcfa8df5d29a780f4fa959 Mon Sep 17 00:00:00 2001 From: Serhan Date: Thu, 14 May 2026 19:17:54 -0700 Subject: [PATCH 21/42] fix(sync): iter-6 B2 correctly revert staged renames in scope guard MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit REVERT-CLAIMED-BUT-NOT-DONE BUG (reported by external review): for a staged rename ``R old -> new``, the helper read the whole ``old -> new`` payload as a single path. The subsequent ``git checkout HEAD -- "old -> new"`` silently failed (pathspec didn't match) AND the return code was never checked, so the helper reported the rename as reverted when in fact ``git status`` still showed it. Fix: split rename payloads so source and destination are independently membership-checked, switch from ``git checkout HEAD --`` to ``git restore --staged --worktree --source=HEAD --`` so rename destinations not present in HEAD are correctly removed, and check the return code — clearing the reverted list on failure so callers see real-vs-claimed revert state. Falls back to legacy ``git checkout`` on pre-2.23 git. Affects every caller of _revert_out_of_scope_changes (agentic_update, agentic_fix, agentic_crash, agentic_verify, agentic_e2e_fix_orchestrator, agentic_sync_runner) — all of them were previously silently no-op on rename out-of-scope cases. --- pdd/agentic_common.py | 73 ++++++++++++++++++++++++++++-------- tests/test_agentic_common.py | 34 +++++++++++++++++ 2 files changed, 92 insertions(+), 15 deletions(-) diff --git a/pdd/agentic_common.py b/pdd/agentic_common.py index f1412a435..f8153dc6b 100644 --- a/pdd/agentic_common.py +++ b/pdd/agentic_common.py @@ -2268,30 +2268,73 @@ def _revert_out_of_scope_changes( for line in result.stdout.splitlines(): if len(line) < 4: continue - rel_path = line[3:].strip() - full_path = (cwd / rel_path).resolve() - if full_path not in allowed_paths: - to_restore.append(rel_path) - reverted.append(full_path) + payload = line[3:].strip() + # Iter-6 B2 (rename revert bug): ``git status --porcelain`` reports + # renames as ``R old -> new``. Treating the whole payload as one + # path caused ``git checkout HEAD --`` to be called with a literal + # ``"old -> new"`` arg, which silently failed and left the rename + # in place. Split renames so BOTH source and destination are + # membership-checked independently. + if " -> " in payload: + old_raw, new_raw = payload.split(" -> ", 1) + entry_paths = [old_raw.strip().strip('"'), new_raw.strip().strip('"')] + else: + entry_paths = [payload.strip('"')] + for rel_path in entry_paths: + if not rel_path: + continue + full_path = (cwd / rel_path).resolve() + if full_path not in allowed_paths: + to_restore.append(rel_path) + reverted.append(full_path) if to_restore: + # Iter-6 B2: use ``git restore --staged --worktree --source=HEAD`` + # so staged renames are correctly undone (``git checkout HEAD --`` + # cannot remove a rename destination unknown to HEAD). Falls back + # to the legacy command for git < 2.23. + checkout_failed = False try: - subprocess.run( - ["git", "-C", str(cwd), "checkout", "HEAD", "--"] + to_restore, + restore_result = subprocess.run( + ["git", "-C", str(cwd), "restore", + "--staged", "--worktree", "--source=HEAD", "--"] + to_restore, capture_output=True, timeout=30, ) + if restore_result.returncode != 0: + stderr = (restore_result.stderr or b"").decode(errors="replace") + if "'restore' is not a git command" in stderr: + legacy = subprocess.run( + ["git", "-C", str(cwd), "checkout", "HEAD", "--"] + + to_restore, + capture_output=True, timeout=30, + ) + if legacy.returncode != 0: + checkout_failed = True + _scope_guard_logger.warning( + "Scope guard: legacy git checkout returned %d " + "for %d file(s): %s", + legacy.returncode, len(to_restore), + (legacy.stderr or b"").decode(errors="replace").strip(), + ) + else: + checkout_failed = True + _scope_guard_logger.warning( + "Scope guard: git restore returned %d for %d file(s): %s", + restore_result.returncode, len(to_restore), stderr.strip(), + ) except (subprocess.TimeoutExpired, FileNotFoundError, OSError) as exc: _scope_guard_logger.warning( - "Scope guard: git checkout failed for %d file(s): %s", + "Scope guard: git restore failed for %d file(s): %s", len(to_restore), exc, ) + checkout_failed = True + if checkout_failed: reverted.clear() - else: - if reverted: - _scope_guard_logger.info( - "Scope guard reverted %d out-of-scope file(s): %s", - len(reverted), - ", ".join(str(p.name) for p in reverted[:10]), - ) + elif reverted: + _scope_guard_logger.info( + "Scope guard reverted %d out-of-scope file(s): %s", + len(reverted), + ", ".join(str(p.name) for p in reverted[:10]), + ) return reverted diff --git a/tests/test_agentic_common.py b/tests/test_agentic_common.py index 6cd64f90d..4738bdc07 100644 --- a/tests/test_agentic_common.py +++ b/tests/test_agentic_common.py @@ -4689,6 +4689,40 @@ def _init_test_git_repo(path): class TestRevertOutOfScopeChanges: """Tests for _revert_out_of_scope_changes scope guard utility.""" + def test_reverts_out_of_scope_staged_rename(self, tmp_path): + """Iter-6 B2 (rename revert bug): ``git status --porcelain`` reports + renames as ``R old -> new``. The helper used to treat the whole + payload as one path, so the subsequent ``git checkout HEAD --`` + was passed a literal ``"old -> new"`` and silently failed. + + After the fix both source and destination are restored and + ``git status`` is clean. + """ + from pdd.agentic_common import _revert_out_of_scope_changes + + proj = tmp_path / "repo" + proj.mkdir() + (proj / "old.py").write_text("contents\n") + (proj / "in_scope.py").write_text("in_scope\n") + _init_test_git_repo(proj) + + _subprocess.run(["git", "-C", str(proj), "mv", "old.py", "new.py"], + check=True, capture_output=True) + + allowed = {(proj / "in_scope.py").resolve()} + reverted = _revert_out_of_scope_changes(proj, allowed) + + status = _subprocess.run( + ["git", "-C", str(proj), "status", "--porcelain"], + capture_output=True, text=True, check=True, + ).stdout + assert status.strip() == "", ( + f"git status should be clean after rename revert; got: {status!r}" + ) + assert (proj / "old.py").exists() + assert not (proj / "new.py").exists() + assert len(reverted) >= 1 + def test_reverts_deleted_files(self, tmp_path): """Deleted files outside allowed set must be restored.""" from pdd.agentic_common import _revert_out_of_scope_changes From e2bbed33860da0200531a53e3cad9c0dc65ae32a Mon Sep 17 00:00:00 2001 From: Serhan Date: Thu, 14 May 2026 19:18:43 -0700 Subject: [PATCH 22/42] fix(sync): iter-6 B3 detect rename source side in durable scope check MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit OUT-OF-SCOPE-DELETION-MISSED BUG (reported by external review): the durable runner staged-paths inspection used ``git diff --cached --name-only``, which for a staged ``git mv old new`` emits only the destination ``new``. A contract that allowed ``new`` but not ``old`` passed validation while the rename silently deleted the out-of-scope ``old``. Fix: switch to ``git diff --cached --name-status -M``. Rename and copy lines now emit ``R\told\tnew`` / ``C\told\tnew``; both columns past the status are treated as scope-checked paths. Regression test uses a real durable runner against a real git repo with a staged rename — the bug only reproduces against real git output. --- pdd/durable_sync_runner.py | 19 +++++++++++++++++-- tests/test_durable_sync_runner.py | 28 ++++++++++++++++++++++++++++ 2 files changed, 45 insertions(+), 2 deletions(-) diff --git a/pdd/durable_sync_runner.py b/pdd/durable_sync_runner.py index af0a8e0f7..25fc4f8e3 100644 --- a/pdd/durable_sync_runner.py +++ b/pdd/durable_sync_runner.py @@ -361,14 +361,29 @@ def _stage_module_changes( self._force_add_module_metadata(basename, module_worktree) + # Iter-6 B3 (rename detection bug): ``--name-only`` for a staged + # rename ``git mv old new`` emits ONLY ``new``. A contract that + # allows ``new`` but not ``old`` would pass validation while the + # rename silently deletes the out-of-scope ``old``. Use + # ``--name-status -M`` so rename lines surface as + # ``R\told\tnew`` and BOTH paths count for scope checking. names = self._git( - ["diff", "--cached", "--name-only", "--diff-filter=ACMRTD"], + ["diff", "--cached", "--name-status", "-M", "--diff-filter=ACMRTD"], cwd=module_worktree, ) if names.returncode != 0: return False, f"Failed to inspect staged changes: {_combined_output(names)}", False - changed_paths = [line.strip() for line in names.stdout.splitlines() if line.strip()] + changed_paths: List[str] = [] + for raw in names.stdout.splitlines(): + line = raw.rstrip("\n") + if not line.strip(): + continue + parts = line.split("\t") + # Whether the entry is a rename/copy (R/C with similarity score) + # or a single-path status (A/M/D/T), every column past the + # status code is a path that should be scope-checked. + changed_paths.extend(p.strip() for p in parts[1:] if p.strip()) out_of_scope = self._out_of_scope_staged_paths(changed_paths) if out_of_scope: return ( diff --git a/tests/test_durable_sync_runner.py b/tests/test_durable_sync_runner.py index ad9c558a5..812e3b55f 100644 --- a/tests/test_durable_sync_runner.py +++ b/tests/test_durable_sync_runner.py @@ -367,6 +367,34 @@ def test_allowed_write_set_empty_rejects_everything_for_durable_runner(tmp_path: assert result == ["src/app.py"] +def test_staged_rename_source_side_is_scope_checked(tmp_path: Path): + """Iter-6 B3 (rename detection bug): ``git diff --cached --name-only`` + for a staged ``git mv old new`` emits ONLY ``new``. A contract that + allows ``new`` but not ``old`` would pass validation while the rename + silently DELETES the out-of-scope ``old``. + + After the fix the durable runner uses ``--name-status -M`` so both + sides of the rename surface and the out-of-scope deletion is rejected. + """ + repo = _init_repo_with_remote(tmp_path) + (repo / "src").mkdir(exist_ok=True) + (repo / "src" / "old.py").write_text("contents\n", encoding="utf-8") + _git(repo, "add", "src/old.py") + _git(repo, "commit", "-m", "add src/old.py") + _git(repo, "mv", "src/old.py", "src/new.py") + + runner = _runner(repo, allowed_write_set=["src/new.py"]) + success, message, _empty = runner._stage_module_changes("foo", repo) + + assert not success, ( + "Durable sync must reject a checkpoint that deletes src/old.py " + "even when the contract permits src/new.py." + ) + assert "src/old.py" in message, ( + f"Diagnostic must call out the out-of-scope source path; got: {message!r}" + ) + + def test_push_failure_preserves_local_checkpoint_and_next_run_pushes_it(tmp_path: Path): repo = _init_repo_with_remote(tmp_path) first = _runner(repo, runner_cls=PushFailingMetadataRunner) From 478cf79c3947e8b379fbcf10f6e5fbf60afda3ce Mon Sep 17 00:00:00 2001 From: Serhan Date: Thu, 14 May 2026 19:27:53 -0700 Subject: [PATCH 23/42] fix(sync): iter-7 B4 revert renames atomically (both sides) when either is out of scope MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit PARTIAL-RENAME BUG (reported by external review): a rename is one atomic git operation. The iter-6 B2 fix correctly split rename payloads so both sides were membership-checked, but then independently restored only the disallowed side — leaving the working tree in a half-renamed state. Concretely: contract = {pdd/old.py}; sync runs `git mv pdd/old.py pdd/new.py` iter-6: restores only pdd/new.py → ``D pdd/old.py`` left staged iter-7: detects rename, restores BOTH sides → clean working tree Inverse case (allowed=new.py) is symmetric. When BOTH sides are in-scope the rename is left in place. --- pdd/agentic_common.py | 32 +++++++---- tests/test_agentic_common.py | 100 +++++++++++++++++++++++++++++++++++ 2 files changed, 123 insertions(+), 9 deletions(-) diff --git a/pdd/agentic_common.py b/pdd/agentic_common.py index f8153dc6b..8f2073b9e 100644 --- a/pdd/agentic_common.py +++ b/pdd/agentic_common.py @@ -2273,20 +2273,34 @@ def _revert_out_of_scope_changes( # renames as ``R old -> new``. Treating the whole payload as one # path caused ``git checkout HEAD --`` to be called with a literal # ``"old -> new"`` arg, which silently failed and left the rename - # in place. Split renames so BOTH source and destination are - # membership-checked independently. + # in place. Split renames so both source and destination surface. if " -> " in payload: old_raw, new_raw = payload.split(" -> ", 1) entry_paths = [old_raw.strip().strip('"'), new_raw.strip().strip('"')] + is_rename = True else: entry_paths = [payload.strip('"')] - for rel_path in entry_paths: - if not rel_path: - continue - full_path = (cwd / rel_path).resolve() - if full_path not in allowed_paths: - to_restore.append(rel_path) - reverted.append(full_path) + is_rename = False + entry_paths = [p for p in entry_paths if p] + if not entry_paths: + continue + # Iter-7 B4 (partial-rename revert): a rename is an atomic git + # operation. If EITHER side of the rename is out of scope, the + # rename as a whole has to be undone — restoring only one side + # leaves the working tree in a half-renamed state (``D old`` or + # ``A new`` depending on which side was in scope). + full_paths = [(cwd / rel).resolve() for rel in entry_paths] + out_of_scope = any(fp not in allowed_paths for fp in full_paths) + if is_rename: + if out_of_scope: + for rel, fp in zip(entry_paths, full_paths): + to_restore.append(rel) + reverted.append(fp) + else: + for rel, fp in zip(entry_paths, full_paths): + if fp not in allowed_paths: + to_restore.append(rel) + reverted.append(fp) if to_restore: # Iter-6 B2: use ``git restore --staged --worktree --source=HEAD`` # so staged renames are correctly undone (``git checkout HEAD --`` diff --git a/tests/test_agentic_common.py b/tests/test_agentic_common.py index 4738bdc07..3b49dfc6e 100644 --- a/tests/test_agentic_common.py +++ b/tests/test_agentic_common.py @@ -4723,6 +4723,106 @@ def test_reverts_out_of_scope_staged_rename(self, tmp_path): assert not (proj / "new.py").exists() assert len(reverted) >= 1 + def test_partial_rename_restores_both_sides_when_source_allowed(self, tmp_path): + """Iter-7 B4 (partial-rename bug): a rename is atomic. If the + contract allows the SOURCE (``old``) but NOT the destination + (``new``), restoring only ``new`` leaves ``D old`` staged. Fix: + when either side of a rename is out of scope, revert BOTH so the + rename is fully undone. + """ + from pdd.agentic_common import _revert_out_of_scope_changes + + proj = tmp_path / "repo" + proj.mkdir() + (proj / "pdd").mkdir() + (proj / "pdd" / "old.py").write_text("contents\n") + _init_test_git_repo(proj) + + _subprocess.run(["git", "-C", str(proj), "mv", + "pdd/old.py", "pdd/new.py"], + check=True, capture_output=True) + + # Contract allows source but not destination. + allowed = {(proj / "pdd" / "old.py").resolve()} + _revert_out_of_scope_changes(proj, allowed) + + status = _subprocess.run( + ["git", "-C", str(proj), "status", "--porcelain"], + capture_output=True, text=True, check=True, + ).stdout + assert status.strip() == "", ( + f"git status should be clean — rename must be fully undone; " + f"got: {status!r}" + ) + assert (proj / "pdd" / "old.py").exists() + assert not (proj / "pdd" / "new.py").exists() + + def test_partial_rename_restores_both_sides_when_destination_allowed( + self, tmp_path + ): + """Iter-7 B4 (partial-rename bug): inverse of the above. If the + contract allows the DESTINATION but NOT the source, restoring only + ``old`` leaves ``A new`` staged. The whole rename must be reverted. + """ + from pdd.agentic_common import _revert_out_of_scope_changes + + proj = tmp_path / "repo" + proj.mkdir() + (proj / "pdd").mkdir() + (proj / "pdd" / "old.py").write_text("contents\n") + _init_test_git_repo(proj) + + _subprocess.run(["git", "-C", str(proj), "mv", + "pdd/old.py", "pdd/new.py"], + check=True, capture_output=True) + + # Contract allows destination but not source. + allowed = {(proj / "pdd" / "new.py").resolve()} + _revert_out_of_scope_changes(proj, allowed) + + status = _subprocess.run( + ["git", "-C", str(proj), "status", "--porcelain"], + capture_output=True, text=True, check=True, + ).stdout + assert status.strip() == "", ( + f"git status should be clean — rename must be fully undone; " + f"got: {status!r}" + ) + assert (proj / "pdd" / "old.py").exists() + assert not (proj / "pdd" / "new.py").exists() + + def test_rename_left_in_place_when_both_sides_allowed(self, tmp_path): + """Iter-7 B4 negative case: when BOTH sides of the rename are in + scope, the rename must NOT be reverted — only out-of-scope changes + get touched. + """ + from pdd.agentic_common import _revert_out_of_scope_changes + + proj = tmp_path / "repo" + proj.mkdir() + (proj / "pdd").mkdir() + (proj / "pdd" / "old.py").write_text("contents\n") + _init_test_git_repo(proj) + + _subprocess.run(["git", "-C", str(proj), "mv", + "pdd/old.py", "pdd/new.py"], + check=True, capture_output=True) + + allowed = { + (proj / "pdd" / "old.py").resolve(), + (proj / "pdd" / "new.py").resolve(), + } + _revert_out_of_scope_changes(proj, allowed) + + # Rename should remain staged. + status = _subprocess.run( + ["git", "-C", str(proj), "status", "--porcelain"], + capture_output=True, text=True, check=True, + ).stdout + assert "R" in status and "old.py" in status and "new.py" in status, ( + f"In-scope rename must remain staged; got: {status!r}" + ) + def test_reverts_deleted_files(self, tmp_path): """Deleted files outside allowed set must be restored.""" from pdd.agentic_common import _revert_out_of_scope_changes From 22de0c0fdc7bf9a3d1f51da30dd1f131e3b5ae5a Mon Sep 17 00:00:00 2001 From: Serhan Date: Thu, 14 May 2026 22:18:24 -0700 Subject: [PATCH 24/42] fix(sync): iter-8 B5+B6 empty contracts revert fully + align prompt with restore-based scope guard MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit B5a (empty-contract early-exit): _revert_out_of_scope_changes used to return [] when allowed_paths was empty — the historical "scope guard for a different module" optimization. With Issue #1013's degenerate- empty contract (allowed_write_set=[] meaning reject-all), this early exit silently bypassed enforcement. Now the check applies only when allowed_paths is non-empty. B5b (worktree-helper partial-rename bug): revert_out_of_scope_changes_with_dirs kept only the rename destination from "R old -> new" entries; the out-of-scope source side was silently deleted. Treat renames atomically (both sides scope-checked together) and revert via `git restore --staged --worktree --source=HEAD -- ` so the rename is fully undone instead of half-undone. B6 (prompt drift prevention): pdd/prompts/agentic_common_python.prompt item 23 documented the OLD `git checkout HEAD --` revert and said "behavior MUST remain". Future pdd sync regeneration would have drifted the code back to the buggy version. Updated prompt 23 and the worktree- helper prompt to describe restore-based revert + atomic rename treatment + empty-contract reject-all semantics. The signature contract is unchanged. New regression tests use real tmp_path git repos to lock in: - empty contract reverts a rename - partial rename revert atomic in worktree helper - mock-based rename test updated to assert both paths --- pdd/agentic_common.py | 8 ++- pdd/agentic_common_worktree.py | 63 ++++++++++++++----- pdd/prompts/agentic_common_python.prompt | 8 ++- .../agentic_common_worktree_python.prompt | 2 +- tests/test_agentic_common.py | 30 +++++++++ tests/test_agentic_common_worktree.py | 48 +++++++++++++- 6 files changed, 139 insertions(+), 20 deletions(-) diff --git a/pdd/agentic_common.py b/pdd/agentic_common.py index 8f2073b9e..94e3051a4 100644 --- a/pdd/agentic_common.py +++ b/pdd/agentic_common.py @@ -2247,7 +2247,13 @@ def _revert_out_of_scope_changes( List of paths that were reverted. """ cwd_str = str(cwd.resolve()) - if not any(str(p).startswith(cwd_str) for p in allowed_paths): + # Iter-8 B5a (empty contract reject-all): when ``allowed_paths`` is + # non-empty, skip when none of the entries fall under *cwd* — that is + # the historical "scope guard for a different module" optimization. + # When ``allowed_paths`` is EMPTY, however, the caller is asking for a + # reject-all sweep (Issue #1013 degenerate-empty contract). Don't + # short-circuit; proceed with revert. + if allowed_paths and not any(str(p).startswith(cwd_str) for p in allowed_paths): return [] try: result = subprocess.run( diff --git a/pdd/agentic_common_worktree.py b/pdd/agentic_common_worktree.py index 98afeb9f9..f2f17cff2 100644 --- a/pdd/agentic_common_worktree.py +++ b/pdd/agentic_common_worktree.py @@ -416,6 +416,16 @@ def revert_out_of_scope_changes_with_dirs( logger.warning("OS error running git status: %s", exc) return reverted + def _scope_check_path(filepath_str: str) -> bool: + """Return True if *filepath_str* is in-scope (allowed).""" + for prefix in allowed_dirs: + if filepath_str.startswith(prefix): + return True + abs_path = (cwd / filepath_str).resolve() + if abs_path in allowed_files: + return True + return False + for line in result.stdout.splitlines(): if len(line) < 4: continue @@ -423,28 +433,51 @@ def revert_out_of_scope_changes_with_dirs( status = line[:2] filepath_raw = line[3:] - # Handle renames: "R old_name -> new_name" + # Iter-8 B5b (worktree-helper rename bug): renames are reported as + # ``R old -> new``. Previously this helper kept only the destination, + # so a partial-rename out-of-scope situation (allowed=new, old + # disallowed) silently deleted ``old`` without reverting. Treat + # renames atomically: if EITHER side is out of scope, restore both. if " -> " in filepath_raw: - filepath_raw = filepath_raw.split(" -> ")[-1] + old_raw, new_raw = filepath_raw.split(" -> ", 1) + old_path = old_raw.strip().strip('"') + new_path = new_raw.strip().strip('"') + if _scope_check_path(old_path) and _scope_check_path(new_path): + continue + # Out-of-scope rename: undo via ``git restore --staged --worktree + # --source=HEAD`` so the destination unknown-to-HEAD is removed. + try: + restore = subprocess.run( + ["git", "restore", "--staged", "--worktree", + "--source=HEAD", "--", old_path, new_path], + cwd=str(cwd), + capture_output=True, + text=True, + timeout=30, + ) + if restore.returncode == 0: + logger.info( + "Reverted out-of-scope rename: %s -> %s", + old_path, new_path, + ) + reverted.append(Path(old_path)) + reverted.append(Path(new_path)) + else: + logger.warning( + "Failed to revert rename %s -> %s: %s", + old_path, new_path, + (restore.stderr or "").strip(), + ) + except (subprocess.TimeoutExpired, FileNotFoundError, OSError) as exc: + logger.warning("git restore failed for rename: %s", exc) + continue filepath_str = filepath_raw.strip().strip('"') # ------------------------------------------------------------------ # Scope check # ------------------------------------------------------------------ - in_scope = False - - for prefix in allowed_dirs: - if filepath_str.startswith(prefix): - in_scope = True - break - - if not in_scope: - abs_path = (cwd / filepath_str).resolve() - if abs_path in allowed_files: - in_scope = True - - if in_scope: + if _scope_check_path(filepath_str): continue # ------------------------------------------------------------------ diff --git a/pdd/prompts/agentic_common_python.prompt b/pdd/prompts/agentic_common_python.prompt index 15e9c177f..8d0ceffda 100644 --- a/pdd/prompts/agentic_common_python.prompt +++ b/pdd/prompts/agentic_common_python.prompt @@ -109,7 +109,13 @@ Shared infrastructure for agentic CLI invocations (Claude Code, Gemini, Codex, O 20. **OpenCode Optional Knobs**: Honor `OPENCODE_AGENT` by passing `--agent ` and `OPENCODE_VARIANT` by passing `--variant ` when set. Omit both flags when unset. `PDD_OPENCODE_MODE` is out of scope for this module version; use `opencode run` only. 21. **Issue Contract Parsing (Issue #1013 — sync scope guard)**: Provide `IssueContract` (frozen dataclass with `allowed_paths: Tuple[str, ...]`, `companion_allowlist: Tuple[str, ...]`, `source: str`) and `parse_issue_contract(issue_body, issue_comments=None) -> Optional[IssueContract]`. The parser scans the issue body first, then each comment (newest last is fine), looking for either (a) an HTML-comment block of the form `` whose JSON declares `allowed_paths` (required, list of repo-relative path strings) and optionally `companion_allowlist` (list of `pathlib`-style glob patterns), or (b) a fenced code block introduced by a heading-like line matching `(?im)^\s*(?:#+\s*)?(?:allowed[\s_-]*write[\s_-]*set|split[\s_-]*contract)\b.*$` immediately followed by a fenced block (```text``` or ```json```) whose lines list one repo-relative path per line (blank lines and `#`-prefixed comments ignored). Path strings are repo-relative POSIX paths; do NOT resolve to absolute filesystem paths here — that is the caller's job once it knows the repo root. The parser MUST be tolerant: malformed JSON, missing fields, or no matching marker returns `None` (the caller treats `None` as "no contract → scope guard runs in permissive fallback mode, no enforcement"). Set `source` to `"html-comment"`, `"fenced-block"`, or the value that was matched, for diagnostics. The parser MUST NOT raise on any input; wrap the JSON load in try/except and return `None` on failure. When both a body marker and a comment marker are present, prefer the body marker (issues are edited authoritatively in the body; comments are append-only and may contain stale snapshots from earlier workflow steps). 22. **Default Sync Companion Allowlist (Issue #1013)**: Expose a module-level constant `DEFAULT_SYNC_COMPANION_ALLOWLIST: Tuple[str, ...]` listing glob patterns for files that `pdd sync` MAY touch as legitimate companion artifacts even when an issue contract restricts the primary write set. The default value MUST be `(".pdd/meta/*.json",)` — only fingerprint metadata under `.pdd/meta/` is auto-allowed. Architecture, examples, and unrelated prompt files are NOT in the default companion allowlist; the issue contract must opt them in explicitly via its own `companion_allowlist` field. This constant exists so `agentic_sync_runner` and `agentic_sync` import a single shared default rather than redefining it inline. -23. **Scope Guard Helper Re-export**: `_revert_out_of_scope_changes(cwd, allowed_paths)` already exists in this module and is reused by sync scope enforcement (Issue #1013). The signature MUST remain `(cwd: Path, allowed_paths: set[Path]) -> List[Path]` and the behavior MUST remain: skip silently when `cwd` is not a git repo or when no allowed path lies under `cwd`; detect tracked changes via `git status --porcelain -uno`; restore out-of-scope tracked files via `git checkout HEAD --`; return the list of resolved paths that were reverted. This requirement is documentation-only — do not change the existing behavior, callers from `agentic_update`, `agentic_fix`, `agentic_crash`, `agentic_e2e_fix_orchestrator`, and the new sync caller all depend on the current contract. +23. **Scope Guard Helper (Issue #1013)**: `_revert_out_of_scope_changes(cwd, allowed_paths) -> List[Path]` is the shared revert helper used by `agentic_update`, `agentic_fix`, `agentic_crash`, `agentic_verify`, `agentic_e2e_fix_orchestrator`, and the sync scope guard. Signature MUST remain `(cwd: Path, allowed_paths: set[Path]) -> List[Path]` and return the list of resolved paths that were reverted. Behavior contract: + - Skip silently when `cwd` is not a git repo, when `git status` is unavailable, or when `allowed_paths` is NON-EMPTY and none of its entries fall under `cwd` (the "scope guard meant for a different module" optimization). An EMPTY `allowed_paths` is a legal reject-all contract (Issue #1013 degenerate-empty case) — proceed with revert. + - Detect tracked changes via `git status --porcelain -uno`. + - Parse each status line. For rename entries (`R old -> new`), surface BOTH source and destination as separate paths; treat the rename atomically when deciding to revert — if EITHER side is out of scope, revert BOTH sides (otherwise the working tree is left in a half-renamed state). + - Restore out-of-scope tracked files via `git restore --staged --worktree --source=HEAD -- ` (not `git checkout HEAD --`, which cannot remove rename destinations unknown to HEAD). Fall back to `git checkout HEAD --` only when the local `git` is too old to support `git restore` (pre-2.23). + - Check the restore subprocess return code; on non-zero, log a WARNING and clear the reverted list so callers see real-vs-claimed revert state. + - Log INFO with the count and first ~10 reverted file names on success. % Function Signatures `get_agent_provider_preference() -> List[str]` diff --git a/pdd/prompts/agentic_common_worktree_python.prompt b/pdd/prompts/agentic_common_worktree_python.prompt index 165af32cf..239c808ef 100644 --- a/pdd/prompts/agentic_common_worktree_python.prompt +++ b/pdd/prompts/agentic_common_worktree_python.prompt @@ -43,7 +43,7 @@ All functions are public (no leading underscore) so orchestrators can import the 7. **`setup_worktree(cwd, issue_number, quiet, *, resume_existing=False, branch_prefix="fix", worktree_prefix="fix") -> Tuple[Optional[Path], Optional[str]]`**: Create an isolated git worktree at `.pdd/worktrees/{worktree_prefix}-issue-{issue_number}/` on branch `{branch_prefix}/issue-{issue_number}`. Clean up existing worktree/directory and branch before creating. If `resume_existing` is True and branch exists, reuse it (attach with `--force`). Otherwise delete the old branch first. When reusing an undeletable branch, reset to main ref after attaching. Print worktree path unless `quiet`. The `branch_prefix` and `worktree_prefix` kwargs let callers customize naming (e.g. `change` prefix for change workflows, `fix` for bug workflows). Return `(worktree_path, None)` on success, `(None, error_msg)` on failure. 8. **`get_modified_and_untracked(cwd: Path) -> List[str]`**: Return modified tracked files (`git diff --name-only HEAD`) plus untracked files (`git ls-files --others --exclude-standard`). 9. **`check_target_file_unchanged(cwd: Path, target_file: str, baseline_sha: Optional[str] = None) -> Tuple[bool, Optional[str]]`**: Detect concurrent edits. Run `git fetch origin` then `git rev-parse origin/main:{target_file}`. If `baseline_sha` is provided, compare current SHA against it — return `(True, current_sha)` if unchanged, `(False, current_sha)` if changed. If `baseline_sha` is None, just return `(True, current_sha)` to establish the baseline. Return `(True, None)` on git failures (fail-open to avoid blocking workflows). -10. **`revert_out_of_scope_changes_with_dirs(cwd: Path, allowed_dirs: set[str], allowed_files: set[Path]) -> List[Path]`**: Scope guard that detects both tracked changes AND new untracked files via `git status --porcelain --untracked-files=all` (a.k.a. `-uall`). The explicit `--untracked-files=all` is required so untracked content nested under a brand-new directory is expanded into individual `?? path/to/file` entries — bare `-u` is ambiguous across git versions/configs and may collapse the directory into a single `?? subdir/` entry that `os.remove` cannot delete. For each changed/new file, check if its path starts with any prefix in `allowed_dirs` OR its resolved absolute path is in `allowed_files`. Revert tracked out-of-scope changes via `git checkout HEAD --`. Remove untracked out-of-scope files via `os.remove`; if (defensively) git ever reports an untracked directory (path ending in `/` or whose target resolves to a directory), use `shutil.rmtree` instead so contained files don't get left behind. Return list of reverted/removed paths. Log actions via module logger. Handle timeout and OS errors gracefully. +10. **`revert_out_of_scope_changes_with_dirs(cwd: Path, allowed_dirs: set[str], allowed_files: set[Path]) -> List[Path]`**: Scope guard that detects both tracked changes AND new untracked files via `git status --porcelain --untracked-files=all` (a.k.a. `-uall`). The explicit `--untracked-files=all` is required so untracked content nested under a brand-new directory is expanded into individual `?? path/to/file` entries — bare `-u` is ambiguous across git versions/configs and may collapse the directory into a single `?? subdir/` entry that `os.remove` cannot delete. For each changed/new file, check if its path starts with any prefix in `allowed_dirs` OR its resolved absolute path is in `allowed_files`. Rename entries (`R old -> new`) are treated ATOMICALLY: surface both source and destination, scope-check each side, and if EITHER side is out of scope, revert BOTH via `git restore --staged --worktree --source=HEAD -- ` (so the rename destination unknown to HEAD is removed and the source restored — `git checkout HEAD --` alone cannot undo this). For non-rename tracked out-of-scope changes, revert via `git restore --staged --worktree --source=HEAD --` (or `git checkout HEAD --` on git < 2.23). Remove untracked out-of-scope files via `os.remove`; if (defensively) git ever reports an untracked directory (path ending in `/` or whose target resolves to a directory), use `shutil.rmtree` instead so contained files don't get left behind. Return list of reverted/removed paths. Log actions via module logger. Handle timeout and OS errors gracefully. 11. **`extract_block_marker(output: str, name: str) -> str`**: Parse a multi-line block delimited by `BEGIN_{name}` and `END_{name}` markers from agent output. Return the content between markers (stripped), or empty string if markers not found. Case-insensitive marker matching. % Dependencies diff --git a/tests/test_agentic_common.py b/tests/test_agentic_common.py index 3b49dfc6e..50e6456d2 100644 --- a/tests/test_agentic_common.py +++ b/tests/test_agentic_common.py @@ -4823,6 +4823,36 @@ def test_rename_left_in_place_when_both_sides_allowed(self, tmp_path): f"In-scope rename must remain staged; got: {status!r}" ) + def test_empty_contract_reverts_rename_fully(self, tmp_path): + """Iter-8 B5a (empty-contract early-exit) + B5b: a reject-all + empty contract (``allowed_paths=set()``) used to short-circuit + the helper. After the fix, the helper proceeds with revert; the + rename is fully undone. + """ + from pdd.agentic_common import _revert_out_of_scope_changes + + proj = tmp_path / "repo" + proj.mkdir() + (proj / "pdd").mkdir() + (proj / "pdd" / "old.py").write_text("contents\n") + _init_test_git_repo(proj) + + _subprocess.run(["git", "-C", str(proj), "mv", + "pdd/old.py", "pdd/new.py"], + check=True, capture_output=True) + + # Empty contract: nothing is allowed → revert everything. + _revert_out_of_scope_changes(proj, set()) + + status = _subprocess.run( + ["git", "-C", str(proj), "status", "--porcelain"], + capture_output=True, text=True, check=True, + ).stdout + assert status.strip() == "", ( + f"Empty contract must revert all changes including renames; " + f"got: {status!r}" + ) + def test_reverts_deleted_files(self, tmp_path): """Deleted files outside allowed set must be restored.""" from pdd.agentic_common import _revert_out_of_scope_changes diff --git a/tests/test_agentic_common_worktree.py b/tests/test_agentic_common_worktree.py index 6a3868605..ff660b8a9 100644 --- a/tests/test_agentic_common_worktree.py +++ b/tests/test_agentic_common_worktree.py @@ -519,16 +519,60 @@ def test_keeps_in_scope_by_allowed_files(self, tmp_path): assert result == [] def test_handles_renames(self): + """Iter-8 B5b: renames are reverted atomically — both old and new + sides appear in the result list, and the helper invokes + ``git restore --staged --worktree --source=HEAD`` (not the old + ``git checkout HEAD --``) so the rename destination is properly + removed from the working tree.""" porcelain = "R old.py -> new.py\n" with patch(f"{MODULE}.subprocess.run") as mock_run: mock_run.side_effect = [ _cp(stdout=porcelain), - _cp(), # checkout + _cp(), # restore ] result = revert_out_of_scope_changes_with_dirs( Path("/repo"), allowed_dirs=set(), allowed_files=set() ) - assert Path("new.py") in result + assert Path("old.py") in result and Path("new.py") in result + # Verify the second subprocess call used ``git restore`` with + # BOTH paths (atomic rename revert). + restore_call = mock_run.call_args_list[1] + args = restore_call.args[0] + assert "restore" in args + assert "old.py" in args and "new.py" in args + + def test_partial_rename_atomic_revert(self, tmp_path): + """Iter-8 B5b (worktree helper): when one side of a rename is + allowed and the other is not, the rename must be reverted as a + unit. Uses a real ``tmp_path`` git repo because mocks would not + catch the actual half-staged state. + """ + import subprocess as _sp + env = {**os.environ, "GIT_AUTHOR_NAME": "T", "GIT_AUTHOR_EMAIL": "t@t", + "GIT_COMMITTER_NAME": "T", "GIT_COMMITTER_EMAIL": "t@t"} + _sp.run(["git", "init", "-b", "main", str(tmp_path)], check=True, + capture_output=True, env=env) + (tmp_path / "old.py").write_text("c\n") + _sp.run(["git", "-C", str(tmp_path), "add", "-A"], check=True, + capture_output=True, env=env) + _sp.run(["git", "-C", str(tmp_path), "commit", "-m", "init"], + check=True, capture_output=True, env=env) + _sp.run(["git", "-C", str(tmp_path), "mv", "old.py", "new.py"], + check=True, capture_output=True, env=env) + + # Allow only one side of the rename. + allowed = {(tmp_path / "old.py").resolve()} + revert_out_of_scope_changes_with_dirs( + tmp_path, allowed_dirs=set(), allowed_files=allowed + ) + + status = _sp.run( + ["git", "-C", str(tmp_path), "status", "--porcelain"], + capture_output=True, text=True, check=True, + ).stdout + assert status.strip() == "", ( + f"Partial rename must be fully undone; got: {status!r}" + ) def test_handles_git_status_failure(self): with patch(f"{MODULE}.subprocess.run", return_value=_cp(returncode=1)): From 52fed06f0ffa2caa263504c4fbd1145322931554 Mon Sep 17 00:00:00 2001 From: Serhan Date: Fri, 15 May 2026 11:03:25 -0700 Subject: [PATCH 25/42] fix(sync): iter-9 M-1 scope guard fail-closed boundary via post-revert re-scan MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Codex iter-9 review (Major): _enforce_scope_guard treated empty lists from the revert helpers as "nothing out of scope" — but those helpers fail-open on git timeout, permission error, or restore failure (log a warning and return []). The module could be marked successful while the contract violation remained on disk. Add _remaining_out_of_scope_paths() that re-scans the worktree via `git status --porcelain --untracked-files=all` after both revert helpers run. Filter duplicates against the already-reported offending set. Surface remaining paths under a new "Unrecovered (revert failed, manual cleanup required):" diagnostic section and hard-fail the module when either set is non-empty. On git failure, the helper returns the sentinel [""] so the orchestrator still hard-fails rather than silently treating an unobservable working tree as clean. When offending is empty but remaining is non-empty, the diagnostic header switches to "Scope guard detected out-of-scope artifacts ... but the revert helpers reported no successful reverts." instead of the misleading "reverted 0 out-of-scope file(s)". Tests: 4 new regression cases in tests/test_agentic_sync_runner.py cover fail-open, clean tree, mixed reverted+unrecovered, and the git-status sentinel. Prompt updated to document the fail-closed contract. Co-Authored-By: Claude Opus 4.7 --- pdd/agentic_sync_runner.py | 115 ++++++++++- pdd/prompts/agentic_sync_runner_python.prompt | 6 +- tests/test_agentic_sync_runner.py | 180 ++++++++++++++++++ 3 files changed, 291 insertions(+), 10 deletions(-) diff --git a/pdd/agentic_sync_runner.py b/pdd/agentic_sync_runner.py index 1eca59551..f911cc5ca 100644 --- a/pdd/agentic_sync_runner.py +++ b/pdd/agentic_sync_runner.py @@ -1901,6 +1901,65 @@ def _matches_companion_allowlist( continue return False + def _remaining_out_of_scope_paths( + self, repo_root: Path, allowed_files: Set[Path] + ) -> List[str]: + """ + Iter-9 M-1 (fail-closed boundary): re-scan the worktree after the + revert helpers have run and return any repo-relative paths still NOT + in *allowed_files*. + + This guards against silent fail-open when either revert helper + cannot inspect / revert / remove an out-of-scope path (git timeout, + permission error, restore failure). Those helpers log a warning and + return ``[]``; without this re-scan ``_enforce_scope_guard`` would + treat the empty list as "nothing was out of scope" and let the + module succeed with the contract still violated on disk. + + Returns: + Sorted list of POSIX repo-relative paths still out of scope, OR + the sentinel ``[""]`` when ``git status`` + itself cannot be executed (timeout / missing git / non-zero + return). The sentinel is consistent with the warning-log + + empty-list style used elsewhere in the scope guard, but still + forces ``_enforce_scope_guard`` to hard-fail rather than treat + the unobservable working tree as clean. + """ + try: + result = subprocess.run( + ["git", "-C", str(repo_root), "status", + "--porcelain", "--untracked-files=all"], + capture_output=True, text=True, timeout=30, + ) + except (subprocess.TimeoutExpired, FileNotFoundError, OSError): + return [""] + if result.returncode != 0: + return [""] + + remaining: Set[str] = set() + for line in result.stdout.splitlines(): + if len(line) < 4: + continue + payload = line[3:].strip() + if not payload: + continue + # Renames: ``R old -> new``. Both sides count. + if " -> " in payload: + old_raw, new_raw = payload.split(" -> ", 1) + entry_paths = [old_raw.strip().strip('"'), + new_raw.strip().strip('"')] + else: + entry_paths = [payload.strip('"')] + for rel in entry_paths: + rel = _normalize_repo_path(rel) + if not rel: + continue + absolute = (repo_root / rel).resolve() + if absolute in allowed_files: + continue + remaining.add(rel) + return sorted(remaining) + def _enforce_scope_guard( self, basename: str, module_cwd: Path ) -> Optional[str]: @@ -1997,7 +2056,23 @@ def _enforce_scope_guard( seen.add(rel) offending.append(rel) - if not offending: + # Iter-9 M-1 (fail-closed boundary): re-scan the worktree after + # the revert helpers have run. Either helper can fail silently + # (git timeout, permission error, restore failure) and return + # ``[]``. Without this re-scan we would conclude "nothing was + # out of scope" and let the module succeed with the contract + # still violated on disk. + remaining_raw = self._remaining_out_of_scope_paths( + repo_root, allowed_files + ) + # Filter out paths already surfaced as ``offending`` so the + # re-scan does not double-list. In practice when helpers succeed + # the path is gone from ``git status``; when helpers fail with + # ``reverted.clear()`` ``offending`` is empty. Defensive filter. + offending_set = set(offending) + remaining = [p for p in remaining_raw if p not in offending_set] + + if not offending and not remaining: return None source = self.contract_source or "" @@ -2007,14 +2082,36 @@ def _enforce_scope_guard( companion_lines = "\n".join( f" - {p}" for p in allowlist ) or " - " - offending_lines = "\n".join(f" - {p}" for p in offending) - diagnostic = ( - f"Scope guard reverted {len(offending)} out-of-scope file(s) " - f"for module '{basename}' (contract source: {source}):\n" - f"{offending_lines}\n" - f"Allowed write set:\n{allowed_lines}\n" - f"Companion allowlist:\n{companion_lines}" - ) + + # Header line shape depends on whether anything was actually + # reverted. When ``offending`` is empty but ``remaining`` is + # non-empty, emitting "reverted 0 out-of-scope file(s)" plus an + # empty bullet list reads incorrectly; use a distinct header. + if offending: + offending_lines = "\n".join(f" - {p}" for p in offending) + header = ( + f"Scope guard reverted {len(offending)} out-of-scope " + f"file(s) for module '{basename}' " + f"(contract source: {source}):\n" + f"{offending_lines}" + ) + else: + header = ( + f"Scope guard detected out-of-scope artifacts for " + f"module '{basename}' (contract source: {source}) " + f"but the revert helpers reported no successful reverts." + ) + + parts = [header] + if remaining: + unrecovered_lines = "\n".join(f" - {p}" for p in remaining) + parts.append( + "Unrecovered (revert failed, manual cleanup required):\n" + f"{unrecovered_lines}" + ) + parts.append(f"Allowed write set:\n{allowed_lines}") + parts.append(f"Companion allowlist:\n{companion_lines}") + diagnostic = "\n".join(parts) # F8 (Issue #1013): print the diagnostic to stderr immediately # after reverting. ``maintenance.py`` separately echoes the # assembled module-failure error at the end of the run — two diff --git a/pdd/prompts/agentic_sync_runner_python.prompt b/pdd/prompts/agentic_sync_runner_python.prompt index e34dab40b..0f188ae7c 100644 --- a/pdd/prompts/agentic_sync_runner_python.prompt +++ b/pdd/prompts/agentic_sync_runner_python.prompt @@ -66,11 +66,14 @@ Parallel sync engine that runs `pdd sync` for multiple modules concurrently usin 22. **Split-Contract Scope Guard (Issue #1013)**: After each per-module `pdd sync` subprocess completes (success or failure), and **before** the runner declares that module successful or persists state, the runner MUST invoke `_enforce_scope_guard(basename, module_cwd)` when `self.scope_guard_enabled` is True AND `self.allowed_write_set is not None`. The helper: - Builds the effective allow set for the module: every path in `self.allowed_write_set` resolved against the module's repo root (the git toplevel of `module_cwd`, falling back to `module_cwd` itself), plus every path under `module_cwd` that matches any glob in the effective companion allowlist (`self.companion_allowlist` ∪ `DEFAULT_SYNC_COMPANION_ALLOWLIST`). - Calls `_revert_out_of_scope_changes(repo_root, allowed_paths)` from `pdd.agentic_common` to revert tracked out-of-scope modifications, AND calls `revert_out_of_scope_changes_with_dirs(repo_root, allowed_dirs=set(), allowed_files=allowed_paths)` from `pdd.agentic_common_worktree` to additionally remove untracked out-of-scope new files. The combination matches the existing scope-guard pattern used by `agentic_update`/`agentic_fix`/`agentic_crash`/`agentic_e2e_fix_orchestrator`. + - **Post-revert re-scan (Issue #1013 iter-9, M-1 fail-closed boundary)**: after both helpers return, the runner MUST call `_remaining_out_of_scope_paths(repo_root, allowed_files)` to detect anything the helpers could not revert/remove (git timeout, permission error, restore failure — those helpers log a warning and return `[]`, which the orchestrator otherwise mistakes for "clean"). Paths returned by the re-scan and not already in the helper-returned offending list go into an `Unrecovered (revert failed, manual cleanup required):` section in the diagnostic. A non-empty Unrecovered set MUST cause `_enforce_scope_guard` to return a diagnostic string (hard-fail the module) even when the revert helpers themselves returned empty lists. When `offending` is empty but `Unrecovered` is non-empty, emit the alternate header `Scope guard detected out-of-scope artifacts for module '' (contract source: ) but the revert helpers reported no successful reverts.` instead of the misleading `Scope guard reverted 0 out-of-scope file(s)...`. The `Unrecovered` section is OMITTED entirely when empty (no empty headers). - Diagnostic format (printed to stderr; structured for downstream parsers — checkup, review-loop reports): ``` Scope guard reverted N out-of-scope file(s) for module '' (contract source: ): - path/relative/to/repo - another/path + Unrecovered (revert failed, manual cleanup required): + - path/the/guard/could/not/revert Allowed write set: - path/from/contract Companion allowlist: @@ -139,7 +142,8 @@ Parallel sync engine that runs `pdd sync` for multiple modules concurrently usin - `_find_pdd_executable() -> Optional[str]`: find pdd binary (same pattern as `server/jobs.py`) - `_parse_cost_from_csv(csv_path: str) -> float`: sum cost column from PDD_OUTPUT_COST_PATH CSV - `_format_duration(start, end) -> str`: format seconds as "Xs" or "Xm Ys" -- `_enforce_scope_guard(self, basename: str, module_cwd: Path) -> Optional[str]`: Issue #1013 scope guard. Returns `None` when the module is in scope; returns a multi-line diagnostic string (see Req 22) when out-of-scope artifacts were detected. Callers (the per-future completion handler) treat a non-None return as a module failure and replace any prior success record with it. No-ops when `self.scope_guard_enabled is False` or `self.allowed_write_set is None`. Reuses `pdd.agentic_common._revert_out_of_scope_changes` and `pdd.agentic_common_worktree.revert_out_of_scope_changes_with_dirs` rather than reimplementing git scanning. +- `_enforce_scope_guard(self, basename: str, module_cwd: Path) -> Optional[str]`: Issue #1013 scope guard. Returns `None` when the module is in scope; returns a multi-line diagnostic string (see Req 22) when out-of-scope artifacts were detected. Callers (the per-future completion handler) treat a non-None return as a module failure and replace any prior success record with it. No-ops when `self.scope_guard_enabled is False` or `self.allowed_write_set is None`. Reuses `pdd.agentic_common._revert_out_of_scope_changes` and `pdd.agentic_common_worktree.revert_out_of_scope_changes_with_dirs` rather than reimplementing git scanning. **Fail-closed safety (Issue #1013 iter-9, M-1)**: after invoking both revert helpers, the runner MUST re-scan the worktree via `_remaining_out_of_scope_paths(repo_root, allowed_files)` and hard-fail when ANY out-of-scope artifacts remain, even if the revert helpers returned empty lists. The helpers fail-open on git timeout / permission error / restore failure (they log a warning and return `[]`); without the re-scan the orchestrator would treat that as "nothing was out of scope" and let the module succeed with the contract still violated on disk. Unrecovered paths surface in the diagnostic under a distinct `Unrecovered (revert failed, manual cleanup required):` section. +- `_remaining_out_of_scope_paths(self, repo_root: Path, allowed_files: Set[Path]) -> List[str]`: Issue #1013 iter-9 (M-1) re-scan helper. Runs `git status --porcelain --untracked-files=all` in *repo_root* with a 30s timeout, parses each line (handling the `R old -> new` rename format and the `_normalize_repo_path` cleanup), resolves each path against *allowed_files*, and returns a sorted list of POSIX repo-relative paths still NOT in the allow set. On git failure (timeout, missing binary, non-zero return) returns the single-element sentinel `[""]` so `_enforce_scope_guard` hard-fails rather than silently treating an unobservable worktree as clean. Consistent with the warning-log + empty-list pattern used by `_revert_out_of_scope_changes` and `revert_out_of_scope_changes_with_dirs` — but the sentinel value, not an empty list, is what forces the hard-fail. - `_parse_conformance_failure(stdout: str, stderr: str) -> Optional[Tuple[str, Tuple[str, ...]]]`: scan combined stdout+stderr for the line prefix `Architecture conformance error for ` and, when matched, return `(repair_directive, missing_symbols)` where `missing_symbols` is a sorted tuple of the symbols listed after any of the following inline shapes (route each into its own directive bucket — they MUST NOT be merged): - (a) `declared symbols missing from generated code:` — default `ArchitectureConformanceError` shape (architecture.json symbol-existence check). - (b) `Python code uses camelCase names (...)` parenthesised list — camelCase guard. diff --git a/tests/test_agentic_sync_runner.py b/tests/test_agentic_sync_runner.py index 4d4b239e8..472054a0f 100644 --- a/tests/test_agentic_sync_runner.py +++ b/tests/test_agentic_sync_runner.py @@ -2815,6 +2815,14 @@ def fake_revert(repo_root, allowed_files): monkeypatch.setattr( runner, "_resolve_repo_root", lambda _cwd: tmp_path.resolve() ) + # The iter-9 M-1 post-revert re-scan calls ``git status``; this test + # uses a tmp_path without a real ``git init`` and only cares about + # the rglob/_git_changed_paths companion-allowlist behavior, so stub + # the re-scan to return [] (matching the helpers' mocked behavior). + monkeypatch.setattr( + runner, "_remaining_out_of_scope_paths", + lambda _root, _allowed: [], + ) diagnostic = runner._enforce_scope_guard("old_module", tmp_path) assert diagnostic is None, ( @@ -2822,6 +2830,178 @@ def fake_revert(repo_root, allowed_files): ) assert (tmp_path / deleted_rel).resolve() in captured["allowed"] + # ------------------------------------------------------------------ + # Iter-9 M-1: fail-closed boundary — re-scan after revert helpers + # ------------------------------------------------------------------ + def _init_repo(self, tmp_path): + """Create a minimal git repo so re-scan ``git status`` succeeds.""" + subprocess.run(["git", "init", "-b", "main", str(tmp_path)], check=True, + capture_output=True) + subprocess.run(["git", "-C", str(tmp_path), "config", "user.email", + "t@t.invalid"], check=True, capture_output=True) + subprocess.run(["git", "-C", str(tmp_path), "config", "user.name", + "T"], check=True, capture_output=True) + (tmp_path / "README.md").write_text("initial") + subprocess.run(["git", "-C", str(tmp_path), "add", "README.md"], + check=True, capture_output=True) + subprocess.run(["git", "-C", str(tmp_path), "commit", "-m", "init"], + check=True, capture_output=True) + + def test_fail_open_regression_unrecovered_path_hard_fails( + self, tmp_path, monkeypatch + ): + """Iter-9 M-1: when both revert helpers return ``[]`` (simulating + helper failure) AND an out-of-scope file remains on disk, the scope + guard MUST hard-fail by returning a diagnostic that surfaces the + unrecovered path under ``Unrecovered``.""" + from pdd import agentic_sync_runner as mod + + self._init_repo(tmp_path) + # Out-of-scope untracked file the revert helper "failed" to remove. + stray = tmp_path / "stray.txt" + stray.write_text("contract violation") + + # Simulate both helpers fail-open (returning []). + monkeypatch.setattr(mod, "_revert_out_of_scope_changes", + lambda _root, _allowed: []) + monkeypatch.setattr( + mod, "revert_out_of_scope_changes_with_dirs", + lambda _root, allowed_dirs, allowed_files: [], + ) + + runner = self._make_runner( + allowed_write_set=["pdd/foo.py"], + companion_allowlist=[".pdd/meta/*.json"], + ) + monkeypatch.setattr( + runner, "_resolve_repo_root", lambda _cwd: tmp_path.resolve() + ) + + diagnostic = runner._enforce_scope_guard("mod", tmp_path) + + assert diagnostic is not None, ( + "Fail-open regression: helpers returned [] but stray.txt still " + "violates the contract; guard must hard-fail." + ) + assert "Unrecovered" in diagnostic + assert "stray.txt" in diagnostic + + def test_clean_working_tree_returns_none(self, tmp_path, monkeypatch): + """No out-of-scope files on disk AND both helpers return [] → + guard MUST return ``None`` (the in-scope path).""" + from pdd import agentic_sync_runner as mod + + self._init_repo(tmp_path) + + monkeypatch.setattr(mod, "_revert_out_of_scope_changes", + lambda _root, _allowed: []) + monkeypatch.setattr( + mod, "revert_out_of_scope_changes_with_dirs", + lambda _root, allowed_dirs, allowed_files: [], + ) + + runner = self._make_runner( + allowed_write_set=["pdd/foo.py"], + companion_allowlist=[".pdd/meta/*.json"], + ) + monkeypatch.setattr( + runner, "_resolve_repo_root", lambda _cwd: tmp_path.resolve() + ) + + assert runner._enforce_scope_guard("mod", tmp_path) is None + + def test_mixed_success_reverted_and_unrecovered_both_surface( + self, tmp_path, monkeypatch + ): + """Helpers report some reverted paths AND additional out-of-scope + paths remain. Diagnostic MUST contain both 'reverted' and + 'Unrecovered' sections. Companion-allowlisted files (e.g. + ``.pdd/meta/.json`` under ``module_cwd``) must NOT appear + under Unrecovered — they are auto-allowed.""" + from pdd import agentic_sync_runner as mod + + self._init_repo(tmp_path) + + # Companion-allowlisted file under module_cwd: must be auto-allowed. + meta_dir = tmp_path / ".pdd" / "meta" + meta_dir.mkdir(parents=True) + companion = meta_dir / "mod_python.json" + companion.write_text("{}") + # Out-of-scope untracked file the helpers "failed" to remove. + stray = tmp_path / "stray.txt" + stray.write_text("contract violation") + + reverted_path = tmp_path / "pdd" / "reverted_already.py" + + monkeypatch.setattr( + mod, "_revert_out_of_scope_changes", + lambda _root, _allowed: [reverted_path], + ) + monkeypatch.setattr( + mod, "revert_out_of_scope_changes_with_dirs", + lambda _root, allowed_dirs, allowed_files: [], + ) + + runner = self._make_runner( + allowed_write_set=["pdd/foo.py"], + companion_allowlist=[".pdd/meta/*.json"], + ) + monkeypatch.setattr( + runner, "_resolve_repo_root", lambda _cwd: tmp_path.resolve() + ) + + diagnostic = runner._enforce_scope_guard("mod", tmp_path) + + assert diagnostic is not None + # Reverted-side: existing diagnostic prefix surfaces the revert count. + assert "Scope guard reverted 1 out-of-scope file(s)" in diagnostic + assert "pdd/reverted_already.py" in diagnostic + # Unrecovered-side: the stray file shows up under the new section. + assert "Unrecovered" in diagnostic + assert "stray.txt" in diagnostic + # Companion artifact must NOT be flagged. + assert ".pdd/meta/mod_python.json" not in diagnostic + + def test_git_status_failed_sentinel_surfaces_in_diagnostic( + self, tmp_path, monkeypatch + ): + """When the post-revert ``git status`` itself fails (timeout, missing + binary, non-zero return), ``_remaining_out_of_scope_paths`` returns + the sentinel ``['']`` and the diagnostic MUST + clearly indicate the failure under ``Unrecovered`` so operators do + not silently trust an unobservable working tree.""" + from pdd import agentic_sync_runner as mod + + monkeypatch.setattr(mod, "_revert_out_of_scope_changes", + lambda _root, _allowed: []) + monkeypatch.setattr( + mod, "revert_out_of_scope_changes_with_dirs", + lambda _root, allowed_dirs, allowed_files: [], + ) + + runner = self._make_runner( + allowed_write_set=["pdd/foo.py"], + companion_allowlist=[".pdd/meta/*.json"], + ) + monkeypatch.setattr( + runner, "_resolve_repo_root", lambda _cwd: tmp_path.resolve() + ) + # Force the re-scan to report the sentinel directly. + monkeypatch.setattr( + runner, "_remaining_out_of_scope_paths", + lambda _root, _allowed: [""], + ) + + diagnostic = runner._enforce_scope_guard("mod", tmp_path) + + assert diagnostic is not None, ( + "Sentinel [''] must hard-fail the module." + ) + assert "Unrecovered" in diagnostic + # Allow the sentinel wording to evolve — just check the failure + # indicator is present. + assert "git-status-failed" in diagnostic + # --------------------------------------------------------------------------- # Issue #745: initial_cost (LLM module analysis cost) tracking From bd0caa28011e524d05a9a382b762343d6a443e3b Mon Sep 17 00:00:00 2001 From: Serhan Date: Fri, 15 May 2026 11:19:47 -0700 Subject: [PATCH 26/42] fix(sync): iter-10 M-1 validate companion_allowlist anchor to prevent guard bypass MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Codex iter-10 review (Major): a contract that declared ``companion_allowlist: ["*"]`` (or ``**``, ``**/*``, ``?``) would let ``_matches_companion_allowlist`` treat arbitrary repo-wide writes as auto-allowed companion artifacts, effectively neutralizing the scope guard. Add ``_is_valid_companion_pattern`` in ``agentic_common`` that requires at least one path segment with a literal character (anything outside ``*?``) and rejects absolute, traversal, Windows-separator, and empty patterns. Apply it at parse time in ``_parse_html_comment_contract`` (silent drop, matching the existing ``allowed_paths`` style) AND at match time in both ``AsyncSyncRunner._matches_companion_allowlist`` and ``DurableSyncRunner._out_of_scope_staged_paths`` (defense-in-depth in case a wildcard-only pattern reaches a runner via direct construction bypassing the parser). The default ``DEFAULT_SYNC_COMPANION_ALLOWLIST = (".pdd/meta/*.json",)`` passes the validator — regression-tested. Tests: parse-time validator (4 cases on wildcard-only, anchored, absolute/traversal, and default), runner-side defense (1 case per runner). Prompt Req 21 cites the iter-10 finding and defines what "invalid" means for companion patterns. Co-Authored-By: Claude Opus 4.7 --- pdd/agentic_common.py | 40 +++++++++++++- pdd/agentic_sync_runner.py | 6 +++ pdd/durable_sync_runner.py | 7 +++ pdd/prompts/agentic_common_python.prompt | 2 +- tests/test_agentic_common.py | 65 +++++++++++++++++++++++ tests/test_agentic_sync_runner.py | 66 ++++++++++++++++++++++++ tests/test_durable_sync_runner.py | 20 +++++++ 7 files changed, 204 insertions(+), 2 deletions(-) diff --git a/pdd/agentic_common.py b/pdd/agentic_common.py index 94e3051a4..5a4ef2455 100644 --- a/pdd/agentic_common.py +++ b/pdd/agentic_common.py @@ -2443,6 +2443,41 @@ def _is_valid_contract_path(raw: object) -> bool: return True +def _is_valid_companion_pattern(raw: object) -> bool: + """ + Return True iff *raw* is a repo-relative companion glob pattern with at + least one literal-character anchor. + + Issue #1013 iter-10 M-1: ``companion_allowlist`` accepts arbitrary glob + patterns, so a contract that declares ``*``, ``**``, or ``**/*`` would + let ``_matches_companion_allowlist`` auto-allow repo-wide changes and + bypass the split-contract write set. Reject patterns whose every + segment is wildcard-only (no character outside ``*?``), as well as + absolute, Windows-separator, traversal, and empty patterns. + """ + if not isinstance(raw, str): + return False + candidate = raw.strip() + if not candidate: + return False + if "\\" in candidate: + return False + if candidate.startswith("/"): + return False + parts = candidate.split("/") + if any(part == ".." for part in parts): + return False + # At least one segment MUST contain a literal character (anything + # outside ``*?``); otherwise the pattern is wildcard-only and would + # match arbitrary repo paths, defeating the scope guard. + for segment in parts: + if not segment: + continue + if any(ch not in "*?" for ch in segment): + return True + return False + + def _parse_html_comment_contract(text: str) -> Optional[IssueContract]: """Return a contract parsed from a ```` block, else None.""" match = _HTML_COMMENT_CONTRACT_RE.search(text) @@ -2467,8 +2502,11 @@ def _parse_html_comment_contract(text: str) -> Optional[IssueContract]: raw_companion = parsed.get("companion_allowlist", []) if not isinstance(raw_companion, list): raw_companion = [] + # Issue #1013 iter-10 M-1: drop wildcard-only / absolute / traversal / + # Windows-separator patterns silently so a contract cannot declare + # ``*``/``**``/``**/*`` and bypass the split-contract write set. companion = tuple( - p.strip() for p in raw_companion if isinstance(p, str) and p.strip() + p.strip() for p in raw_companion if _is_valid_companion_pattern(p) ) return IssueContract( allowed_paths=allowed, diff --git a/pdd/agentic_sync_runner.py b/pdd/agentic_sync_runner.py index f911cc5ca..2e56151fb 100644 --- a/pdd/agentic_sync_runner.py +++ b/pdd/agentic_sync_runner.py @@ -28,6 +28,7 @@ from .agentic_common import ( DEFAULT_SYNC_COMPANION_ALLOWLIST, + _is_valid_companion_pattern, _revert_out_of_scope_changes, ) from .agentic_common_worktree import revert_out_of_scope_changes_with_dirs @@ -1893,6 +1894,11 @@ def _matches_companion_allowlist( for pattern in allowlist: if not pattern: continue + # Issue #1013 iter-10 M-1 (defense-in-depth): even if a + # wildcard-only pattern slipped past the parser, refuse to + # treat it as auto-allowing repo-wide writes. + if not _is_valid_companion_pattern(pattern): + continue try: if candidate.match(pattern): return True diff --git a/pdd/durable_sync_runner.py b/pdd/durable_sync_runner.py index 25fc4f8e3..f2dce857f 100644 --- a/pdd/durable_sync_runner.py +++ b/pdd/durable_sync_runner.py @@ -20,6 +20,7 @@ from pathlib import Path, PurePosixPath from typing import Dict, List, Optional, Set, Tuple +from .agentic_common import _is_valid_companion_pattern from .agentic_sync_runner import AsyncSyncRunner, MAX_WORKERS CHECKPOINT_TRAILER = "PDD-Sync-Checkpoint-V1" @@ -434,6 +435,12 @@ def _out_of_scope_staged_paths(self, paths: List[str]) -> List[str]: for pattern in allowlist: if not pattern: continue + # Issue #1013 iter-10 M-1 (defense-in-depth): a + # wildcard-only / absolute / traversal pattern that + # slipped past the parser must NOT auto-allow repo-wide + # writes. + if not _is_valid_companion_pattern(pattern): + continue try: if candidate.match(pattern): matched = True diff --git a/pdd/prompts/agentic_common_python.prompt b/pdd/prompts/agentic_common_python.prompt index 8d0ceffda..6173b9a06 100644 --- a/pdd/prompts/agentic_common_python.prompt +++ b/pdd/prompts/agentic_common_python.prompt @@ -107,7 +107,7 @@ Shared infrastructure for agentic CLI invocations (Claude Code, Gemini, Codex, O 18. **Post Final Comment**: `post_final_comment(repo_owner, repo_name, issue_number, reason, total_cost, steps_completed, total_steps, cwd) -> bool`: Post a generated workflow summary comment to the GitHub issue when the workflow stops early. The function builds the comment body from the stop reason, cumulative cost, and completed/total step counts; callers do not pass a preformatted body. 19. **OpenCode Model Resolution**: Resolve the OpenCode model in this order: (1) `OPENCODE_MODEL` env var, kept verbatim including nested slashes like `openrouter/openai/gpt-5.3-codex`; (2) derive a candidate from `llm_model.csv` using PDD's existing model-strength semantics, then translate LiteLLM-oriented IDs via `_translate_to_opencode_model()`. The CSV fallback MUST be auth-aware: build the configured OpenCode provider set from parsed provider credentials in `~/.local/share/opencode/auth.json`, parsed usable OpenCode config provider/model entries (`~/.config/opencode/opencode.json`, nearest project `opencode.json`, `OPENCODE_CONFIG`, `OPENCODE_CONFIG_CONTENT`), and every provider credential env var represented in `llm_model.csv`; filter candidate rows to providers that are configured before selecting a model. OpenCode config sources contribute a configured provider only when they declare a provider/model path with resolvable auth or explicit local/no-key provider semantics; bare config existence is diagnostic-only. OpenCode agentic runs use `OPENCODE_MODEL` or the auth-aware CSV fallback, not generic direct-prompt model defaults. Required translations include `github_copilot/X -> github-copilot/X`, `gemini/X -> google/X`, bare Anthropic rows like `claude-sonnet-... -> anthropic/claude-sonnet-...`, and bare OpenAI rows like `gpt-5 -> openai/gpt-5`; IDs already in OpenCode `provider/model` form pass through unchanged. If no configured provider can serve the selected model, fail fast with an actionable error telling the user to set `OPENCODE_MODEL=provider/model`, configure the matching provider, or run `opencode models` after authentication. Do not rely on OpenCode default model resolution. 20. **OpenCode Optional Knobs**: Honor `OPENCODE_AGENT` by passing `--agent ` and `OPENCODE_VARIANT` by passing `--variant ` when set. Omit both flags when unset. `PDD_OPENCODE_MODE` is out of scope for this module version; use `opencode run` only. -21. **Issue Contract Parsing (Issue #1013 — sync scope guard)**: Provide `IssueContract` (frozen dataclass with `allowed_paths: Tuple[str, ...]`, `companion_allowlist: Tuple[str, ...]`, `source: str`) and `parse_issue_contract(issue_body, issue_comments=None) -> Optional[IssueContract]`. The parser scans the issue body first, then each comment (newest last is fine), looking for either (a) an HTML-comment block of the form `` whose JSON declares `allowed_paths` (required, list of repo-relative path strings) and optionally `companion_allowlist` (list of `pathlib`-style glob patterns), or (b) a fenced code block introduced by a heading-like line matching `(?im)^\s*(?:#+\s*)?(?:allowed[\s_-]*write[\s_-]*set|split[\s_-]*contract)\b.*$` immediately followed by a fenced block (```text``` or ```json```) whose lines list one repo-relative path per line (blank lines and `#`-prefixed comments ignored). Path strings are repo-relative POSIX paths; do NOT resolve to absolute filesystem paths here — that is the caller's job once it knows the repo root. The parser MUST be tolerant: malformed JSON, missing fields, or no matching marker returns `None` (the caller treats `None` as "no contract → scope guard runs in permissive fallback mode, no enforcement"). Set `source` to `"html-comment"`, `"fenced-block"`, or the value that was matched, for diagnostics. The parser MUST NOT raise on any input; wrap the JSON load in try/except and return `None` on failure. When both a body marker and a comment marker are present, prefer the body marker (issues are edited authoritatively in the body; comments are append-only and may contain stale snapshots from earlier workflow steps). +21. **Issue Contract Parsing (Issue #1013 — sync scope guard)**: Provide `IssueContract` (frozen dataclass with `allowed_paths: Tuple[str, ...]`, `companion_allowlist: Tuple[str, ...]`, `source: str`) and `parse_issue_contract(issue_body, issue_comments=None) -> Optional[IssueContract]`. The parser scans the issue body first, then each comment (newest last is fine), looking for either (a) an HTML-comment block of the form `` whose JSON declares `allowed_paths` (required, list of repo-relative path strings) and optionally `companion_allowlist` (list of `pathlib`-style glob patterns), or (b) a fenced code block introduced by a heading-like line matching `(?im)^\s*(?:#+\s*)?(?:allowed[\s_-]*write[\s_-]*set|split[\s_-]*contract)\b.*$` immediately followed by a fenced block (```text``` or ```json```) whose lines list one repo-relative path per line (blank lines and `#`-prefixed comments ignored). Path strings are repo-relative POSIX paths; do NOT resolve to absolute filesystem paths here — that is the caller's job once it knows the repo root. The parser MUST be tolerant: malformed JSON, missing fields, or no matching marker returns `None` (the caller treats `None` as "no contract → scope guard runs in permissive fallback mode, no enforcement"). Set `source` to `"html-comment"`, `"fenced-block"`, or the value that was matched, for diagnostics. The parser MUST NOT raise on any input; wrap the JSON load in try/except and return `None` on failure. When both a body marker and a comment marker are present, prefer the body marker (issues are edited authoritatively in the body; comments are append-only and may contain stale snapshots from earlier workflow steps). Per Issue #1013 iter-10 M-1, the parser MUST drop syntactically invalid `companion_allowlist` entries silently — same policy as `allowed_paths`. An entry is invalid if it is empty after `.strip()`, absolute (starts with `/`), uses a Windows separator (`\`), contains a `..` traversal segment, or is wildcard-only with no literal-character anchor (every segment consists exclusively of `*` and `?`, e.g. `*`, `**`, `**/*`, `?`). Patterns with at least one segment containing a non-wildcard character (`.pdd/meta/*.json`, `architecture.json`, `**/foo.json`) remain valid. 22. **Default Sync Companion Allowlist (Issue #1013)**: Expose a module-level constant `DEFAULT_SYNC_COMPANION_ALLOWLIST: Tuple[str, ...]` listing glob patterns for files that `pdd sync` MAY touch as legitimate companion artifacts even when an issue contract restricts the primary write set. The default value MUST be `(".pdd/meta/*.json",)` — only fingerprint metadata under `.pdd/meta/` is auto-allowed. Architecture, examples, and unrelated prompt files are NOT in the default companion allowlist; the issue contract must opt them in explicitly via its own `companion_allowlist` field. This constant exists so `agentic_sync_runner` and `agentic_sync` import a single shared default rather than redefining it inline. 23. **Scope Guard Helper (Issue #1013)**: `_revert_out_of_scope_changes(cwd, allowed_paths) -> List[Path]` is the shared revert helper used by `agentic_update`, `agentic_fix`, `agentic_crash`, `agentic_verify`, `agentic_e2e_fix_orchestrator`, and the sync scope guard. Signature MUST remain `(cwd: Path, allowed_paths: set[Path]) -> List[Path]` and return the list of resolved paths that were reverted. Behavior contract: - Skip silently when `cwd` is not a git repo, when `git status` is unavailable, or when `allowed_paths` is NON-EMPTY and none of its entries fall under `cwd` (the "scope guard meant for a different module" optimization). An EMPTY `allowed_paths` is a legal reject-all contract (Issue #1013 degenerate-empty case) — proceed with revert. diff --git a/tests/test_agentic_common.py b/tests/test_agentic_common.py index 50e6456d2..46a5b9ce6 100644 --- a/tests/test_agentic_common.py +++ b/tests/test_agentic_common.py @@ -7273,3 +7273,68 @@ def test_fenced_block_empty_body_returns_empty_contract(self): c = parse_issue_contract(body) assert isinstance(c, IssueContract) assert c.allowed_paths == () + + def test_companion_allowlist_rejects_wildcard_only_patterns(self): + """Iter-10 M-1: wildcard-only patterns (``*``, ``**``, ``**/*``, ``?``) + would let a contract auto-allow repo-wide changes; the parser MUST + drop them silently.""" + from pdd.agentic_common import parse_issue_contract, IssueContract + + body = ( + "" + ) + c = parse_issue_contract(body) + assert isinstance(c, IssueContract) + assert c.companion_allowlist == () + assert c.allowed_paths == ("pdd/foo.py",) + + def test_companion_allowlist_keeps_anchored_patterns(self): + """Iter-10 M-1: patterns with at least one literal-character segment + anchor remain valid.""" + from pdd.agentic_common import parse_issue_contract, IssueContract + + body = ( + "" + ) + c = parse_issue_contract(body) + assert isinstance(c, IssueContract) + assert c.companion_allowlist == ( + ".pdd/meta/*.json", + "architecture.json", + "**/foo.json", + ) + + def test_companion_allowlist_rejects_traversal_and_absolute(self): + """Iter-10 M-1: absolute paths, parent-traversal, and Windows + separators in companion patterns must be dropped silently.""" + from pdd.agentic_common import parse_issue_contract, IssueContract + + body = ( + "" + ) + c = parse_issue_contract(body) + assert isinstance(c, IssueContract) + assert c.companion_allowlist == () + + def test_default_companion_allowlist_passes_validation(self): + """Iter-10 M-1: the shipped default allowlist MUST itself pass the + validator — otherwise the runner's defense-in-depth filter would + strip every entry and the scope guard would have no companions.""" + from pdd.agentic_common import ( + DEFAULT_SYNC_COMPANION_ALLOWLIST, + _is_valid_companion_pattern, + ) + + assert DEFAULT_SYNC_COMPANION_ALLOWLIST + for pattern in DEFAULT_SYNC_COMPANION_ALLOWLIST: + assert _is_valid_companion_pattern(pattern), pattern diff --git a/tests/test_agentic_sync_runner.py b/tests/test_agentic_sync_runner.py index 472054a0f..3da283596 100644 --- a/tests/test_agentic_sync_runner.py +++ b/tests/test_agentic_sync_runner.py @@ -2779,6 +2779,72 @@ def test_pre_existing_untracked_files_are_preserved(self, tmp_path): ) assert diagnostic is None or "scratch.txt" not in diagnostic + def test_wildcard_only_companion_pattern_does_not_auto_allow( + self, tmp_path, monkeypatch + ): + """Iter-10 M-1: a wildcard-only pattern (``**/*``) that bypassed the + parser (e.g. constructed directly, injected through a non-issue + code path) MUST NOT cause ``_matches_companion_allowlist`` to + auto-allow repo-wide writes. The defense-in-depth filter inside + the runner rejects wildcard-only patterns the same way the parser + does.""" + from pdd import agentic_sync_runner as mod + + repo = tmp_path + # Create an out-of-scope file under the module's cwd. + unrelated = repo / "unrelated" + unrelated.mkdir() + offending = unrelated / "file.py" + offending.write_text("out of scope") + + captured_allowed = {} + + def fake_revert(repo_root, allowed_files): + captured_allowed["files"] = set(allowed_files) + # Pretend we reverted the out-of-scope file so the diagnostic + # path returns a non-None message; the assertion below is on + # the auto-allow decision, not on the revert mechanics. + return [offending] + + monkeypatch.setattr(mod, "_revert_out_of_scope_changes", fake_revert) + monkeypatch.setattr( + mod, + "revert_out_of_scope_changes_with_dirs", + lambda _root, allowed_dirs, allowed_files: [], + ) + + runner = self._make_runner( + allowed_write_set=["pdd/foo.py"], + # Inject the dangerous wildcard-only pattern directly, + # bypassing the parser. + companion_allowlist=["**/*"], + ) + monkeypatch.setattr( + runner, "_resolve_repo_root", lambda _cwd: repo.resolve() + ) + # iter-9 M-1 re-scan needs git; stub it out — this test only + # cares about the auto-allow decision feeding ``fake_revert``. + monkeypatch.setattr( + runner, "_remaining_out_of_scope_paths", + lambda _root, _allowed: [], + ) + + # The defense-in-depth filter must reject ``**/*`` directly. + assert runner._matches_companion_allowlist( + "unrelated/file.py", ("**/*",) + ) is False + + diagnostic = runner._enforce_scope_guard("mod", repo) + # The offending file must NOT be in the auto-allowed set despite + # the wildcard-only pattern living in self.companion_allowlist. + assert offending.resolve() not in captured_allowed.get("files", set()), ( + "wildcard-only companion pattern must NOT auto-allow " + "repo-wide writes (iter-10 M-1)" + ) + # Because fake_revert returned the offending file, the diagnostic + # is non-None — confirming the scope guard did flag it. + assert diagnostic is not None + def test_deleted_companion_in_git_status_is_preserved( self, tmp_path, monkeypatch ): diff --git a/tests/test_durable_sync_runner.py b/tests/test_durable_sync_runner.py index 812e3b55f..832fd07e3 100644 --- a/tests/test_durable_sync_runner.py +++ b/tests/test_durable_sync_runner.py @@ -367,6 +367,26 @@ def test_allowed_write_set_empty_rejects_everything_for_durable_runner(tmp_path: assert result == ["src/app.py"] +def test_wildcard_only_companion_pattern_is_ignored_by_durable_runner( + tmp_path: Path, +): + """Iter-10 M-1: even if a wildcard-only pattern (``**/*``) bypasses the + parser and lands in ``self.companion_allowlist``, the durable runner's + defense-in-depth filter MUST refuse to treat it as auto-allowing + repo-wide writes.""" + repo = _init_repo_with_remote(tmp_path) + runner = _runner( + repo, + allowed_write_set=["pdd/foo.py"], + companion_allowlist=["**/*"], + ) + + # ``**/*`` is wildcard-only, so it must NOT auto-allow ``unrelated/file.py``. + assert runner._out_of_scope_staged_paths( + ["unrelated/file.py"] + ) == ["unrelated/file.py"] + + def test_staged_rename_source_side_is_scope_checked(tmp_path: Path): """Iter-6 B3 (rename detection bug): ``git diff --cached --name-only`` for a staged ``git mv old new`` emits ONLY ``new``. A contract that From 5144b258649ed70ce30230ad2b179e14fd6494a1 Mon Sep 17 00:00:00 2001 From: Serhan Date: Fri, 15 May 2026 11:39:35 -0700 Subject: [PATCH 27/42] fix(sync): iter-12 B-1 json fenced contracts + M-1 prompt drift; defer B-2 MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Codex iter-11 review: B-1 (Blocker, fixed): _parse_fenced_block_contract treated both `text` and `json` fences as line-separated path lists, so a JSON fence body of `["pdd/foo.py", "tests/test_foo.py"]` parsed as the single literal path `'["pdd/foo.py", "tests/test_foo.py"]'`, and `[]` parsed as the literal `('[]',)` instead of the empty (reject-all) contract. Capture the fence language in the regex's `lang` group and branch `_parse_fenced_block_contract` so the `json` body goes through `json.loads()` (array → paths, `[]` → empty contract, non-array/malformed → None for permissive fallback) while the `text` body keeps its existing line-by-line semantics. Both still emit `source="fenced-block"`. Spec sentence in agentic_common_python.prompt §21 updated so prompt regeneration won't silently reintroduce the bug. M-1 (Major, fixed): the AsyncSyncRunner `` JSON in agentic_sync_runner_python.prompt omitted the `contract_source: Optional[str] = None` constructor kwarg that exists in the code (agentic_sync_runner.py:878). Append it to the prompt's signature and to the Req 1 constructor bullet list with a one-line description of its diagnostic purpose. PDD prompts are source of truth; a regeneration would otherwise drop the kwarg. B-2 (deferred): the post-revert re-scan does not include gitignored files. This is a theoretical attack surface — sync writes prompts, tests, and code, none of which are typically gitignored — and no concrete trigger has been observed. To be filed as a follow-up issue; no code change here. Tests: 5 new fenced-block parse tests (JSON array, empty JSON array, malformed JSON, JSON object rejection, text-fence regression). One existing test (`test_fenced_block_accepts_only_text_or_json`) updated because it locked in the B-1 bug by feeding a bare path into a JSON fence. Co-Authored-By: Claude Opus 4.7 --- pdd/agentic_common.py | 77 ++++++++++++----- pdd/prompts/agentic_common_python.prompt | 2 +- pdd/prompts/agentic_sync_runner_python.prompt | 5 +- tests/test_agentic_common.py | 84 ++++++++++++++++++- 4 files changed, 143 insertions(+), 25 deletions(-) diff --git a/pdd/agentic_common.py b/pdd/agentic_common.py index 5a4ef2455..c160b685a 100644 --- a/pdd/agentic_common.py +++ b/pdd/agentic_common.py @@ -2409,9 +2409,12 @@ class IssueContract: # follows the heading. Only whitespace/newlines may precede the fence # (anchored via ``\A`` once we slice the text after the heading); the info # string MUST be ``text`` or ``json`` per spec (Issue #1013, iter-3 F3 — bare -# fences are rejected). Captures the inner body. +# fences are rejected). Captures the language (``text`` / ``json``) into the +# named ``lang`` group so the parser can branch on the body format +# (Issue #1013 iter-12 B-1: a ``json`` body must be parsed as a JSON array, +# not as a line-separated path list). _FENCED_BLOCK_RE = re.compile( - r"\A\s*```(?:text|json)[ \t]*\n(?P.*?)```", + r"\A\s*```(?Ptext|json)[ \t]*\n(?P.*?)```", re.DOTALL, ) @@ -2517,11 +2520,25 @@ def _parse_html_comment_contract(text: str) -> Optional[IssueContract]: def _parse_fenced_block_contract(text: str) -> Optional[IssueContract]: """Return a contract parsed from a heading + immediately-following fenced - code block, else None. The fence MUST be ```` ``` ```` / - ```` ```text ```` / ```` ```json ```` and MUST appear immediately after - the heading (only whitespace permitted between). When the fence body - contains no valid paths, the contract is still returned as an empty - (reject-all) contract per Issue #1013.""" + code block, else None. The fence MUST carry a ``text`` or ``json`` info + string (bare fences are rejected per Issue #1013 iter-3 F3) and MUST + appear immediately after the heading (only whitespace permitted between). + + Body format depends on the fence language (Issue #1013 iter-12 B-1): + + - ``text``: one repo-relative POSIX path per line; blank lines and lines + starting with ``#`` are ignored. Surrounding backticks on a path line + are stripped before validation. + - ``json``: a JSON array of repo-relative POSIX path strings, e.g. + ``["pdd/foo.py", "tests/test_foo.py"]``. Anything else (object, + number, string, malformed JSON) returns ``None`` so the caller falls + back to permissive mode. + + When the fence body parses syntactically but contains no valid paths + (every entry dropped by ``_is_valid_contract_path``), the contract is + still returned with ``allowed_paths=()`` — a degenerate but legal + reject-all contract per the iter-8 B5 semantics. + """ header_match = _FENCED_BLOCK_HEADER_RE.search(text) if not header_match: return None @@ -2531,21 +2548,43 @@ def _parse_fenced_block_contract(text: str) -> Optional[IssueContract]: block_match = _FENCED_BLOCK_RE.match(after_header) if not block_match: return None + lang = block_match.group("lang") body = block_match.group("body") - # One repo-relative path per line; ignore blank lines and "#" comments. paths: List[str] = [] seen: set = set() - for raw_line in body.splitlines(): - line = raw_line.strip() - if not line or line.startswith("#"): - continue - # Strip surrounding backticks if a user wrapped the path - line = line.strip("`").strip() - if not _is_valid_contract_path(line): - continue - if line not in seen: - paths.append(line) - seen.add(line) + if lang == "json": + # Issue #1013 iter-12 B-1: a JSON fence holds a JSON array of path + # strings, NOT a line-separated list. Parsing failures and any + # non-array payload (object, number, string) signal a malformed + # contract; return ``None`` so the caller falls back to permissive + # mode (matching the HTML-comment branch's tolerance). + try: + parsed = json.loads(body) + except (ValueError, TypeError): + return None + if not isinstance(parsed, list): + return None + for entry in parsed: + if not _is_valid_contract_path(entry): + continue + candidate = entry.strip() + if candidate not in seen: + paths.append(candidate) + seen.add(candidate) + else: + # ``text`` fence: one repo-relative path per line; ignore blank + # lines and ``#`` comments. Strip surrounding backticks if a user + # wrapped the path. + for raw_line in body.splitlines(): + line = raw_line.strip() + if not line or line.startswith("#"): + continue + line = line.strip("`").strip() + if not _is_valid_contract_path(line): + continue + if line not in seen: + paths.append(line) + seen.add(line) # Empty fenced block is a legal degenerate contract (reject all). return IssueContract( allowed_paths=tuple(paths), diff --git a/pdd/prompts/agentic_common_python.prompt b/pdd/prompts/agentic_common_python.prompt index 6173b9a06..50c560d80 100644 --- a/pdd/prompts/agentic_common_python.prompt +++ b/pdd/prompts/agentic_common_python.prompt @@ -107,7 +107,7 @@ Shared infrastructure for agentic CLI invocations (Claude Code, Gemini, Codex, O 18. **Post Final Comment**: `post_final_comment(repo_owner, repo_name, issue_number, reason, total_cost, steps_completed, total_steps, cwd) -> bool`: Post a generated workflow summary comment to the GitHub issue when the workflow stops early. The function builds the comment body from the stop reason, cumulative cost, and completed/total step counts; callers do not pass a preformatted body. 19. **OpenCode Model Resolution**: Resolve the OpenCode model in this order: (1) `OPENCODE_MODEL` env var, kept verbatim including nested slashes like `openrouter/openai/gpt-5.3-codex`; (2) derive a candidate from `llm_model.csv` using PDD's existing model-strength semantics, then translate LiteLLM-oriented IDs via `_translate_to_opencode_model()`. The CSV fallback MUST be auth-aware: build the configured OpenCode provider set from parsed provider credentials in `~/.local/share/opencode/auth.json`, parsed usable OpenCode config provider/model entries (`~/.config/opencode/opencode.json`, nearest project `opencode.json`, `OPENCODE_CONFIG`, `OPENCODE_CONFIG_CONTENT`), and every provider credential env var represented in `llm_model.csv`; filter candidate rows to providers that are configured before selecting a model. OpenCode config sources contribute a configured provider only when they declare a provider/model path with resolvable auth or explicit local/no-key provider semantics; bare config existence is diagnostic-only. OpenCode agentic runs use `OPENCODE_MODEL` or the auth-aware CSV fallback, not generic direct-prompt model defaults. Required translations include `github_copilot/X -> github-copilot/X`, `gemini/X -> google/X`, bare Anthropic rows like `claude-sonnet-... -> anthropic/claude-sonnet-...`, and bare OpenAI rows like `gpt-5 -> openai/gpt-5`; IDs already in OpenCode `provider/model` form pass through unchanged. If no configured provider can serve the selected model, fail fast with an actionable error telling the user to set `OPENCODE_MODEL=provider/model`, configure the matching provider, or run `opencode models` after authentication. Do not rely on OpenCode default model resolution. 20. **OpenCode Optional Knobs**: Honor `OPENCODE_AGENT` by passing `--agent ` and `OPENCODE_VARIANT` by passing `--variant ` when set. Omit both flags when unset. `PDD_OPENCODE_MODE` is out of scope for this module version; use `opencode run` only. -21. **Issue Contract Parsing (Issue #1013 — sync scope guard)**: Provide `IssueContract` (frozen dataclass with `allowed_paths: Tuple[str, ...]`, `companion_allowlist: Tuple[str, ...]`, `source: str`) and `parse_issue_contract(issue_body, issue_comments=None) -> Optional[IssueContract]`. The parser scans the issue body first, then each comment (newest last is fine), looking for either (a) an HTML-comment block of the form `` whose JSON declares `allowed_paths` (required, list of repo-relative path strings) and optionally `companion_allowlist` (list of `pathlib`-style glob patterns), or (b) a fenced code block introduced by a heading-like line matching `(?im)^\s*(?:#+\s*)?(?:allowed[\s_-]*write[\s_-]*set|split[\s_-]*contract)\b.*$` immediately followed by a fenced block (```text``` or ```json```) whose lines list one repo-relative path per line (blank lines and `#`-prefixed comments ignored). Path strings are repo-relative POSIX paths; do NOT resolve to absolute filesystem paths here — that is the caller's job once it knows the repo root. The parser MUST be tolerant: malformed JSON, missing fields, or no matching marker returns `None` (the caller treats `None` as "no contract → scope guard runs in permissive fallback mode, no enforcement"). Set `source` to `"html-comment"`, `"fenced-block"`, or the value that was matched, for diagnostics. The parser MUST NOT raise on any input; wrap the JSON load in try/except and return `None` on failure. When both a body marker and a comment marker are present, prefer the body marker (issues are edited authoritatively in the body; comments are append-only and may contain stale snapshots from earlier workflow steps). Per Issue #1013 iter-10 M-1, the parser MUST drop syntactically invalid `companion_allowlist` entries silently — same policy as `allowed_paths`. An entry is invalid if it is empty after `.strip()`, absolute (starts with `/`), uses a Windows separator (`\`), contains a `..` traversal segment, or is wildcard-only with no literal-character anchor (every segment consists exclusively of `*` and `?`, e.g. `*`, `**`, `**/*`, `?`). Patterns with at least one segment containing a non-wildcard character (`.pdd/meta/*.json`, `architecture.json`, `**/foo.json`) remain valid. +21. **Issue Contract Parsing (Issue #1013 — sync scope guard)**: Provide `IssueContract` (frozen dataclass with `allowed_paths: Tuple[str, ...]`, `companion_allowlist: Tuple[str, ...]`, `source: str`) and `parse_issue_contract(issue_body, issue_comments=None) -> Optional[IssueContract]`. The parser scans the issue body first, then each comment (newest last is fine), looking for either (a) an HTML-comment block of the form `` whose JSON declares `allowed_paths` (required, list of repo-relative path strings) and optionally `companion_allowlist` (list of `pathlib`-style glob patterns), or (b) a fenced code block introduced by a heading-like line matching `(?im)^\s*(?:#+\s*)?(?:allowed[\s_-]*write[\s_-]*set|split[\s_-]*contract)\b.*$` immediately followed by a fenced block with info string ```text``` (one repo-relative path per line; blank lines and `#`-prefixed comments ignored) or ```json``` (body is a JSON array of repo-relative path strings, e.g. `["pdd/foo.py", "tests/test_foo.py"]`; non-array payloads such as objects/numbers/strings and malformed JSON yield `None` so the caller falls back to permissive mode — Issue #1013 iter-12 B-1). Path strings are repo-relative POSIX paths; do NOT resolve to absolute filesystem paths here — that is the caller's job once it knows the repo root. The parser MUST be tolerant: malformed JSON, missing fields, or no matching marker returns `None` (the caller treats `None` as "no contract → scope guard runs in permissive fallback mode, no enforcement"). Set `source` to `"html-comment"`, `"fenced-block"`, or the value that was matched, for diagnostics. The parser MUST NOT raise on any input; wrap the JSON load in try/except and return `None` on failure. When both a body marker and a comment marker are present, prefer the body marker (issues are edited authoritatively in the body; comments are append-only and may contain stale snapshots from earlier workflow steps). Per Issue #1013 iter-10 M-1, the parser MUST drop syntactically invalid `companion_allowlist` entries silently — same policy as `allowed_paths`. An entry is invalid if it is empty after `.strip()`, absolute (starts with `/`), uses a Windows separator (`\`), contains a `..` traversal segment, or is wildcard-only with no literal-character anchor (every segment consists exclusively of `*` and `?`, e.g. `*`, `**`, `**/*`, `?`). Patterns with at least one segment containing a non-wildcard character (`.pdd/meta/*.json`, `architecture.json`, `**/foo.json`) remain valid. 22. **Default Sync Companion Allowlist (Issue #1013)**: Expose a module-level constant `DEFAULT_SYNC_COMPANION_ALLOWLIST: Tuple[str, ...]` listing glob patterns for files that `pdd sync` MAY touch as legitimate companion artifacts even when an issue contract restricts the primary write set. The default value MUST be `(".pdd/meta/*.json",)` — only fingerprint metadata under `.pdd/meta/` is auto-allowed. Architecture, examples, and unrelated prompt files are NOT in the default companion allowlist; the issue contract must opt them in explicitly via its own `companion_allowlist` field. This constant exists so `agentic_sync_runner` and `agentic_sync` import a single shared default rather than redefining it inline. 23. **Scope Guard Helper (Issue #1013)**: `_revert_out_of_scope_changes(cwd, allowed_paths) -> List[Path]` is the shared revert helper used by `agentic_update`, `agentic_fix`, `agentic_crash`, `agentic_verify`, `agentic_e2e_fix_orchestrator`, and the sync scope guard. Signature MUST remain `(cwd: Path, allowed_paths: set[Path]) -> List[Path]` and return the list of resolved paths that were reverted. Behavior contract: - Skip silently when `cwd` is not a git repo, when `git status` is unavailable, or when `allowed_paths` is NON-EMPTY and none of its entries fall under `cwd` (the "scope guard meant for a different module" optimization). An EMPTY `allowed_paths` is a legal reject-all contract (Issue #1013 degenerate-empty case) — proceed with revert. diff --git a/pdd/prompts/agentic_sync_runner_python.prompt b/pdd/prompts/agentic_sync_runner_python.prompt index 0f188ae7c..97b730702 100644 --- a/pdd/prompts/agentic_sync_runner_python.prompt +++ b/pdd/prompts/agentic_sync_runner_python.prompt @@ -5,7 +5,7 @@ "type": "module", "module": { "functions": [ - {"name": "AsyncSyncRunner", "signature": "(basenames: List[str], dep_graph: Dict[str, List[str]], sync_options: Dict[str, Any], github_info: Optional[Dict[str, Any]], quiet: bool = False, verbose: bool = False, issue_url: Optional[str] = None, module_cwds: Optional[Dict[str, Path]] = None, initial_cost: float = 0.0, *, allowed_write_set: Optional[Iterable[str]] = None, companion_allowlist: Optional[Iterable[str]] = None, scope_guard_enabled: bool = True)", "returns": "AsyncSyncRunner"}, + {"name": "AsyncSyncRunner", "signature": "(basenames: List[str], dep_graph: Dict[str, List[str]], sync_options: Dict[str, Any], github_info: Optional[Dict[str, Any]], quiet: bool = False, verbose: bool = False, issue_url: Optional[str] = None, module_cwds: Optional[Dict[str, Path]] = None, initial_cost: float = 0.0, *, allowed_write_set: Optional[Iterable[str]] = None, companion_allowlist: Optional[Iterable[str]] = None, scope_guard_enabled: bool = True, contract_source: Optional[str] = None)", "returns": "AsyncSyncRunner"}, {"name": "AsyncSyncRunner.run", "signature": "() -> Tuple[bool, str, float]", "returns": "Tuple[bool, str, float]"}, {"name": "build_dep_graph_from_architecture", "signature": "(arch_path: Path, target_basenames: List[str]) -> DepGraphFromArchitectureResult", "returns": "DepGraphFromArchitectureResult"}, {"name": "build_dep_graph_from_architecture_data", "signature": "(architecture: Any, target_basenames: List[str], *, source_name: str = 'architecture.json') -> DepGraphFromArchitectureResult", "returns": "DepGraphFromArchitectureResult"} @@ -27,7 +27,7 @@ Write the `pdd/agentic_sync_runner.py` module. Parallel sync engine that runs `pdd sync` for multiple modules concurrently using a ThreadPoolExecutor, respecting dependency ordering. Posts live progress updates to a GitHub issue comment. Supports state persistence for resumability across runs, phase tracking, and graceful interrupt handling. % Requirements -1. Class: `AsyncSyncRunner(basenames, dep_graph, sync_options, github_info, quiet, verbose, issue_url, module_cwds, initial_cost=0.0, *, allowed_write_set=None, companion_allowlist=None, scope_guard_enabled=True)` +1. Class: `AsyncSyncRunner(basenames, dep_graph, sync_options, github_info, quiet, verbose, issue_url, module_cwds, initial_cost=0.0, *, allowed_write_set=None, companion_allowlist=None, scope_guard_enabled=True, contract_source=None)` - `basenames: List[str]` — modules to sync - `dep_graph: Dict[str, List[str]]` — basename -> [dependency basenames] - `sync_options: Dict` — budget, total_budget, target_coverage, skip_verify, skip_tests, agentic, no_steer, max_attempts, one_session, local, timeout_adder @@ -38,6 +38,7 @@ Parallel sync engine that runs `pdd sync` for multiple modules concurrently usin - `allowed_write_set: Optional[Iterable[str]]` — repo-relative path strings from the issue split contract that this sync run is permitted to modify. `None` means "no contract was parseable from the issue → run in permissive mode (no enforcement)". An explicit empty iterable means "contract present but empty → reject every change as out-of-scope" (a degenerate but legal contract). Resolved against each module's `cwd`/repo root inside the runner. - `companion_allowlist: Optional[Iterable[str]]` — additional glob patterns (e.g. `".pdd/meta/*.json"`) describing companion artifacts that MAY be modified outside the primary `allowed_write_set`. Defaults to `DEFAULT_SYNC_COMPANION_ALLOWLIST` from `agentic_common` (currently `(".pdd/meta/*.json",)`) when `None`. Issue contracts MAY widen the companion allowlist by passing a superset. - `scope_guard_enabled: bool` — master switch (default `True`). When `False`, the runner records the parsed contract for diagnostics but performs no enforcement, no revert, and no hard-fail. Maps to the CLI `--no-scope-guard` opt-out. + - `contract_source: Optional[str]` — diagnostic label carrying the parse source of the issue contract (`"html-comment"` or `"fenced-block"`, matching `IssueContract.source`) so scope-guard diagnostics and downstream review-loop reporters can surface where the contract was detected. `None` when no contract was parsed (permissive fallback). - Tracks per-module state: pending -> running -> success | failed 2. Method: `run() -> Tuple[bool, str, float]` — returns (all_success, summary_message, total_cost) where total_cost includes initial_cost + per-module costs 3. Use `concurrent.futures.ThreadPoolExecutor` with `MAX_WORKERS = 4`; when `sync_options["total_budget"]` is set, run sequentially and pass only the remaining total budget to each child process so the total budget is not multiplied per module. diff --git a/tests/test_agentic_common.py b/tests/test_agentic_common.py index 46a5b9ce6..8dd4978e4 100644 --- a/tests/test_agentic_common.py +++ b/tests/test_agentic_common.py @@ -7249,11 +7249,18 @@ def test_fenced_block_must_immediately_follow_heading(self): def test_fenced_block_accepts_only_text_or_json(self): """Iter-3 F3: the spec at agentic_common_python.prompt:110 requires ``text`` or ``json`` info strings; bare ``` (no language) is NOT - accepted as a split-contract fence.""" + accepted as a split-contract fence. + + Iter-12 B-1: each fence language has its own body format — ``text`` + is line-separated paths, ``json`` is a JSON array of path strings.""" from pdd.agentic_common import parse_issue_contract - for fence in ("```text", "```json"): - body = f"## Allowed Write Set\n{fence}\npdd/foo.py\n```\n" + cases = ( + ("```text", "pdd/foo.py"), + ("```json", '["pdd/foo.py"]'), + ) + for fence, payload in cases: + body = f"## Allowed Write Set\n{fence}\n{payload}\n```\n" c = parse_issue_contract(body) assert c is not None and c.allowed_paths == ("pdd/foo.py",), fence assert c.source == "fenced-block" @@ -7274,6 +7281,77 @@ def test_fenced_block_empty_body_returns_empty_contract(self): assert isinstance(c, IssueContract) assert c.allowed_paths == () + def test_fenced_json_array_of_paths_parses_correctly(self): + """Iter-12 B-1: a ``json`` fence whose body is a JSON array of path + strings must parse to those paths (NOT to a single literal path + equal to the raw JSON text).""" + from pdd.agentic_common import parse_issue_contract, IssueContract + + body = ( + "## Allowed Write Set\n" + "```json\n" + '["pdd/foo.py", "tests/test_foo.py"]\n' + "```\n" + ) + c = parse_issue_contract(body) + assert isinstance(c, IssueContract) + assert c.allowed_paths == ("pdd/foo.py", "tests/test_foo.py") + assert c.source == "fenced-block" + + def test_fenced_json_empty_array_returns_empty_contract(self): + """Iter-12 B-1: ``[]`` in a ``json`` fence is a syntactically valid + degenerate reject-all contract — the parser MUST return an + ``IssueContract`` with ``allowed_paths=()``, NOT a single-element + tuple containing the literal string ``'[]'``.""" + from pdd.agentic_common import parse_issue_contract, IssueContract + + body = "## Allowed Write Set\n```json\n[]\n```\n" + c = parse_issue_contract(body) + assert isinstance(c, IssueContract) + assert c.allowed_paths == () + assert c.source == "fenced-block" + + def test_fenced_json_malformed_returns_none(self): + """Iter-12 B-1: malformed JSON in a ``json`` fence MUST cause the + parser to return ``None`` (permissive fallback), not raise.""" + from pdd.agentic_common import parse_issue_contract + + body = "## Allowed Write Set\n```json\n{not valid json\n```\n" + assert parse_issue_contract(body) is None + + def test_fenced_json_object_returns_none(self): + """Iter-12 B-1: a JSON *object* in a ``json`` fence is the + HTML-comment format leaking into a fence — the fenced-block + ``json`` format is documented as an array of paths only, so the + parser MUST return ``None`` for objects.""" + from pdd.agentic_common import parse_issue_contract + + body = ( + "## Allowed Write Set\n" + "```json\n" + '{"allowed_paths": ["pdd/foo.py"]}\n' + "```\n" + ) + assert parse_issue_contract(body) is None + + def test_fenced_text_still_parses_line_by_line(self): + """Iter-12 B-1 regression: the ``text`` fence branch MUST keep its + original line-by-line semantics after the parser branched on + language.""" + from pdd.agentic_common import parse_issue_contract, IssueContract + + body = ( + "## Allowed Write Set\n" + "```text\n" + "pdd/foo.py\n" + "tests/test_foo.py\n" + "```\n" + ) + c = parse_issue_contract(body) + assert isinstance(c, IssueContract) + assert c.allowed_paths == ("pdd/foo.py", "tests/test_foo.py") + assert c.source == "fenced-block" + def test_companion_allowlist_rejects_wildcard_only_patterns(self): """Iter-10 M-1: wildcard-only patterns (``*``, ``**``, ``**/*``, ``?``) would let a contract auto-allow repo-wide changes; the parser MUST From 2b90d8871e44670f793c841cd7aa6ba56746ecec Mon Sep 17 00:00:00 2001 From: Serhan Date: Fri, 15 May 2026 12:00:35 -0700 Subject: [PATCH 28/42] fix(sync): iter-14 M-1+M-2 anchored companion matcher; defer M-3 MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Codex iter-13 review: M-1+M-2 (Major, fixed): pathlib.PurePosixPath.match is suffix-based when the pattern is relative, so the default .pdd/meta/*.json companion glob falsely matched subdir/.pdd/meta/foo.json and a/b/c/.pdd/meta/foo.json. A contract violator could bypass the scope guard by writing fingerprint-shaped files under any nested directory. Empirically verified before fixing. Add _matches_companion_pattern_anchored() in agentic_common.py: segment-aware, anchored at the start of the path, equal segment count required, per-segment fnmatch.fnmatchcase. Replace the inline PurePosixPath.match call in both AsyncSyncRunner._matches_companion_allowlist and DurableSyncRunner._out_of_scope_staged_paths so the bug stays fixed in one place. Tighten _is_valid_companion_pattern to also reject ** doublestar segments. The anchored matcher does not implement recursive globbing; allowing ** would reintroduce the same depth-bypass foot-gun via the backdoor. Contracts that genuinely need depth-wildcard companions should enumerate directories explicitly. Two-part fix (not just the matcher swap): _enforce_scope_guard now normalizes companion candidates MODULE-relative (was repo-relative). The pattern .pdd/meta/*.json describes fingerprint metadata at the top of each module's working directory; in a multi-module repo (module_cwd is a subdirectory), the file lives at mod_a/.pdd/meta/x.json relative to the repo root but at .pdd/meta/x.json relative to the module. The old suffix-matcher obscured this by accidentally auto-allowing the repo-relative form — that same accident was the M-1 bug surface. The durable runner doesn't need this normalization (its staged paths are already worktree-rooted). M-3 (deferred): symlink-resolved out-of-scope path in _revert_out_of_scope_changes. This helper is used by agentic_update, agentic_fix, agentic_crash, agentic_e2e_fix_orchestrator, and the sync scope guard — blast radius is beyond PR #1014. No concrete trigger (LLMs do not write symlinks). To be filed as a follow-up issue together with iter-11 B-2 (ignored-file rescan). Tests: 3 anchored-matcher cases, 1 doublestar-validator case, 1 async-runner nested-meta-rejection case, 1 durable-runner equivalent. Updated iter-10 test that previously accepted **/foo.json now expects it dropped. Co-Authored-By: Claude Opus 4.7 --- pdd/agentic_common.py | 44 +++++++++++- pdd/agentic_sync_runner.py | 52 +++++++++----- pdd/durable_sync_runner.py | 33 +++++---- pdd/prompts/agentic_common_python.prompt | 4 +- pdd/prompts/agentic_sync_runner_python.prompt | 2 +- pdd/prompts/durable_sync_runner_python.prompt | 2 +- tests/test_agentic_common.py | 71 ++++++++++++++++++- tests/test_agentic_sync_runner.py | 62 ++++++++++++++++ tests/test_durable_sync_runner.py | 29 ++++++++ 9 files changed, 259 insertions(+), 40 deletions(-) diff --git a/pdd/agentic_common.py b/pdd/agentic_common.py index c160b685a..7b1c69df9 100644 --- a/pdd/agentic_common.py +++ b/pdd/agentic_common.py @@ -1,5 +1,6 @@ from __future__ import annotations +import fnmatch import functools import os import signal @@ -2449,7 +2450,7 @@ def _is_valid_contract_path(raw: object) -> bool: def _is_valid_companion_pattern(raw: object) -> bool: """ Return True iff *raw* is a repo-relative companion glob pattern with at - least one literal-character anchor. + least one literal-character anchor and no ``**`` doublestar segment. Issue #1013 iter-10 M-1: ``companion_allowlist`` accepts arbitrary glob patterns, so a contract that declares ``*``, ``**``, or ``**/*`` would @@ -2457,6 +2458,14 @@ def _is_valid_companion_pattern(raw: object) -> bool: bypass the split-contract write set. Reject patterns whose every segment is wildcard-only (no character outside ``*?``), as well as absolute, Windows-separator, traversal, and empty patterns. + + Issue #1013 iter-14 M-1/M-2: also reject patterns whose any segment is + exactly ``**``. The segment-aware matcher (``fnmatch.fnmatchcase`` per + segment) treats ``**`` as just another wildcard segment, which would + let a contract like ``**/foo.json`` auto-allow ``foo.json`` at any + depth — exactly the suffix-match foot-gun ``PurePosixPath.match`` + exhibited. Contracts that genuinely need a depth-wildcard companion + artifact should enumerate the directories explicitly. """ if not isinstance(raw, str): return False @@ -2470,6 +2479,11 @@ def _is_valid_companion_pattern(raw: object) -> bool: parts = candidate.split("/") if any(part == ".." for part in parts): return False + # Iter-14: reject doublestar segments. ``**`` only has well-defined + # semantics for recursive matching, which the anchored segment-aware + # matcher does NOT implement (it requires equal segment counts). + if any(part == "**" for part in parts): + return False # At least one segment MUST contain a literal character (anything # outside ``*?``); otherwise the pattern is wildcard-only and would # match arbitrary repo paths, defeating the scope guard. @@ -2481,6 +2495,34 @@ def _is_valid_companion_pattern(raw: object) -> bool: return False +def _matches_companion_pattern_anchored(rel_posix: str, pattern: str) -> bool: + """ + Issue #1013 iter-14 M-1/M-2: anchored, segment-aware glob match for + companion allowlist patterns. + + Unlike :meth:`pathlib.PurePosixPath.match` (which matches from the + right and lets ``.pdd/meta/*.json`` falsely match + ``subdir/.pdd/meta/foo.json``), this matcher requires path and + pattern to align segment-by-segment from the START of the path with + equal segment count. Each segment is matched via + :func:`fnmatch.fnmatchcase` for ``*`` / ``?`` semantics. + + Returns False on invalid patterns (already filtered by + :func:`_is_valid_companion_pattern`); callers should validate first + so an invalid pattern can never auto-allow a path. + """ + if not pattern or not rel_posix: + return False + path_parts = rel_posix.replace("\\", "/").strip("/").split("/") + pattern_parts = pattern.replace("\\", "/").strip("/").split("/") + if len(path_parts) != len(pattern_parts): + return False + return all( + fnmatch.fnmatchcase(pp, patp) + for pp, patp in zip(path_parts, pattern_parts) + ) + + def _parse_html_comment_contract(text: str) -> Optional[IssueContract]: """Return a contract parsed from a ```` block, else None.""" match = _HTML_COMMENT_CONTRACT_RE.search(text) diff --git a/pdd/agentic_sync_runner.py b/pdd/agentic_sync_runner.py index 2e56151fb..d8e6becb5 100644 --- a/pdd/agentic_sync_runner.py +++ b/pdd/agentic_sync_runner.py @@ -21,7 +21,7 @@ from collections import defaultdict from concurrent.futures import FIRST_COMPLETED, ThreadPoolExecutor, wait from dataclasses import dataclass, field -from pathlib import Path, PurePosixPath +from pathlib import Path from typing import Any, Dict, Iterable, List, NamedTuple, Optional, Set, Tuple from rich.console import Console @@ -29,6 +29,7 @@ from .agentic_common import ( DEFAULT_SYNC_COMPANION_ALLOWLIST, _is_valid_companion_pattern, + _matches_companion_pattern_anchored, _revert_out_of_scope_changes, ) from .agentic_common_worktree import revert_out_of_scope_changes_with_dirs @@ -1886,25 +1887,25 @@ def _matches_companion_allowlist( ) -> bool: """Return True if *rel_posix_path* matches any companion glob. - Uses ``pathlib.PurePosixPath.match`` (not ``fnmatch.fnmatch``) so that - ``.pdd/meta/*.json`` does NOT inadvertently match nested paths like - ``.pdd/meta/nested/foo.json``. Matches Issue #1013 spec. + Issue #1013 iter-14 M-1: uses + :func:`_matches_companion_pattern_anchored` (segment-aware, + anchored at the START of the path) rather than + :meth:`pathlib.PurePosixPath.match`. The pathlib matcher is + suffix-based when the pattern is relative, so + ``.pdd/meta/*.json`` falsely matches ``subdir/.pdd/meta/foo.json`` + — letting a contract violator bypass the guard by writing + fingerprint-shaped files nested under any directory. """ - candidate = PurePosixPath(rel_posix_path) for pattern in allowlist: if not pattern: continue # Issue #1013 iter-10 M-1 (defense-in-depth): even if a - # wildcard-only pattern slipped past the parser, refuse to - # treat it as auto-allowing repo-wide writes. + # wildcard-only / doublestar pattern slipped past the parser, + # refuse to treat it as auto-allowing repo-wide writes. if not _is_valid_companion_pattern(pattern): continue - try: - if candidate.match(pattern): - return True - except ValueError: - # Invalid glob pattern — treat as non-match rather than raise. - continue + if _matches_companion_pattern_anchored(rel_posix_path, pattern): + return True return False def _remaining_out_of_scope_paths( @@ -2004,13 +2005,25 @@ def _enforce_scope_guard( # F1 (Issue #1013 iter-3): only files UNDER ``module_cwd`` count # as companion artifacts — never auto-allow a sibling module's # ``.pdd/meta/*.json`` just because it lives in the same repo. + # + # Iter-14 M-1: companion patterns are matched MODULE-RELATIVE, + # not repo-relative. The pattern ``.pdd/meta/*.json`` describes + # fingerprint metadata at the top of each module's working + # directory; in a multi-module repo where ``module_cwd`` is a + # subdirectory (e.g. ``mod_a/``), the file lives at + # ``mod_a/.pdd/meta/x.json`` relative to the repo root but at + # ``.pdd/meta/x.json`` relative to the module — and the latter + # is what the segment-aware anchored matcher must see. (The old + # ``PurePosixPath.match`` suffix-matching obscured this by + # accidentally auto-allowing the repo-relative form, which also + # auto-allowed any ``subdir/.pdd/meta/foo.json`` — the bug.) allowlist = tuple(self.companion_allowlist) cwd_path = Path(module_cwd).resolve() for path in cwd_path.rglob("*"): if not path.is_file(): continue try: - rel_posix = path.resolve().relative_to(repo_root).as_posix() + rel_posix = path.resolve().relative_to(cwd_path).as_posix() except ValueError: continue if self._matches_companion_allowlist(rel_posix, allowlist): @@ -2021,15 +2034,20 @@ def _enforce_scope_guard( # when a module is renamed/removed); those deletions appear in # ``git status`` as tracked ``D ``. Without this pass the revert # helper would resurrect the deleted companion and hard-fail. + # + # Iter-14 M-1: ``_git_changed_paths`` returns repo-relative paths; + # scope to ``cwd_path`` FIRST, then match the module-relative form + # against the companion pattern (same semantics as the rglob loop + # above). for rel_posix in _git_changed_paths(repo_root): - if not self._matches_companion_allowlist(rel_posix, allowlist): - continue absolute = (repo_root / rel_posix).resolve() try: - absolute.relative_to(cwd_path) + module_rel_posix = absolute.relative_to(cwd_path).as_posix() except ValueError: # Outside the module's cwd — scoped out by F1 iter-3. continue + if not self._matches_companion_allowlist(module_rel_posix, allowlist): + continue allowed_files.add(absolute) # Iter-6 B1 (data-loss bug): pre-existing untracked files diff --git a/pdd/durable_sync_runner.py b/pdd/durable_sync_runner.py index f2dce857f..700144d5c 100644 --- a/pdd/durable_sync_runner.py +++ b/pdd/durable_sync_runner.py @@ -17,10 +17,13 @@ import time import uuid from hashlib import sha1 -from pathlib import Path, PurePosixPath +from pathlib import Path from typing import Dict, List, Optional, Set, Tuple -from .agentic_common import _is_valid_companion_pattern +from .agentic_common import ( + _is_valid_companion_pattern, + _matches_companion_pattern_anchored, +) from .agentic_sync_runner import AsyncSyncRunner, MAX_WORKERS CHECKPOINT_TRAILER = "PDD-Sync-Checkpoint-V1" @@ -427,26 +430,26 @@ def _out_of_scope_staged_paths(self, paths: List[str]) -> List[str]: normalized = normalized[2:] if normalized in self.allowed_write_paths: continue - # F3 (Issue #1013): companion glob matching uses pathlib-style - # semantics so ``.pdd/meta/*.json`` does NOT match nested paths - # like ``.pdd/meta/nested/foo.json``. - candidate = PurePosixPath(normalized) + # F3 (Issue #1013): companion glob matching uses anchored, + # segment-aware semantics so ``.pdd/meta/*.json`` does NOT + # match nested paths like ``.pdd/meta/nested/foo.json`` or + # ``subdir/.pdd/meta/foo.json``. Iter-14 M-2: replaced + # ``PurePosixPath.match`` (suffix-based, falsely matched + # ``subdir/.pdd/meta/foo.json``) with the centralized + # anchored matcher in ``agentic_common``. matched = False for pattern in allowlist: if not pattern: continue # Issue #1013 iter-10 M-1 (defense-in-depth): a - # wildcard-only / absolute / traversal pattern that - # slipped past the parser must NOT auto-allow repo-wide - # writes. + # wildcard-only / doublestar / absolute / traversal + # pattern that slipped past the parser must NOT + # auto-allow repo-wide writes. if not _is_valid_companion_pattern(pattern): continue - try: - if candidate.match(pattern): - matched = True - break - except ValueError: - continue + if _matches_companion_pattern_anchored(normalized, pattern): + matched = True + break if matched: continue offending.add(normalized) diff --git a/pdd/prompts/agentic_common_python.prompt b/pdd/prompts/agentic_common_python.prompt index 50c560d80..e592d0c6f 100644 --- a/pdd/prompts/agentic_common_python.prompt +++ b/pdd/prompts/agentic_common_python.prompt @@ -107,8 +107,8 @@ Shared infrastructure for agentic CLI invocations (Claude Code, Gemini, Codex, O 18. **Post Final Comment**: `post_final_comment(repo_owner, repo_name, issue_number, reason, total_cost, steps_completed, total_steps, cwd) -> bool`: Post a generated workflow summary comment to the GitHub issue when the workflow stops early. The function builds the comment body from the stop reason, cumulative cost, and completed/total step counts; callers do not pass a preformatted body. 19. **OpenCode Model Resolution**: Resolve the OpenCode model in this order: (1) `OPENCODE_MODEL` env var, kept verbatim including nested slashes like `openrouter/openai/gpt-5.3-codex`; (2) derive a candidate from `llm_model.csv` using PDD's existing model-strength semantics, then translate LiteLLM-oriented IDs via `_translate_to_opencode_model()`. The CSV fallback MUST be auth-aware: build the configured OpenCode provider set from parsed provider credentials in `~/.local/share/opencode/auth.json`, parsed usable OpenCode config provider/model entries (`~/.config/opencode/opencode.json`, nearest project `opencode.json`, `OPENCODE_CONFIG`, `OPENCODE_CONFIG_CONTENT`), and every provider credential env var represented in `llm_model.csv`; filter candidate rows to providers that are configured before selecting a model. OpenCode config sources contribute a configured provider only when they declare a provider/model path with resolvable auth or explicit local/no-key provider semantics; bare config existence is diagnostic-only. OpenCode agentic runs use `OPENCODE_MODEL` or the auth-aware CSV fallback, not generic direct-prompt model defaults. Required translations include `github_copilot/X -> github-copilot/X`, `gemini/X -> google/X`, bare Anthropic rows like `claude-sonnet-... -> anthropic/claude-sonnet-...`, and bare OpenAI rows like `gpt-5 -> openai/gpt-5`; IDs already in OpenCode `provider/model` form pass through unchanged. If no configured provider can serve the selected model, fail fast with an actionable error telling the user to set `OPENCODE_MODEL=provider/model`, configure the matching provider, or run `opencode models` after authentication. Do not rely on OpenCode default model resolution. 20. **OpenCode Optional Knobs**: Honor `OPENCODE_AGENT` by passing `--agent ` and `OPENCODE_VARIANT` by passing `--variant ` when set. Omit both flags when unset. `PDD_OPENCODE_MODE` is out of scope for this module version; use `opencode run` only. -21. **Issue Contract Parsing (Issue #1013 — sync scope guard)**: Provide `IssueContract` (frozen dataclass with `allowed_paths: Tuple[str, ...]`, `companion_allowlist: Tuple[str, ...]`, `source: str`) and `parse_issue_contract(issue_body, issue_comments=None) -> Optional[IssueContract]`. The parser scans the issue body first, then each comment (newest last is fine), looking for either (a) an HTML-comment block of the form `` whose JSON declares `allowed_paths` (required, list of repo-relative path strings) and optionally `companion_allowlist` (list of `pathlib`-style glob patterns), or (b) a fenced code block introduced by a heading-like line matching `(?im)^\s*(?:#+\s*)?(?:allowed[\s_-]*write[\s_-]*set|split[\s_-]*contract)\b.*$` immediately followed by a fenced block with info string ```text``` (one repo-relative path per line; blank lines and `#`-prefixed comments ignored) or ```json``` (body is a JSON array of repo-relative path strings, e.g. `["pdd/foo.py", "tests/test_foo.py"]`; non-array payloads such as objects/numbers/strings and malformed JSON yield `None` so the caller falls back to permissive mode — Issue #1013 iter-12 B-1). Path strings are repo-relative POSIX paths; do NOT resolve to absolute filesystem paths here — that is the caller's job once it knows the repo root. The parser MUST be tolerant: malformed JSON, missing fields, or no matching marker returns `None` (the caller treats `None` as "no contract → scope guard runs in permissive fallback mode, no enforcement"). Set `source` to `"html-comment"`, `"fenced-block"`, or the value that was matched, for diagnostics. The parser MUST NOT raise on any input; wrap the JSON load in try/except and return `None` on failure. When both a body marker and a comment marker are present, prefer the body marker (issues are edited authoritatively in the body; comments are append-only and may contain stale snapshots from earlier workflow steps). Per Issue #1013 iter-10 M-1, the parser MUST drop syntactically invalid `companion_allowlist` entries silently — same policy as `allowed_paths`. An entry is invalid if it is empty after `.strip()`, absolute (starts with `/`), uses a Windows separator (`\`), contains a `..` traversal segment, or is wildcard-only with no literal-character anchor (every segment consists exclusively of `*` and `?`, e.g. `*`, `**`, `**/*`, `?`). Patterns with at least one segment containing a non-wildcard character (`.pdd/meta/*.json`, `architecture.json`, `**/foo.json`) remain valid. -22. **Default Sync Companion Allowlist (Issue #1013)**: Expose a module-level constant `DEFAULT_SYNC_COMPANION_ALLOWLIST: Tuple[str, ...]` listing glob patterns for files that `pdd sync` MAY touch as legitimate companion artifacts even when an issue contract restricts the primary write set. The default value MUST be `(".pdd/meta/*.json",)` — only fingerprint metadata under `.pdd/meta/` is auto-allowed. Architecture, examples, and unrelated prompt files are NOT in the default companion allowlist; the issue contract must opt them in explicitly via its own `companion_allowlist` field. This constant exists so `agentic_sync_runner` and `agentic_sync` import a single shared default rather than redefining it inline. +21. **Issue Contract Parsing (Issue #1013 — sync scope guard)**: Provide `IssueContract` (frozen dataclass with `allowed_paths: Tuple[str, ...]`, `companion_allowlist: Tuple[str, ...]`, `source: str`) and `parse_issue_contract(issue_body, issue_comments=None) -> Optional[IssueContract]`. The parser scans the issue body first, then each comment (newest last is fine), looking for either (a) an HTML-comment block of the form `` whose JSON declares `allowed_paths` (required, list of repo-relative path strings) and optionally `companion_allowlist` (list of `pathlib`-style glob patterns), or (b) a fenced code block introduced by a heading-like line matching `(?im)^\s*(?:#+\s*)?(?:allowed[\s_-]*write[\s_-]*set|split[\s_-]*contract)\b.*$` immediately followed by a fenced block with info string ```text``` (one repo-relative path per line; blank lines and `#`-prefixed comments ignored) or ```json``` (body is a JSON array of repo-relative path strings, e.g. `["pdd/foo.py", "tests/test_foo.py"]`; non-array payloads such as objects/numbers/strings and malformed JSON yield `None` so the caller falls back to permissive mode — Issue #1013 iter-12 B-1). Path strings are repo-relative POSIX paths; do NOT resolve to absolute filesystem paths here — that is the caller's job once it knows the repo root. The parser MUST be tolerant: malformed JSON, missing fields, or no matching marker returns `None` (the caller treats `None` as "no contract → scope guard runs in permissive fallback mode, no enforcement"). Set `source` to `"html-comment"`, `"fenced-block"`, or the value that was matched, for diagnostics. The parser MUST NOT raise on any input; wrap the JSON load in try/except and return `None` on failure. When both a body marker and a comment marker are present, prefer the body marker (issues are edited authoritatively in the body; comments are append-only and may contain stale snapshots from earlier workflow steps). Per Issue #1013 iter-10 M-1 (tightened in iter-14 M-1/M-2 paired with iter-10 M-1), the parser MUST drop syntactically invalid `companion_allowlist` entries silently — same policy as `allowed_paths`. An entry is invalid if it is empty after `.strip()`, absolute (starts with `/`), uses a Windows separator (`\`), contains a `..` traversal segment, is wildcard-only with no literal-character anchor (every segment consists exclusively of `*` and `?`, e.g. `*`, `**`, `**/*`, `?`), OR contains any segment that is exactly `**` (the doublestar — `fnmatch` does not implement recursive glob semantics, and the anchored segment-aware matcher requires equal segment counts, so a `**`-bearing pattern would be ambiguous and is rejected at parse time). Patterns with at least one segment containing a non-wildcard character and no `**` segment (`.pdd/meta/*.json`, `architecture.json`, `tests/test_*.py`) remain valid. +22. **Default Sync Companion Allowlist (Issue #1013)**: Expose a module-level constant `DEFAULT_SYNC_COMPANION_ALLOWLIST: Tuple[str, ...]` listing glob patterns for files that `pdd sync` MAY touch as legitimate companion artifacts even when an issue contract restricts the primary write set. The default value MUST be `(".pdd/meta/*.json",)` — only fingerprint metadata under `.pdd/meta/` is auto-allowed. Architecture, examples, and unrelated prompt files are NOT in the default companion allowlist; the issue contract must opt them in explicitly via its own `companion_allowlist` field. This constant exists so `agentic_sync_runner` and `agentic_sync` import a single shared default rather than redefining it inline. Also expose `_matches_companion_pattern_anchored(rel_posix: str, pattern: str) -> bool` (Issue #1013 iter-14 M-1/M-2) — anchored, segment-aware glob matching used by both runner-side companion-allowlist checks. Unlike `pathlib.PurePosixPath.match` (which matches from the right and falsely treats `subdir/.pdd/meta/foo.json` as matching `.pdd/meta/*.json`), this helper requires the path and pattern to align segment-by-segment from the START of the path with equal segment count, and matches each segment via `fnmatch.fnmatchcase` for `*`/`?` semantics. Returns False on invalid patterns; callers should validate with `_is_valid_companion_pattern` first so an invalid pattern can never auto-allow a path. 23. **Scope Guard Helper (Issue #1013)**: `_revert_out_of_scope_changes(cwd, allowed_paths) -> List[Path]` is the shared revert helper used by `agentic_update`, `agentic_fix`, `agentic_crash`, `agentic_verify`, `agentic_e2e_fix_orchestrator`, and the sync scope guard. Signature MUST remain `(cwd: Path, allowed_paths: set[Path]) -> List[Path]` and return the list of resolved paths that were reverted. Behavior contract: - Skip silently when `cwd` is not a git repo, when `git status` is unavailable, or when `allowed_paths` is NON-EMPTY and none of its entries fall under `cwd` (the "scope guard meant for a different module" optimization). An EMPTY `allowed_paths` is a legal reject-all contract (Issue #1013 degenerate-empty case) — proceed with revert. - Detect tracked changes via `git status --porcelain -uno`. diff --git a/pdd/prompts/agentic_sync_runner_python.prompt b/pdd/prompts/agentic_sync_runner_python.prompt index 97b730702..491d039db 100644 --- a/pdd/prompts/agentic_sync_runner_python.prompt +++ b/pdd/prompts/agentic_sync_runner_python.prompt @@ -65,7 +65,7 @@ Parallel sync engine that runs `pdd sync` for multiple modules concurrently usin 20. Forward "Successfully submitted example" messages from child stdout to parent console 21. Heartbeat logging: during long-running syncs, print progress updates every 60s. Prefer parsed `PDD_PHASE` state — `f" — phase: {current_phase} ({len(completed_phases)} done)"` — so operators see real progress through the generate/test/fix phases instead of a stale `Preprocessing complete` line. Fall back to the last non-box-drawing stdout line only when no phase has been reported yet. 22. **Split-Contract Scope Guard (Issue #1013)**: After each per-module `pdd sync` subprocess completes (success or failure), and **before** the runner declares that module successful or persists state, the runner MUST invoke `_enforce_scope_guard(basename, module_cwd)` when `self.scope_guard_enabled` is True AND `self.allowed_write_set is not None`. The helper: - - Builds the effective allow set for the module: every path in `self.allowed_write_set` resolved against the module's repo root (the git toplevel of `module_cwd`, falling back to `module_cwd` itself), plus every path under `module_cwd` that matches any glob in the effective companion allowlist (`self.companion_allowlist` ∪ `DEFAULT_SYNC_COMPANION_ALLOWLIST`). + - Builds the effective allow set for the module: every path in `self.allowed_write_set` resolved against the module's repo root (the git toplevel of `module_cwd`, falling back to `module_cwd` itself), plus every path under `module_cwd` that matches any glob in the effective companion allowlist (`self.companion_allowlist` ∪ `DEFAULT_SYNC_COMPANION_ALLOWLIST`). Companion-allowlist matching MUST use `_matches_companion_pattern_anchored` from `pdd.agentic_common` (Issue #1013 iter-14 M-1) — anchored, segment-aware glob matching — NOT `pathlib.PurePosixPath.match`, whose suffix-based semantics would falsely auto-allow nested paths like `subdir/.pdd/meta/foo.json` against the default `.pdd/meta/*.json` pattern. Candidate paths are normalized as POSIX relative to `module_cwd`, NOT relative to the repo root, so a multi-module repo with per-module cwds correctly auto-allows `/.pdd/meta/*.json` under the default top-level companion pattern. For the `git status`-driven branch (companion-shaped paths that are TRACKED-DELETED and therefore invisible to `rglob`, iter-4 F1), scope each repo-relative path to `module_cwd` first, then strip the cwd prefix before invoking the matcher. - Calls `_revert_out_of_scope_changes(repo_root, allowed_paths)` from `pdd.agentic_common` to revert tracked out-of-scope modifications, AND calls `revert_out_of_scope_changes_with_dirs(repo_root, allowed_dirs=set(), allowed_files=allowed_paths)` from `pdd.agentic_common_worktree` to additionally remove untracked out-of-scope new files. The combination matches the existing scope-guard pattern used by `agentic_update`/`agentic_fix`/`agentic_crash`/`agentic_e2e_fix_orchestrator`. - **Post-revert re-scan (Issue #1013 iter-9, M-1 fail-closed boundary)**: after both helpers return, the runner MUST call `_remaining_out_of_scope_paths(repo_root, allowed_files)` to detect anything the helpers could not revert/remove (git timeout, permission error, restore failure — those helpers log a warning and return `[]`, which the orchestrator otherwise mistakes for "clean"). Paths returned by the re-scan and not already in the helper-returned offending list go into an `Unrecovered (revert failed, manual cleanup required):` section in the diagnostic. A non-empty Unrecovered set MUST cause `_enforce_scope_guard` to return a diagnostic string (hard-fail the module) even when the revert helpers themselves returned empty lists. When `offending` is empty but `Unrecovered` is non-empty, emit the alternate header `Scope guard detected out-of-scope artifacts for module '' (contract source: ) but the revert helpers reported no successful reverts.` instead of the misleading `Scope guard reverted 0 out-of-scope file(s)...`. The `Unrecovered` section is OMITTED entirely when empty (no empty headers). - Diagnostic format (printed to stderr; structured for downstream parsers — checkup, review-loop reports): diff --git a/pdd/prompts/durable_sync_runner_python.prompt b/pdd/prompts/durable_sync_runner_python.prompt index e4366f3fd..8097393cf 100644 --- a/pdd/prompts/durable_sync_runner_python.prompt +++ b/pdd/prompts/durable_sync_runner_python.prompt @@ -14,7 +14,7 @@ Durable execution engine for `pdd sync --durable`. It must pr 4. Use `.pdd/worktrees/durable-issue-` as the main durable worktree and `.pdd/worktrees/sync-issue--` for per-module worktrees. 5. Resume by scanning pushed checkpoint commits on the durable branch for trailers formatted as `PDD-Sync-Checkpoint-V1: issue= module=`. Ignore trailers for other issues. 6. Do not rely on `.pdd/agentic_sync_state.json` for durable resume. Corrupt or missing local state must not prevent resuming from remote checkpoint trailers. -7. For each successful module, create a checkpoint commit containing only safe, relevant project files and allowed `.pdd/meta/_*.json` metadata. If the parent issue supplied an allowed write set, reject any staged path outside that exact repo-relative set before creating the checkpoint. Push the checkpoint before printing `PDD_CHECKPOINT:`. +7. For each successful module, create a checkpoint commit containing only safe, relevant project files and allowed `.pdd/meta/_*.json` metadata. If the parent issue supplied an allowed write set, reject any staged path outside that exact repo-relative set before creating the checkpoint. Companion-allowlist matching for staged paths MUST use `_matches_companion_pattern_anchored` from `pdd.agentic_common` (Issue #1013 iter-14 M-2) — anchored, segment-aware glob matching — NOT `pathlib.PurePosixPath.match`, whose suffix-based semantics would let a nested path like `subdir/.pdd/meta/foo.json` falsely match the default `.pdd/meta/*.json` companion pattern and bypass the split contract. Push the checkpoint before printing `PDD_CHECKPOINT:`. 8. If a module succeeds with no file diff, create an empty checkpoint commit so resume can still skip it later. 9. Never checkpoint unsafe files: `.env`, `.env.local`, `cost.csv`, `crash.log`, `fix_errors.log`, `.pem`, `.key`, token/secret paths, `.pdd/worktrees`, `.pdd/agentic_sync_state.json`, or unrelated `.pdd` files. 10. On patch conflict or failed module output, exit non-zero for the durable run, abort any in-progress `git am`, preserve prior checkpoints, and do not create later checkpoints. diff --git a/tests/test_agentic_common.py b/tests/test_agentic_common.py index 8dd4978e4..1dcb1ba4f 100644 --- a/tests/test_agentic_common.py +++ b/tests/test_agentic_common.py @@ -7371,22 +7371,26 @@ def test_companion_allowlist_rejects_wildcard_only_patterns(self): def test_companion_allowlist_keeps_anchored_patterns(self): """Iter-10 M-1: patterns with at least one literal-character segment - anchor remain valid.""" + anchor remain valid. Iter-14 M-1/M-2: ``**``-bearing patterns are + ALSO dropped now (segment-aware matcher requires equal segment + counts, so a doublestar segment would be ambiguous).""" from pdd.agentic_common import parse_issue_contract, IssueContract body = ( "" ) c = parse_issue_contract(body) assert isinstance(c, IssueContract) + # ``**/foo.json`` is dropped by the iter-14 doublestar rule; + # the three remaining anchored patterns are kept. assert c.companion_allowlist == ( ".pdd/meta/*.json", "architecture.json", - "**/foo.json", + "tests/test_*.py", ) def test_companion_allowlist_rejects_traversal_and_absolute(self): @@ -7416,3 +7420,64 @@ def test_default_companion_allowlist_passes_validation(self): assert DEFAULT_SYNC_COMPANION_ALLOWLIST for pattern in DEFAULT_SYNC_COMPANION_ALLOWLIST: assert _is_valid_companion_pattern(pattern), pattern + + def test_anchored_matcher_rejects_nested_default_pattern(self): + """Iter-14 M-1/M-2: the anchored, segment-aware matcher MUST treat + ``.pdd/meta/*.json`` as a TOP-LEVEL pattern — paths nested under + any other directory (``subdir/.pdd/meta/foo.json``) MUST NOT auto- + allow, because ``PurePosixPath.match`` is suffix-based and would + let a contract violator bypass the guard by writing fingerprint- + shaped files under any prefix. + """ + from pdd.agentic_common import _matches_companion_pattern_anchored + + # Intended: top-level match auto-allows. + assert _matches_companion_pattern_anchored( + ".pdd/meta/foo.json", ".pdd/meta/*.json" + ) is True + # Bug repro: nested-prefix path must NOT match. + assert _matches_companion_pattern_anchored( + "subdir/.pdd/meta/foo.json", ".pdd/meta/*.json" + ) is False + # Deeper-prefix path must NOT match. + assert _matches_companion_pattern_anchored( + "a/b/c/.pdd/meta/foo.json", ".pdd/meta/*.json" + ) is False + # Path nested UNDER the meta dir (different segment count) must + # NOT match — preserves the iter-3 F3 strict-pathlib semantics. + assert _matches_companion_pattern_anchored( + ".pdd/meta/sub/foo.json", ".pdd/meta/*.json" + ) is False + + def test_anchored_matcher_handles_segment_wildcards(self): + """Iter-14 M-1/M-2: ``*`` matches a single segment only. The + matcher MUST NOT collapse multiple segments into one wildcard. + """ + from pdd.agentic_common import _matches_companion_pattern_anchored + + assert _matches_companion_pattern_anchored( + "tests/foo.txt", "tests/*.txt" + ) is True + # Segment count mismatch — ``*`` does not span ``sub/foo.txt``. + assert _matches_companion_pattern_anchored( + "tests/sub/foo.txt", "tests/*.txt" + ) is False + + def test_is_valid_companion_pattern_rejects_doublestar(self): + """Iter-14 M-1/M-2: ``**`` segments are rejected at parse time so + the segment-aware matcher never sees them. The validator MUST + reject both pure ``**`` and any pattern with a ``**`` segment. + """ + from pdd.agentic_common import _is_valid_companion_pattern + + # Pure wildcard-only patterns (already iter-10 territory). + assert _is_valid_companion_pattern("**") is False + # Iter-14: ``**`` SEGMENTS rejected even when paired with literals. + assert _is_valid_companion_pattern("**/foo.json") is False + assert _is_valid_companion_pattern("foo/**") is False + assert _is_valid_companion_pattern("foo/**/bar.json") is False + # Regression: the shipped default and other anchored patterns + # without ``**`` segments remain valid. + assert _is_valid_companion_pattern(".pdd/meta/*.json") is True + assert _is_valid_companion_pattern("architecture.json") is True + assert _is_valid_companion_pattern("tests/test_*.py") is True diff --git a/tests/test_agentic_sync_runner.py b/tests/test_agentic_sync_runner.py index 3da283596..621e88d24 100644 --- a/tests/test_agentic_sync_runner.py +++ b/tests/test_agentic_sync_runner.py @@ -2845,6 +2845,68 @@ def fake_revert(repo_root, allowed_files): # is non-None — confirming the scope guard did flag it. assert diagnostic is not None + def test_nested_meta_path_is_not_auto_allowed( + self, tmp_path, monkeypatch + ): + """Iter-14 M-1: the default companion pattern ``.pdd/meta/*.json`` + was previously matched with ``PurePosixPath.match`` (suffix-based), + which falsely matched any path ending in ``.pdd/meta/.json`` + — including ``subdir/.pdd/meta/bar.json``. The anchored matcher + MUST treat the default pattern as TOP-LEVEL, so a nested path is + out of scope even though it carries the right basename and dir + suffix. + """ + from pdd import agentic_sync_runner as mod + + repo = tmp_path + # Create an out-of-scope file at a NESTED .pdd/meta path — the + # exact bug shape: a fingerprint-shaped file under a subdir. + nested = repo / "subdir" / ".pdd" / "meta" + nested.mkdir(parents=True) + offending = nested / "bar.json" + offending.write_text("{}", encoding="utf-8") + + captured_allowed = {} + + def fake_revert(repo_root, allowed_files): + captured_allowed["files"] = set(allowed_files) + return [offending] + + monkeypatch.setattr(mod, "_revert_out_of_scope_changes", fake_revert) + monkeypatch.setattr( + mod, + "revert_out_of_scope_changes_with_dirs", + lambda _root, allowed_dirs, allowed_files: [], + ) + + runner = self._make_runner( + allowed_write_set=["pdd/foo.py"], + companion_allowlist=[".pdd/meta/*.json"], + ) + monkeypatch.setattr( + runner, "_resolve_repo_root", lambda _cwd: repo.resolve() + ) + monkeypatch.setattr( + runner, "_remaining_out_of_scope_paths", + lambda _root, _allowed: [], + ) + + # Direct matcher assertion: anchored, segment-aware match must + # REJECT a nested .pdd/meta/*.json path against the top-level + # pattern (the iter-14 M-1 bug shape). + assert runner._matches_companion_allowlist( + "subdir/.pdd/meta/bar.json", (".pdd/meta/*.json",) + ) is False + + diagnostic = runner._enforce_scope_guard("mod", repo) + # Nested file must NOT be auto-allowed even though it is shaped + # like the default companion pattern. + assert offending.resolve() not in captured_allowed.get("files", set()), ( + "nested .pdd/meta path must NOT be auto-allowed by the " + "default top-level companion pattern (iter-14 M-1)" + ) + assert diagnostic is not None + def test_deleted_companion_in_git_status_is_preserved( self, tmp_path, monkeypatch ): diff --git a/tests/test_durable_sync_runner.py b/tests/test_durable_sync_runner.py index 832fd07e3..b9cee2319 100644 --- a/tests/test_durable_sync_runner.py +++ b/tests/test_durable_sync_runner.py @@ -387,6 +387,35 @@ def test_wildcard_only_companion_pattern_is_ignored_by_durable_runner( ) == ["unrelated/file.py"] +def test_durable_nested_meta_path_is_not_in_companion_allowlist( + tmp_path: Path, +): + """Iter-14 M-2: durable checkpoint scope checking previously used + ``PurePosixPath.match`` (suffix-based), which falsely matched + ``subdir/.pdd/meta/foo.json`` against the default top-level + ``.pdd/meta/*.json`` companion pattern. The anchored matcher MUST + refuse to auto-allow nested fingerprint-shaped paths, so staged + nested ``.pdd/meta/*.json`` files surface as out-of-scope and the + checkpoint is rejected. + """ + repo = _init_repo_with_remote(tmp_path) + runner = _runner( + repo, + allowed_write_set=["pdd/foo.py"], + companion_allowlist=[".pdd/meta/*.json"], + ) + + # The nested path is shaped like a fingerprint-meta artifact but + # sits under ``subdir/`` — the iter-14 M-2 bug shape. + result = runner._out_of_scope_staged_paths( + ["subdir/.pdd/meta/bar.json"] + ) + assert result == ["subdir/.pdd/meta/bar.json"], ( + "nested .pdd/meta path must NOT be auto-allowed by the " + "default top-level companion pattern (iter-14 M-2)" + ) + + def test_staged_rename_source_side_is_scope_checked(tmp_path: Path): """Iter-6 B3 (rename detection bug): ``git diff --cached --name-only`` for a staged ``git mv old new`` emits ONLY ``new``. A contract that From 74796136c501f396bbc97596ed3985cb458018b0 Mon Sep 17 00:00:00 2001 From: Serhan Date: Fri, 15 May 2026 12:12:46 -0700 Subject: [PATCH 29/42] fix(sync): iter-16 M-1 thread module context into durable scope check MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Codex iter-15 review (Major): iter-14's anchored matcher fix landed in the durable runner, but with the assumption "staged paths are already worktree-rooted." That assumption held for single-module durable sync but BROKE multi-module sync: when module_cwd is a subdirectory of the worktree (e.g. `worktree/pkg` for module `pkg_mod`), staged paths surface as `pkg/.pdd/meta/foo.json` relative to the worktree git root. The default companion pattern `.pdd/meta/*.json` describes MODULE-RELATIVE artifacts and does not match the worktree-rooted form, so legitimate fingerprint metadata gets rejected — blocking valid checkpoint commits under an otherwise correct split contract. Thread `basename` and `module_worktree` through `_out_of_scope_staged_paths` and resolve `module_cwd_rel` via `_module_cwd_for_worktree(...).relative_to(module_worktree)`. When `module_cwd_rel` is non-empty: - Staged paths NOT starting with `module_cwd_rel + "/"` are sibling-module artifacts and never auto-allow (preserves F1 iter-3 sibling rule). - Staged paths within the module strip the prefix before the anchored companion matcher runs, so `pkg/.pdd/meta/foo.json` becomes `.pdd/meta/foo.json` for the match decision. When `module_cwd_rel` is `""` or `"."` (single-module worktree), the match stays repo-relative — preserves iter-14 single-module semantics (`.pdd/meta/*.json` matches `.pdd/meta/foo.json`, NOT `subdir/.pdd/meta/foo.json`). The allowed_write_paths check stays repo-relative — the contract is declared with repo-rooted paths and unchanged. Tests: multi-module companion auto-allow, sibling-module rejection, single-module regression (companion still matches), single-module nested-meta regression (iter-14 fix still in effect). Five existing call sites updated to pass new kwargs. Co-Authored-By: Claude Opus 4.7 --- pdd/durable_sync_runner.py | 55 +++++++- pdd/prompts/durable_sync_runner_python.prompt | 2 +- tests/test_durable_sync_runner.py | 132 +++++++++++++++++- 3 files changed, 180 insertions(+), 9 deletions(-) diff --git a/pdd/durable_sync_runner.py b/pdd/durable_sync_runner.py index 700144d5c..59b7d205c 100644 --- a/pdd/durable_sync_runner.py +++ b/pdd/durable_sync_runner.py @@ -388,7 +388,9 @@ def _stage_module_changes( # or a single-path status (A/M/D/T), every column past the # status code is a path that should be scope-checked. changed_paths.extend(p.strip() for p in parts[1:] if p.strip()) - out_of_scope = self._out_of_scope_staged_paths(changed_paths) + out_of_scope = self._out_of_scope_staged_paths( + changed_paths, basename, module_worktree + ) if out_of_scope: return ( False, @@ -408,7 +410,12 @@ def _stage_module_changes( empty = not changed_paths return True, "", empty - def _out_of_scope_staged_paths(self, paths: List[str]) -> List[str]: + def _out_of_scope_staged_paths( + self, + paths: List[str], + basename: str, + module_worktree: Path, + ) -> List[str]: """ Return staged paths that violate the issue split-contract. @@ -418,16 +425,44 @@ def _out_of_scope_staged_paths(self, paths: List[str]) -> List[str]: both contract paths AND companion-allowlist matches (e.g. ``.pdd/meta/*.json``) so fingerprint metadata can still be checkpointed alongside the primary write set. + + Issue #1013 iter-16 M-1: in a multi-module repo where + ``module_cwd`` is a SUBDIRECTORY of the worktree (e.g. + ``worktree/pkg``), staged paths surface relative to the worktree + git root (e.g. ``pkg/.pdd/meta/foo.json``). Companion-allowlist + patterns describe MODULE-RELATIVE artifacts (e.g. + ``.pdd/meta/*.json``), so the staged path must be stripped of the + module cwd prefix before the anchored matcher runs. Mirrors the + async runner's iter-14 M-1 part-2 fix. Paths that sit OUTSIDE the + module's cwd (sibling-module artifacts) never auto-allow — see + F1 iter-3. """ # Permissive mode: scope_guard disabled or no contract parsed. if not self.scope_guard_enabled or self.allowed_write_paths is None: return [] allowlist = tuple(self.companion_allowlist) + # Resolve the module's cwd relative to its worktree once. For a + # single-module sync where ``module_cwd == module_worktree``, this + # is ``"."`` (or ``""`` if the helper ever returned the worktree + # path itself) and we treat it as "no prefix to strip". + module_cwd = self._module_cwd_for_worktree(basename, module_worktree) + try: + module_cwd_rel = module_cwd.resolve().relative_to( + module_worktree.resolve() + ).as_posix() + except ValueError: + module_cwd_rel = "" + if module_cwd_rel in ("", "."): + module_cwd_rel = "" + offending: Set[str] = set() for raw in paths: normalized = raw.replace(os.sep, "/").strip() if normalized.startswith("./"): normalized = normalized[2:] + # Allowed-write-set match is REPO-RELATIVE by contract (the + # split contract is declared with repo-rooted paths) and stays + # unchanged. if normalized in self.allowed_write_paths: continue # F3 (Issue #1013): companion glob matching uses anchored, @@ -437,6 +472,20 @@ def _out_of_scope_staged_paths(self, paths: List[str]) -> List[str]: # ``PurePosixPath.match`` (suffix-based, falsely matched # ``subdir/.pdd/meta/foo.json``) with the centralized # anchored matcher in ``agentic_common``. + # + # Iter-16 M-1: for multi-module sync, strip the module_cwd + # prefix before matching so the module-relative pattern works + # against the repo-relative staged path. Paths outside the + # module's cwd are sibling artifacts and never auto-allow. + if module_cwd_rel: + prefix = module_cwd_rel + "/" + if not normalized.startswith(prefix): + offending.add(normalized) + continue + candidate_rel = normalized[len(prefix):] + else: + candidate_rel = normalized + matched = False for pattern in allowlist: if not pattern: @@ -447,7 +496,7 @@ def _out_of_scope_staged_paths(self, paths: List[str]) -> List[str]: # auto-allow repo-wide writes. if not _is_valid_companion_pattern(pattern): continue - if _matches_companion_pattern_anchored(normalized, pattern): + if _matches_companion_pattern_anchored(candidate_rel, pattern): matched = True break if matched: diff --git a/pdd/prompts/durable_sync_runner_python.prompt b/pdd/prompts/durable_sync_runner_python.prompt index 8097393cf..1ce2ea224 100644 --- a/pdd/prompts/durable_sync_runner_python.prompt +++ b/pdd/prompts/durable_sync_runner_python.prompt @@ -14,7 +14,7 @@ Durable execution engine for `pdd sync --durable`. It must pr 4. Use `.pdd/worktrees/durable-issue-` as the main durable worktree and `.pdd/worktrees/sync-issue--` for per-module worktrees. 5. Resume by scanning pushed checkpoint commits on the durable branch for trailers formatted as `PDD-Sync-Checkpoint-V1: issue= module=`. Ignore trailers for other issues. 6. Do not rely on `.pdd/agentic_sync_state.json` for durable resume. Corrupt or missing local state must not prevent resuming from remote checkpoint trailers. -7. For each successful module, create a checkpoint commit containing only safe, relevant project files and allowed `.pdd/meta/_*.json` metadata. If the parent issue supplied an allowed write set, reject any staged path outside that exact repo-relative set before creating the checkpoint. Companion-allowlist matching for staged paths MUST use `_matches_companion_pattern_anchored` from `pdd.agentic_common` (Issue #1013 iter-14 M-2) — anchored, segment-aware glob matching — NOT `pathlib.PurePosixPath.match`, whose suffix-based semantics would let a nested path like `subdir/.pdd/meta/foo.json` falsely match the default `.pdd/meta/*.json` companion pattern and bypass the split contract. Push the checkpoint before printing `PDD_CHECKPOINT:`. +7. For each successful module, create a checkpoint commit containing only safe, relevant project files and allowed `.pdd/meta/_*.json` metadata. If the parent issue supplied an allowed write set, reject any staged path outside that exact repo-relative set before creating the checkpoint. Companion-allowlist matching for staged paths MUST use `_matches_companion_pattern_anchored` from `pdd.agentic_common` (Issue #1013 iter-14 M-2) — anchored, segment-aware glob matching — NOT `pathlib.PurePosixPath.match`, whose suffix-based semantics would let a nested path like `subdir/.pdd/meta/foo.json` falsely match the default `.pdd/meta/*.json` companion pattern and bypass the split contract. Issue #1013 iter-16 M-1: companion patterns are matched MODULE-RELATIVE, so when `module_cwd != module_worktree` (multi-module sync where the module lives in a subdirectory like `worktree/pkg`), strip the module_cwd prefix from each staged path before invoking the anchored matcher — otherwise legitimate metadata such as `pkg/.pdd/meta/foo.json` would be rejected. Staged paths that fall outside the module's cwd (sibling-module artifacts) MUST NOT auto-allow under any companion pattern (F1 iter-3 sibling rule). Push the checkpoint before printing `PDD_CHECKPOINT:`. 8. If a module succeeds with no file diff, create an empty checkpoint commit so resume can still skip it later. 9. Never checkpoint unsafe files: `.env`, `.env.local`, `cost.csv`, `crash.log`, `fix_errors.log`, `.pem`, `.key`, token/secret paths, `.pdd/worktrees`, `.pdd/agentic_sync_state.json`, or unrelated `.pdd` files. 10. On patch conflict or failed module output, exit non-zero for the durable run, abort any in-progress `git am`, preserve prior checkpoints, and do not create later checkpoints. diff --git a/tests/test_durable_sync_runner.py b/tests/test_durable_sync_runner.py index b9cee2319..695e4f850 100644 --- a/tests/test_durable_sync_runner.py +++ b/tests/test_durable_sync_runner.py @@ -336,7 +336,9 @@ def test_allowed_write_set_rejects_out_of_scope_checkpoint_paths(tmp_path: Path) runner = _runner(repo, allowed_write_set=["src/app.py"]) assert runner._out_of_scope_staged_paths( - ["src/app.py", "architecture.json", ".pdd/meta/foo_python.json"] + ["src/app.py", "architecture.json", ".pdd/meta/foo_python.json"], + "foo", + repo, ) == ["architecture.json"] @@ -349,7 +351,9 @@ def test_allowed_write_set_none_means_permissive_for_durable_runner(tmp_path: Pa runner = _runner(repo, allowed_write_set=None) assert runner._out_of_scope_staged_paths( - ["src/app.py", "architecture.json", "anything/else.txt"] + ["src/app.py", "architecture.json", "anything/else.txt"], + "foo", + repo, ) == [] @@ -362,7 +366,9 @@ def test_allowed_write_set_empty_rejects_everything_for_durable_runner(tmp_path: runner = _runner(repo, allowed_write_set=[]) result = runner._out_of_scope_staged_paths( - ["src/app.py", ".pdd/meta/foo_python.json"] + ["src/app.py", ".pdd/meta/foo_python.json"], + "foo", + repo, ) assert result == ["src/app.py"] @@ -383,7 +389,9 @@ def test_wildcard_only_companion_pattern_is_ignored_by_durable_runner( # ``**/*`` is wildcard-only, so it must NOT auto-allow ``unrelated/file.py``. assert runner._out_of_scope_staged_paths( - ["unrelated/file.py"] + ["unrelated/file.py"], + "foo", + repo, ) == ["unrelated/file.py"] @@ -408,7 +416,9 @@ def test_durable_nested_meta_path_is_not_in_companion_allowlist( # The nested path is shaped like a fingerprint-meta artifact but # sits under ``subdir/`` — the iter-14 M-2 bug shape. result = runner._out_of_scope_staged_paths( - ["subdir/.pdd/meta/bar.json"] + ["subdir/.pdd/meta/bar.json"], + "foo", + repo, ) assert result == ["subdir/.pdd/meta/bar.json"], ( "nested .pdd/meta path must NOT be auto-allowed by the " @@ -416,6 +426,118 @@ def test_durable_nested_meta_path_is_not_in_companion_allowlist( ) +def test_multi_module_durable_companion_matched_module_relative( + tmp_path: Path, +): + """Iter-16 M-1: in a multi-module repo where ``module_cwd`` is a + SUBDIRECTORY of the worktree (``worktree/pkg``), staged paths + surface relative to the worktree git root (``pkg/.pdd/meta/foo.json``). + The companion pattern ``.pdd/meta/*.json`` is module-relative, so the + durable scope check MUST strip the module_cwd prefix before matching; + otherwise legitimate fingerprint metadata is rejected and the + checkpoint commit fails. Mirrors the async-side iter-14 M-1 part-2 + fix. + """ + repo = _init_repo_with_remote(tmp_path) + runner = _runner( + repo, + basenames=["pkg_mod"], + module_cwds={"pkg_mod": repo / "pkg"}, + allowed_write_set=["pkg/foo.py"], + companion_allowlist=[".pdd/meta/*.json"], + ) + + # Staged path is repo-relative (``pkg/.pdd/meta/foo.json``) but the + # companion pattern describes module-relative metadata. With the + # iter-16 fix the prefix is stripped and the anchored matcher sees + # ``.pdd/meta/foo.json`` — a clean match. + result = runner._out_of_scope_staged_paths( + ["pkg/.pdd/meta/foo.json"], + "pkg_mod", + repo, + ) + assert result == [], ( + "multi-module durable runner must strip module_cwd prefix before " + "companion-pattern matching (iter-16 M-1)" + ) + + +def test_durable_sibling_module_metadata_rejected(tmp_path: Path): + """Iter-16 M-1 (sibling-module regression for F1 iter-3): when + ``module_cwd = worktree/pkg``, a sibling module's metadata path like + ``pkg_other/.pdd/meta/foo.json`` sits OUTSIDE the active module's + cwd. The companion allowlist must NOT auto-allow it; only files + UNDER the module's own cwd qualify as companion artifacts. + """ + repo = _init_repo_with_remote(tmp_path) + runner = _runner( + repo, + basenames=["pkg_mod"], + module_cwds={"pkg_mod": repo / "pkg"}, + allowed_write_set=["pkg/foo.py"], + companion_allowlist=[".pdd/meta/*.json"], + ) + + result = runner._out_of_scope_staged_paths( + ["pkg_other/.pdd/meta/foo.json"], + "pkg_mod", + repo, + ) + assert result == ["pkg_other/.pdd/meta/foo.json"], ( + "sibling-module metadata must NOT be auto-allowed by the " + "companion allowlist (F1 iter-3 sibling rule, iter-16 M-1)" + ) + + +def test_single_module_durable_companion_still_matches(tmp_path: Path): + """Iter-16 M-1 (single-module regression): when ``module_cwd == + module_worktree`` (no submodule prefix), top-level + ``.pdd/meta/foo.json`` must still match the default companion + pattern. The iter-16 prefix-stripping must be a no-op in this case. + """ + repo = _init_repo_with_remote(tmp_path) + runner = _runner( + repo, + allowed_write_set=["src/app.py"], + companion_allowlist=[".pdd/meta/*.json"], + ) + + result = runner._out_of_scope_staged_paths( + [".pdd/meta/foo.json"], + "foo", + repo, + ) + assert result == [], ( + "single-module durable runner must still auto-allow top-level " + ".pdd/meta artifacts (iter-16 M-1 single-module regression)" + ) + + +def test_single_module_durable_nested_meta_not_allowed(tmp_path: Path): + """Iter-14 M-2 (regression): single-module durable runner with + ``module_cwd == module_worktree`` must still reject a NESTED + ``subdir/.pdd/meta/foo.json`` — the anchored matcher refuses + suffix-style matches, and iter-16's prefix-stripping must not + weaken that. + """ + repo = _init_repo_with_remote(tmp_path) + runner = _runner( + repo, + allowed_write_set=["src/app.py"], + companion_allowlist=[".pdd/meta/*.json"], + ) + + result = runner._out_of_scope_staged_paths( + ["subdir/.pdd/meta/foo.json"], + "foo", + repo, + ) + assert result == ["subdir/.pdd/meta/foo.json"], ( + "nested .pdd/meta path must remain out-of-scope under single-" + "module mode (iter-14 M-2 regression preserved by iter-16)" + ) + + def test_staged_rename_source_side_is_scope_checked(tmp_path: Path): """Iter-6 B3 (rename detection bug): ``git diff --cached --name-only`` for a staged ``git mv old new`` emits ONLY ``new``. A contract that From 37e32731ad1a3e68e3c08475f83bb1f92a361efb Mon Sep 17 00:00:00 2001 From: Serhan Date: Fri, 15 May 2026 12:50:54 -0700 Subject: [PATCH 30/42] fix(sync): iter-18 bullet-list parser + durable baseline + dedupe permissive log MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Three verified findings from external review: Blocker — the parser only supported HTML-comment and fenced-code-block markers, but the real-world split-contract format used in issue #1005 (the original motivation for #1013) is neither. #1005's body declares the contract under `## Split Contract` with a `**Allowed write set:**` label and a markdown bullet list of paths. Our parser returned None for this and pdd sync fell back to permissive mode — the exact regression #1013 was meant to prevent. Verified end-to-end: fetching #1005's body and feeding it into parse_issue_contract returned allowed_paths=('pdd/update_main.py', 'pdd/prompts/update_main_python.prompt', 'tests/test_update_main.py'), source='bullet-list' after the fix. Add a third parser branch `_parse_bullet_list_contract` in agentic_common.py that anchors on `## Split Contract` / `## Allowed Write Set` headings + `**Allowed write set:**` label + a `-`/`*`/`+` bullet list. The list terminates at the next `**Label:**`, `---` horizontal rule, next heading, blank-then-non-bullet, or EOF. Strips surrounding backticks. Drops invalid entries via `_is_valid_contract_path`. Returns the iter-8 B5 empty-contract reject-all when nothing valid remains. Priority order: HTML-comment > fenced-block > bullet-list. Major — DurableSyncRunner called super().__init__() BEFORE pinning self.project_root = self.git_root, so AsyncSyncRunner snapshotted _baseline_changed_paths from the caller's Path.cwd() rather than the durable worktree's root. Those baseline paths then got auto-allowed during enforcement, so a dirty out.py in the caller's checkout caused out.py generated by sync in a separate worktree to pass scope guard with diag=None. Add a keyword-only project_root: Optional[Path] = None to AsyncSyncRunner.__init__; when provided it overrides Path.cwd() BEFORE the baseline snapshot. DurableSyncRunner now passes project_root=self.git_root in its super() call; the redundant post-super self.project_root assignment is removed. Minor — permissive-mode runs were logging the "no contract" message twice (once in agentic_sync.py before dispatch, once in AsyncSyncRunner.run() entry). Drop the runner-side log; keep the dispatch-site log which already has richer context. Tests: 7 new bullet-list parser cases (incl. #1005 body verbatim, section-terminator variants, backtick stripping, HTML-comment priority); 3 new project_root tests (durable pinning, async kwarg, baseline content); 2 existing iter-3 runner tests inverted to assert the duplicate log is absent. 664 passed across the scope-guard suite. README §"Split-Contract Scope Guard" documents the third format; prompt § 21 updated. Co-Authored-By: Claude Opus 4.7 --- README.md | 17 +- pdd/agentic_common.py | 146 +++++++++++++++- pdd/agentic_sync_runner.py | 36 ++-- pdd/durable_sync_runner.py | 10 +- pdd/prompts/agentic_common_python.prompt | 2 +- pdd/prompts/agentic_sync_runner_python.prompt | 11 +- tests/test_agentic_common.py | 157 ++++++++++++++++++ tests/test_agentic_sync_runner.py | 90 ++++++++-- tests/test_durable_sync_runner.py | 59 +++++++ 9 files changed, 486 insertions(+), 42 deletions(-) diff --git a/README.md b/README.md index 01d8da009..15b5ead0a 100644 --- a/README.md +++ b/README.md @@ -1065,7 +1065,7 @@ Options (agentic mode): When the linked GitHub issue declares an allowed write set (a "split contract"), `pdd sync` enforces it: each per-module subprocess is followed by a scope check that reverts tracked changes and removes untracked new files that fall outside the contract. Companion artifacts under `.pdd/meta/*.json` are auto-allowed because they are sync's own fingerprint bookkeeping; issues may opt additional companions (e.g. examples or architecture entries) into the allowlist explicitly. -The contract is read from the issue body or any of its comments in one of two forms: +The contract is read from the issue body or any of its comments in one of three forms (tried in priority order — the first match wins): 1. An HTML-comment block (preferred — invisible in rendered Markdown): ```html @@ -1086,6 +1086,21 @@ The contract is read from the issue body or any of its comments in one of two fo pdd/prompts/update_main_python.prompt tests/test_update_main.py ``` +3. A bullet list under an inline `**Allowed write set:**` label (the + real-world shape used by sub-issues such as #1005): + + ```markdown + ## Split Contract + **Command sequence:** change → sync + **Allowed write set:** + - `pdd/update_main.py` + - `pdd/prompts/update_main_python.prompt` + - `tests/test_update_main.py` + **Acceptance criteria:** + - ... + ``` + + The heading regex is the same as form 2; the inline `**Allowed write set:**` label discriminates the bullet list so unrelated bullets earlier in the body (e.g. a `## Files` section) are NOT captured. Each bullet is one repo-relative POSIX path with optional surrounding backticks. The list terminates at the next `**Label:**` (such as `**Acceptance criteria:**`), a `---` rule, another heading, a non-blank non-bullet line, or end of body. When an out-of-scope change is detected, the run records a hard failure for that module with a diagnostic of the form: diff --git a/pdd/agentic_common.py b/pdd/agentic_common.py index 7b1c69df9..80eb8417e 100644 --- a/pdd/agentic_common.py +++ b/pdd/agentic_common.py @@ -2380,7 +2380,8 @@ class IssueContract: :data:`DEFAULT_SYNC_COMPANION_ALLOWLIST` to produce the effective allowlist. source: Diagnostic label describing where the contract was detected - (currently ``"html-comment"`` or ``"fenced-block"``). + (currently ``"html-comment"``, ``"fenced-block"``, or + ``"bullet-list"``). """ allowed_paths: Tuple[str, ...] @@ -2419,6 +2420,26 @@ class IssueContract: re.DOTALL, ) +# Issue #1013 iter-18 B-1: third declaration format — bullet-list contract. +# Matches the bold inline label ``**Allowed write set:**`` that introduces the +# bullet list. Anchored to its own line; surrounding whitespace tolerated; +# optional whitespace inside the bold delimiters. +_BULLET_LIST_LABEL_RE = re.compile( + r"^\s*\*\*\s*allowed\s+write\s+set\s*:\s*\*\*\s*$", + re.IGNORECASE | re.MULTILINE, +) + +# Matches one ``-``/``*``/``+`` bullet line whose content is captured. +_BULLET_LINE_RE = re.compile( + r"^\s*[-*+]\s+(?P.+?)\s*$" +) + +# Matches another bold inline label (e.g. ``**Acceptance criteria:**``). +# Used as a stop terminator while scanning bullets. +_NEXT_LABEL_RE = re.compile( + r"^\s*\*\*[^*\n]+\*\*\s*$" +) + def _is_valid_contract_path(raw: object) -> bool: """ @@ -2635,8 +2656,93 @@ def _parse_fenced_block_contract(text: str) -> Optional[IssueContract]: ) +def _parse_bullet_list_contract(text: str) -> Optional[IssueContract]: + """Return a contract parsed from a heading + ``**Allowed write set:**`` + inline label + bullet list, else ``None``. This is the third supported + format (Issue #1013 iter-18 B-1), motivated by the real-world shape used + in #1005 where the contract is written as Markdown bullets under a bold + inline label rather than inside a fenced code block or HTML comment. + + Algorithm: + + 1. Find a heading line matching :data:`_FENCED_BLOCK_HEADER_RE` + (``## Split Contract`` / ``## Allowed Write Set`` / etc.). + 2. After the heading, scan forward for the inline label line + ``**Allowed write set:**`` (case-insensitive). The label is the + discriminator — bullets BEFORE the label (e.g. under a ``## Files`` + section earlier in the body) MUST NOT be picked up. + 3. Collect contiguous ``-`` / ``*`` / ``+`` bullets that follow the + label, skipping blank lines. Stop the list at the first of: + + - another ``**Label:**`` line (e.g. ``**Acceptance criteria:**``) + - a ``---`` horizontal rule + - another ``#``-prefixed heading + - a non-blank line that is not a bullet + - end of body. + + 4. Each captured bullet entry has its surrounding backticks stripped + (users write ``- `pdd/foo.py``` because they're showing code paths), + then is validated by :func:`_is_valid_contract_path`. Invalid entries + are dropped silently — same policy as the other branches. + 5. An empty bullet list (no valid paths captured) is still returned as a + degenerate reject-all contract per the iter-8 B5 semantics. + """ + header_match = _FENCED_BLOCK_HEADER_RE.search(text) + if not header_match: + return None + after_header = text[header_match.end():] + label_match = _BULLET_LIST_LABEL_RE.search(after_header) + if not label_match: + return None + + body = after_header[label_match.end():] + paths: List[str] = [] + seen: set = set() + for raw_line in body.splitlines(): + stripped = raw_line.strip() + if not stripped: + # Blank line: do not terminate; bullets can be separated by + # blank lines in some Markdown styles. Continue scanning. + continue + if stripped.startswith("---"): + break + if stripped.startswith("#"): + break + # Another bold inline label (e.g. ``**Acceptance criteria:**``) + # terminates the bullet list. + if _NEXT_LABEL_RE.match(raw_line): + break + bullet_match = _BULLET_LINE_RE.match(raw_line) + if not bullet_match: + # First non-blank, non-terminator, non-bullet line ends the list. + break + entry = bullet_match.group("entry").strip() + # Users wrap code-shaped paths in backticks; strip a single layer of + # surrounding backticks before validation. + entry = entry.strip("`").strip() + if not _is_valid_contract_path(entry): + continue + if entry not in seen: + paths.append(entry) + seen.add(entry) + + return IssueContract( + allowed_paths=tuple(paths), + companion_allowlist=(), + source="bullet-list", + ) + + def _parse_contract_from_text(text: Optional[str]) -> Optional[IssueContract]: - """Try HTML-comment first, then fenced-block. Returns None on any failure.""" + """Try HTML-comment first, then fenced-block, then bullet-list. Returns + ``None`` when no branch matches. + + Priority order (Issue #1013 iter-18 B-1): HTML comment is authoritative + (spec-preferred form, invisible in rendered Markdown), then fenced block + (the existing iter-3 form), then bullet list (the real-world shape used + in #1005 and similar issues). The first branch that returns a + non-``None`` :class:`IssueContract` wins. + """ if not text: return None try: @@ -2646,7 +2752,13 @@ def _parse_contract_from_text(text: Optional[str]) -> Optional[IssueContract]: if html_contract is not None: return html_contract try: - return _parse_fenced_block_contract(text) + fenced_contract = _parse_fenced_block_contract(text) + except Exception: # noqa: BLE001 — parser MUST NOT raise on any input + fenced_contract = None + if fenced_contract is not None: + return fenced_contract + try: + return _parse_bullet_list_contract(text) except Exception: # noqa: BLE001 — parser MUST NOT raise on any input return None @@ -2658,9 +2770,9 @@ def parse_issue_contract( """ Parse an issue split-contract from an issue body or its comments. - Two declaration formats are supported (Issue #1013): + Three declaration formats are supported (Issue #1013): - 1. HTML-comment block (authoritative):: + 1. HTML-comment block (authoritative, spec-preferred):: ` whose JSON declares `allowed_paths` (required, list of repo-relative path strings) and optionally `companion_allowlist` (list of `pathlib`-style glob patterns), or (b) a fenced code block introduced by a heading-like line matching `(?im)^\s*(?:#+\s*)?(?:allowed[\s_-]*write[\s_-]*set|split[\s_-]*contract)\b.*$` immediately followed by a fenced block with info string ```text``` (one repo-relative path per line; blank lines and `#`-prefixed comments ignored) or ```json``` (body is a JSON array of repo-relative path strings, e.g. `["pdd/foo.py", "tests/test_foo.py"]`; non-array payloads such as objects/numbers/strings and malformed JSON yield `None` so the caller falls back to permissive mode — Issue #1013 iter-12 B-1). Path strings are repo-relative POSIX paths; do NOT resolve to absolute filesystem paths here — that is the caller's job once it knows the repo root. The parser MUST be tolerant: malformed JSON, missing fields, or no matching marker returns `None` (the caller treats `None` as "no contract → scope guard runs in permissive fallback mode, no enforcement"). Set `source` to `"html-comment"`, `"fenced-block"`, or the value that was matched, for diagnostics. The parser MUST NOT raise on any input; wrap the JSON load in try/except and return `None` on failure. When both a body marker and a comment marker are present, prefer the body marker (issues are edited authoritatively in the body; comments are append-only and may contain stale snapshots from earlier workflow steps). Per Issue #1013 iter-10 M-1 (tightened in iter-14 M-1/M-2 paired with iter-10 M-1), the parser MUST drop syntactically invalid `companion_allowlist` entries silently — same policy as `allowed_paths`. An entry is invalid if it is empty after `.strip()`, absolute (starts with `/`), uses a Windows separator (`\`), contains a `..` traversal segment, is wildcard-only with no literal-character anchor (every segment consists exclusively of `*` and `?`, e.g. `*`, `**`, `**/*`, `?`), OR contains any segment that is exactly `**` (the doublestar — `fnmatch` does not implement recursive glob semantics, and the anchored segment-aware matcher requires equal segment counts, so a `**`-bearing pattern would be ambiguous and is rejected at parse time). Patterns with at least one segment containing a non-wildcard character and no `**` segment (`.pdd/meta/*.json`, `architecture.json`, `tests/test_*.py`) remain valid. +21. **Issue Contract Parsing (Issue #1013 — sync scope guard)**: Provide `IssueContract` (frozen dataclass with `allowed_paths: Tuple[str, ...]`, `companion_allowlist: Tuple[str, ...]`, `source: str`) and `parse_issue_contract(issue_body, issue_comments=None) -> Optional[IssueContract]`. The parser scans the issue body first, then each comment (newest last is fine), looking for one of THREE declaration formats: (a) an HTML-comment block of the form `` whose JSON declares `allowed_paths` (required, list of repo-relative path strings) and optionally `companion_allowlist` (list of `pathlib`-style glob patterns); (b) a fenced code block introduced by a heading-like line matching `(?im)^\s*(?:#+\s*)?(?:allowed[\s_-]*write[\s_-]*set|split[\s_-]*contract)\b.*$` immediately followed by a fenced block with info string ```text``` (one repo-relative path per line; blank lines and `#`-prefixed comments ignored) or ```json``` (body is a JSON array of repo-relative path strings, e.g. `["pdd/foo.py", "tests/test_foo.py"]`; non-array payloads such as objects/numbers/strings and malformed JSON yield `None` so the caller falls back to permissive mode — Issue #1013 iter-12 B-1); or (c) **Bullet-list under an inline label (Issue #1013 iter-18 B-1)**: the same heading regex as (b), followed somewhere later by an inline bold label line `(?im)^\s*\*\*\s*allowed\s+write\s+set\s*:\s*\*\*\s*$`, followed by `-`/`*`/`+` bullets each carrying one repo-relative POSIX path (optional surrounding backticks are stripped). The label is the discriminator: bullets that appear BEFORE the label (e.g. under an earlier `## Files` section) MUST NOT be captured. The bullet list terminates at the first of (i) another `**Label:**` line such as `**Acceptance criteria:**`, (ii) a `---` horizontal rule, (iii) a `#`-prefixed heading, (iv) a non-blank, non-bullet line, or (v) end of body; blank lines do NOT terminate. Each bullet's surrounding backticks are stripped before validation. Branches MUST be tried in priority order (a) → (b) → (c); the first non-`None` match wins. Path strings are repo-relative POSIX paths; do NOT resolve to absolute filesystem paths here — that is the caller's job once it knows the repo root. The parser MUST be tolerant: malformed JSON, missing fields, or no matching marker returns `None` (the caller treats `None` as "no contract → scope guard runs in permissive fallback mode, no enforcement"). Set `source` to `"html-comment"`, `"fenced-block"`, or `"bullet-list"` matching the branch that produced the contract, for diagnostics. The parser MUST NOT raise on any input; wrap the JSON load in try/except and return `None` on failure. When both a body marker and a comment marker are present, prefer the body marker (issues are edited authoritatively in the body; comments are append-only and may contain stale snapshots from earlier workflow steps). Per Issue #1013 iter-10 M-1 (tightened in iter-14 M-1/M-2 paired with iter-10 M-1), the parser MUST drop syntactically invalid `companion_allowlist` entries silently — same policy as `allowed_paths`. An entry is invalid if it is empty after `.strip()`, absolute (starts with `/`), uses a Windows separator (`\`), contains a `..` traversal segment, is wildcard-only with no literal-character anchor (every segment consists exclusively of `*` and `?`, e.g. `*`, `**`, `**/*`, `?`), OR contains any segment that is exactly `**` (the doublestar — `fnmatch` does not implement recursive glob semantics, and the anchored segment-aware matcher requires equal segment counts, so a `**`-bearing pattern would be ambiguous and is rejected at parse time). Patterns with at least one segment containing a non-wildcard character and no `**` segment (`.pdd/meta/*.json`, `architecture.json`, `tests/test_*.py`) remain valid. 22. **Default Sync Companion Allowlist (Issue #1013)**: Expose a module-level constant `DEFAULT_SYNC_COMPANION_ALLOWLIST: Tuple[str, ...]` listing glob patterns for files that `pdd sync` MAY touch as legitimate companion artifacts even when an issue contract restricts the primary write set. The default value MUST be `(".pdd/meta/*.json",)` — only fingerprint metadata under `.pdd/meta/` is auto-allowed. Architecture, examples, and unrelated prompt files are NOT in the default companion allowlist; the issue contract must opt them in explicitly via its own `companion_allowlist` field. This constant exists so `agentic_sync_runner` and `agentic_sync` import a single shared default rather than redefining it inline. Also expose `_matches_companion_pattern_anchored(rel_posix: str, pattern: str) -> bool` (Issue #1013 iter-14 M-1/M-2) — anchored, segment-aware glob matching used by both runner-side companion-allowlist checks. Unlike `pathlib.PurePosixPath.match` (which matches from the right and falsely treats `subdir/.pdd/meta/foo.json` as matching `.pdd/meta/*.json`), this helper requires the path and pattern to align segment-by-segment from the START of the path with equal segment count, and matches each segment via `fnmatch.fnmatchcase` for `*`/`?` semantics. Returns False on invalid patterns; callers should validate with `_is_valid_companion_pattern` first so an invalid pattern can never auto-allow a path. 23. **Scope Guard Helper (Issue #1013)**: `_revert_out_of_scope_changes(cwd, allowed_paths) -> List[Path]` is the shared revert helper used by `agentic_update`, `agentic_fix`, `agentic_crash`, `agentic_verify`, `agentic_e2e_fix_orchestrator`, and the sync scope guard. Signature MUST remain `(cwd: Path, allowed_paths: set[Path]) -> List[Path]` and return the list of resolved paths that were reverted. Behavior contract: - Skip silently when `cwd` is not a git repo, when `git status` is unavailable, or when `allowed_paths` is NON-EMPTY and none of its entries fall under `cwd` (the "scope guard meant for a different module" optimization). An EMPTY `allowed_paths` is a legal reject-all contract (Issue #1013 degenerate-empty case) — proceed with revert. diff --git a/pdd/prompts/agentic_sync_runner_python.prompt b/pdd/prompts/agentic_sync_runner_python.prompt index a8f39e350..f4bf40f76 100644 --- a/pdd/prompts/agentic_sync_runner_python.prompt +++ b/pdd/prompts/agentic_sync_runner_python.prompt @@ -5,7 +5,7 @@ "type": "module", "module": { "functions": [ - {"name": "AsyncSyncRunner", "signature": "(basenames: List[str], dep_graph: Dict[str, List[str]], sync_options: Dict[str, Any], github_info: Optional[Dict[str, Any]], quiet: bool = False, *, verbose: bool = False, issue_url: Optional[str] = None, module_cwds: Optional[Dict[str, Path]] = None, module_targets: Optional[Dict[str, str]] = None, initial_cost: float = 0.0, allowed_write_set: Optional[Iterable[str]] = None, companion_allowlist: Optional[Iterable[str]] = None, scope_guard_enabled: bool = True, contract_source: Optional[str] = None)", "returns": "AsyncSyncRunner"}, + {"name": "AsyncSyncRunner", "signature": "(basenames: List[str], dep_graph: Dict[str, List[str]], sync_options: Dict[str, Any], github_info: Optional[Dict[str, Any]], quiet: bool = False, *, verbose: bool = False, issue_url: Optional[str] = None, module_cwds: Optional[Dict[str, Path]] = None, module_targets: Optional[Dict[str, str]] = None, initial_cost: float = 0.0, allowed_write_set: Optional[Iterable[str]] = None, companion_allowlist: Optional[Iterable[str]] = None, scope_guard_enabled: bool = True, contract_source: Optional[str] = None, project_root: Optional[Path] = None)", "returns": "AsyncSyncRunner"}, {"name": "AsyncSyncRunner.run", "signature": "() -> Tuple[bool, str, float]", "returns": "Tuple[bool, str, float]"}, {"name": "build_dep_graph_from_architecture", "signature": "(arch_path: Path, target_basenames: List[str]) -> DepGraphFromArchitectureResult", "returns": "DepGraphFromArchitectureResult"}, {"name": "build_dep_graph_from_architecture_data", "signature": "(architecture: Any, target_basenames: List[str], *, source_name: str = 'architecture.json') -> DepGraphFromArchitectureResult", "returns": "DepGraphFromArchitectureResult"} @@ -27,7 +27,7 @@ Write the `pdd/agentic_sync_runner.py` module. Parallel sync engine that runs `pdd sync` for multiple modules concurrently using a ThreadPoolExecutor, respecting dependency ordering. Posts live progress updates to a GitHub issue comment. Supports state persistence for resumability across runs, phase tracking, and graceful interrupt handling. % Requirements -1. Class: `AsyncSyncRunner(basenames, dep_graph, sync_options, github_info, quiet, *, verbose, issue_url, module_cwds, module_targets, initial_cost=0.0, allowed_write_set=None, companion_allowlist=None, scope_guard_enabled=True, contract_source=None)` +1. Class: `AsyncSyncRunner(basenames, dep_graph, sync_options, github_info, quiet, *, verbose, issue_url, module_cwds, module_targets, initial_cost=0.0, allowed_write_set=None, companion_allowlist=None, scope_guard_enabled=True, contract_source=None, project_root=None)` - `basenames: List[str]` — modules to sync - `dep_graph: Dict[str, List[str]]` — basename -> [dependency basenames] - `sync_options: Dict` — budget, total_budget, target_coverage, skip_verify, skip_tests, agentic, no_steer, max_attempts, one_session, local, timeout_adder @@ -39,7 +39,8 @@ Parallel sync engine that runs `pdd sync` for multiple modules concurrently usin - `allowed_write_set: Optional[Iterable[str]]` — repo-relative path strings from the issue split contract that this sync run is permitted to modify. `None` means "no contract was parseable from the issue → run in permissive mode (no enforcement)". An explicit empty iterable means "contract present but empty → reject every change as out-of-scope" (a degenerate but legal contract). Resolved against each module's `cwd`/repo root inside the runner. - `companion_allowlist: Optional[Iterable[str]]` — additional glob patterns (e.g. `".pdd/meta/*.json"`) describing companion artifacts that MAY be modified outside the primary `allowed_write_set`. Defaults to `DEFAULT_SYNC_COMPANION_ALLOWLIST` from `agentic_common` (currently `(".pdd/meta/*.json",)`) when `None`. Issue contracts MAY widen the companion allowlist by passing a superset. - `scope_guard_enabled: bool` — master switch (default `True`). When `False`, the runner records the parsed contract for diagnostics but performs no enforcement, no revert, and no hard-fail. Maps to the CLI `--no-scope-guard` opt-out. - - `contract_source: Optional[str]` — diagnostic label carrying the parse source of the issue contract (`"html-comment"` or `"fenced-block"`, matching `IssueContract.source`) so scope-guard diagnostics and downstream review-loop reporters can surface where the contract was detected. `None` when no contract was parsed (permissive fallback). + - `contract_source: Optional[str]` — diagnostic label carrying the parse source of the issue contract (`"html-comment"`, `"fenced-block"`, or `"bullet-list"`, matching `IssueContract.source`) so scope-guard diagnostics and downstream review-loop reporters can surface where the contract was detected. `None` when no contract was parsed (permissive fallback). + - `project_root: Optional[Path]` — when non-`None`, overrides the default `Path.cwd()` used to seed `self.project_root` and to take the baseline-changed-paths snapshot (Issue #1013 iter-18 M-1). Subclasses such as `DurableSyncRunner` pin this to the durable worktree's git root so the baseline reflects the worktree where syncs will actually run, not the caller's current working directory. Resolved with `Path(project_root).resolve()` when provided. - Tracks per-module state: pending -> running -> success | failed 2. Method: `run() -> Tuple[bool, str, float]` — returns (all_success, summary_message, total_cost) where total_cost includes initial_cost + per-module costs 3. Use `concurrent.futures.ThreadPoolExecutor` with `MAX_WORKERS = 4`; when `sync_options["total_budget"]` is set, run sequentially and pass only the remaining total budget to each child process so the total budget is not multiplied per module. @@ -82,8 +83,8 @@ Parallel sync engine that runs `pdd sync` for multiple modules concurrently usin - .pdd/meta/*.json ``` - **Hard-fail policy (Issue #1013 acceptance criteria 3 and 4)**: if any out-of-scope path was detected, the module MUST be recorded as failed with `error="Scope guard hard-fail: out-of-scope artifacts detected"` followed by the diagnostic body. This blocks the per-module success record, blocks dependent modules from scheduling, and ensures checkup/review-loop reports surface the failure rather than burying it under an apparently-successful sync. Hard-fail applies even when the underlying `pdd sync` subprocess succeeded — the contract violation is the failure mode the scope guard exists to catch. - - **Permissive fallback**: when `self.allowed_write_set is None` (no parseable contract on the issue), `_enforce_scope_guard` returns immediately without enforcement. Document this in a one-line dim INFO log on `run()` entry so operators understand why enforcement is off for that run. - - **Opt-out**: when `self.scope_guard_enabled is False`, log a one-line dim WARNING on `run()` entry ("Scope guard disabled via --no-scope-guard") and skip enforcement entirely. Even an explicit `allowed_write_set` is recorded only for diagnostics in this mode. + - **Permissive fallback**: when `self.allowed_write_set is None` (no parseable contract on the issue), `_enforce_scope_guard` returns immediately without enforcement. The user-facing INFO log for this state is owned by the sync layer (`run_agentic_sync` in `pdd/agentic_sync.py`) — the runner MUST NOT emit a duplicate line on `run()` entry (Issue #1013 iter-18 m-1: previously logged twice per invocation; one source of truth now). + - **Opt-out**: when `self.scope_guard_enabled is False`, skip enforcement entirely. Even an explicit `allowed_write_set` is recorded only for diagnostics in this mode. The user-facing WARNING for this state is likewise owned by the sync layer — the runner MUST NOT log it again on `run()` entry (Issue #1013 iter-18 m-1). - The scope-guard step MUST run with a `threading.Lock` held around git operations on a per-`module_cwd` basis to avoid `git status` / `git checkout` races when modules share a repo root (the common shared-worktree case for non-durable issue sync). % Dataclass: `ModuleState` diff --git a/tests/test_agentic_common.py b/tests/test_agentic_common.py index 1dcb1ba4f..da818e576 100644 --- a/tests/test_agentic_common.py +++ b/tests/test_agentic_common.py @@ -7481,3 +7481,160 @@ def test_is_valid_companion_pattern_rejects_doublestar(self): assert _is_valid_companion_pattern(".pdd/meta/*.json") is True assert _is_valid_companion_pattern("architecture.json") is True assert _is_valid_companion_pattern("tests/test_*.py") is True + + # ------------------------------------------------------------------ + # Issue #1013 iter-18 B-1: bullet-list contract format + # ------------------------------------------------------------------ + + def test_bullet_list_contract_from_issue_1005(self): + """Iter-18 B-1: the real-world #1005 issue body (verbatim) MUST + parse to the three paths under ``**Allowed write set:**`` and + NOT the six unrelated paths under the earlier ``## Files`` + section. + """ + from pdd.agentic_common import parse_issue_contract, IssueContract + + body = ( + "## Problem\n" + "Single-file `pdd update ` clears stale `_run.json` reports but does not reliably save a fingerprint on success. This leaves `.pdd/meta/` looking partially synced.\n" + "\n" + "## Files\n" + "- `pdd/update_main.py`\n" + "- `pdd/prompts/update_main_python.prompt`\n" + "- `pdd/prompts/agentic_update_python.prompt`\n" + "- `.pdd/meta/update_main_python.json`\n" + "- `.pdd/meta/update_main_python_run.json`\n" + "- `tests/test_update_main.py` (regression test)\n" + "\n" + "## Desired Behavior\n" + "...\n" + "\n" + "---\n" + "## Split Contract\n" + "**Command sequence:** change → sync\n" + "**Allowed write set:**\n" + "- `pdd/update_main.py`\n" + "- `pdd/prompts/update_main_python.prompt`\n" + "- `tests/test_update_main.py`\n" + "**Acceptance criteria:**\n" + "- Successful `pdd update ` writes a current fingerprint to `.pdd/meta/.json`.\n" + "- Finalization failure produces an explicit user-visible warning (no silent stale metadata).\n" + "- Regression test in `tests/test_update_main.py` covers fingerprint save on success and the warning path on finalization failure.\n" + "**Independently mergeable:** True\n" + "**Scope rule:** Do not expand beyond this contract or implement sibling sub-issue work. If the contract is insufficient, report the gap instead.\n" + ) + c = parse_issue_contract(body) + assert isinstance(c, IssueContract) + assert c.allowed_paths == ( + "pdd/update_main.py", + "pdd/prompts/update_main_python.prompt", + "tests/test_update_main.py", + ) + assert c.source == "bullet-list" + + def test_bullet_list_stops_at_next_label(self): + """Iter-18 B-1: the ``**Acceptance criteria:**`` label terminates + the bullet list — bullets under it MUST NOT join the write set. + """ + from pdd.agentic_common import parse_issue_contract, IssueContract + + body = ( + "## Split Contract\n" + "**Allowed write set:**\n" + "- pdd/foo.py\n" + "- tests/test_foo.py\n" + "**Acceptance criteria:**\n" + "- a thing\n" + "- another thing\n" + ) + c = parse_issue_contract(body) + assert isinstance(c, IssueContract) + assert c.allowed_paths == ("pdd/foo.py", "tests/test_foo.py") + assert c.source == "bullet-list" + + def test_bullet_list_stops_at_horizontal_rule(self): + """Iter-18 B-1: ``---`` terminates the bullet list.""" + from pdd.agentic_common import parse_issue_contract, IssueContract + + body = ( + "## Split Contract\n" + "**Allowed write set:**\n" + "- pdd/foo.py\n" + "- tests/test_foo.py\n" + "---\n" + "Other section\n" + "- not_in_contract.py\n" + ) + c = parse_issue_contract(body) + assert isinstance(c, IssueContract) + assert c.allowed_paths == ("pdd/foo.py", "tests/test_foo.py") + + def test_bullet_list_strips_backticks_on_paths(self): + """Iter-18 B-1: backtick-wrapped paths in bullets are accepted; the + backticks are stripped before validation.""" + from pdd.agentic_common import parse_issue_contract, IssueContract + + body = ( + "## Split Contract\n" + "**Allowed write set:**\n" + "- `pdd/foo.py`\n" + "- `tests/test_foo.py`\n" + ) + c = parse_issue_contract(body) + assert isinstance(c, IssueContract) + assert c.allowed_paths == ("pdd/foo.py", "tests/test_foo.py") + assert c.source == "bullet-list" + + def test_bullet_list_under_allowed_write_set_heading(self): + """Iter-18 B-1: ``## Allowed Write Set`` is an accepted heading + (matches the same regex as ``## Split Contract``).""" + from pdd.agentic_common import parse_issue_contract, IssueContract + + body = ( + "## Allowed Write Set\n" + "**Allowed write set:**\n" + "- pdd/foo.py\n" + "- tests/test_foo.py\n" + ) + c = parse_issue_contract(body) + assert isinstance(c, IssueContract) + assert c.allowed_paths == ("pdd/foo.py", "tests/test_foo.py") + assert c.source == "bullet-list" + + def test_bullet_list_with_no_valid_paths(self): + """Iter-18 B-1: bullets that are all invalid (parent-traversal, + absolute, etc.) reduce to a degenerate reject-all contract per + the iter-8 B5 semantics — the contract is still returned with + ``allowed_paths=()``, NOT ``None``.""" + from pdd.agentic_common import parse_issue_contract, IssueContract + + body = ( + "## Split Contract\n" + "**Allowed write set:**\n" + "- ../escape\n" + "- /absolute/path\n" + "- pdd\\windows_sep.py\n" + ) + c = parse_issue_contract(body) + assert isinstance(c, IssueContract) + assert c.allowed_paths == () + assert c.source == "bullet-list" + + def test_html_comment_wins_over_bullet_list(self): + """Iter-18 B-1: when BOTH formats appear, the HTML-comment branch + wins (spec-preferred priority order is preserved).""" + from pdd.agentic_common import parse_issue_contract, IssueContract + + body = ( + '\n' + "\n" + "## Split Contract\n" + "**Allowed write set:**\n" + "- from_bullets.py\n" + ) + c = parse_issue_contract(body) + assert isinstance(c, IssueContract) + assert c.allowed_paths == ("from_html.py",) + assert c.source == "html-comment" diff --git a/tests/test_agentic_sync_runner.py b/tests/test_agentic_sync_runner.py index cb3372769..560f6871a 100644 --- a/tests/test_agentic_sync_runner.py +++ b/tests/test_agentic_sync_runner.py @@ -2600,6 +2600,72 @@ def test_explicit_empty_contract_rejects_all_changes(self): assert runner.allowed_write_paths == set() assert runner.max_workers == 1 + def test_async_runner_project_root_kwarg_overrides_cwd( + self, tmp_path, monkeypatch + ): + """Iter-18 M-1: the new keyword-only ``project_root`` kwarg MUST + override the default ``Path.cwd()`` and MUST be applied BEFORE the + baseline-changed-paths snapshot is taken — otherwise subclasses + (e.g. ``DurableSyncRunner``) cannot pin the baseline to a known + repo root. + """ + import subprocess + + # Initialise a real git repo at ``durable_root`` so the baseline + # snapshot's ``git status`` invocation actually runs. + durable_root = tmp_path / "durable_root" + durable_root.mkdir() + subprocess.run( + ["git", "init", "-b", "main", str(durable_root)], + check=True, + capture_output=True, + ) + subprocess.run( + ["git", "-C", str(durable_root), "config", "user.email", "t@t.invalid"], + check=True, + capture_output=True, + ) + subprocess.run( + ["git", "-C", str(durable_root), "config", "user.name", "T"], + check=True, + capture_output=True, + ) + (durable_root / "README.md").write_text("initial") + subprocess.run( + ["git", "-C", str(durable_root), "add", "README.md"], + check=True, + capture_output=True, + ) + subprocess.run( + ["git", "-C", str(durable_root), "commit", "-m", "init"], + check=True, + capture_output=True, + ) + + # Dirty file inside durable_root (should appear in baseline). + (durable_root / "dirty.py").write_text("user wip") + + # The CALLER's cwd is a different directory entirely. A dirty file + # there MUST NOT leak into the runner's baseline. + caller_cwd = tmp_path / "caller_cwd" + caller_cwd.mkdir() + (caller_cwd / "out.py").write_text("dirty file in caller cwd") + monkeypatch.chdir(caller_cwd) + + runner = AsyncSyncRunner( + basenames=["a"], + dep_graph={"a": []}, + sync_options={}, + github_info=None, + quiet=True, + allowed_write_set=["pdd/a.py"], + project_root=durable_root, + ) + + assert runner.project_root == durable_root.resolve() + assert "dirty.py" in runner._baseline_changed_paths + assert "out.py" not in runner._baseline_changed_paths + class TestEnforceScopeGuard: """Issue #1013 (F9): direct behavioural coverage for ``_enforce_scope_guard`` @@ -2745,23 +2811,27 @@ def fake_revert(repo_root, allowed_files): # Sibling module's companion artifact must NOT be auto-allowed. assert (module_b / ".pdd" / "meta" / "x.json").resolve() not in files - def test_run_entry_logs_permissive_mode(self, capsys): - """Iter-3 F2: runner emits dim INFO on run() entry when no contract.""" + def test_run_entry_does_not_log_permissive_mode_again(self, capsys): + """Iter-18 m-1: ``run_agentic_sync`` already emits one user-facing + line per invocation covering all three states (disabled / contract + loaded / no contract). The runner used to emit a second duplicate + line on ``run()`` entry — removed in iter-18 so the operator sees + a single authoritative status line. + """ runner = self._make_runner( allowed_write_set=None, quiet=False, ) - # Make run() return immediately by emptying the basenames list AFTER - # construction; the dispatch loop short-circuits and we only want the - # entry log. runner.basenames = [] runner.run() out = capsys.readouterr().out - assert "permissive mode" in out + # The runner-side duplicate is gone. + assert "permissive mode" not in out - def test_run_entry_logs_opt_out_warning(self, capsys): - """Iter-3 F2: runner emits dim WARNING on run() entry when scope guard - is disabled via --no-scope-guard.""" + def test_run_entry_does_not_log_opt_out_warning_again(self, capsys): + """Iter-18 m-1: caller-side log owns the opt-out warning; the + runner-side duplicate was removed. + """ runner = self._make_runner( allowed_write_set=["pdd/foo.py"], scope_guard_enabled=False, @@ -2770,7 +2840,7 @@ def test_run_entry_logs_opt_out_warning(self, capsys): runner.basenames = [] runner.run() out = capsys.readouterr().out - assert "--no-scope-guard" in out + assert "--no-scope-guard" not in out def test_pre_existing_untracked_files_are_preserved(self, tmp_path): """Iter-6 B1 (data-loss bug): a user's pre-existing untracked file diff --git a/tests/test_durable_sync_runner.py b/tests/test_durable_sync_runner.py index 695e4f850..57cd11447 100644 --- a/tests/test_durable_sync_runner.py +++ b/tests/test_durable_sync_runner.py @@ -746,3 +746,62 @@ def test_total_budget_keeps_durable_runner_single_worker(tmp_path: Path): ) assert runner.max_workers == 1 + + +def test_durable_baseline_paths_use_git_root_not_caller_cwd( + tmp_path: Path, monkeypatch: pytest.MonkeyPatch +): + """Issue #1013 iter-18 M-1: ``DurableSyncRunner`` MUST take its baseline- + changed-paths snapshot from the durable repo root, NOT from the caller's + current working directory. Reviewer reproduced a regression where a + dirty file in the main checkout (``out.py``) was auto-allowed by the + scope guard inside the durable worktree because the baseline was taken + from ``Path.cwd()`` before ``DurableSyncRunner.__init__`` reassigned + ``project_root`` to ``self.git_root``. + """ + caller_cwd = tmp_path / "caller_cwd" + caller_cwd.mkdir() + # Dirty file under the caller's cwd; should NOT leak into baseline. + (caller_cwd / "out.py").write_text("dirty file in caller's cwd") + + durable_root = _init_repo_with_remote(tmp_path) + monkeypatch.chdir(caller_cwd) + + runner = _runner( + durable_root, + runner_cls=EmptyDurableRunner, + allowed_write_set=["pdd/foo.py"], + companion_allowlist=[".pdd/meta/*.json"], + ) + + assert runner.project_root == durable_root.resolve() + # Baseline snapshot was taken against durable_root, where ``out.py`` + # does not exist as a dirty file. The caller's dirty ``out.py`` MUST + # NOT appear in the baseline. + assert "out.py" not in runner._baseline_changed_paths + + +def test_durable_baseline_includes_dirty_files_in_durable_root( + tmp_path: Path, monkeypatch: pytest.MonkeyPatch +): + """Iter-18 M-1 (positive case): a dirty file ACTUALLY under the durable + repo root MUST appear in ``_baseline_changed_paths`` so the scope guard + preserves pre-existing user work-in-progress under the durable worktree. + """ + caller_cwd = tmp_path / "caller_cwd" + caller_cwd.mkdir() + durable_root = _init_repo_with_remote(tmp_path) + + # Dirty (untracked) file inside the durable repo root. + (durable_root / "dirty.py").write_text("user work-in-progress") + + monkeypatch.chdir(caller_cwd) + runner = _runner( + durable_root, + runner_cls=EmptyDurableRunner, + allowed_write_set=["pdd/foo.py"], + companion_allowlist=[".pdd/meta/*.json"], + ) + + assert runner.project_root == durable_root.resolve() + assert "dirty.py" in runner._baseline_changed_paths From 7e46cd38ce301cd3c49a5d98af86628857379663 Mon Sep 17 00:00:00 2001 From: Serhan Date: Fri, 15 May 2026 13:01:55 -0700 Subject: [PATCH 31/42] fix(sync): iter-20 detect gitignored out-of-scope writes MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Codex flagged this gap three times (iter-11 B-2, iter-17 M-1, iter-19 M-1). Earlier passes deferred it as theoretical, but the project itself gitignores .pdd/ (main commit a7ce5f0ee chore: ignore untracked pdd metadata artifacts), so the trigger is no longer hypothetical: - User has `build/` (or `.pdd/`) in .gitignore. - Sync writes a file there. - The post-revert re-scan uses `git status --untracked-files=all` which OMITS gitignored files. - Scope guard returns None and the contract violation lands in the PR. Add `_git_ignored_paths()` using `git ls-files --others --ignored --exclude-standard` (enumerates every ignored file, not just dirs). Snapshot `_baseline_ignored_paths` at runner init, gated on `scope_guard_enabled AND allowed_write_paths is not None` so non-contract runs skip the cost (potentially slow on repos with large ignored trees like node_modules/, build/). In `_remaining_out_of_scope_paths`, run a second scan over ignored files after the existing `git status` scan. Skip paths in the baseline (pre-existing — not sync's fault) and paths in `allowed_files` (companion-allowlisted, e.g. `.pdd/meta/*.json` for users who gitignore `.pdd/`). The remainder is treated as out-of-scope and surfaces under the existing `Unrecovered (revert failed, manual cleanup required):` section. The `` sentinel still applies — failure of the ignored scan also forces a hard-fail rather than silent pass. DurableSyncRunner left untouched: it does `git add -A` before scope checking, so ignored files surface as staged regardless. Tests: gitignored-out-of-scope detection, baseline-preserved-on-existing, gitignored-companion-allowlisted-still-allowed (verifies the .pdd/ workflow the project actually uses), ignored-scan-failure-sentinel. 162 passed in test_agentic_sync_runner.py. Co-Authored-By: Claude Opus 4.7 --- pdd/agentic_sync_runner.py | 109 +++++++- pdd/prompts/agentic_sync_runner_python.prompt | 3 +- tests/test_agentic_sync_runner.py | 234 ++++++++++++++++++ 3 files changed, 339 insertions(+), 7 deletions(-) diff --git a/pdd/agentic_sync_runner.py b/pdd/agentic_sync_runner.py index facf3d88b..295f24ba7 100644 --- a/pdd/agentic_sync_runner.py +++ b/pdd/agentic_sync_runner.py @@ -187,6 +187,42 @@ def _git_changed_paths(project_root: Path) -> set[str]: return {p for p in paths if p} +def _git_ignored_paths(project_root: Path) -> set[str]: + """Return repo-relative POSIX paths of git-ignored files (Issue #1013 iter-20). + + Uses ``git ls-files --others --ignored --exclude-standard`` to enumerate + every individual ignored file (no directory entries, no status prefix). + May be slow on repos with large ignored trees (``node_modules/``, + ``build/``, etc.) — callers MUST gate the call on + ``scope_guard_enabled AND allowed_write_paths is not None`` so non-contract + runs do not pay the cost. + + Returns an empty set on any subprocess failure or non-zero return — the + baseline snapshot is best-effort. The post-revert re-scan in + ``_remaining_out_of_scope_paths`` handles ignored-scan failures with the + explicit ```` sentinel instead. + """ + try: + result = subprocess.run( + ["git", "ls-files", "--others", "--ignored", "--exclude-standard"], + cwd=project_root, + capture_output=True, + text=True, + check=False, + ) + except (OSError, subprocess.SubprocessError): + return set() + if result.returncode != 0: + return set() + + paths: set[str] = set() + for line in result.stdout.splitlines(): + rel = _normalize_repo_path(line.strip().strip('"')) + if rel: + paths.add(rel) + return paths + + # --------------------------------------------------------------------------- # Helper functions # --------------------------------------------------------------------------- @@ -958,6 +994,24 @@ def __init__( else set() ) + # Iter-20 M-1 (gitignored fail-open): also snapshot pre-existing + # gitignored files (e.g. user-side ``build/cache.bin`` under a + # repo-wide ``.gitignore: build/``). ``git status`` does not show + # ignored files by default, so without this baseline a sync that + # writes to a gitignored path outside the allowed write set would be + # invisible to the post-revert re-scan and the module would be + # marked successful with the contract violated on disk. + # + # Gated on ``scope_guard_enabled AND allowed_write_paths is not None`` + # (same gate as ``_baseline_changed_paths``) so non-contract runs do + # not pay the ``git ls-files`` cost on repos with large ignored + # trees (``node_modules/``, ``build/``, etc.). + self._baseline_ignored_paths: Set[str] = ( + _git_ignored_paths(self.project_root) + if self.scope_guard_enabled and self.allowed_write_paths is not None + else set() + ) + self.total_budget = self.sync_options.get("total_budget") self.max_workers = 1 if self.total_budget is not None else MAX_WORKERS # When a contract narrows writes AND scope-guard enforcement is @@ -1925,14 +1979,24 @@ def _remaining_out_of_scope_paths( treat the empty list as "nothing was out of scope" and let the module succeed with the contract still violated on disk. + Iter-20 M-1 (gitignored fail-open): the standard ``git status`` scan + does NOT report gitignored files. A sync that writes outside the + contract into a gitignored path (e.g. ``build/junk.txt`` under a + repo-wide ``.gitignore: build/``) would be invisible. A SECOND scan + via ``git ls-files --others --ignored --exclude-standard`` enumerates + every individual ignored file; results not in ``allowed_files`` and + not in ``self._baseline_ignored_paths`` (pre-existing ignored files + the user owned BEFORE sync ran) are added to the ``remaining`` set. + Returns: Sorted list of POSIX repo-relative paths still out of scope, OR - the sentinel ``[""]`` when ``git status`` - itself cannot be executed (timeout / missing git / non-zero - return). The sentinel is consistent with the warning-log + - empty-list style used elsewhere in the scope guard, but still - forces ``_enforce_scope_guard`` to hard-fail rather than treat - the unobservable working tree as clean. + the sentinel ``[""]`` when EITHER the + ``git status`` scan OR the ``git ls-files --ignored`` scan + cannot be executed (timeout / missing git / non-zero return). + The sentinel is consistent with the warning-log + empty-list + style used elsewhere in the scope guard, but still forces + ``_enforce_scope_guard`` to hard-fail rather than treat the + unobservable working tree as clean. """ try: result = subprocess.run( @@ -1967,6 +2031,39 @@ def _remaining_out_of_scope_paths( if absolute in allowed_files: continue remaining.add(rel) + + # Iter-20 M-1: scan for gitignored files that the standard + # ``git status`` pass above cannot see. ``git ls-files + # --others --ignored --exclude-standard`` lists every individual + # ignored file (no directory rollup, no status prefix). + try: + ignored_result = subprocess.run( + ["git", "-C", str(repo_root), "ls-files", + "--others", "--ignored", "--exclude-standard"], + capture_output=True, text=True, timeout=30, + ) + except (subprocess.TimeoutExpired, FileNotFoundError, OSError): + return [""] + if ignored_result.returncode != 0: + return [""] + + baseline_ignored = getattr(self, "_baseline_ignored_paths", set()) + for line in ignored_result.stdout.splitlines(): + rel = _normalize_repo_path(line.strip().strip('"')) + if not rel: + continue + # Pre-existing ignored files (snapshot at runner init) are NOT + # the sync run's responsibility. + if rel in baseline_ignored: + continue + absolute = (repo_root / rel).resolve() + # Companion-allowlisted files (e.g. ``.pdd/meta/*.json`` when + # the user has ``.pdd/`` in ``.gitignore``) are in + # ``allowed_files`` via the rglob pass in ``_enforce_scope_guard``. + if absolute in allowed_files: + continue + remaining.add(rel) + return sorted(remaining) def _enforce_scope_guard( diff --git a/pdd/prompts/agentic_sync_runner_python.prompt b/pdd/prompts/agentic_sync_runner_python.prompt index f4bf40f76..d9182eb36 100644 --- a/pdd/prompts/agentic_sync_runner_python.prompt +++ b/pdd/prompts/agentic_sync_runner_python.prompt @@ -41,6 +41,7 @@ Parallel sync engine that runs `pdd sync` for multiple modules concurrently usin - `scope_guard_enabled: bool` — master switch (default `True`). When `False`, the runner records the parsed contract for diagnostics but performs no enforcement, no revert, and no hard-fail. Maps to the CLI `--no-scope-guard` opt-out. - `contract_source: Optional[str]` — diagnostic label carrying the parse source of the issue contract (`"html-comment"`, `"fenced-block"`, or `"bullet-list"`, matching `IssueContract.source`) so scope-guard diagnostics and downstream review-loop reporters can surface where the contract was detected. `None` when no contract was parsed (permissive fallback). - `project_root: Optional[Path]` — when non-`None`, overrides the default `Path.cwd()` used to seed `self.project_root` and to take the baseline-changed-paths snapshot (Issue #1013 iter-18 M-1). Subclasses such as `DurableSyncRunner` pin this to the durable worktree's git root so the baseline reflects the worktree where syncs will actually run, not the caller's current working directory. Resolved with `Path(project_root).resolve()` when provided. + - `_baseline_ignored_paths: Set[str]` (Issue #1013 iter-20 M-1) — sibling snapshot to `_baseline_changed_paths`, populated from `git ls-files --others --ignored --exclude-standard` at init via the helper `_git_ignored_paths(project_root)`. Records repo-relative POSIX paths of pre-existing gitignored files (e.g. user-side `build/cache.bin` under a repo-wide `.gitignore: build/`) so the post-revert re-scan does not flag them as the sync run's out-of-scope writes. Gated identically to `_baseline_changed_paths` (`scope_guard_enabled AND allowed_write_paths is not None`) so non-contract runs do not pay the `git ls-files` cost on repos with large ignored trees (`node_modules/`, `build/`, etc.). - Tracks per-module state: pending -> running -> success | failed 2. Method: `run() -> Tuple[bool, str, float]` — returns (all_success, summary_message, total_cost) where total_cost includes initial_cost + per-module costs 3. Use `concurrent.futures.ThreadPoolExecutor` with `MAX_WORKERS = 4`; when `sync_options["total_budget"]` is set, run sequentially and pass only the remaining total budget to each child process so the total budget is not multiplied per module. @@ -146,7 +147,7 @@ Parallel sync engine that runs `pdd sync` for multiple modules concurrently usin - `_parse_cost_from_csv(csv_path: str) -> float`: sum cost column from PDD_OUTPUT_COST_PATH CSV - `_format_duration(start, end) -> str`: format seconds as "Xs" or "Xm Ys" - `_enforce_scope_guard(self, basename: str, module_cwd: Path) -> Optional[str]`: Issue #1013 scope guard. Returns `None` when the module is in scope; returns a multi-line diagnostic string (see Req 22) when out-of-scope artifacts were detected. Callers (the per-future completion handler) treat a non-None return as a module failure and replace any prior success record with it. No-ops when `self.scope_guard_enabled is False` or `self.allowed_write_set is None`. Reuses `pdd.agentic_common._revert_out_of_scope_changes` and `pdd.agentic_common_worktree.revert_out_of_scope_changes_with_dirs` rather than reimplementing git scanning. **Fail-closed safety (Issue #1013 iter-9, M-1)**: after invoking both revert helpers, the runner MUST re-scan the worktree via `_remaining_out_of_scope_paths(repo_root, allowed_files)` and hard-fail when ANY out-of-scope artifacts remain, even if the revert helpers returned empty lists. The helpers fail-open on git timeout / permission error / restore failure (they log a warning and return `[]`); without the re-scan the orchestrator would treat that as "nothing was out of scope" and let the module succeed with the contract still violated on disk. Unrecovered paths surface in the diagnostic under a distinct `Unrecovered (revert failed, manual cleanup required):` section. -- `_remaining_out_of_scope_paths(self, repo_root: Path, allowed_files: Set[Path]) -> List[str]`: Issue #1013 iter-9 (M-1) re-scan helper. Runs `git status --porcelain --untracked-files=all` in *repo_root* with a 30s timeout, parses each line (handling the `R old -> new` rename format and the `_normalize_repo_path` cleanup), resolves each path against *allowed_files*, and returns a sorted list of POSIX repo-relative paths still NOT in the allow set. On git failure (timeout, missing binary, non-zero return) returns the single-element sentinel `[""]` so `_enforce_scope_guard` hard-fails rather than silently treating an unobservable worktree as clean. Consistent with the warning-log + empty-list pattern used by `_revert_out_of_scope_changes` and `revert_out_of_scope_changes_with_dirs` — but the sentinel value, not an empty list, is what forces the hard-fail. +- `_remaining_out_of_scope_paths(self, repo_root: Path, allowed_files: Set[Path]) -> List[str]`: Issue #1013 iter-9 (M-1) re-scan helper. Runs `git status --porcelain --untracked-files=all` in *repo_root* with a 30s timeout, parses each line (handling the `R old -> new` rename format and the `_normalize_repo_path` cleanup), resolves each path against *allowed_files*, and returns a sorted list of POSIX repo-relative paths still NOT in the allow set. **Iter-20 M-1 (gitignored fail-open):** the standard `git status` scan does NOT report gitignored files, so a sync that writes outside the contract into a gitignored path (e.g. `build/junk.txt` under a repo-wide `.gitignore: build/`) would otherwise be invisible to the re-scan. A SECOND scan via `git ls-files --others --ignored --exclude-standard` (also with a 30s timeout) enumerates every individual ignored file; entries are skipped when present in `self._baseline_ignored_paths` (pre-existing user-owned ignored files snapshotted at runner init) or in `allowed_files` (companion artifacts like `.pdd/meta/*.json` when the user has `.pdd/` in `.gitignore`); the rest are merged into the returned set. On EITHER git scan failing (timeout, missing binary, non-zero return) the function returns the single-element sentinel `[""]` so `_enforce_scope_guard` hard-fails rather than silently treating an unobservable worktree as clean. Consistent with the warning-log + empty-list pattern used by `_revert_out_of_scope_changes` and `revert_out_of_scope_changes_with_dirs` — but the sentinel value, not an empty list, is what forces the hard-fail. - `_parse_conformance_failure(stdout: str, stderr: str) -> Optional[Tuple[str, Tuple[str, ...]]]`: scan combined stdout+stderr for the line prefix `Architecture conformance error for ` and, when matched, return `(repair_directive, missing_symbols)` where `missing_symbols` is a sorted tuple of the symbols listed after any of the following inline shapes (route each into its own directive bucket — they MUST NOT be merged): - (a) `declared symbols missing from generated code:` — default `ArchitectureConformanceError` shape (architecture.json symbol-existence check). - (b) `Python code uses camelCase names (...)` parenthesised list — camelCase guard. diff --git a/tests/test_agentic_sync_runner.py b/tests/test_agentic_sync_runner.py index 560f6871a..5e25ff2cd 100644 --- a/tests/test_agentic_sync_runner.py +++ b/tests/test_agentic_sync_runner.py @@ -3232,6 +3232,240 @@ def test_git_status_failed_sentinel_surfaces_in_diagnostic( # indicator is present. assert "git-status-failed" in diagnostic + # --------------------------------------------------------------------- + # Iter-20 M-1: gitignored out-of-scope detection + # --------------------------------------------------------------------- + + @staticmethod + def _init_git_repo(repo_root: Path) -> None: + """Initialise a minimal committed git repo for ignored-scan tests.""" + subprocess.run( + ["git", "init", "-b", "main", str(repo_root)], + check=True, capture_output=True, + ) + subprocess.run( + ["git", "-C", str(repo_root), "config", "user.email", "t@t.invalid"], + check=True, capture_output=True, + ) + subprocess.run( + ["git", "-C", str(repo_root), "config", "user.name", "T"], + check=True, capture_output=True, + ) + (repo_root / "README.md").write_text("initial") + subprocess.run( + ["git", "-C", str(repo_root), "add", "README.md"], + check=True, capture_output=True, + ) + subprocess.run( + ["git", "-C", str(repo_root), "commit", "-m", "init"], + check=True, capture_output=True, + ) + + def test_gitignored_out_of_scope_file_is_detected( + self, tmp_path, monkeypatch + ): + """Iter-20 M-1: a sync that writes to a gitignored path outside the + contract (e.g. ``build/junk.txt`` under ``.gitignore: build/``) is + invisible to ``git status --untracked-files=all`` — but the second + ``git ls-files --ignored`` scan MUST catch it and surface it via the + ```` set so ``_enforce_scope_guard`` hard-fails.""" + from pdd import agentic_sync_runner as mod + + self._init_git_repo(tmp_path) + (tmp_path / ".gitignore").write_text("build/\n") + subprocess.run( + ["git", "-C", str(tmp_path), "add", ".gitignore"], + check=True, capture_output=True, + ) + subprocess.run( + ["git", "-C", str(tmp_path), "commit", "-m", "ignore build"], + check=True, capture_output=True, + ) + + # Construct the runner FIRST (empty ignored baseline), then create + # the gitignored stray AFTER — simulates sync writing it. + monkeypatch.chdir(tmp_path) + runner = self._make_runner( + allowed_write_set=["pdd/foo.py"], + companion_allowlist=[".pdd/meta/*.json"], + ) + runner.project_root = tmp_path.resolve() + assert runner._baseline_ignored_paths == set(), ( + "no ignored files exist yet — baseline must be empty" + ) + + # Sync writes a gitignored file outside the contract. + (tmp_path / "build").mkdir() + (tmp_path / "build" / "junk.txt").write_text("bad") + + # Revert helpers can't see gitignored files — simulate them + # returning empty as Codex's repro describes. + monkeypatch.setattr(mod, "_revert_out_of_scope_changes", + lambda _root, _allowed: []) + monkeypatch.setattr( + mod, "revert_out_of_scope_changes_with_dirs", + lambda _root, allowed_dirs, allowed_files: [], + ) + monkeypatch.setattr( + runner, "_resolve_repo_root", lambda _cwd: tmp_path.resolve() + ) + + diagnostic = runner._enforce_scope_guard("mod", tmp_path) + + assert diagnostic is not None, ( + "gitignored out-of-scope file must hard-fail the module" + ) + assert "build/junk.txt" in diagnostic + assert "Unrecovered" in diagnostic + + def test_gitignored_baseline_file_is_not_falsely_flagged( + self, tmp_path, monkeypatch + ): + """Iter-20 M-1: pre-existing gitignored files (snapshotted at runner + init) MUST NOT be flagged as the sync run's out-of-scope writes.""" + from pdd import agentic_sync_runner as mod + + self._init_git_repo(tmp_path) + (tmp_path / ".gitignore").write_text("cache.bin\n") + subprocess.run( + ["git", "-C", str(tmp_path), "add", ".gitignore"], + check=True, capture_output=True, + ) + subprocess.run( + ["git", "-C", str(tmp_path), "commit", "-m", "ignore cache"], + check=True, capture_output=True, + ) + + # Create the gitignored file BEFORE constructing the runner so it + # lands in the baseline ignored set. + (tmp_path / "cache.bin").write_text("user cache") + + monkeypatch.chdir(tmp_path) + runner = self._make_runner( + allowed_write_set=["pdd/foo.py"], + companion_allowlist=[".pdd/meta/*.json"], + ) + runner.project_root = tmp_path.resolve() + + assert "cache.bin" in runner._baseline_ignored_paths, ( + "pre-existing ignored file must be captured in baseline" + ) + + monkeypatch.setattr(mod, "_revert_out_of_scope_changes", + lambda _root, _allowed: []) + monkeypatch.setattr( + mod, "revert_out_of_scope_changes_with_dirs", + lambda _root, allowed_dirs, allowed_files: [], + ) + monkeypatch.setattr( + runner, "_resolve_repo_root", lambda _cwd: tmp_path.resolve() + ) + + diagnostic = runner._enforce_scope_guard("mod", tmp_path) + + assert diagnostic is None, ( + "pre-existing ignored file must not be flagged: " + f"got {diagnostic!r}" + ) + + def test_gitignored_companion_artifact_is_allowed( + self, tmp_path, monkeypatch + ): + """Iter-20 M-1 integration: when the user has ``.pdd/`` in + ``.gitignore`` (this very project does — see commit a7ce5f0ee), the + fingerprint metadata file ``.pdd/meta/mod_python.json`` is gitignored + — but it's also in the default companion allowlist, so the existing + ``rglob`` + companion-match path in ``_enforce_scope_guard`` adds it + to ``allowed_files``. The new ignored-files scan MUST skip it. + """ + from pdd import agentic_sync_runner as mod + + self._init_git_repo(tmp_path) + (tmp_path / ".gitignore").write_text(".pdd/\n") + subprocess.run( + ["git", "-C", str(tmp_path), "add", ".gitignore"], + check=True, capture_output=True, + ) + subprocess.run( + ["git", "-C", str(tmp_path), "commit", "-m", "ignore .pdd"], + check=True, capture_output=True, + ) + + monkeypatch.chdir(tmp_path) + runner = self._make_runner( + allowed_write_set=["pdd/foo.py"], + companion_allowlist=[".pdd/meta/*.json"], + ) + runner.project_root = tmp_path.resolve() + + # Sync writes a companion-allowed fingerprint file — also gitignored. + (tmp_path / ".pdd" / "meta").mkdir(parents=True) + (tmp_path / ".pdd" / "meta" / "mod_python.json").write_text("{}") + + monkeypatch.setattr(mod, "_revert_out_of_scope_changes", + lambda _root, _allowed: []) + monkeypatch.setattr( + mod, "revert_out_of_scope_changes_with_dirs", + lambda _root, allowed_dirs, allowed_files: [], + ) + monkeypatch.setattr( + runner, "_resolve_repo_root", lambda _cwd: tmp_path.resolve() + ) + + diagnostic = runner._enforce_scope_guard("mod", tmp_path) + + assert diagnostic is None, ( + "gitignored companion artifact must remain auto-allowed: " + f"got {diagnostic!r}" + ) + + def test_ignored_scan_failure_returns_sentinel( + self, tmp_path, monkeypatch + ): + """Iter-20 M-1 fail-closed: when the ``git ls-files --ignored`` scan + itself fails (here: forced FileNotFoundError), the remaining-paths + helper MUST return the existing ```` sentinel so + ``_enforce_scope_guard`` hard-fails rather than treating an + unobservable worktree as clean. + + The status scan succeeds (returning an empty set); only the ignored + scan fails. The sentinel symmetry across both scans is what the + iter-19 review flagged as required. + """ + from pdd import agentic_sync_runner as mod + + self._init_git_repo(tmp_path) + + monkeypatch.chdir(tmp_path) + runner = self._make_runner( + allowed_write_set=["pdd/foo.py"], + companion_allowlist=[".pdd/meta/*.json"], + ) + runner.project_root = tmp_path.resolve() + + real_run = subprocess.run + + def fake_run(cmd, *args, **kwargs): + # The status scan keeps working; the ignored scan blows up. + if ( + isinstance(cmd, list) + and "ls-files" in cmd + and "--ignored" in cmd + ): + raise FileNotFoundError("git ls-files unavailable") + return real_run(cmd, *args, **kwargs) + + monkeypatch.setattr( + mod.subprocess, "run", fake_run + ) + + result = runner._remaining_out_of_scope_paths( + tmp_path.resolve(), allowed_files=set() + ) + assert result == [""], ( + "ignored-scan failure must surface the sentinel" + ) + # --------------------------------------------------------------------------- # Issue #745: initial_cost (LLM module analysis cost) tracking From ff52568e09c7ad05c2907fea54eaa352d847157d Mon Sep 17 00:00:00 2001 From: Serhan Date: Fri, 15 May 2026 13:11:07 -0700 Subject: [PATCH 32/42] fix(sync): iter-22 M-1 clear durable baseline so main-checkout dirty WIP doesn't leak into worktrees MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Codex iter-21 caught a baseline leakage that iter-18's project_root threading did NOT cover. iter-18 pinned the AsyncSyncRunner baseline snapshot to self.git_root, but in production self.git_root IS the user's main checkout — where dirty WIP lives. Per-module durable sync then runs in .pdd/worktrees/sync-issue--/ (a separate directory). Inheriting the main-checkout baseline meant dirty out.py in the main checkout was treated as allowed inside fresh worktrees, silently bypassing the split contract. External reviewer's repro confirmed: dirty out.py in main → sync writes out.py to worktree → scope guard returned None. Fix: after super().__init__(), clear both _baseline_changed_paths (iter-6 B1) and _baseline_ignored_paths (iter-20 M-1) in DurableSyncRunner. Per-module worktrees are freshly created via git worktree add; they have no pre-existing user WIP by construction, so the "preserve untracked WIP" carve-out has no analog in durable mode. Anything that surfaces in a worktree at scope-guard time was put there by this sync run. AsyncSyncRunner unchanged — async still runs in-place in the user's checkout, and the baseline preserves pre-existing WIP correctly there. Tests: empty-baseline invariant (even when git_root is dirty), and the reviewer's repro (dirty out.py in caller cwd + out.py in separate worktree → diagnostic NOT None). iter-18 test test_durable_baseline_paths_use_git_root_not_caller_cwd inverted to the new invariant; iter-18 test for dirty-files-in-durable-root replaced by the iter-22 empty-baseline invariant. Co-Authored-By: Claude Opus 4.7 --- pdd/durable_sync_runner.py | 20 +++ pdd/prompts/durable_sync_runner_python.prompt | 2 +- tests/test_durable_sync_runner.py | 125 +++++++++++++++--- 3 files changed, 126 insertions(+), 21 deletions(-) diff --git a/pdd/durable_sync_runner.py b/pdd/durable_sync_runner.py index 1e4a44e63..fec60dc46 100644 --- a/pdd/durable_sync_runner.py +++ b/pdd/durable_sync_runner.py @@ -94,6 +94,26 @@ def __init__( contract_source=contract_source, project_root=self.git_root, ) + + # Issue #1013 iter-22 M-1 (durable baseline-leakage bug): per-module + # durable worktrees are freshly created via ``git worktree add`` and + # have no pre-existing user WIP by construction. Whatever surfaces in + # a worktree at scope-guard time was put there BY this sync run, so + # the iter-6 B1 "preserve user's pre-existing untracked files" + # carve-out — which exists for the in-place async case where sync + # runs inside the user's main checkout — has no analog in durable + # mode. iter-18 pinned the baseline snapshot to ``self.git_root`` + # (the main checkout), but in production ``git_root`` IS the user's + # main checkout where dirty WIP lives; the actual per-module sync + # then runs inside ``.pdd/worktrees/sync-issue--/``, a + # DIFFERENT directory. Inheriting the main checkout's baseline LEAKS + # the caller's dirty paths into the worktree's allow set (see + # ``_enforce_scope_guard``: each baseline ``rel_posix`` is resolved + # against the scope-guard-time ``repo_root``, which is the per-module + # worktree root), bypassing the split contract. Clear the baseline + # so each fresh module worktree starts clean. + self._baseline_changed_paths = set() + self._baseline_ignored_paths = set() if self.total_budget is not None: self.max_workers = 1 elif durable_max_parallel is not None: diff --git a/pdd/prompts/durable_sync_runner_python.prompt b/pdd/prompts/durable_sync_runner_python.prompt index 1ce2ea224..492cdf489 100644 --- a/pdd/prompts/durable_sync_runner_python.prompt +++ b/pdd/prompts/durable_sync_runner_python.prompt @@ -11,7 +11,7 @@ Durable execution engine for `pdd sync --durable`. It must pr 1. Class: `DurableSyncRunner`, compatible with the runner contract used by `run_agentic_sync()`: `run() -> Tuple[bool, str, float]`. 2. Prepare a durable branch named `sync/issue-` by default, unless a safe `durable_branch` override is supplied. 3. Reject unsafe branches: `main`, `master`, the repository default branch, missing/broken `origin`, non-git directories, and durable branches checked out in another worktree. -4. Use `.pdd/worktrees/durable-issue-` as the main durable worktree and `.pdd/worktrees/sync-issue--` for per-module worktrees. +4. Use `.pdd/worktrees/durable-issue-` as the main durable worktree and `.pdd/worktrees/sync-issue--` for per-module worktrees. Issue #1013 iter-22 M-1: after invoking `AsyncSyncRunner.__init__`, explicitly clear the inherited `_baseline_changed_paths` and `_baseline_ignored_paths` sets to the empty set. Per-module durable worktrees are freshly created via `git worktree add` and contain no pre-existing user WIP by construction, so the iter-6 B1 "preserve pre-existing untracked files" carve-out (which protects the user's main checkout in the in-place async case) has no analog in durable mode. Inheriting a non-empty baseline from `git_root` leaks the caller's dirty paths into each worktree's scope-guard allow set (`_enforce_scope_guard` resolves baseline `rel_posix` entries against the per-module worktree root) and silently bypasses the split contract. 5. Resume by scanning pushed checkpoint commits on the durable branch for trailers formatted as `PDD-Sync-Checkpoint-V1: issue= module=`. Ignore trailers for other issues. 6. Do not rely on `.pdd/agentic_sync_state.json` for durable resume. Corrupt or missing local state must not prevent resuming from remote checkpoint trailers. 7. For each successful module, create a checkpoint commit containing only safe, relevant project files and allowed `.pdd/meta/_*.json` metadata. If the parent issue supplied an allowed write set, reject any staged path outside that exact repo-relative set before creating the checkpoint. Companion-allowlist matching for staged paths MUST use `_matches_companion_pattern_anchored` from `pdd.agentic_common` (Issue #1013 iter-14 M-2) — anchored, segment-aware glob matching — NOT `pathlib.PurePosixPath.match`, whose suffix-based semantics would let a nested path like `subdir/.pdd/meta/foo.json` falsely match the default `.pdd/meta/*.json` companion pattern and bypass the split contract. Issue #1013 iter-16 M-1: companion patterns are matched MODULE-RELATIVE, so when `module_cwd != module_worktree` (multi-module sync where the module lives in a subdirectory like `worktree/pkg`), strip the module_cwd prefix from each staged path before invoking the anchored matcher — otherwise legitimate metadata such as `pkg/.pdd/meta/foo.json` would be rejected. Staged paths that fall outside the module's cwd (sibling-module artifacts) MUST NOT auto-allow under any companion pattern (F1 iter-3 sibling rule). Push the checkpoint before printing `PDD_CHECKPOINT:`. diff --git a/tests/test_durable_sync_runner.py b/tests/test_durable_sync_runner.py index 57cd11447..dbb6bf582 100644 --- a/tests/test_durable_sync_runner.py +++ b/tests/test_durable_sync_runner.py @@ -751,17 +751,17 @@ def test_total_budget_keeps_durable_runner_single_worker(tmp_path: Path): def test_durable_baseline_paths_use_git_root_not_caller_cwd( tmp_path: Path, monkeypatch: pytest.MonkeyPatch ): - """Issue #1013 iter-18 M-1: ``DurableSyncRunner`` MUST take its baseline- - changed-paths snapshot from the durable repo root, NOT from the caller's - current working directory. Reviewer reproduced a regression where a - dirty file in the main checkout (``out.py``) was auto-allowed by the - scope guard inside the durable worktree because the baseline was taken - from ``Path.cwd()`` before ``DurableSyncRunner.__init__`` reassigned - ``project_root`` to ``self.git_root``. + """Issue #1013 iter-18 M-1 + iter-22 M-1: ``DurableSyncRunner`` MUST NOT + inherit baseline-changed-paths from the caller's cwd. Iter-18 first + pinned the snapshot to the durable ``git_root`` (so caller-cwd dirty + files would not leak); iter-22 then made the durable baseline EMPTY by + construction (per-module worktrees are freshly-created and have no + pre-existing user WIP), so this assertion is now vacuously true but is + kept as an explicit guard against regressions. """ caller_cwd = tmp_path / "caller_cwd" caller_cwd.mkdir() - # Dirty file under the caller's cwd; should NOT leak into baseline. + # Dirty file under the caller's cwd; must NOT leak into baseline. (caller_cwd / "out.py").write_text("dirty file in caller's cwd") durable_root = _init_repo_with_remote(tmp_path) @@ -775,27 +775,40 @@ def test_durable_baseline_paths_use_git_root_not_caller_cwd( ) assert runner.project_root == durable_root.resolve() - # Baseline snapshot was taken against durable_root, where ``out.py`` - # does not exist as a dirty file. The caller's dirty ``out.py`` MUST - # NOT appear in the baseline. + # Iter-22 M-1 invariant: durable baseline is always empty. assert "out.py" not in runner._baseline_changed_paths + assert runner._baseline_changed_paths == set() -def test_durable_baseline_includes_dirty_files_in_durable_root( +def test_durable_baseline_is_empty_even_when_git_root_has_dirty_files( tmp_path: Path, monkeypatch: pytest.MonkeyPatch ): - """Iter-18 M-1 (positive case): a dirty file ACTUALLY under the durable - repo root MUST appear in ``_baseline_changed_paths`` so the scope guard - preserves pre-existing user work-in-progress under the durable worktree. + """Issue #1013 iter-22 M-1 (durable baseline-leakage bug): in production + ``git_root`` IS the user's main checkout where dirty WIP lives, but the + per-module sync runs in a SEPARATE ``.pdd/worktrees/sync-issue-N-mod/`` + worktree. If the durable runner inherits the main-checkout baseline, + ``_enforce_scope_guard`` resolves each baseline ``rel_posix`` against the + per-module worktree root and silently auto-allows any same-named file + written there by sync, bypassing the split contract. + + Iter-18 fixed the iter-17 regression where the snapshot was taken from + ``Path.cwd()`` BEFORE the durable runner reassigned ``project_root``; + iter-22 closes the residual leak by making the durable baseline empty + by construction. Per-module worktrees are freshly created via + ``git worktree add`` and have no pre-existing user WIP — so the + iter-6 B1 "preserve pre-existing untracked files" carve-out (which + exists for the in-place async case) has no analog here. """ - caller_cwd = tmp_path / "caller_cwd" - caller_cwd.mkdir() durable_root = _init_repo_with_remote(tmp_path) - # Dirty (untracked) file inside the durable repo root. + # Dirty (untracked) file inside the durable repo root. In production this + # stands in for the user's WIP in their main checkout. (durable_root / "dirty.py").write_text("user work-in-progress") + # Also stage a tracked modification so both flavours of "dirty" are + # represented in what ``git status`` would otherwise report. + (durable_root / "README.md").write_text("locally modified\n") - monkeypatch.chdir(caller_cwd) + monkeypatch.chdir(durable_root) runner = _runner( durable_root, runner_cls=EmptyDurableRunner, @@ -803,5 +816,77 @@ def test_durable_baseline_includes_dirty_files_in_durable_root( companion_allowlist=[".pdd/meta/*.json"], ) + # Iter-22 M-1: durable baseline is empty regardless of the git_root's + # state. The dirty paths from the main checkout must NOT bleed into the + # per-module worktree's allow set. assert runner.project_root == durable_root.resolve() - assert "dirty.py" in runner._baseline_changed_paths + assert runner._baseline_changed_paths == set() + assert runner._baseline_ignored_paths == set() + + +def test_durable_scope_guard_does_not_whitelist_main_checkout_dirty_files( + tmp_path: Path, monkeypatch: pytest.MonkeyPatch +): + """Iter-22 M-1 reviewer repro: a dirty ``out.py`` in the main checkout + must NOT silently whitelist an ``out.py`` written by sync in a separate + per-module worktree. Before iter-22, the durable runner snapshotted the + main checkout's ``out.py`` into ``_baseline_changed_paths``; the scope + guard then resolved that path against the per-module worktree root and + added it to ``allowed_files``, so the contract-violating worktree + ``out.py`` slid through. + """ + from pdd import agentic_sync_runner as mod + + # Main checkout (becomes the durable runner's ``git_root``) with a dirty + # ``out.py`` standing in for the user's WIP. + main_checkout = _init_repo_with_remote(tmp_path) + (main_checkout / "out.py").write_text("user WIP in main checkout") + + # Separate worktree directory, where sync actually runs. Initialize it + # as its own git repo so ``_resolve_repo_root`` and ``git status`` + # operate locally there. + worktree_path = tmp_path / "sync-worktree" + worktree_path.mkdir() + _git(worktree_path, "init", "-b", "main", ".") + _git(worktree_path, "config", "user.name", "Test User") + _git(worktree_path, "config", "user.email", "test@example.invalid") + (worktree_path / ".gitignore").write_text(".pdd/\n", encoding="utf-8") + _git(worktree_path, "add", ".gitignore") + _git(worktree_path, "commit", "-m", "initial") + + # Sync wrote ``out.py`` inside the worktree — this is the contract + # violation that must be detected. + (worktree_path / "out.py").write_text("written by sync, NOT in contract") + + monkeypatch.chdir(main_checkout) + runner = _runner( + main_checkout, + runner_cls=EmptyDurableRunner, + allowed_write_set=["pdd/foo.py"], + companion_allowlist=[".pdd/meta/*.json"], + ) + + # Mock the revert helpers to return [] so the diagnostic depends purely + # on the re-scan + baseline interaction, not the helpers' behaviour. + monkeypatch.setattr( + mod, "_revert_out_of_scope_changes", lambda _root, _allowed: [] + ) + monkeypatch.setattr( + mod, + "revert_out_of_scope_changes_with_dirs", + lambda _root, allowed_dirs, allowed_files: [], + ) + + # Pretend ``module_cwd`` resolves to the separate worktree. + monkeypatch.setattr( + runner, "_resolve_repo_root", lambda _cwd: worktree_path.resolve() + ) + + diagnostic = runner._enforce_scope_guard("mod", worktree_path) + + # Without the iter-22 fix the diagnostic would be ``None`` because + # ``out.py`` from the (leaked) baseline resolves to + # ``/out.py`` and lands in ``allowed_files``. With the + # fix the baseline is empty, so ``out.py`` is correctly out of scope. + assert diagnostic is not None + assert "out.py" in diagnostic From eb021957eedad734e8d87966759d17ac3203b86e Mon Sep 17 00:00:00 2001 From: Serhan Date: Fri, 15 May 2026 13:24:50 -0700 Subject: [PATCH 33/42] fix(sync): iter-24 hash-aware baseline preservation closes clobber gap MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Codex iter-23 demonstrated: dirty out-of-scope outside.py + sync overwrite → scope guard returned None, leaving "sync clobbered" content in place of user WIP. The iter-6 B1 baseline-preservation rule (don't delete pre-existing untracked files) was implemented as name-based pass: any path in _baseline_changed_paths was added to allowed_files unconditionally, so an LLM that wrote to an out-of-scope path which happened to coincide with the user's dirty WIP bypassed the contract check silently. Make baseline preservation content-aware. _baseline_changed_paths and _baseline_ignored_paths become Dict[path, Optional[str]] mapping repo-relative path → init-time SHA-1. At scope-guard time, re-hash each baseline path; only auto-allow when current SHA matches init-time SHA. Clobbered (different SHA) and deleted (None) entries fall through to the normal contract check — sync's out-of-scope writes now surface in the diagnostic even when they happen to coincide with a baseline filename. Unreadable-at-init paths (init SHA was None) fall back to the legacy name-based preserve to avoid false-positives on permission-flaky paths. Cost: hashlib.sha1 over each baseline path, only when scope_guard_enabled AND allowed_write_paths is not None. Baseline sets are typically small (user's local dirty WIP, not whole tree). DurableSyncRunner's iter-22 baseline-clearing now uses empty dicts ({}) instead of empty sets — the iteration is a no-op either way, but the type swap matches the new invariant. Verified the existing durable test_baseline_remains_empty_after_init continues to pass. Verified the codex repro is closed: before this fix `diag: None / outside: sync clobbered`; after, diagnostic surfaces outside.py in the Unrecovered block. Tests: clobber-detected, unchanged-still-allowed (iter-6 B1 regression), deleted-baseline-drops-from-allowed, durable-baseline empty-dict invariant. 673 passed across the scope-guard suite. Co-Authored-By: Claude Opus 4.7 --- pdd/agentic_sync_runner.py | 130 +++++++++-- pdd/durable_sync_runner.py | 9 +- pdd/prompts/agentic_sync_runner_python.prompt | 3 +- tests/test_agentic_sync_runner.py | 216 +++++++++++++++++- tests/test_durable_sync_runner.py | 42 +++- 5 files changed, 377 insertions(+), 23 deletions(-) diff --git a/pdd/agentic_sync_runner.py b/pdd/agentic_sync_runner.py index 295f24ba7..b41ae492c 100644 --- a/pdd/agentic_sync_runner.py +++ b/pdd/agentic_sync_runner.py @@ -8,6 +8,7 @@ import csv as _csv import datetime +import hashlib import json import os import re @@ -223,6 +224,35 @@ def _git_ignored_paths(project_root: Path) -> set[str]: return paths +def _hash_file(project_root: Path, rel_posix: str) -> Optional[str]: + """Return the SHA-1 of *rel_posix* under *project_root*, or None. + + Issue #1013 iter-24 (M-1) baseline-clobber fix: the scope guard preserves + pre-existing dirty/untracked paths (iter-6 B1) so the sync run does not + delete unrelated user WIP. The original implementation matched paths by + NAME only, which let a buggy LLM SILENTLY OVERWRITE an out-of-scope + baseline file with different content — the post-revert re-scan saw the + name in the baseline and skipped the contract check. + + This hash is captured per baseline path at runner init and re-computed + at scope-guard time; only an unchanged SHA (same bytes on disk) is + treated as pre-existing user WIP and auto-allowed. A divergent SHA + falls through to the contract check, surfacing the clobber. + + SHA-1 is sufficient — this is clobber detection, not adversarial + collision resistance. Returns ``None`` when the file cannot be read + (missing, permission denied, etc.); callers MUST treat ``None`` as + "no fingerprint available" and decide policy explicitly. + """ + try: + path = (project_root / rel_posix).resolve() + with open(path, "rb") as handle: + data = handle.read() + except (OSError, FileNotFoundError): + return None + return hashlib.sha1(data).hexdigest() + + # --------------------------------------------------------------------------- # Helper functions # --------------------------------------------------------------------------- @@ -988,11 +1018,25 @@ def __init__( # set BEFORE any module sync runs. Pre-existing untracked files # (e.g. user's ``scratch.txt``) are not the sync run's responsibility # and MUST be preserved by the scope guard. - self._baseline_changed_paths: Set[str] = ( - _git_changed_paths(self.project_root) - if self.scope_guard_enabled and self.allowed_write_paths is not None - else set() - ) + # + # Iter-24 M-1 (baseline-clobber bug): the snapshot is now a DICT + # mapping repo-relative POSIX path → init-time SHA-1 instead of a + # bare set. ``_enforce_scope_guard`` re-hashes each baseline path at + # check time; ONLY paths whose content is byte-identical to the init + # snapshot are auto-allowed. Name-based preservation let a buggy LLM + # silently OVERWRITE pre-existing dirty files outside the contract + # (the iter-23 codex repro). Empty dict — never bare set — when the + # gate (``scope_guard_enabled AND allowed_write_paths is not None``) + # is off, so the dict.items() loops in the enforcement path are + # safe no-ops. + if self.scope_guard_enabled and self.allowed_write_paths is not None: + _baseline_paths = _git_changed_paths(self.project_root) + self._baseline_changed_paths: Dict[str, Optional[str]] = { + rel: _hash_file(self.project_root, rel) + for rel in _baseline_paths + } + else: + self._baseline_changed_paths = {} # Iter-20 M-1 (gitignored fail-open): also snapshot pre-existing # gitignored files (e.g. user-side ``build/cache.bin`` under a @@ -1006,11 +1050,19 @@ def __init__( # (same gate as ``_baseline_changed_paths``) so non-contract runs do # not pay the ``git ls-files`` cost on repos with large ignored # trees (``node_modules/``, ``build/``, etc.). - self._baseline_ignored_paths: Set[str] = ( - _git_ignored_paths(self.project_root) - if self.scope_guard_enabled and self.allowed_write_paths is not None - else set() - ) + # + # Iter-24 M-1: same dict-with-SHA shape as ``_baseline_changed_paths`` + # — pre-existing ignored files are skipped from the gitignored + # re-scan ONLY when their content is unchanged. A clobbered ignored + # baseline path surfaces via the re-scan. + if self.scope_guard_enabled and self.allowed_write_paths is not None: + _baseline_ignored = _git_ignored_paths(self.project_root) + self._baseline_ignored_paths: Dict[str, Optional[str]] = { + rel: _hash_file(self.project_root, rel) + for rel in _baseline_ignored + } + else: + self._baseline_ignored_paths = {} self.total_budget = self.sync_options.get("total_budget") self.max_workers = 1 if self.total_budget is not None else MAX_WORKERS @@ -2047,15 +2099,33 @@ def _remaining_out_of_scope_paths( if ignored_result.returncode != 0: return [""] - baseline_ignored = getattr(self, "_baseline_ignored_paths", set()) + # Iter-24 M-1: ``_baseline_ignored_paths`` is now a Dict[path, SHA]. + # ``getattr`` fallback keeps the runtime robust against subclasses + # that bypass __init__ (none in-tree today, but the iter-20 fallback + # used ``set()`` for the same reason); we still expect a mapping. + baseline_ignored = getattr(self, "_baseline_ignored_paths", {}) for line in ignored_result.stdout.splitlines(): rel = _normalize_repo_path(line.strip().strip('"')) if not rel: continue - # Pre-existing ignored files (snapshot at runner init) are NOT - # the sync run's responsibility. + # Iter-6 B1 / iter-24 M-1: pre-existing ignored files snapshotted + # at runner init are NOT the sync run's responsibility — but ONLY + # if their content is byte-identical to the init snapshot. A + # clobbered ignored baseline path must surface as out-of-scope. if rel in baseline_ignored: - continue + baseline_hash = baseline_ignored[rel] + current_hash = _hash_file(repo_root, rel) + # ``current_hash is None`` means the file disappeared from + # disk between the init snapshot and now — but the ignored + # scan still listed it, which is contradictory. Fall through + # to surface it; the diagnostic is the safer direction. + if (current_hash is not None + and (baseline_hash is None + or current_hash == baseline_hash)): + # Unchanged baseline (or unreadable at init → preserve + # by name, same conservative carve-out as the changed + # baseline branch in ``_enforce_scope_guard``). + continue absolute = (repo_root / rel).resolve() # Companion-allowlisted files (e.g. ``.pdd/meta/*.json`` when # the user has ``.pdd/`` in ``.gitignore``) are in @@ -2153,8 +2223,36 @@ def _enforce_scope_guard( # captured at runner __init__ are NEVER out-of-scope. Without # this pass, a user's ``scratch.txt`` or unrelated WIP under # the repo root would be removed by the revert helper. - for rel_posix in self._baseline_changed_paths: - allowed_files.add((repo_root / rel_posix).resolve()) + # + # Iter-24 M-1 (baseline-clobber bug): preservation is now + # CONTENT-AWARE. Re-hash each baseline path against the + # init-time SHA. Only byte-identical content is auto-allowed; + # divergent SHAs fall through to the contract check so a + # sync-side clobber is surfaced. Note: the init hash uses + # ``self.project_root`` and the enforcement hash uses + # ``repo_root`` — in the non-durable async case those resolve + # to the same path; the durable runner clears the baseline + # entirely (iter-22) so this loop is a no-op there. + for rel_posix, baseline_hash in self._baseline_changed_paths.items(): + current_hash = _hash_file(repo_root, rel_posix) + if current_hash is None: + # File was deleted (or unreadable) at check time — + # don't auto-allow a ghost. The revert helpers handle + # restore semantics; we just refuse to whitelist. + continue + if baseline_hash is None: + # Couldn't hash at init (the file was unreadable + # then). Be conservative and preserve by name, the + # legacy iter-6 B1 behaviour — avoids false-positives + # on permission-flaky paths that pre-date the run. + allowed_files.add((repo_root / rel_posix).resolve()) + continue + if current_hash == baseline_hash: + # Unchanged user WIP — preserve. + allowed_files.add((repo_root / rel_posix).resolve()) + # else: sync (or some other writer) clobbered the file. + # Do NOT add to allowed_files — let the contract check + # flag it as out-of-scope. tracked_reverted = _revert_out_of_scope_changes(repo_root, allowed_files) untracked_reverted = revert_out_of_scope_changes_with_dirs( diff --git a/pdd/durable_sync_runner.py b/pdd/durable_sync_runner.py index fec60dc46..807412e50 100644 --- a/pdd/durable_sync_runner.py +++ b/pdd/durable_sync_runner.py @@ -112,8 +112,13 @@ def __init__( # against the scope-guard-time ``repo_root``, which is the per-module # worktree root), bypassing the split contract. Clear the baseline # so each fresh module worktree starts clean. - self._baseline_changed_paths = set() - self._baseline_ignored_paths = set() + # + # Iter-24 M-1: baseline snapshots are now ``Dict[str, Optional[str]]`` + # (path → SHA-1) for content-aware preservation; clear to empty dicts + # so iteration in ``_enforce_scope_guard`` and + # ``_remaining_out_of_scope_paths`` remains a no-op. + self._baseline_changed_paths = {} + self._baseline_ignored_paths = {} if self.total_budget is not None: self.max_workers = 1 elif durable_max_parallel is not None: diff --git a/pdd/prompts/agentic_sync_runner_python.prompt b/pdd/prompts/agentic_sync_runner_python.prompt index d9182eb36..a690730ae 100644 --- a/pdd/prompts/agentic_sync_runner_python.prompt +++ b/pdd/prompts/agentic_sync_runner_python.prompt @@ -41,7 +41,8 @@ Parallel sync engine that runs `pdd sync` for multiple modules concurrently usin - `scope_guard_enabled: bool` — master switch (default `True`). When `False`, the runner records the parsed contract for diagnostics but performs no enforcement, no revert, and no hard-fail. Maps to the CLI `--no-scope-guard` opt-out. - `contract_source: Optional[str]` — diagnostic label carrying the parse source of the issue contract (`"html-comment"`, `"fenced-block"`, or `"bullet-list"`, matching `IssueContract.source`) so scope-guard diagnostics and downstream review-loop reporters can surface where the contract was detected. `None` when no contract was parsed (permissive fallback). - `project_root: Optional[Path]` — when non-`None`, overrides the default `Path.cwd()` used to seed `self.project_root` and to take the baseline-changed-paths snapshot (Issue #1013 iter-18 M-1). Subclasses such as `DurableSyncRunner` pin this to the durable worktree's git root so the baseline reflects the worktree where syncs will actually run, not the caller's current working directory. Resolved with `Path(project_root).resolve()` when provided. - - `_baseline_ignored_paths: Set[str]` (Issue #1013 iter-20 M-1) — sibling snapshot to `_baseline_changed_paths`, populated from `git ls-files --others --ignored --exclude-standard` at init via the helper `_git_ignored_paths(project_root)`. Records repo-relative POSIX paths of pre-existing gitignored files (e.g. user-side `build/cache.bin` under a repo-wide `.gitignore: build/`) so the post-revert re-scan does not flag them as the sync run's out-of-scope writes. Gated identically to `_baseline_changed_paths` (`scope_guard_enabled AND allowed_write_paths is not None`) so non-contract runs do not pay the `git ls-files` cost on repos with large ignored trees (`node_modules/`, `build/`, etc.). + - `_baseline_changed_paths: Dict[str, Optional[str]]` (Issue #1013 iter-6 B1 + iter-24 M-1) — snapshot of pre-existing dirty/untracked working-tree paths captured from `_git_changed_paths(project_root)` at runner init, mapping each repo-relative POSIX path to its init-time SHA-1 (`_hash_file(project_root, rel)`) or `None` when the file was unreadable at init. Iter-6 B1 originated the iter-6 B1 "preserve user's pre-existing untracked files" carve-out so the scope guard does not delete unrelated user WIP. **Iter-24 M-1 (baseline-clobber bug)** upgraded preservation from name-based to content-aware: the old `Set[str]` snapshot let a buggy LLM silently OVERWRITE an out-of-scope baseline file (the iter-23 codex repro: `outside: sync clobbered`). The dict + SHA invariant: a baseline path is auto-allowed (added to `allowed_files`) by `_enforce_scope_guard` ONLY IF its current SHA-1 matches the init-time SHA-1; a divergent SHA falls through to the contract check and surfaces the clobber. Gated on `scope_guard_enabled AND allowed_write_paths is not None`; when the gate is off the snapshot is an empty dict so the `.items()` iteration in the enforcement path is a no-op. The `_hash_file` helper uses SHA-1 because this is clobber detection, not adversarial collision resistance. + - `_baseline_ignored_paths: Dict[str, Optional[str]]` (Issue #1013 iter-20 M-1 + iter-24 M-1) — sibling snapshot to `_baseline_changed_paths`, populated from `git ls-files --others --ignored --exclude-standard` at init via the helper `_git_ignored_paths(project_root)`. Records repo-relative POSIX paths of pre-existing gitignored files (e.g. user-side `build/cache.bin` under a repo-wide `.gitignore: build/`) so the post-revert re-scan does not flag them as the sync run's out-of-scope writes. Gated identically to `_baseline_changed_paths` (`scope_guard_enabled AND allowed_write_paths is not None`) so non-contract runs do not pay the `git ls-files` cost on repos with large ignored trees (`node_modules/`, `build/`, etc.). **Iter-24 M-1** same dict-with-SHA shape as `_baseline_changed_paths`: pre-existing ignored files are skipped from the gitignored re-scan ONLY when their content is byte-identical to the init snapshot; a clobbered ignored baseline path surfaces via the re-scan as out-of-scope. - Tracks per-module state: pending -> running -> success | failed 2. Method: `run() -> Tuple[bool, str, float]` — returns (all_success, summary_message, total_cost) where total_cost includes initial_cost + per-module costs 3. Use `concurrent.futures.ThreadPoolExecutor` with `MAX_WORKERS = 4`; when `sync_options["total_budget"]` is set, run sequentially and pass only the remaining total budget to each child process so the total budget is not multiplied per module. diff --git a/tests/test_agentic_sync_runner.py b/tests/test_agentic_sync_runner.py index 5e25ff2cd..d59e9b59b 100644 --- a/tests/test_agentic_sync_runner.py +++ b/tests/test_agentic_sync_runner.py @@ -2881,6 +2881,219 @@ def test_pre_existing_untracked_files_are_preserved(self, tmp_path): ) assert diagnostic is None or "scratch.txt" not in diagnostic + # --------------------------------------------------------------------- + # Iter-24 M-1: hash-aware baseline preservation + # --------------------------------------------------------------------- + + def test_baseline_preservation_clobber_is_detected( + self, tmp_path, monkeypatch + ): + """Iter-24 M-1 (baseline-clobber bug, codex iter-23 repro): a + pre-existing dirty file outside the contract MUST be flagged when + sync overwrites it with different content. Name-based preservation + (iter-6 B1) silently auto-allowed any same-named write.""" + from pdd import agentic_sync_runner as mod + + subprocess.run( + ["git", "init", "-b", "main", str(tmp_path)], + check=True, capture_output=True, + ) + subprocess.run( + ["git", "-C", str(tmp_path), "config", "user.email", "t@t.invalid"], + check=True, capture_output=True, + ) + subprocess.run( + ["git", "-C", str(tmp_path), "config", "user.name", "T"], + check=True, capture_output=True, + ) + (tmp_path / "README.md").write_text("initial") + subprocess.run( + ["git", "-C", str(tmp_path), "add", "README.md"], + check=True, capture_output=True, + ) + subprocess.run( + ["git", "-C", str(tmp_path), "commit", "-m", "init"], + check=True, capture_output=True, + ) + + # Pre-existing dirty file outside the contract — codex repro. + outside = tmp_path / "outside.py" + outside.write_text("user wip") + + monkeypatch.chdir(tmp_path) + runner = self._make_runner( + allowed_write_set=["pdd/foo.py"], + companion_allowlist=[".pdd/meta/*.json"], + ) + runner.project_root = tmp_path.resolve() + + assert "outside.py" in runner._baseline_changed_paths + baseline_hash = runner._baseline_changed_paths["outside.py"] + assert baseline_hash is not None, ( + "iter-24: baseline SHA must be captured for readable files" + ) + + # Simulate sync (buggy LLM) clobbering the file with different + # content. + outside.write_text("sync clobbered") + + monkeypatch.setattr( + mod, "_revert_out_of_scope_changes", lambda _root, _allowed: [] + ) + monkeypatch.setattr( + mod, "revert_out_of_scope_changes_with_dirs", + lambda _root, allowed_dirs, allowed_files: [], + ) + monkeypatch.setattr( + runner, "_resolve_repo_root", lambda _cwd: tmp_path.resolve() + ) + + diagnostic = runner._enforce_scope_guard("mod", tmp_path) + + assert diagnostic is not None, ( + "iter-24: clobbered baseline file must hard-fail the module" + ) + assert "outside.py" in diagnostic, ( + "iter-24: clobbered path must appear in the diagnostic" + ) + + def test_baseline_preservation_unchanged_file_still_allowed( + self, tmp_path, monkeypatch + ): + """Iter-24 M-1 (iter-6 B1 regression): an unchanged pre-existing + dirty file MUST still be auto-allowed (preserved). The iter-24 fix + adds content-awareness but does not break the iter-6 B1 carve-out + for unchanged user WIP.""" + from pdd import agentic_sync_runner as mod + + subprocess.run( + ["git", "init", "-b", "main", str(tmp_path)], + check=True, capture_output=True, + ) + subprocess.run( + ["git", "-C", str(tmp_path), "config", "user.email", "t@t.invalid"], + check=True, capture_output=True, + ) + subprocess.run( + ["git", "-C", str(tmp_path), "config", "user.name", "T"], + check=True, capture_output=True, + ) + (tmp_path / "README.md").write_text("initial") + subprocess.run( + ["git", "-C", str(tmp_path), "add", "README.md"], + check=True, capture_output=True, + ) + subprocess.run( + ["git", "-C", str(tmp_path), "commit", "-m", "init"], + check=True, capture_output=True, + ) + + outside = tmp_path / "outside.py" + outside.write_text("user wip") + + monkeypatch.chdir(tmp_path) + runner = self._make_runner( + allowed_write_set=["pdd/foo.py"], + companion_allowlist=[".pdd/meta/*.json"], + ) + runner.project_root = tmp_path.resolve() + + # Sync did NOT touch outside.py — content unchanged. + assert outside.read_text() == "user wip" + + monkeypatch.setattr( + mod, "_revert_out_of_scope_changes", lambda _root, _allowed: [] + ) + monkeypatch.setattr( + mod, "revert_out_of_scope_changes_with_dirs", + lambda _root, allowed_dirs, allowed_files: [], + ) + monkeypatch.setattr( + runner, "_resolve_repo_root", lambda _cwd: tmp_path.resolve() + ) + + diagnostic = runner._enforce_scope_guard("mod", tmp_path) + + assert diagnostic is None, ( + f"iter-24: unchanged baseline file must be preserved, got: " + f"{diagnostic!r}" + ) + + def test_baseline_preservation_deleted_file_drops_from_allowed( + self, tmp_path, monkeypatch + ): + """Iter-24 M-1: when a baseline path is deleted between init and + scope-guard time, the iter-24 logic skips it (``current_hash is + None``). The deleted path MUST NOT appear in ``allowed_files`` — + use a TRACKED-AND-MODIFIED baseline path so ``git status`` shows + the deletion (advisor #3 fix). Capture the revert helpers' allowed + set to assert directly.""" + from pdd import agentic_sync_runner as mod + + subprocess.run( + ["git", "init", "-b", "main", str(tmp_path)], + check=True, capture_output=True, + ) + subprocess.run( + ["git", "-C", str(tmp_path), "config", "user.email", "t@t.invalid"], + check=True, capture_output=True, + ) + subprocess.run( + ["git", "-C", str(tmp_path), "config", "user.name", "T"], + check=True, capture_output=True, + ) + # Commit ``outside.py`` so it's tracked; then dirty it. ``git + # status`` will list the subsequent deletion as ``D ``. + (tmp_path / "outside.py").write_text("tracked content") + (tmp_path / "README.md").write_text("initial") + subprocess.run( + ["git", "-C", str(tmp_path), "add", "outside.py", "README.md"], + check=True, capture_output=True, + ) + subprocess.run( + ["git", "-C", str(tmp_path), "commit", "-m", "init"], + check=True, capture_output=True, + ) + outside = tmp_path / "outside.py" + outside.write_text("dirty content") # now tracked + modified + + monkeypatch.chdir(tmp_path) + runner = self._make_runner( + allowed_write_set=["pdd/foo.py"], + companion_allowlist=[".pdd/meta/*.json"], + ) + runner.project_root = tmp_path.resolve() + + assert "outside.py" in runner._baseline_changed_paths + + # Simulate the file being deleted between baseline snapshot and + # scope-guard run. + outside.unlink() + + captured_allowed: Dict[str, set] = {} + + def fake_revert(_root, allowed_files): + captured_allowed["files"] = set(allowed_files) + return [] + + monkeypatch.setattr(mod, "_revert_out_of_scope_changes", fake_revert) + monkeypatch.setattr( + mod, "revert_out_of_scope_changes_with_dirs", + lambda _root, allowed_dirs, allowed_files: [], + ) + monkeypatch.setattr( + runner, "_resolve_repo_root", lambda _cwd: tmp_path.resolve() + ) + + runner._enforce_scope_guard("mod", tmp_path) + + # The KEY assertion: the deleted baseline path was NOT auto-allowed. + deleted_abs = (tmp_path / "outside.py").resolve() + assert deleted_abs not in captured_allowed["files"], ( + "iter-24: deleted baseline path must not be in allowed_files; " + f"got: {captured_allowed['files']}" + ) + def test_wildcard_only_companion_pattern_does_not_auto_allow( self, tmp_path, monkeypatch ): @@ -3290,7 +3503,8 @@ def test_gitignored_out_of_scope_file_is_detected( companion_allowlist=[".pdd/meta/*.json"], ) runner.project_root = tmp_path.resolve() - assert runner._baseline_ignored_paths == set(), ( + # Iter-24 M-1: baseline snapshots are now Dict[str, Optional[str]]. + assert runner._baseline_ignored_paths == {}, ( "no ignored files exist yet — baseline must be empty" ) diff --git a/tests/test_durable_sync_runner.py b/tests/test_durable_sync_runner.py index dbb6bf582..4c0cb2952 100644 --- a/tests/test_durable_sync_runner.py +++ b/tests/test_durable_sync_runner.py @@ -776,8 +776,9 @@ def test_durable_baseline_paths_use_git_root_not_caller_cwd( assert runner.project_root == durable_root.resolve() # Iter-22 M-1 invariant: durable baseline is always empty. + # Iter-24 M-1: empty dict (was empty set before the type swap). assert "out.py" not in runner._baseline_changed_paths - assert runner._baseline_changed_paths == set() + assert runner._baseline_changed_paths == {} def test_durable_baseline_is_empty_even_when_git_root_has_dirty_files( @@ -819,9 +820,44 @@ def test_durable_baseline_is_empty_even_when_git_root_has_dirty_files( # Iter-22 M-1: durable baseline is empty regardless of the git_root's # state. The dirty paths from the main checkout must NOT bleed into the # per-module worktree's allow set. + # Iter-24 M-1: empty dict (was empty set before the type swap). assert runner.project_root == durable_root.resolve() - assert runner._baseline_changed_paths == set() - assert runner._baseline_ignored_paths == set() + assert runner._baseline_changed_paths == {} + assert runner._baseline_ignored_paths == {} + + +def test_durable_baseline_remains_empty_dict_after_init( + tmp_path: Path, monkeypatch: pytest.MonkeyPatch +): + """Issue #1013 iter-24 M-1: baseline snapshots changed from ``Set[str]`` + to ``Dict[str, Optional[str]]`` (path → SHA-1) for content-aware + preservation. The durable runner's iter-22 "clear baseline" invariant + still holds — but the cleared value is now an empty dict, not an empty + set. Iterating a dict with no entries yields nothing, so all the + ``.items()`` loops in ``_enforce_scope_guard`` and + ``_remaining_out_of_scope_paths`` remain safe no-ops in durable mode. + """ + durable_root = _init_repo_with_remote(tmp_path) + # Dirty paths in the git_root that would have populated the baseline + # under the inherited AsyncSyncRunner init path. + (durable_root / "dirty.py").write_text("user wip") + (durable_root / "build.log").write_text("ignored junk") + + monkeypatch.chdir(durable_root) + runner = _runner( + durable_root, + runner_cls=EmptyDurableRunner, + allowed_write_set=["pdd/foo.py"], + companion_allowlist=[".pdd/meta/*.json"], + ) + + assert runner._baseline_changed_paths == {} + assert runner._baseline_ignored_paths == {} + # Iter-24 invariant: the cleared baseline is a Mapping (so .items() is + # safe). Iteration yields no entries — confirms downstream loops are + # no-ops. + assert list(runner._baseline_changed_paths.items()) == [] + assert list(runner._baseline_ignored_paths.items()) == [] def test_durable_scope_guard_does_not_whitelist_main_checkout_dirty_files( From ed5e3e4798b231b6da19c47dec2677caad853adc Mon Sep 17 00:00:00 2001 From: Serhan Date: Fri, 15 May 2026 13:43:12 -0700 Subject: [PATCH 34/42] fix(sync): iter-26 B-1 close orchestrator-level scope leak in arch corrections MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Reviewer caught a real contract bypass: pdd/agentic_sync.py:1903 calls _apply_architecture_corrections() BEFORE any runner exists, modifying architecture.json at the orchestrator level. The per-module scope guard never sees this write. Plus the "All modules are already synced — nothing to do." early return at line 1973 short-circuits the dispatch entirely without running enforcement. Reviewer reproduced: contract allowed only pdd/foo.py, LLM dependency-correction modified architecture.json, sync returned success with M architecture.json in git status. Add an orchestrator-level scope gate around the deps-correction call. Architecture.json may be modified iff: - No contract parsed (issue_contract is None → permissive mode), OR - The caller passed --no-scope-guard (scope_guard is False → opt-out), OR - architecture.json is explicitly in the contract's allowed_paths. Otherwise, emit an explicit skip warning telling the operator to add architecture.json to the contract or rerun with --no-scope-guard, and do not invoke _apply_architecture_corrections. This closes the bypass WITHOUT needing to revert post-hoc — the orchestrator simply refuses the only out-of-contract write it can perform. The "already synced" early return now has nothing to enforce because the orchestrator never performs the write in the first place. Plus docs drift: CHANGELOG.md:5 listed only HTML-comment and fenced-block contract formats; the iter-18 bullet-list format ("## Split Contract" + "**Allowed write set:**" + bullets) was missing. _extract_allowed_write_paths docstring at agentic_sync.py:1543 had the same omission. Tests: 4 codepath cases (skipped/applied/no-contract/opt-out) plus a defensive "already-synced early-return does not leak arch changes" test that uses a real `git init` tmp repo and asserts `git status` is clean for architecture.json. 678 passed across the scope-guard surface. Co-Authored-By: Claude Opus 4.7 --- CHANGELOG.md | 2 +- pdd/agentic_sync.py | 37 +++- tests/test_agentic_sync.py | 432 +++++++++++++++++++++++++++++++++++++ 3 files changed, 464 insertions(+), 7 deletions(-) diff --git a/CHANGELOG.md b/CHANGELOG.md index 96123ae64..52058512a 100644 --- a/CHANGELOG.md +++ b/CHANGELOG.md @@ -2,7 +2,7 @@ ### Fix -- **#1013 sync**: enforce split-contract allowed write sets. When the linked GitHub issue declares an allowed write set (HTML comment `` or a fenced "Allowed Write Set" / "Split Contract" block), `pdd sync` now reverts tracked changes and removes untracked new files that fall outside the contract after each per-module subprocess, hard-fails the module on out-of-scope artifacts, and surfaces the contract source plus offending paths in checkup/review-loop reports. Companion artifacts under `.pdd/meta/*.json` are auto-allowed; additional companions can be opted in via the contract's `companion_allowlist` field. Use `--no-scope-guard` to opt out for a single run. Issues without a contract marker remain in permissive mode (no enforcement). +- **#1013 sync**: enforce split-contract allowed write sets. When the linked GitHub issue declares an allowed write set (HTML comment ``, a fenced "Allowed Write Set" / "Split Contract" block, or a `## Split Contract` heading with an `**Allowed write set:**` label followed by a bullet list), `pdd sync` now reverts tracked changes and removes untracked new files that fall outside the contract after each per-module subprocess, hard-fails the module on out-of-scope artifacts, and surfaces the contract source plus offending paths in checkup/review-loop reports. Companion artifacts under `.pdd/meta/*.json` are auto-allowed; additional companions can be opted in via the contract's `companion_allowlist` field. Use `--no-scope-guard` to opt out for a single run. Issues without a contract marker remain in permissive mode (no enforcement). ## v0.0.238 (2026-05-14) diff --git a/pdd/agentic_sync.py b/pdd/agentic_sync.py index a981e1d52..8e0c19bf7 100644 --- a/pdd/agentic_sync.py +++ b/pdd/agentic_sync.py @@ -1536,11 +1536,11 @@ def _extract_allowed_write_paths(issue_text: str) -> List[str]: This helper used to do its own loose markdown scan for allowed-write paths. It now delegates to the structured contract parser in - :mod:`pdd.agentic_common` so the public contract API (HTML-comment JSON - and fenced-block formats) is the single source of truth. The wrapper is - kept for one release so any external caller that imported the private - name does not crash at import time; it returns an empty list whenever - :func:`parse_issue_contract` cannot find a valid contract. + :mod:`pdd.agentic_common` so the public contract API (HTML-comment JSON, + fenced-block, and bullet-list formats) is the single source of truth. + The wrapper is kept for one release so any external caller that imported + the private name does not crash at import time; it returns an empty list + whenever :func:`parse_issue_contract` cannot find a valid contract. """ contract = parse_issue_contract(issue_text) return list(contract.allowed_paths) if contract is not None else [] @@ -1891,6 +1891,22 @@ def run_agentic_sync( console.print(f"[green]Modules to sync: {modules_to_sync}[/green]") # 10. Apply dependency corrections if needed + # + # Iter-26: scope-guard the LLM dependency-correction step. This runs + # at the ORCHESTRATOR level — before the runner exists — so the per- + # module scope guard cannot catch it. If the issue contract does not + # include ``architecture.json`` in its allowed write set, skip the + # correction so the contract is not silently violated. Pre-existing + # behavior is preserved when no contract was parsed (issue_contract + # is None → permissive mode) or when ``architecture.json`` IS in the + # contract (legitimate architecture-touching PRs). ``--no-scope-guard`` + # (``scope_guard is False``) is treated as an explicit opt-out and + # also bypasses the gate. + arch_in_scope = ( + issue_contract is None + or scope_guard is False + or "architecture.json" in tuple(issue_contract.allowed_paths or ()) + ) if not deps_valid and deps_corrections and architecture is not None: if dry_run: if not quiet: @@ -1898,9 +1914,18 @@ def run_agentic_sync( "[yellow]Dry run: dependency corrections were suggested; " "architecture.json was not modified.[/yellow]" ) + elif not arch_in_scope: + if not quiet: + console.print( + "[yellow]Sync scope guard: skipping LLM dependency " + "corrections — architecture.json is outside the issue " + "split-contract allowed write set. Add architecture.json " + "to the contract or rerun with --no-scope-guard to apply " + "corrections.[/yellow]" + ) elif not quiet: console.print("[yellow]LLM flagged dependency corrections, updating architecture.json...[/yellow]") - if not dry_run: + if not dry_run and arch_in_scope: architecture = _apply_architecture_corrections(arch_path, architecture, deps_corrections, quiet) # 11. Build dependency graph diff --git a/tests/test_agentic_sync.py b/tests/test_agentic_sync.py index 84f0b9170..fda91f139 100644 --- a/tests/test_agentic_sync.py +++ b/tests/test_agentic_sync.py @@ -1479,6 +1479,438 @@ def test_durable_mode_uses_durable_runner( assert runner_kwargs["durable_max_parallel"] == 2 +# --------------------------------------------------------------------------- +# Iter-26: orchestrator-level scope guard for the LLM dependency-correction +# step. The per-module scope guard runs INSIDE the runner; the dependency- +# correction step writes architecture.json BEFORE any runner exists. If the +# split-contract allowed write set does not include ``architecture.json``, +# the orchestrator must skip the correction so the contract is not silently +# violated. These tests cover the gate decision plus the already-synced +# early-return path which dispatches no runner at all. +# --------------------------------------------------------------------------- + + +# A bullet-list contract that the parser in pdd.agentic_common recognizes. +# The ``**Allowed write set:**`` inline label is the discriminator; ``## Split +# Contract`` is just the surrounding heading. NOTE: architecture.json is NOT +# in this allow set, so the orchestrator must skip the deps-correction step. +_CONTRACT_BODY_ARCH_OUT_OF_SCOPE = ( + "Fix foo.\n" + "\n" + "## Split Contract\n" + "\n" + "**Allowed write set:**\n" + "\n" + "- `pdd/foo.py`\n" +) + +# Same shape but architecture.json IS in the allow set — the correction should +# be applied. +_CONTRACT_BODY_ARCH_IN_SCOPE = ( + "Fix foo.\n" + "\n" + "## Split Contract\n" + "\n" + "**Allowed write set:**\n" + "\n" + "- `pdd/foo.py`\n" + "- `architecture.json`\n" +) + + +class TestDependencyCorrectionsScopeGuard: + """Verify the orchestrator-level scope gate on + ``_apply_architecture_corrections``. The gate runs BEFORE any runner is + dispatched, so per-module scope enforcement cannot catch this write.""" + + @patch("pdd.agentic_sync._apply_architecture_corrections") + @patch("pdd.agentic_sync.AsyncSyncRunner") + @patch("pdd.agentic_sync._filter_already_synced", return_value=["foo"]) + @patch("pdd.agentic_sync._detect_modules_from_branch_diff", return_value=[]) + @patch("pdd.agentic_sync._run_dry_run_validation") + @patch( + "pdd.agentic_sync.build_dep_graph_from_architecture_data", + return_value=DepGraphFromArchitectureResult({"foo": []}, []), + ) + @patch("pdd.agentic_sync.load_prompt_template", return_value="template {issue_content} {architecture_json}") + @patch("pdd.agentic_sync.run_agentic_task") + @patch("pdd.agentic_sync._load_architecture_json") + @patch("pdd.agentic_sync._run_gh_command") + @patch("pdd.agentic_sync._check_gh_cli", return_value=True) + def test_dependency_corrections_skipped_when_arch_outside_contract( + self, + mock_gh_cli, + mock_gh_cmd, + mock_load_arch, + mock_agentic_task, + mock_load_prompt, + mock_build_graph, + mock_dry_run, + mock_branch_diff, + mock_filter_synced, + mock_runner_cls, + mock_apply_corrections, + tmp_path, + capsys, + ): + """Contract excludes architecture.json → corrections must NOT run.""" + issue_data = { + "title": "Fix foo", + "body": _CONTRACT_BODY_ARCH_OUT_OF_SCOPE, + "comments_url": "", + } + mock_gh_cmd.return_value = (True, json.dumps(issue_data)) + mock_load_arch.return_value = ( + [{"filename": "foo_python.prompt", "dependencies": []}], + tmp_path / "architecture.json", + ) + mock_agentic_task.return_value = ( + True, + ( + 'MODULES_TO_SYNC: ["foo"]\n' + "DEPS_VALID: false\n" + 'DEPS_CORRECTIONS: [{"filename": "foo_python.prompt", "dependencies": []}]' + ), + 0.05, + "anthropic", + ) + mock_dry_run.return_value = (True, {"foo": tmp_path}, [], 0.0) + + mock_runner = MagicMock() + mock_runner.run.return_value = (True, "All 1 modules synced successfully", 0.10) + mock_runner_cls.return_value = mock_runner + + success, msg, cost, model = run_agentic_sync( + "https://github.com/owner/repo/issues/1", + quiet=False, # capture the skip-warning text + ) + + assert success is True + mock_apply_corrections.assert_not_called() + captured = capsys.readouterr() + # Warning text from the orchestrator skip branch. Rich console may + # line-wrap the message at any width, so we collapse whitespace + # before substring-matching for the discriminating phrase. + flat = " ".join(captured.out.split()) + assert "Sync scope guard: skipping LLM dependency corrections" in flat + assert "architecture.json is outside" in flat + + @patch("pdd.agentic_sync._apply_architecture_corrections") + @patch("pdd.agentic_sync.AsyncSyncRunner") + @patch("pdd.agentic_sync._filter_already_synced", return_value=["foo"]) + @patch("pdd.agentic_sync._detect_modules_from_branch_diff", return_value=[]) + @patch("pdd.agentic_sync._run_dry_run_validation") + @patch( + "pdd.agentic_sync.build_dep_graph_from_architecture_data", + return_value=DepGraphFromArchitectureResult({"foo": []}, []), + ) + @patch("pdd.agentic_sync.load_prompt_template", return_value="template {issue_content} {architecture_json}") + @patch("pdd.agentic_sync.run_agentic_task") + @patch("pdd.agentic_sync._load_architecture_json") + @patch("pdd.agentic_sync._run_gh_command") + @patch("pdd.agentic_sync._check_gh_cli", return_value=True) + def test_dependency_corrections_applied_when_arch_in_contract( + self, + mock_gh_cli, + mock_gh_cmd, + mock_load_arch, + mock_agentic_task, + mock_load_prompt, + mock_build_graph, + mock_dry_run, + mock_branch_diff, + mock_filter_synced, + mock_runner_cls, + mock_apply_corrections, + tmp_path, + ): + """Contract includes architecture.json → corrections must run.""" + arch_data = [{"filename": "foo_python.prompt", "dependencies": []}] + mock_apply_corrections.return_value = arch_data + + issue_data = { + "title": "Fix foo", + "body": _CONTRACT_BODY_ARCH_IN_SCOPE, + "comments_url": "", + } + mock_gh_cmd.return_value = (True, json.dumps(issue_data)) + mock_load_arch.return_value = ( + arch_data, + tmp_path / "architecture.json", + ) + mock_agentic_task.return_value = ( + True, + ( + 'MODULES_TO_SYNC: ["foo"]\n' + "DEPS_VALID: false\n" + 'DEPS_CORRECTIONS: [{"filename": "foo_python.prompt", "dependencies": []}]' + ), + 0.05, + "anthropic", + ) + mock_dry_run.return_value = (True, {"foo": tmp_path}, [], 0.0) + + mock_runner = MagicMock() + mock_runner.run.return_value = (True, "All 1 modules synced successfully", 0.10) + mock_runner_cls.return_value = mock_runner + + success, _msg, _cost, _model = run_agentic_sync( + "https://github.com/owner/repo/issues/1", quiet=True + ) + + assert success is True + mock_apply_corrections.assert_called_once() + + @patch("pdd.agentic_sync._apply_architecture_corrections") + @patch("pdd.agentic_sync.AsyncSyncRunner") + @patch("pdd.agentic_sync._filter_already_synced", return_value=["foo"]) + @patch("pdd.agentic_sync._detect_modules_from_branch_diff", return_value=[]) + @patch("pdd.agentic_sync._run_dry_run_validation") + @patch( + "pdd.agentic_sync.build_dep_graph_from_architecture_data", + return_value=DepGraphFromArchitectureResult({"foo": []}, []), + ) + @patch("pdd.agentic_sync.load_prompt_template", return_value="template {issue_content} {architecture_json}") + @patch("pdd.agentic_sync.run_agentic_task") + @patch("pdd.agentic_sync._load_architecture_json") + @patch("pdd.agentic_sync._run_gh_command") + @patch("pdd.agentic_sync._check_gh_cli", return_value=True) + def test_dependency_corrections_applied_when_no_contract( + self, + mock_gh_cli, + mock_gh_cmd, + mock_load_arch, + mock_agentic_task, + mock_load_prompt, + mock_build_graph, + mock_dry_run, + mock_branch_diff, + mock_filter_synced, + mock_runner_cls, + mock_apply_corrections, + tmp_path, + ): + """No contract markers → permissive mode preserves pre-iter-26 behavior.""" + arch_data = [{"filename": "foo_python.prompt", "dependencies": []}] + mock_apply_corrections.return_value = arch_data + + # No HTML comment, no fenced block, no ``**Allowed write set:**`` label. + issue_data = { + "title": "Fix foo", + "body": "Just fix foo, no contract here.", + "comments_url": "", + } + mock_gh_cmd.return_value = (True, json.dumps(issue_data)) + mock_load_arch.return_value = ( + arch_data, + tmp_path / "architecture.json", + ) + mock_agentic_task.return_value = ( + True, + ( + 'MODULES_TO_SYNC: ["foo"]\n' + "DEPS_VALID: false\n" + 'DEPS_CORRECTIONS: [{"filename": "foo_python.prompt", "dependencies": []}]' + ), + 0.05, + "anthropic", + ) + mock_dry_run.return_value = (True, {"foo": tmp_path}, [], 0.0) + + mock_runner = MagicMock() + mock_runner.run.return_value = (True, "All 1 modules synced successfully", 0.10) + mock_runner_cls.return_value = mock_runner + + success, _msg, _cost, _model = run_agentic_sync( + "https://github.com/owner/repo/issues/1", quiet=True + ) + + assert success is True + mock_apply_corrections.assert_called_once() + + @patch("pdd.agentic_sync._apply_architecture_corrections") + @patch("pdd.agentic_sync.AsyncSyncRunner") + @patch("pdd.agentic_sync._filter_already_synced", return_value=["foo"]) + @patch("pdd.agentic_sync._detect_modules_from_branch_diff", return_value=[]) + @patch("pdd.agentic_sync._run_dry_run_validation") + @patch( + "pdd.agentic_sync.build_dep_graph_from_architecture_data", + return_value=DepGraphFromArchitectureResult({"foo": []}, []), + ) + @patch("pdd.agentic_sync.load_prompt_template", return_value="template {issue_content} {architecture_json}") + @patch("pdd.agentic_sync.run_agentic_task") + @patch("pdd.agentic_sync._load_architecture_json") + @patch("pdd.agentic_sync._run_gh_command") + @patch("pdd.agentic_sync._check_gh_cli", return_value=True) + def test_dependency_corrections_applied_when_scope_guard_disabled( + self, + mock_gh_cli, + mock_gh_cmd, + mock_load_arch, + mock_agentic_task, + mock_load_prompt, + mock_build_graph, + mock_dry_run, + mock_branch_diff, + mock_filter_synced, + mock_runner_cls, + mock_apply_corrections, + tmp_path, + ): + """``--no-scope-guard`` bypasses the gate even when arch is out of scope.""" + arch_data = [{"filename": "foo_python.prompt", "dependencies": []}] + mock_apply_corrections.return_value = arch_data + + issue_data = { + "title": "Fix foo", + "body": _CONTRACT_BODY_ARCH_OUT_OF_SCOPE, + "comments_url": "", + } + mock_gh_cmd.return_value = (True, json.dumps(issue_data)) + mock_load_arch.return_value = ( + arch_data, + tmp_path / "architecture.json", + ) + mock_agentic_task.return_value = ( + True, + ( + 'MODULES_TO_SYNC: ["foo"]\n' + "DEPS_VALID: false\n" + 'DEPS_CORRECTIONS: [{"filename": "foo_python.prompt", "dependencies": []}]' + ), + 0.05, + "anthropic", + ) + mock_dry_run.return_value = (True, {"foo": tmp_path}, [], 0.0) + + mock_runner = MagicMock() + mock_runner.run.return_value = (True, "All 1 modules synced successfully", 0.10) + mock_runner_cls.return_value = mock_runner + + success, _msg, _cost, _model = run_agentic_sync( + "https://github.com/owner/repo/issues/1", + quiet=True, + scope_guard=False, + ) + + assert success is True + mock_apply_corrections.assert_called_once() + + @patch("pdd.agentic_sync._apply_architecture_corrections") + @patch("pdd.agentic_sync.AsyncSyncRunner") + @patch("pdd.agentic_sync.DurableSyncRunner") + @patch("pdd.agentic_sync._filter_already_synced", return_value=[]) + @patch("pdd.agentic_sync._detect_modules_from_branch_diff", return_value=[]) + @patch("pdd.agentic_sync._run_dry_run_validation") + @patch( + "pdd.agentic_sync.build_dep_graph_from_architecture_data", + return_value=DepGraphFromArchitectureResult({"foo": []}, []), + ) + @patch("pdd.agentic_sync.load_prompt_template", return_value="template {issue_content} {architecture_json}") + @patch("pdd.agentic_sync.run_agentic_task") + @patch("pdd.agentic_sync._load_architecture_json") + @patch("pdd.agentic_sync._run_gh_command") + @patch("pdd.agentic_sync._check_gh_cli", return_value=True) + def test_already_synced_early_return_does_not_leak_arch_changes( + self, + mock_gh_cli, + mock_gh_cmd, + mock_load_arch, + mock_agentic_task, + mock_load_prompt, + mock_build_graph, + mock_dry_run, + mock_branch_diff, + mock_filter_synced, + mock_durable_runner_cls, + mock_async_runner_cls, + mock_apply_corrections, + tmp_path, + ): + """Defensive: even if every module is already synced and the runner is + never dispatched, the orchestrator must NOT write architecture.json + out-of-scope. Verifies both the mock-level assertion AND that no + ``M architecture.json`` shows up in a real git repo's ``git status`` + after the orchestrator returns its early "already synced" success. + """ + # Build a tiny git repo with a committed architecture.json so any + # subsequent write would show as a tracked modification. + repo = tmp_path / "repo" + repo.mkdir() + subprocess.run(["git", "init", "--quiet"], cwd=repo, check=True) + subprocess.run( + ["git", "config", "user.email", "test@example.com"], + cwd=repo, + check=True, + ) + subprocess.run( + ["git", "config", "user.name", "Test"], + cwd=repo, + check=True, + ) + arch_file = repo / "architecture.json" + arch_data = [{"filename": "foo_python.prompt", "dependencies": []}] + arch_file.write_text(json.dumps(arch_data, indent=2)) + subprocess.run( + ["git", "add", "architecture.json"], cwd=repo, check=True + ) + subprocess.run( + ["git", "commit", "--quiet", "-m", "init arch"], + cwd=repo, + check=True, + ) + + issue_data = { + "title": "Fix foo", + "body": _CONTRACT_BODY_ARCH_OUT_OF_SCOPE, + "comments_url": "", + } + mock_gh_cmd.return_value = (True, json.dumps(issue_data)) + mock_load_arch.return_value = (arch_data, arch_file) + mock_agentic_task.return_value = ( + True, + ( + 'MODULES_TO_SYNC: ["foo"]\n' + "DEPS_VALID: false\n" + 'DEPS_CORRECTIONS: [{"filename": "foo_python.prompt", "dependencies": []}]' + ), + 0.05, + "anthropic", + ) + mock_dry_run.return_value = (True, {"foo": repo}, [], 0.0) + + old_cwd = Path.cwd() + try: + os.chdir(repo) + success, msg, _cost, _model = run_agentic_sync( + "https://github.com/owner/repo/issues/1", quiet=True + ) + finally: + os.chdir(old_cwd) + + # Orchestrator returns the "already synced" early-success path. + assert success is True + assert "already synced" in msg.lower() + + # The gate must have refused the only out-of-contract write the + # orchestrator can perform. + mock_apply_corrections.assert_not_called() + # No runner is dispatched on the already-synced path. + mock_async_runner_cls.assert_not_called() + mock_durable_runner_cls.assert_not_called() + + # Defense-in-depth: a real git status check confirms the on-disk + # architecture.json is untouched. + status = subprocess.run( + ["git", "status", "--porcelain"], + cwd=repo, + check=True, + capture_output=True, + text=True, + ) + assert "architecture.json" not in status.stdout + + # --------------------------------------------------------------------------- # _resolve_module_cwd # --------------------------------------------------------------------------- From 56fc6b1cc5e549a82b215076e58ea813d8876741 Mon Sep 17 00:00:00 2001 From: Serhan Date: Fri, 15 May 2026 14:04:23 -0700 Subject: [PATCH 35/42] fix(sync): iter-28 close two orchestrator-level scope bypasses MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Codex iter-27 found two BLOCKERS in the pre-runner orchestrator path that bypass scope-guard enforcement entirely. B-1: pdd/agentic_sync.py:1284 executes an LLM-suggested ``pdd sync`` shell command to validate the cwd for a dry-run failure. The cmd is passed through to ``subprocess.run(shell=True)`` verbatim. If the LLM omits ``--dry-run`` (or returns it without the flag), the validation step actually performs a real sync and writes files BEFORE the scope-guarded runner is constructed. Add ``_inject_dry_run_flag()`` that injects ``--dry-run`` into every ``pdd sync`` invocation in the suggested command using a regex with a positional lookahead so we don't match ``pdd sync-architecture`` or ``pdd synchronize``. Idempotent when the flag is already present. Paranoia check refuses to execute if injection didn't land. LLM prompt template updated to require ``--dry-run`` and note auto-injection. B-2: the iter-26 ``arch_in_scope`` gate compared the literal string ``"architecture.json"`` against the contract. But ``arch_path`` can be a nested file like ``frontend/architecture.json``, in which case the literal string is NOT in ``allowed_paths`` yet the gate passes the write because the contract names ``architecture.json`` (root only) — or, in the reverse, blocks legitimate nested-arch writes when the contract correctly names ``frontend/architecture.json``. Add ``_arch_path_in_scope(arch_path, project_root, issue_contract, scope_guard)`` that resolves the ACTUAL ``arch_path`` to a repo- relative POSIX string and compares against ``issue_contract.allowed_paths``. Returns False when arch_path resolves outside project_root. Tests: B-1 — injection cases (no-flag, has-flag, cd-chained, must- not-match sync-architecture) plus 7 helper-level cases. B-2 — nested arch in/out-of-contract, arch outside project root, plus 7 helper- level cases. 699 passed across the scope-guard surface. Co-Authored-By: Claude Opus 4.7 --- pdd/agentic_sync.py | 113 +++- .../agentic_sync_fix_dry_run_LLM.prompt | 5 + tests/test_agentic_sync.py | 552 ++++++++++++++++++ 3 files changed, 658 insertions(+), 12 deletions(-) diff --git a/pdd/agentic_sync.py b/pdd/agentic_sync.py index 8e0c19bf7..76c74e878 100644 --- a/pdd/agentic_sync.py +++ b/pdd/agentic_sync.py @@ -1191,6 +1191,33 @@ def _run_single_dry_run( return False, str(e) +# Matches a ``pdd sync`` invocation as two whitespace-separated tokens (NOT +# ``pdd sync-architecture`` or ``pdd synchronize``). The lookahead requires +# whitespace or end-of-string after ``sync`` — ``\b`` between ``c`` and ``-`` +# would otherwise match ``sync-*`` subcommands. Used to inject ``--dry-run`` +# into LLM-suggested sync commands before they are executed. +_PDD_SYNC_INVOCATION_RE = re.compile(r"(\bpdd\s+sync)(?=\s|$)") + + +def _inject_dry_run_flag(cmd: str) -> str: + """Inject ``--dry-run`` into every ``pdd sync`` invocation in *cmd*. + + Iter-28 B-1: the LLM-suggested command is supposed to be a dry-run probe + to validate cwd, but the LLM could omit ``--dry-run`` and cause a real + sync write before the scope-guarded runner exists. This injection is the + authoritative safety mechanism — the prompt template also instructs the + LLM to include it, but we cannot rely on prompt compliance alone. + + Only matches ``pdd sync`` as two whitespace-separated tokens; subcommands + like ``pdd sync-architecture`` are intentionally left untouched (and + rejected downstream by the paranoia check in + :func:`_llm_fix_dry_run_failure`). + """ + if "--dry-run" in cmd: + return cmd + return _PDD_SYNC_INVOCATION_RE.sub(r"\1 --dry-run", cmd) + + def _llm_fix_dry_run_failure( basename: str, project_root: Path, @@ -1273,10 +1300,30 @@ def _llm_fix_dry_run_failure( if "pdd" not in suggested_cmd or "sync" not in suggested_cmd: return False, None, llm_cost, f"LLM suggested unexpected command: {suggested_cmd}" + # B-1 (iter-28): force ``--dry-run`` onto the LLM-suggested command. The + # probe is supposed to validate cwd only — never perform a real sync. If + # the LLM omits ``--dry-run`` we inject it; if injection cannot land + # (e.g. ``pdd sync-architecture`` which intentionally is not matched by + # :data:`_PDD_SYNC_INVOCATION_RE`) the paranoia check below refuses to + # execute. The scope-guarded runner does not yet exist at this point, so + # there is no fallback enforcement. + safe_cmd = _inject_dry_run_flag(suggested_cmd) + if "--dry-run" not in safe_cmd: + return ( + False, + None, + llm_cost, + ( + "LLM suggested command does not contain a ``pdd sync`` invocation " + "where ``--dry-run`` could be injected; refusing to execute: " + f"{suggested_cmd}" + ), + ) + # Append a pwd marker after the command so we can extract the effective cwd. # This avoids fragile regex parsing of cd segments from the command string. pwd_marker = "__PDD_EFFECTIVE_CWD__" - augmented_cmd = f"{suggested_cmd} && echo {pwd_marker} && pwd" + augmented_cmd = f"{safe_cmd} && echo {pwd_marker} && pwd" # Run the suggested command directly via shell from project root. # This handles relative cd paths, chained cd's, etc. naturally. @@ -1546,6 +1593,42 @@ def _extract_allowed_write_paths(issue_text: str) -> List[str]: return list(contract.allowed_paths) if contract is not None else [] +def _arch_path_in_scope( + arch_path: Path, + project_root: Path, + issue_contract: Optional[IssueContract], + scope_guard: bool, +) -> bool: + """Return True if *arch_path* is in the issue contract's allowed write set. + + Iter-28 B-2: the iter-26 gate compared the literal string + ``"architecture.json"`` against the contract. That bypassed the guard when + the project uses a nested architecture (e.g. ``frontend/architecture.json``) + — the literal check would either falsely allow a write the contract + forbids, or falsely block a write the contract permits. The check now + resolves the ACTUAL ``arch_path`` to a repo-relative POSIX path and + compares that against the contract. + + The gate is bypassed (returns True) when: + - no contract was parsed (``issue_contract`` is None → permissive mode), or + - ``--no-scope-guard`` was passed (``scope_guard`` is False). + + Returns False when ``arch_path`` resolves outside ``project_root`` — that + is treated as out-of-scope by definition (the contract is repo-relative, + so a path outside the repo cannot be allowed by it). + """ + if issue_contract is None or scope_guard is False: + return True + try: + arch_rel = ( + arch_path.resolve().relative_to(project_root.resolve()).as_posix() + ) + except ValueError: + # arch_path resolves outside project_root — by definition not in scope. + return False + return arch_rel in tuple(issue_contract.allowed_paths or ()) + + def _apply_architecture_corrections( arch_path: Path, architecture: List[Dict[str, Any]], @@ -1895,17 +1978,23 @@ def run_agentic_sync( # Iter-26: scope-guard the LLM dependency-correction step. This runs # at the ORCHESTRATOR level — before the runner exists — so the per- # module scope guard cannot catch it. If the issue contract does not - # include ``architecture.json`` in its allowed write set, skip the - # correction so the contract is not silently violated. Pre-existing - # behavior is preserved when no contract was parsed (issue_contract - # is None → permissive mode) or when ``architecture.json`` IS in the - # contract (legitimate architecture-touching PRs). ``--no-scope-guard`` - # (``scope_guard is False``) is treated as an explicit opt-out and - # also bypasses the gate. - arch_in_scope = ( - issue_contract is None - or scope_guard is False - or "architecture.json" in tuple(issue_contract.allowed_paths or ()) + # include the ACTUAL ``arch_path`` (as a repo-relative POSIX path) in + # its allowed write set, skip the correction so the contract is not + # silently violated. Pre-existing behavior is preserved when no contract + # was parsed (``issue_contract`` is None → permissive mode) or when the + # arch path IS in the contract (legitimate architecture-touching PRs). + # ``--no-scope-guard`` (``scope_guard is False``) is treated as an + # explicit opt-out and also bypasses the gate. + # + # Iter-28 B-2: use the resolved ``arch_path`` rather than the literal + # string ``"architecture.json"``. A nested arch path such as + # ``frontend/architecture.json`` would otherwise either bypass the gate + # (when the contract allows ``architecture.json`` literally but the + # actual file is nested) or be incorrectly skipped (when the contract + # allows the nested path explicitly). The actual file being written is + # the source of truth, delegated to :func:`_arch_path_in_scope`. + arch_in_scope = _arch_path_in_scope( + arch_path, project_root, issue_contract, scope_guard ) if not deps_valid and deps_corrections and architecture is not None: if dry_run: diff --git a/pdd/prompts/agentic_sync_fix_dry_run_LLM.prompt b/pdd/prompts/agentic_sync_fix_dry_run_LLM.prompt index 39ac1dbfb..dbffe7fb5 100644 --- a/pdd/prompts/agentic_sync_fix_dry_run_LLM.prompt +++ b/pdd/prompts/agentic_sync_fix_dry_run_LLM.prompt @@ -20,6 +20,11 @@ Look at where .pddrc files are located and which subdirectory contains the relev Output the full shell command on a single line, prefixed with `SYNC_CMD:`. The command MUST use `cd && pdd --force sync {basename} --dry-run --agentic --no-steer`. +The command MUST include `--dry-run`. This invocation is a probe to validate +the working directory only — it must never perform a real sync. If `--dry-run` +is omitted, the orchestrator will inject it automatically before executing, +but you should always emit it explicitly so the intent is unambiguous. + SYNC_CMD: cd && pdd --force sync {basename} --dry-run --agentic --no-steer Examples: diff --git a/tests/test_agentic_sync.py b/tests/test_agentic_sync.py index fda91f139..e6f1f2ae7 100644 --- a/tests/test_agentic_sync.py +++ b/tests/test_agentic_sync.py @@ -18,6 +18,7 @@ from pdd.agentic_sync import ( _apply_architecture_corrections, _analyze_global_sync_modules, + _arch_path_in_scope, _architecture_module_basenames, _architecture_sync_modules, _augment_architecture_from_pr_branch, @@ -26,6 +27,7 @@ _detect_modules_from_branch_diff, _filter_already_synced, _find_project_root, + _inject_dry_run_flag, _is_catchall_match, _is_github_issue_url, _is_runtime_llm_template, @@ -40,6 +42,7 @@ run_agentic_sync, run_global_sync, ) +from pdd.agentic_common import IssueContract from pdd.agentic_sync_runner import ( DepGraphFromArchitectureResult, build_dep_graph_from_architecture, @@ -1517,6 +1520,20 @@ def test_durable_mode_uses_durable_runner( "- `architecture.json`\n" ) +# Iter-28 B-2: contract allows a NESTED architecture path. Used by the +# nested-arch B-2 tests to assert the gate compares the real ``arch_path`` +# (resolved repo-relative) rather than the literal string ``architecture.json``. +_CONTRACT_BODY_NESTED_ARCH_IN_SCOPE = ( + "Fix foo.\n" + "\n" + "## Split Contract\n" + "\n" + "**Allowed write set:**\n" + "\n" + "- `pdd/foo.py`\n" + "- `frontend/architecture.json`\n" +) + class TestDependencyCorrectionsScopeGuard: """Verify the orchestrator-level scope gate on @@ -1595,6 +1612,7 @@ def test_dependency_corrections_skipped_when_arch_outside_contract( assert "Sync scope guard: skipping LLM dependency corrections" in flat assert "architecture.json is outside" in flat + @patch("pdd.agentic_sync._find_project_root") @patch("pdd.agentic_sync._apply_architecture_corrections") @patch("pdd.agentic_sync.AsyncSyncRunner") @patch("pdd.agentic_sync._filter_already_synced", return_value=["foo"]) @@ -1622,9 +1640,11 @@ def test_dependency_corrections_applied_when_arch_in_contract( mock_filter_synced, mock_runner_cls, mock_apply_corrections, + mock_find_root, tmp_path, ): """Contract includes architecture.json → corrections must run.""" + mock_find_root.return_value = tmp_path arch_data = [{"filename": "foo_python.prompt", "dependencies": []}] mock_apply_corrections.return_value = arch_data @@ -1910,6 +1930,356 @@ def test_already_synced_early_return_does_not_leak_arch_changes( ) assert "architecture.json" not in status.stdout + # ------------------------------------------------------------------ + # Iter-28 B-2: nested arch_path bypass + # ------------------------------------------------------------------ + + @patch("pdd.agentic_sync._apply_architecture_corrections") + @patch("pdd.agentic_sync.AsyncSyncRunner") + @patch("pdd.agentic_sync._filter_already_synced", return_value=["foo"]) + @patch("pdd.agentic_sync._detect_modules_from_branch_diff", return_value=[]) + @patch("pdd.agentic_sync._run_dry_run_validation") + @patch( + "pdd.agentic_sync.build_dep_graph_from_architecture_data", + return_value=DepGraphFromArchitectureResult({"foo": []}, []), + ) + @patch("pdd.agentic_sync.load_prompt_template", return_value="template {issue_content} {architecture_json}") + @patch("pdd.agentic_sync.run_agentic_task") + @patch("pdd.agentic_sync._load_architecture_json") + @patch("pdd.agentic_sync._run_gh_command") + @patch("pdd.agentic_sync._check_gh_cli", return_value=True) + def test_dependency_corrections_skipped_for_nested_arch_outside_contract( + self, + mock_gh_cli, + mock_gh_cmd, + mock_load_arch, + mock_agentic_task, + mock_load_prompt, + mock_build_graph, + mock_dry_run, + mock_branch_diff, + mock_filter_synced, + mock_runner_cls, + mock_apply_corrections, + tmp_path, + ): + """Contract allows the literal string ``architecture.json`` but the + REAL arch path is ``frontend/architecture.json``. Iter-28 B-2: the + gate must compare the resolved arch path, not the bare string, so + the nested arch write is rejected.""" + arch_data = [{"filename": "foo_python.prompt", "dependencies": []}] + # Contract allows root architecture.json only — NOT the nested path. + issue_data = { + "title": "Fix foo", + "body": _CONTRACT_BODY_ARCH_IN_SCOPE, + "comments_url": "", + } + mock_gh_cmd.return_value = (True, json.dumps(issue_data)) + # arch_path resolves nested: frontend/architecture.json under + # tmp_path. The literal-string gate would have matched the contract's + # ``architecture.json`` entry and let the write through; the + # resolved-path gate must NOT. + nested_arch = tmp_path / "frontend" / "architecture.json" + (tmp_path / "frontend").mkdir() + mock_load_arch.return_value = (arch_data, nested_arch) + mock_agentic_task.return_value = ( + True, + ( + 'MODULES_TO_SYNC: ["foo"]\n' + "DEPS_VALID: false\n" + 'DEPS_CORRECTIONS: [{"filename": "foo_python.prompt", "dependencies": []}]' + ), + 0.05, + "anthropic", + ) + mock_dry_run.return_value = (True, {"foo": tmp_path}, [], 0.0) + + mock_runner = MagicMock() + mock_runner.run.return_value = (True, "All 1 modules synced successfully", 0.10) + mock_runner_cls.return_value = mock_runner + + success, _msg, _cost, _model = run_agentic_sync( + "https://github.com/owner/repo/issues/1", quiet=True + ) + + assert success is True + mock_apply_corrections.assert_not_called() + + @patch("pdd.agentic_sync._find_project_root") + @patch("pdd.agentic_sync._apply_architecture_corrections") + @patch("pdd.agentic_sync.AsyncSyncRunner") + @patch("pdd.agentic_sync._filter_already_synced", return_value=["foo"]) + @patch("pdd.agentic_sync._detect_modules_from_branch_diff", return_value=[]) + @patch("pdd.agentic_sync._run_dry_run_validation") + @patch( + "pdd.agentic_sync.build_dep_graph_from_architecture_data", + return_value=DepGraphFromArchitectureResult({"foo": []}, []), + ) + @patch("pdd.agentic_sync.load_prompt_template", return_value="template {issue_content} {architecture_json}") + @patch("pdd.agentic_sync.run_agentic_task") + @patch("pdd.agentic_sync._load_architecture_json") + @patch("pdd.agentic_sync._run_gh_command") + @patch("pdd.agentic_sync._check_gh_cli", return_value=True) + def test_dependency_corrections_applied_for_nested_arch_in_contract( + self, + mock_gh_cli, + mock_gh_cmd, + mock_load_arch, + mock_agentic_task, + mock_load_prompt, + mock_build_graph, + mock_dry_run, + mock_branch_diff, + mock_filter_synced, + mock_runner_cls, + mock_apply_corrections, + mock_find_root, + tmp_path, + ): + """Contract explicitly allows ``frontend/architecture.json`` and the + arch path matches → gate must permit the write.""" + mock_find_root.return_value = tmp_path + arch_data = [{"filename": "foo_python.prompt", "dependencies": []}] + mock_apply_corrections.return_value = arch_data + + issue_data = { + "title": "Fix foo", + "body": _CONTRACT_BODY_NESTED_ARCH_IN_SCOPE, + "comments_url": "", + } + mock_gh_cmd.return_value = (True, json.dumps(issue_data)) + nested_arch = tmp_path / "frontend" / "architecture.json" + (tmp_path / "frontend").mkdir() + mock_load_arch.return_value = (arch_data, nested_arch) + mock_agentic_task.return_value = ( + True, + ( + 'MODULES_TO_SYNC: ["foo"]\n' + "DEPS_VALID: false\n" + 'DEPS_CORRECTIONS: [{"filename": "foo_python.prompt", "dependencies": []}]' + ), + 0.05, + "anthropic", + ) + mock_dry_run.return_value = (True, {"foo": tmp_path}, [], 0.0) + + mock_runner = MagicMock() + mock_runner.run.return_value = (True, "All 1 modules synced successfully", 0.10) + mock_runner_cls.return_value = mock_runner + + success, _msg, _cost, _model = run_agentic_sync( + "https://github.com/owner/repo/issues/1", quiet=True + ) + + assert success is True + mock_apply_corrections.assert_called_once() + + @patch("pdd.agentic_sync._apply_architecture_corrections") + @patch("pdd.agentic_sync.AsyncSyncRunner") + @patch("pdd.agentic_sync._filter_already_synced", return_value=["foo"]) + @patch("pdd.agentic_sync._detect_modules_from_branch_diff", return_value=[]) + @patch("pdd.agentic_sync._run_dry_run_validation") + @patch( + "pdd.agentic_sync.build_dep_graph_from_architecture_data", + return_value=DepGraphFromArchitectureResult({"foo": []}, []), + ) + @patch("pdd.agentic_sync.load_prompt_template", return_value="template {issue_content} {architecture_json}") + @patch("pdd.agentic_sync.run_agentic_task") + @patch("pdd.agentic_sync._load_architecture_json") + @patch("pdd.agentic_sync._run_gh_command") + @patch("pdd.agentic_sync._check_gh_cli", return_value=True) + def test_dependency_corrections_skipped_for_arch_outside_project_root( + self, + mock_gh_cli, + mock_gh_cmd, + mock_load_arch, + mock_agentic_task, + mock_load_prompt, + mock_build_graph, + mock_dry_run, + mock_branch_diff, + mock_filter_synced, + mock_runner_cls, + mock_apply_corrections, + tmp_path, + ): + """``arch_path`` resolves outside ``project_root`` → never in scope. + + Defense-in-depth: even if some upstream bug threads an arch path + outside the repo root into the orchestrator, ``_arch_path_in_scope`` + catches the ``ValueError`` from ``relative_to`` and returns False so + the write is refused. + """ + arch_data = [{"filename": "foo_python.prompt", "dependencies": []}] + issue_data = { + "title": "Fix foo", + "body": _CONTRACT_BODY_ARCH_IN_SCOPE, + "comments_url": "", + } + mock_gh_cmd.return_value = (True, json.dumps(issue_data)) + # Force an arch path that resolves OUTSIDE project_root. + outside_arch = (tmp_path.parent / "outside_root" / "architecture.json").resolve() + outside_arch.parent.mkdir(parents=True, exist_ok=True) + mock_load_arch.return_value = (arch_data, outside_arch) + mock_agentic_task.return_value = ( + True, + ( + 'MODULES_TO_SYNC: ["foo"]\n' + "DEPS_VALID: false\n" + 'DEPS_CORRECTIONS: [{"filename": "foo_python.prompt", "dependencies": []}]' + ), + 0.05, + "anthropic", + ) + mock_dry_run.return_value = (True, {"foo": tmp_path}, [], 0.0) + + mock_runner = MagicMock() + mock_runner.run.return_value = (True, "All 1 modules synced successfully", 0.10) + mock_runner_cls.return_value = mock_runner + + success, _msg, _cost, _model = run_agentic_sync( + "https://github.com/owner/repo/issues/1", quiet=True + ) + + assert success is True + mock_apply_corrections.assert_not_called() + + +# --------------------------------------------------------------------------- +# _arch_path_in_scope (iter-28 B-2 helper, unit-level) +# --------------------------------------------------------------------------- + + +class TestArchPathInScope: + """Unit-level coverage of the resolved-path scope check used by the + iter-26 orchestrator gate, post-iter-28 B-2.""" + + @staticmethod + def _contract(*allowed: str) -> IssueContract: + return IssueContract( + allowed_paths=tuple(allowed), + companion_allowlist=(), + source="test", + ) + + def test_no_contract_permissive(self, tmp_path): + """No contract → always in scope (pre-iter-26 behavior preserved).""" + assert _arch_path_in_scope( + tmp_path / "architecture.json", + tmp_path, + issue_contract=None, + scope_guard=True, + ) + + def test_scope_guard_disabled_bypasses_check(self, tmp_path): + """``--no-scope-guard`` → always in scope, contract ignored.""" + contract = self._contract("pdd/foo.py") # arch NOT in contract + assert _arch_path_in_scope( + tmp_path / "architecture.json", + tmp_path, + issue_contract=contract, + scope_guard=False, + ) + + def test_literal_arch_in_contract_match(self, tmp_path): + """Root arch + contract allows ``architecture.json`` → in scope.""" + contract = self._contract("pdd/foo.py", "architecture.json") + assert _arch_path_in_scope( + tmp_path / "architecture.json", + tmp_path, + issue_contract=contract, + scope_guard=True, + ) + + def test_nested_arch_with_literal_contract_rejected(self, tmp_path): + """Nested arch + contract only allows literal ``architecture.json`` + → out of scope (the iter-28 B-2 fix).""" + contract = self._contract("pdd/foo.py", "architecture.json") + assert not _arch_path_in_scope( + tmp_path / "frontend" / "architecture.json", + tmp_path, + issue_contract=contract, + scope_guard=True, + ) + + def test_nested_arch_with_nested_contract_match(self, tmp_path): + """Nested arch + contract names the same nested path → in scope.""" + contract = self._contract("pdd/foo.py", "frontend/architecture.json") + assert _arch_path_in_scope( + tmp_path / "frontend" / "architecture.json", + tmp_path, + issue_contract=contract, + scope_guard=True, + ) + + def test_arch_outside_project_root_rejected(self, tmp_path): + """``arch_path`` outside the repo → out of scope (ValueError → False).""" + contract = self._contract("architecture.json") + outside = (tmp_path.parent / "outside_root" / "architecture.json").resolve() + outside.parent.mkdir(parents=True, exist_ok=True) + assert not _arch_path_in_scope( + outside, + tmp_path, + issue_contract=contract, + scope_guard=True, + ) + + def test_empty_contract_allowed_paths_rejected(self, tmp_path): + """Empty ``allowed_paths`` tuple → no path is in scope.""" + contract = self._contract() # ``allowed_paths=()`` + assert not _arch_path_in_scope( + tmp_path / "architecture.json", + tmp_path, + issue_contract=contract, + scope_guard=True, + ) + + +# --------------------------------------------------------------------------- +# _inject_dry_run_flag (iter-28 B-1 helper, unit-level) +# --------------------------------------------------------------------------- + + +class TestInjectDryRunFlag: + """Unit-level coverage of the ``--dry-run`` injector used by the + LLM dry-run fallback executor.""" + + def test_injects_after_sync_token(self): + assert _inject_dry_run_flag("pdd sync foo") == "pdd sync --dry-run foo" + + def test_idempotent_when_dry_run_already_present(self): + assert ( + _inject_dry_run_flag("pdd sync foo --dry-run") + == "pdd sync foo --dry-run" + ) + + def test_injects_into_cd_chained_command(self): + assert ( + _inject_dry_run_flag("cd subdir && pdd sync foo") + == "cd subdir && pdd sync --dry-run foo" + ) + + def test_does_not_match_sync_architecture(self): + # ``pdd sync-architecture`` is a different subcommand; the regex + # lookahead refuses to match so injection leaves the command alone. + assert ( + _inject_dry_run_flag("pdd sync-architecture") + == "pdd sync-architecture" + ) + + def test_does_not_match_synchronize(self): + assert _inject_dry_run_flag("pdd synchronize") == "pdd synchronize" + + def test_injects_at_end_of_command(self): + # ``pdd sync`` with no trailing arg → injection still lands. + assert _inject_dry_run_flag("pdd sync") == "pdd sync --dry-run" + + def test_injects_before_ampersand_chain(self): + assert ( + _inject_dry_run_flag("pdd sync && echo done") + == "pdd sync --dry-run && echo done" + ) + # --------------------------------------------------------------------------- # _resolve_module_cwd @@ -2478,6 +2848,188 @@ def test_dry_run_success_rejects_changed_no_self_include_prompt_contract( assert "includes no existing module source context" in errors[0] +# --------------------------------------------------------------------------- +# _llm_fix_dry_run_failure --dry-run injection (iter-28 B-1) +# --------------------------------------------------------------------------- + + +class TestLlmFixDryRunInjection: + """Verify that LLM-suggested sync commands always get ``--dry-run``. + + Iter-28 B-1: the orchestrator-level dry-run LLM fallback executes the + LLM's suggested command via shell. If the LLM forgets ``--dry-run`` the + command would perform a real sync write before the scope-guarded runner + exists. The orchestrator must inject ``--dry-run`` (and refuse to execute + if injection cannot land) to keep the probe non-destructive. + """ + + @staticmethod + def _llm_response(cmd: str) -> str: + return f"SYNC_CMD: {cmd}\n" + + @patch("pdd.agentic_sync.subprocess.run") + @patch("pdd.agentic_sync.run_agentic_task") + @patch("pdd.agentic_sync.load_prompt_template") + def test_llm_fix_dry_run_injects_dry_run_flag( + self, + mock_load_prompt, + mock_agentic_task, + mock_subprocess, + tmp_path, + ): + """LLM omits ``--dry-run`` → orchestrator injects it before exec.""" + mock_load_prompt.return_value = ( + "{basename} {dry_run_error} {project_tree} {pddrc_locations} {attempted_cwd}" + ) + mock_agentic_task.return_value = ( + True, + self._llm_response("pdd sync foo"), + 0.01, + "anthropic", + ) + mock_subprocess.return_value = MagicMock( + returncode=0, + stdout=f"__PDD_EFFECTIVE_CWD__\n{tmp_path}\n", + stderr="", + ) + + ok, cwd, cost, err = _llm_fix_dry_run_failure( + basename="foo", + project_root=tmp_path, + dry_run_error="prompt not found", + quiet=True, + ) + + assert ok is True + assert cwd == tmp_path.resolve() + assert err == "" + # Captured subprocess command must contain the injected flag in the + # correct token position (right after ``sync``). + executed_cmd = mock_subprocess.call_args[0][0] + assert "pdd sync --dry-run foo" in executed_cmd + + @patch("pdd.agentic_sync.subprocess.run") + @patch("pdd.agentic_sync.run_agentic_task") + @patch("pdd.agentic_sync.load_prompt_template") + def test_llm_fix_dry_run_preserves_existing_dry_run_flag( + self, + mock_load_prompt, + mock_agentic_task, + mock_subprocess, + tmp_path, + ): + """LLM already includes ``--dry-run`` → command unchanged.""" + mock_load_prompt.return_value = ( + "{basename} {dry_run_error} {project_tree} {pddrc_locations} {attempted_cwd}" + ) + mock_agentic_task.return_value = ( + True, + self._llm_response("pdd sync foo --dry-run"), + 0.01, + "anthropic", + ) + mock_subprocess.return_value = MagicMock( + returncode=0, + stdout=f"__PDD_EFFECTIVE_CWD__\n{tmp_path}\n", + stderr="", + ) + + ok, _cwd, _cost, _err = _llm_fix_dry_run_failure( + basename="foo", + project_root=tmp_path, + dry_run_error="prompt not found", + quiet=True, + ) + + assert ok is True + executed_cmd = mock_subprocess.call_args[0][0] + # Only one ``--dry-run`` appears in the executed command (the LLM's + # own copy was preserved verbatim, no second copy was injected). + assert executed_cmd.count("--dry-run") == 1 + assert "pdd sync foo --dry-run" in executed_cmd + + @patch("pdd.agentic_sync.subprocess.run") + @patch("pdd.agentic_sync.run_agentic_task") + @patch("pdd.agentic_sync.load_prompt_template") + def test_llm_fix_dry_run_handles_cd_chained_command( + self, + mock_load_prompt, + mock_agentic_task, + mock_subprocess, + tmp_path, + ): + """LLM emits ``cd subdir && pdd sync foo`` → injection still lands.""" + subdir = tmp_path / "subdir" + subdir.mkdir() + mock_load_prompt.return_value = ( + "{basename} {dry_run_error} {project_tree} {pddrc_locations} {attempted_cwd}" + ) + mock_agentic_task.return_value = ( + True, + self._llm_response("cd subdir && pdd sync foo"), + 0.01, + "anthropic", + ) + mock_subprocess.return_value = MagicMock( + returncode=0, + stdout=f"__PDD_EFFECTIVE_CWD__\n{subdir}\n", + stderr="", + ) + + ok, _cwd, _cost, _err = _llm_fix_dry_run_failure( + basename="foo", + project_root=tmp_path, + dry_run_error="prompt not found", + quiet=True, + ) + + assert ok is True + executed_cmd = mock_subprocess.call_args[0][0] + assert "cd subdir && pdd sync --dry-run foo" in executed_cmd + + @patch("pdd.agentic_sync.subprocess.run") + @patch("pdd.agentic_sync.run_agentic_task") + @patch("pdd.agentic_sync.load_prompt_template") + def test_llm_fix_dry_run_does_not_inject_into_sync_architecture( + self, + mock_load_prompt, + mock_agentic_task, + mock_subprocess, + tmp_path, + ): + """``pdd sync-architecture`` is a different subcommand — must not match. + + The regex (``\\bpdd\\s+sync`` with whitespace/end-of-string lookahead) + deliberately rejects ``pdd sync-architecture`` so the injection cannot + produce ``pdd sync --dry-run-architecture``. The downstream paranoia + check then refuses to execute the un-injected command, since the + scope-guarded runner does not exist at this point. + """ + mock_load_prompt.return_value = ( + "{basename} {dry_run_error} {project_tree} {pddrc_locations} {attempted_cwd}" + ) + mock_agentic_task.return_value = ( + True, + self._llm_response("pdd sync-architecture"), + 0.01, + "anthropic", + ) + + ok, cwd, _cost, err = _llm_fix_dry_run_failure( + basename="foo", + project_root=tmp_path, + dry_run_error="prompt not found", + quiet=True, + ) + + # Paranoia check must reject the command outright. + assert ok is False + assert cwd is None + assert "--dry-run" in err + # And subprocess.run must NEVER be invoked for a non-injectable cmd. + mock_subprocess.assert_not_called() + + # --------------------------------------------------------------------------- # _filter_already_synced # --------------------------------------------------------------------------- From 1f9fdc0ccc06659d911905c6dc3c42255fbeb7a0 Mon Sep 17 00:00:00 2001 From: Serhan Date: Fri, 15 May 2026 14:52:17 -0700 Subject: [PATCH 36/42] fix(sync): iter-30 unified orchestrator scope guard (Option A) MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Closes the entire class of pre-dispatch contract bypasses that iter-26/-28/-29 surfaced one site at a time. The previous fixes each gated a single orchestrator write site; codex kept finding new ones (architecture corrections → LLM-suggested shell cmd → nested arch path → write-capable identify-modules → free-form shell). The structural issue: run_agentic_sync does write-capable work BEFORE the runner exists, so AsyncSyncRunner._enforce_scope_guard cannot see those writes. PART 1 — Replace LLM shell execution with safe argv. iter-28's --dry-run injection was the wrong shape: shell=True with an LLM-provided string is the actual wound, and injection doesn't stop `rm`, redirects, or chained writes (codex iter-29 B-2). Rewrite the prompt template (agentic_sync_fix_dry_run_LLM.prompt) to ask for SYNC_CWD: only — the LLM identifies the directory, nothing else. Parse and validate (no shell metachars; resolves under project_root). Build the argv ourselves: [pdd_exe, --force, sync, basename, --dry-run, --agentic, --no-steer]. Run with shell=False. LLM-controlled shell execution is gone by construction. Legacy SYNC_CMD: format produces an explicit migration error so stale cached responses surface a clear retry hint. _inject_dry_run_flag helper deleted — it has no surface anymore. PART 2 — Orchestrator-level scope guard. Add _enforce_orchestrator_scope in agentic_sync.py. Snapshot the working tree at run_agentic_sync entry — _git_changed_paths + _git_ignored_paths with SHA fingerprints (iter-24 shape, shared via new _hash_baseline_paths helper in agentic_sync_runner.py). At every early-return BEFORE runner dispatch, revert anything outside the contract + companion allowlist + baseline. Same primitives the per-module guard uses: _revert_out_of_scope_changes (tracked), revert_out_of_scope_changes_with_dirs (untracked), iter-9 fail- closed re-scan for unrecovered paths, iter-14 anchored companion matcher, iter-24 SHA-aware baseline preservation. 9 early-return sites wrapped via _orch_scope_check_return helper. Dispatch sites (2164/2166) intentionally not wrapped — the runner's own guard handles enforcement once it exists. Gated on (scope_guard AND issue_contract is not None) so non- contract runs and --no-scope-guard opt-out pay zero cost. iter-26's _arch_path_in_scope gate retained as defense-in-depth — the orchestrator guard would also catch out-of-contract arch.json writes, but the gate prevents them in the first place, which reads better in operator output. Tests: 6 unit cases on _enforce_orchestrator_scope (revert/preserve/ clobber/permissive/opt-out/companion auto-allow); 5 integration cases on run_agentic_sync with the new wrap; 4 cases for the new safe argv path. iter-28's TestInjectDryRunFlag and TestLlmFixDryRunInjection deleted (the helper they exercised is gone). 703 passed across the scope-guard suite. Co-Authored-By: Claude Opus 4.7 --- pdd/agentic_sync.py | 593 +++++++++++++++--- pdd/agentic_sync_runner.py | 20 + .../agentic_sync_fix_dry_run_LLM.prompt | 23 +- tests/test_agentic_sync.py | 569 ++++++++++++++--- 4 files changed, 1004 insertions(+), 201 deletions(-) diff --git a/pdd/agentic_sync.py b/pdd/agentic_sync.py index 76c74e878..6ceee9477 100644 --- a/pdd/agentic_sync.py +++ b/pdd/agentic_sync.py @@ -16,7 +16,7 @@ import subprocess import sys from pathlib import Path -from typing import Any, Dict, List, NamedTuple, Optional, Tuple +from typing import Any, Dict, Iterable, List, NamedTuple, Optional, Tuple from rich.console import Console @@ -24,14 +24,22 @@ from .agentic_common import ( DEFAULT_SYNC_COMPANION_ALLOWLIST, IssueContract, + _is_valid_companion_pattern, + _matches_companion_pattern_anchored, + _revert_out_of_scope_changes, parse_issue_contract, run_agentic_task, ) +from .agentic_common_worktree import revert_out_of_scope_changes_with_dirs from .agentic_sync_runner import ( AsyncSyncRunner, _architecture_entry_aliases, _basename_from_architecture_filename, _find_pdd_executable, + _git_changed_paths, + _git_ignored_paths, + _hash_baseline_paths, + _hash_file, build_dep_graph_from_architecture_data, ) from .durable_sync_runner import DurableSyncRunner @@ -1191,31 +1199,15 @@ def _run_single_dry_run( return False, str(e) -# Matches a ``pdd sync`` invocation as two whitespace-separated tokens (NOT -# ``pdd sync-architecture`` or ``pdd synchronize``). The lookahead requires -# whitespace or end-of-string after ``sync`` — ``\b`` between ``c`` and ``-`` -# would otherwise match ``sync-*`` subcommands. Used to inject ``--dry-run`` -# into LLM-suggested sync commands before they are executed. -_PDD_SYNC_INVOCATION_RE = re.compile(r"(\bpdd\s+sync)(?=\s|$)") - - -def _inject_dry_run_flag(cmd: str) -> str: - """Inject ``--dry-run`` into every ``pdd sync`` invocation in *cmd*. - - Iter-28 B-1: the LLM-suggested command is supposed to be a dry-run probe - to validate cwd, but the LLM could omit ``--dry-run`` and cause a real - sync write before the scope-guarded runner exists. This injection is the - authoritative safety mechanism — the prompt template also instructs the - LLM to include it, but we cannot rely on prompt compliance alone. - - Only matches ``pdd sync`` as two whitespace-separated tokens; subcommands - like ``pdd sync-architecture`` are intentionally left untouched (and - rejected downstream by the paranoia check in - :func:`_llm_fix_dry_run_failure`). - """ - if "--dry-run" in cmd: - return cmd - return _PDD_SYNC_INVOCATION_RE.sub(r"\1 --dry-run", cmd) +# Iter-30 B-2: shell metacharacters disallowed in the LLM-supplied +# ``SYNC_CWD`` value. We build the argv ourselves and pass ``shell=False`` +# so injection via the cwd string is impossible by construction, but reject +# these characters anyway as defense-in-depth — a path containing them is +# almost certainly the LLM ignoring the new prompt shape and trying to send +# a shell fragment. +_SYNC_CWD_FORBIDDEN_CHARS: Tuple[str, ...] = ( + ";", "&", "|", "<", ">", "`", "$", "(", ")", "\n", "\r", +) def _llm_fix_dry_run_failure( @@ -1226,7 +1218,16 @@ def _llm_fix_dry_run_failure( verbose: bool = False, reasoning_time: Optional[float] = None, ) -> Tuple[bool, Optional[Path], float, str]: - """Ask the LLM to suggest the correct cwd/command when dry-run fails. + """Ask the LLM to suggest the correct cwd when dry-run fails. + + Iter-30 B-2 replaces the prior ``shell=True`` exec of an LLM-supplied + string with a hardened approach: the LLM only identifies a working + directory (``SYNC_CWD: ``) and the orchestrator + builds the ``pdd --force sync --dry-run --agentic --no-steer`` + argv itself, then executes with ``shell=False``. This closes iter-29 B-2 + (LLM shell injection at the orchestrator level) — ``shell=True`` with an + LLM-provided string allowed ``rm``/redirects/chained writes that the + iter-28 ``--dry-run`` flag injection could not block. Returns: Tuple of (success, suggested_cwd_or_None, llm_cost, error_msg). @@ -1289,88 +1290,139 @@ def _llm_fix_dry_run_failure( if not llm_success: return False, None, llm_cost, f"LLM failed to suggest fix: {llm_output}" - # Parse SYNC_CMD from response - cmd_match = re.search(r"SYNC_CMD:\s*(.+)", llm_output) - if not cmd_match: - return False, None, llm_cost, "LLM response did not contain SYNC_CMD marker" - - suggested_cmd = cmd_match.group(1).strip() - - # Safety: reject commands that don't look like a pdd sync invocation - if "pdd" not in suggested_cmd or "sync" not in suggested_cmd: - return False, None, llm_cost, f"LLM suggested unexpected command: {suggested_cmd}" - - # B-1 (iter-28): force ``--dry-run`` onto the LLM-suggested command. The - # probe is supposed to validate cwd only — never perform a real sync. If - # the LLM omits ``--dry-run`` we inject it; if injection cannot land - # (e.g. ``pdd sync-architecture`` which intentionally is not matched by - # :data:`_PDD_SYNC_INVOCATION_RE`) the paranoia check below refuses to - # execute. The scope-guarded runner does not yet exist at this point, so - # there is no fallback enforcement. - safe_cmd = _inject_dry_run_flag(suggested_cmd) - if "--dry-run" not in safe_cmd: + # Iter-30: explicitly reject the legacy ``SYNC_CMD:`` shape so a stale + # cached LLM response surfaces a clear migration error rather than a + # vague "no SYNC_CWD marker" message. The new prompt only asks for the + # cwd; if the response carries the old shell-command shape the LLM is + # acting on a prior cached version of the prompt and must be re-asked + # with the iter-30 wording. + if "SYNC_CWD:" not in llm_output and "SYNC_CMD:" in llm_output: + return ( + False, + None, + llm_cost, + ( + "LLM returned legacy ``SYNC_CMD:`` format. The orchestrator now " + "builds the sync argv itself and only expects ``SYNC_CWD: `` " + "from the LLM. Re-run; this usually clears after one retry." + ), + ) + + cwd_match = re.search(r"SYNC_CWD:\s*(.+)", llm_output) + if not cwd_match: + return False, None, llm_cost, "LLM response did not contain SYNC_CWD marker" + + raw_cwd = cwd_match.group(1).strip() + if not raw_cwd: + return False, None, llm_cost, "LLM returned an empty SYNC_CWD value" + + # Strip surrounding quotes the LLM may emit. + if (raw_cwd.startswith('"') and raw_cwd.endswith('"')) or ( + raw_cwd.startswith("'") and raw_cwd.endswith("'") + ): + raw_cwd = raw_cwd[1:-1].strip() + if not raw_cwd: + return False, None, llm_cost, "LLM returned an empty SYNC_CWD value" + + # Defense-in-depth: reject any shell metacharacter. We pass ``shell=False`` + # downstream so injection through the cwd is structurally impossible, but + # a path containing these characters is almost certainly the LLM trying + # to smuggle a shell fragment past the new prompt shape. + for ch in _SYNC_CWD_FORBIDDEN_CHARS: + if ch in raw_cwd: + return ( + False, + None, + llm_cost, + ( + f"LLM SYNC_CWD value contains forbidden character " + f"{ch!r}; refusing to execute: {raw_cwd!r}" + ), + ) + + # Resolve relative-to-project-root or absolute paths. Both are accepted + # so long as the resolved location lives under ``project_root``. + candidate = Path(raw_cwd) + if not candidate.is_absolute(): + candidate = project_root / candidate + try: + resolved_cwd = candidate.resolve() + except (OSError, RuntimeError) as exc: + return ( + False, + None, + llm_cost, + f"Failed to resolve SYNC_CWD path {raw_cwd!r}: {exc}", + ) + + project_root_resolved = project_root.resolve() + try: + resolved_cwd.relative_to(project_root_resolved) + except ValueError: return ( False, None, llm_cost, ( - "LLM suggested command does not contain a ``pdd sync`` invocation " - "where ``--dry-run`` could be injected; refusing to execute: " - f"{suggested_cmd}" + f"SYNC_CWD resolves outside project root " + f"({resolved_cwd} not under {project_root_resolved}); " + "refusing to execute." ), ) - # Append a pwd marker after the command so we can extract the effective cwd. - # This avoids fragile regex parsing of cd segments from the command string. - pwd_marker = "__PDD_EFFECTIVE_CWD__" - augmented_cmd = f"{safe_cmd} && echo {pwd_marker} && pwd" + if not resolved_cwd.is_dir(): + return ( + False, + None, + llm_cost, + f"SYNC_CWD does not resolve to a directory: {resolved_cwd}", + ) + + # Build the argv ourselves — the LLM never sees or supplies a shell line. + pdd_exe = _find_pdd_executable() + if pdd_exe: + cmd: List[str] = [pdd_exe] + else: + cmd = [sys.executable, "-m", "pdd"] + cmd.extend( + ["--force", "sync", basename, "--dry-run", "--agentic", "--no-steer"] + ) - # Run the suggested command directly via shell from project root. - # This handles relative cd paths, chained cd's, etc. naturally. try: result = subprocess.run( - augmented_cmd, - shell=True, - cwd=str(project_root), + cmd, + cwd=str(resolved_cwd), + shell=False, capture_output=True, text=True, timeout=60, env={**os.environ, "PDD_FORCE": "1", "CI": "1"}, ) except subprocess.TimeoutExpired: - return False, None, llm_cost, f"LLM suggested command timed out: {suggested_cmd}" - except Exception as e: - return False, None, llm_cost, f"Failed to run LLM suggested command: {e}" - - if result.returncode == 0: - # Extract effective cwd from the pwd output after our marker - stdout_lines = result.stdout.strip().splitlines() - effective_cwd = project_root.resolve() - for i, line in enumerate(stdout_lines): - if line.strip() == pwd_marker and i + 1 < len(stdout_lines): - effective_cwd = Path(stdout_lines[i + 1].strip()).resolve() - break - - # Validate resolved cwd is within project root - try: - effective_cwd.relative_to(project_root.resolve()) - except ValueError: - return ( - False, - None, - llm_cost, - f"LLM command resolves outside project root: {suggested_cmd}", - ) - - return True, effective_cwd, llm_cost, "" - else: - err_output = result.stderr or result.stdout or f"Exit code {result.returncode}" return ( False, None, llm_cost, - f"LLM suggested command failed: {err_output[:500]}", + f"LLM-suggested cwd dry-run timed out at {resolved_cwd}", ) + except Exception as e: # pragma: no cover — defensive + return ( + False, + None, + llm_cost, + f"Failed to run dry-run probe at {resolved_cwd}: {e}", + ) + + if result.returncode == 0: + return True, resolved_cwd, llm_cost, "" + + err_output = result.stderr or result.stdout or f"Exit code {result.returncode}" + return ( + False, + None, + llm_cost, + f"LLM-suggested cwd dry-run failed at {resolved_cwd}: {err_output[:500]}", + ) def _run_dry_run_validation( @@ -1593,6 +1645,302 @@ def _extract_allowed_write_paths(issue_text: str) -> List[str]: return list(contract.allowed_paths) if contract is not None else [] +def _enforce_orchestrator_scope( + project_root: Path, + issue_contract: Optional[IssueContract], + scope_guard: bool, + baseline_changed: Dict[str, Optional[str]], + baseline_ignored: Dict[str, Optional[str]], + *, + quiet: bool = False, +) -> Optional[str]: + """Iter-30: scope-guard the orchestrator's pre-dispatch write surface. + + Reverts any working-tree changes that fall outside the issue contract's + allowed write set + companion allowlist + the baseline of pre-existing + files captured at orchestrator entry. Returns ``None`` when the working + tree is clean (within scope), else returns a multi-line diagnostic + string describing what was reverted and what (if anything) could not be + reverted. + + Permissive when ``issue_contract is None`` or ``scope_guard is False`` — + returns ``None`` unconditionally, matching the spec for the per-module + scope guard in :class:`AsyncSyncRunner._enforce_scope_guard`. + + Why this exists (iter-29 → iter-30 promotion): + The per-module scope guard in :class:`AsyncSyncRunner` only runs + AFTER the runner is constructed and only for per-module sync + subprocesses. The orchestrator does write-capable work BEFORE the + runner exists — LLM module identification (write-capable + :func:`run_agentic_task`), LLM dry-run-fix subprocesses, + :func:`_apply_architecture_corrections`, etc. Each pre-dispatch + write site is unguarded by the per-module scope guard, so iter-26, + iter-28 and iter-29 each kept finding new orchestrator-level + bypasses. This unified guard runs at every orchestrator return site + between the baseline snapshot and the runner dispatch, replacing the + per-site patches. + + Args: + project_root: Repo root the orchestrator is operating against + (always the user's main checkout, in both async and durable + mode — the orchestrator does not branch to worktrees). + issue_contract: Parsed contract for the issue, or ``None`` when no + structured contract was found (permissive mode). + scope_guard: ``True`` when the runner-level scope guard is enabled + (CLI default). ``False`` when the user passed ``--no-scope-guard``. + ``False`` short-circuits to a no-op for the same reason the + per-module guard does. + baseline_changed: Snapshot of working-tree changes (``git status + --porcelain --untracked-files=all``) at orchestrator entry, + mapping repo-relative POSIX paths to init-time SHA-1. Only + byte-identical content is auto-allowed (iter-24); divergent + SHAs fall through to the contract check so a clobber surfaces. + baseline_ignored: Snapshot of gitignored paths (``git ls-files + --others --ignored --exclude-standard``) at orchestrator entry, + same SHA-aware preservation rule as ``baseline_changed`` + (iter-20 + iter-24). + quiet: When True, suppress the stderr echo of the diagnostic. + + Returns: + ``None`` when working tree is clean within the contract OR when + enforcement is disabled (permissive mode, or ``--no-scope-guard``). + Diagnostic string otherwise — callers prepend it to the return + message so the user sees what was reverted. + """ + if not scope_guard or issue_contract is None: + return None + + # Build the allowed-files set. Same shape as + # ``AsyncSyncRunner._enforce_scope_guard`` (iter-14 anchored matcher + + # iter-24 SHA-aware baseline preservation), specialised for the + # orchestrator's project_root cwd: there is no per-module ``module_cwd`` + # to anchor the rglob/companion matcher against, so the orchestrator + # uses the repo root as both ``repo_root`` and ``module_cwd`` — every + # pre-dispatch write the orchestrator can produce is repo-rooted. + repo_root = project_root.resolve() + + allowed_paths_iter = tuple(issue_contract.allowed_paths or ()) + allowed_files: set[Path] = set() + for rel in allowed_paths_iter: + if not rel: + continue + allowed_files.add((repo_root / rel).resolve()) + + # Companion allowlist union with DEFAULT — mirrors the iter-26 gate + # in :func:`_arch_path_in_scope`'s neighbour and the runner's own + # ``__init__`` union. Order preserved for log determinism. + allowlist: Tuple[str, ...] = tuple( + dict.fromkeys( + tuple(issue_contract.companion_allowlist or ()) + + tuple(DEFAULT_SYNC_COMPANION_ALLOWLIST) + ) + ) + + # rglob for currently-on-disk companion files. The orchestrator scope + # guard cannot rely on a single module_cwd, so the scan is repo-wide + # — same union as the per-module guard, just with a wider net. + for path in repo_root.rglob("*"): + if not path.is_file(): + continue + try: + rel_posix = path.resolve().relative_to(repo_root).as_posix() + except ValueError: + continue + if _matches_companion(rel_posix, allowlist): + allowed_files.add(path.resolve()) + + # Also pick up companion-shaped tracked deletions (sync legitimately + # removes ``.pdd/meta/foo_python.json`` when a module is renamed; the + # revert helper would otherwise resurrect it). + for rel_posix in _git_changed_paths(repo_root): + absolute = (repo_root / rel_posix).resolve() + if _matches_companion(rel_posix, allowlist): + allowed_files.add(absolute) + + # Iter-24 SHA-aware preservation of pre-existing changed baseline. + for rel_posix, baseline_hash in baseline_changed.items(): + current_hash = _hash_file(repo_root, rel_posix) + if current_hash is None: + # File was deleted after baseline. Let revert helpers decide. + continue + if baseline_hash is None or current_hash == baseline_hash: + # Unreadable at snapshot (preserve by name) or unchanged content + # → preserve. + allowed_files.add((repo_root / rel_posix).resolve()) + + tracked_reverted = _revert_out_of_scope_changes(repo_root, allowed_files) + untracked_reverted = revert_out_of_scope_changes_with_dirs( + repo_root, allowed_dirs=set(), allowed_files=allowed_files + ) + + seen: set[str] = set() + offending: List[str] = [] + for path in list(tracked_reverted) + list(untracked_reverted): + try: + rel = Path(path).resolve().relative_to(repo_root).as_posix() + except ValueError: + rel = str(path) + if rel in seen: + continue + if (repo_root / rel).resolve() in allowed_files: + continue + seen.add(rel) + offending.append(rel) + + # Iter-9 fail-closed re-scan: either revert helper can fail silently + # and return []. We re-scan the working tree after revert to be sure + # the contract is now satisfied; anything still on disk that is not + # allowed becomes the "unrecovered" set. + remaining_raw = _orchestrator_remaining_out_of_scope_paths( + repo_root, allowed_files, baseline_ignored + ) + offending_set = set(offending) + remaining = [p for p in remaining_raw if p not in offending_set] + + if not offending and not remaining: + return None + + source = issue_contract.source or "" + allowed_lines = ( + "\n".join(f" - {p}" for p in sorted(allowed_paths_iter)) + or " - " + ) + companion_lines = ( + "\n".join(f" - {p}" for p in allowlist) or " - " + ) + if offending: + offending_lines = "\n".join(f" - {p}" for p in offending) + header = ( + f"Orchestrator scope guard reverted {len(offending)} " + f"out-of-scope file(s) before runner dispatch " + f"(contract source: {source}):\n{offending_lines}" + ) + else: + header = ( + "Orchestrator scope guard detected out-of-scope artifacts " + f"before runner dispatch (contract source: {source}) but the " + "revert helpers reported no successful reverts." + ) + + parts = [header] + if remaining: + unrecovered_lines = "\n".join(f" - {p}" for p in remaining) + parts.append( + "Unrecovered (revert failed, manual cleanup required):\n" + f"{unrecovered_lines}" + ) + parts.append(f"Allowed write set:\n{allowed_lines}") + parts.append(f"Companion allowlist:\n{companion_lines}") + diagnostic = "\n".join(parts) + + if not quiet: + # F8 parity with the per-module scope guard: echo diagnostic to + # stderr immediately so the user sees what was reverted before + # the orchestrator's combined return message appears. + print(diagnostic, file=sys.stderr) + return diagnostic + + +def _matches_companion(rel_posix: str, allowlist: Iterable[str]) -> bool: + """Anchored companion-allowlist match for orchestrator scope guard. + + Iter-30: thin wrapper around the iter-14 module-relative anchored + matcher used by :class:`AsyncSyncRunner._enforce_scope_guard`. Kept + local to the orchestrator so the orchestrator and the per-module + guard share semantics without the orchestrator importing the runner's + instance method. + """ + for pattern in allowlist: + if not pattern: + continue + if not _is_valid_companion_pattern(pattern): + continue + if _matches_companion_pattern_anchored(rel_posix, pattern): + return True + return False + + +def _orchestrator_remaining_out_of_scope_paths( + repo_root: Path, + allowed_files: set[Path], + baseline_ignored: Dict[str, Optional[str]], +) -> List[str]: + """Iter-30 fail-closed re-scan for the orchestrator scope guard. + + Re-runs the iter-9 + iter-20 + iter-24 re-scan logic from + :meth:`AsyncSyncRunner._remaining_out_of_scope_paths` but parameterised + on the orchestrator's snapshots (the runner uses its own instance + state). Returns the sentinel ``[""]`` when either + of the underlying git probes fails so the orchestrator surfaces the + failure rather than silently passing. + """ + try: + result = subprocess.run( + ["git", "-C", str(repo_root), "status", + "--porcelain", "--untracked-files=all"], + capture_output=True, text=True, timeout=30, + ) + except (subprocess.TimeoutExpired, FileNotFoundError, OSError): + return [""] + if result.returncode != 0: + return [""] + + remaining: set[str] = set() + for line in result.stdout.splitlines(): + if len(line) < 4: + continue + payload = line[3:].strip() + if not payload: + continue + if " -> " in payload: + old_raw, new_raw = payload.split(" -> ", 1) + entry_paths = [old_raw.strip().strip('"'), + new_raw.strip().strip('"')] + else: + entry_paths = [payload.strip('"')] + for rel in entry_paths: + rel = rel.strip() + if rel.startswith("./"): + rel = rel[2:] + if not rel: + continue + absolute = (repo_root / rel).resolve() + if absolute in allowed_files: + continue + remaining.add(rel) + + try: + ignored_result = subprocess.run( + ["git", "-C", str(repo_root), "ls-files", + "--others", "--ignored", "--exclude-standard"], + capture_output=True, text=True, timeout=30, + ) + except (subprocess.TimeoutExpired, FileNotFoundError, OSError): + return [""] + if ignored_result.returncode != 0: + return [""] + + for line in ignored_result.stdout.splitlines(): + rel = line.strip().strip('"') + if rel.startswith("./"): + rel = rel[2:] + if not rel: + continue + if rel in baseline_ignored: + baseline_hash = baseline_ignored[rel] + current_hash = _hash_file(repo_root, rel) + if current_hash is not None and ( + baseline_hash is None or current_hash == baseline_hash + ): + continue + absolute = (repo_root / rel).resolve() + if absolute in allowed_files: + continue + remaining.add(rel) + + return sorted(remaining) + + def _arch_path_in_scope( arch_path: Path, project_root: Path, @@ -1842,6 +2190,59 @@ def run_agentic_sync( if not quiet: console.print("[yellow]No architecture.json found, falling back to include-based dependency graph[/yellow]") + # Iter-30: orchestrator-level scope guard baseline snapshot. The + # per-module scope guard in :class:`AsyncSyncRunner` only enforces AFTER + # the runner is constructed, so any pre-dispatch LLM call or shell + # command in this orchestrator can produce out-of-contract writes that + # the per-module guard never sees. We snapshot the changed + ignored + # working tree now, run the orchestrator's pre-dispatch work, then + # invoke :func:`_enforce_orchestrator_scope` at every early return so + # any orchestrator-level violation is reverted before the function + # exits. Gated on ``scope_guard AND issue_contract is not None`` so + # non-contract runs and explicit ``--no-scope-guard`` opt-outs skip the + # baseline cost (mirrors the per-module guard's gate in + # :class:`AsyncSyncRunner.__init__`). The orchestrator's cwd is the + # user's main checkout in BOTH async and durable mode — durable mode + # only branches to worktrees inside :class:`DurableSyncRunner.run`, + # which runs after this guard is no longer relevant. + if scope_guard and issue_contract is not None: + _orch_baseline_changed: Dict[str, Optional[str]] = _hash_baseline_paths( + project_root, _git_changed_paths(project_root) + ) + _orch_baseline_ignored: Dict[str, Optional[str]] = _hash_baseline_paths( + project_root, _git_ignored_paths(project_root) + ) + else: + _orch_baseline_changed = {} + _orch_baseline_ignored = {} + + def _orch_scope_check_return( + msg: str, cost: float, prov: str, success: bool + ) -> Tuple[bool, str, float, str]: + """Iter-30: wrap an early return with orchestrator scope-guard enforcement. + + Closes the entire class of orchestrator-level scope bypasses that + iter-26, iter-28, and iter-29 each found a new instance of. When the + scope guard reverts anything, the return is forced to ``success=False`` + and the diagnostic is appended to the caller's message so the user + sees what was reverted before the orchestrator returned. + """ + scope_diagnostic = _enforce_orchestrator_scope( + project_root, + issue_contract, + scope_guard, + _orch_baseline_changed, + _orch_baseline_ignored, + quiet=quiet, + ) + if scope_diagnostic is None: + return success, msg, cost, prov + combined = ( + f"{msg}\n\nOrchestrator scope guard: out-of-contract artifacts " + f"detected before dispatch:\n{scope_diagnostic}" + ) + return False, combined, cost, prov + # 7. Try git diff-based module detection first (deterministic, free) branch_modules = _detect_modules_from_branch_diff(project_root) llm_cost = 0.0 @@ -1867,7 +2268,7 @@ def run_agentic_sync( "skipping LLM identification.[/green]" ) console.print(f"[green]{msg}[/green]") - return True, msg, llm_cost, provider + return _orch_scope_check_return(msg, llm_cost, provider, success=True) else: # 7b. Fall back to LLM-based module identification prompt_template = load_prompt_template("agentic_sync_identify_modules_LLM") @@ -1901,7 +2302,7 @@ def run_agentic_sync( msg = f"LLM failed to identify modules: {llm_output}" if use_github_state: _post_error_comment(owner, repo, issue_number, msg) - return False, msg, llm_cost, provider + return _orch_scope_check_return(msg, llm_cost, provider, success=False) # 9. Parse LLM response modules_to_sync, deps_valid, deps_corrections = _parse_llm_response(llm_output) @@ -1918,11 +2319,11 @@ def run_agentic_sync( msg = "All modules are already synced — nothing to do." if not quiet: console.print(f"[green]{msg}[/green]") - return True, msg, llm_cost, provider + return _orch_scope_check_return(msg, llm_cost, provider, success=True) msg = "LLM identified no modules to sync" if use_github_state: _post_error_comment(owner, repo, issue_number, msg) - return False, msg, llm_cost, provider + return _orch_scope_check_return(msg, llm_cost, provider, success=False) # LLM returns basenames from architecture.json filenames (e.g., "crm_models_Python"). # pdd sync expects basenames without the language suffix (e.g., "crm_models"). @@ -1954,7 +2355,7 @@ def run_agentic_sync( msg = "All modules are already synced — nothing to do." if not quiet: console.print(f"[green]{msg}[/green]") - return True, msg, llm_cost, provider + return _orch_scope_check_return(msg, llm_cost, provider, success=True) # 9.4 Augment architecture with entries from the PR branch (new modules created by pdd-change) architecture = _augment_architecture_from_pr_branch(architecture, project_root, issue_number) @@ -1968,7 +2369,7 @@ def run_agentic_sync( msg = f"No valid modules to sync (all basenames were invalid: {invalid_basenames})" if use_github_state: _post_error_comment(owner, repo, issue_number, msg) - return False, msg, llm_cost, provider + return _orch_scope_check_return(msg, llm_cost, provider, success=False) if not quiet: console.print(f"[green]Modules to sync: {modules_to_sync}[/green]") @@ -2066,7 +2467,7 @@ def run_agentic_sync( console.print(f"[red]{msg}[/red]") if use_github_state: _post_error_comment(owner, repo, issue_number, msg) - return False, msg, llm_cost, provider + return _orch_scope_check_return(msg, llm_cost, provider, success=False) if not quiet: for bn, cwd in module_cwds.items(): @@ -2089,14 +2490,14 @@ def run_agentic_sync( msg = "All modules are already synced — nothing to do." if not quiet: console.print(f"[green]{msg}[/green]") - return True, msg, llm_cost, provider + return _orch_scope_check_return(msg, llm_cost, provider, success=True) if dry_run: module_list = ", ".join(modules_to_sync) msg = f"Dry run complete: {len(modules_to_sync)} module(s) would sync: {module_list}" if not quiet: console.print(f"[green]{msg}[/green]") - return True, msg, llm_cost, provider + return _orch_scope_check_return(msg, llm_cost, provider, success=True) # 12. Run parallel sync sync_options = { diff --git a/pdd/agentic_sync_runner.py b/pdd/agentic_sync_runner.py index b41ae492c..280a1ed14 100644 --- a/pdd/agentic_sync_runner.py +++ b/pdd/agentic_sync_runner.py @@ -253,6 +253,26 @@ def _hash_file(project_root: Path, rel_posix: str) -> Optional[str]: return hashlib.sha1(data).hexdigest() +def _hash_baseline_paths( + project_root: Path, paths: Iterable[str] +) -> Dict[str, Optional[str]]: + """Map each repo-relative path under *project_root* to its SHA-1. + + Iter-30: extracted helper. Previously this was an inline dict + comprehension in :class:`AsyncSyncRunner.__init__` (mirrored twice for + the changed-paths and ignored-paths baselines). The orchestrator scope + guard now reuses it to snapshot baseline before any pre-dispatch LLM + call or shell command runs in :func:`pdd.agentic_sync.run_agentic_sync`. + + Returns: + Dict from repo-relative POSIX path to SHA-1 hex string. ``None`` is + recorded when the file cannot be read at snapshot time — callers + (the scope guard) must treat ``None`` as "no fingerprint available" + and decide preservation policy explicitly. See :func:`_hash_file`. + """ + return {rel: _hash_file(project_root, rel) for rel in paths} + + # --------------------------------------------------------------------------- # Helper functions # --------------------------------------------------------------------------- diff --git a/pdd/prompts/agentic_sync_fix_dry_run_LLM.prompt b/pdd/prompts/agentic_sync_fix_dry_run_LLM.prompt index dbffe7fb5..5a597aa61 100644 --- a/pdd/prompts/agentic_sync_fix_dry_run_LLM.prompt +++ b/pdd/prompts/agentic_sync_fix_dry_run_LLM.prompt @@ -13,22 +13,23 @@ You are debugging a failed `pdd sync` dry-run for module "{basename}". The dry-run failed when running `pdd sync {basename} --dry-run` from `{attempted_cwd}`. This usually means the prompt file was not found, or the working directory is wrong. -Determine the correct working directory and output the full command to run. +Determine the correct working directory from which `pdd sync {basename}` should be invoked. Look at where .pddrc files are located and which subdirectory contains the relevant prompt files. ## Output Format -Output the full shell command on a single line, prefixed with `SYNC_CMD:`. -The command MUST use `cd && pdd --force sync {basename} --dry-run --agentic --no-steer`. +Output the working directory ONLY as a single line, prefixed with `SYNC_CWD:`. +The value MUST be a path relative to the project root (use `.` for the project root itself) +or an absolute path that resolves under the project root. Do NOT emit a shell command. +Do NOT include any shell metacharacters (`;`, `&`, `|`, `<`, `>`, `` ` ``, `$`, `(`, `)`, newline). -The command MUST include `--dry-run`. This invocation is a probe to validate -the working directory only — it must never perform a real sync. If `--dry-run` -is omitted, the orchestrator will inject it automatically before executing, -but you should always emit it explicitly so the intent is unambiguous. +The orchestrator will build and execute the `pdd sync {basename} --dry-run` argv itself +from this working directory. You only need to identify the directory. -SYNC_CMD: cd && pdd --force sync {basename} --dry-run --agentic --no-steer +SYNC_CWD: Examples: -- SYNC_CMD: cd examples/hello && pdd --force sync {basename} --dry-run --agentic --no-steer -- SYNC_CMD: cd . && pdd --force sync {basename} --dry-run --agentic --no-steer +- SYNC_CWD: examples/hello +- SYNC_CWD: . +- SYNC_CWD: frontend -Only output ONE SYNC_CMD line. Do not output SYNC_CWD. +Only output ONE SYNC_CWD line. Do not output SYNC_CMD. diff --git a/tests/test_agentic_sync.py b/tests/test_agentic_sync.py index e6f1f2ae7..1959061c3 100644 --- a/tests/test_agentic_sync.py +++ b/tests/test_agentic_sync.py @@ -25,9 +25,9 @@ _build_scoped_global_dep_graph, _branch_diff_is_runtime_llm_only, _detect_modules_from_branch_diff, + _enforce_orchestrator_scope, _filter_already_synced, _find_project_root, - _inject_dry_run_flag, _is_catchall_match, _is_github_issue_url, _is_runtime_llm_template, @@ -45,6 +45,7 @@ from pdd.agentic_common import IssueContract from pdd.agentic_sync_runner import ( DepGraphFromArchitectureResult, + _hash_baseline_paths, build_dep_graph_from_architecture, ) @@ -2236,50 +2237,446 @@ def test_empty_contract_allowed_paths_rejected(self, tmp_path): # --------------------------------------------------------------------------- -# _inject_dry_run_flag (iter-28 B-1 helper, unit-level) +# _enforce_orchestrator_scope (iter-30 unified orchestrator scope guard) # --------------------------------------------------------------------------- -class TestInjectDryRunFlag: - """Unit-level coverage of the ``--dry-run`` injector used by the - LLM dry-run fallback executor.""" +def _init_git_repo(tmp_path: Path) -> None: + """Create a minimal git repo at *tmp_path* for orchestrator scope tests.""" + subprocess.run( + ["git", "init", "--quiet", "--initial-branch=main", str(tmp_path)], + check=True, + ) + subprocess.run( + ["git", "-C", str(tmp_path), "config", "user.email", "test@example.com"], + check=True, + ) + subprocess.run( + ["git", "-C", str(tmp_path), "config", "user.name", "Test"], + check=True, + ) + # Seed with a committed file so HEAD exists. + seed = tmp_path / ".pddrc" + seed.write_text("# pddrc\n", encoding="utf-8") + subprocess.run( + ["git", "-C", str(tmp_path), "add", "-A"], + check=True, + ) + subprocess.run( + ["git", "-C", str(tmp_path), "commit", "--quiet", "-m", "init"], + check=True, + ) + + +def _hash_baseline_single(project_root: Path, rel: str) -> str: + """Tiny helper: SHA-1 of the file at *project_root / rel* (for baseline maps).""" + import hashlib + return hashlib.sha1((project_root / rel).read_bytes()).hexdigest() + + +class TestEnforceOrchestratorScope: + """Iter-30: unit-level coverage of the orchestrator scope guard helper. + + These tests exercise :func:`_enforce_orchestrator_scope` directly. Higher- + level integration tests that drive :func:`run_agentic_sync` end-to-end are + in :class:`TestOrchestratorScopeGuardIntegration`. + """ + + @staticmethod + def _contract(*allowed: str) -> IssueContract: + return IssueContract( + allowed_paths=tuple(allowed), + companion_allowlist=(), + source="test", + ) + + def test_no_op_when_no_contract(self, tmp_path): + """``issue_contract is None`` → permissive, returns None unconditionally.""" + _init_git_repo(tmp_path) + out_of_scope = tmp_path / "wild.py" + out_of_scope.write_text("unsanctioned\n", encoding="utf-8") + + result = _enforce_orchestrator_scope( + tmp_path, + issue_contract=None, + scope_guard=True, + baseline_changed={}, + baseline_ignored={}, + quiet=True, + ) + assert result is None + # Permissive mode preserves the file on disk. + assert out_of_scope.exists() + + def test_no_op_when_scope_guard_disabled(self, tmp_path): + """``scope_guard=False`` → no-op even with a contract.""" + _init_git_repo(tmp_path) + out_of_scope = tmp_path / "wild.py" + out_of_scope.write_text("unsanctioned\n", encoding="utf-8") + + contract = self._contract("pdd/foo.py") + result = _enforce_orchestrator_scope( + tmp_path, + issue_contract=contract, + scope_guard=False, + baseline_changed={}, + baseline_ignored={}, + quiet=True, + ) + assert result is None + assert out_of_scope.exists() + + def test_reverts_untracked_out_of_contract_writes(self, tmp_path): + """Untracked out-of-contract file → reverted, diagnostic returned.""" + _init_git_repo(tmp_path) + contract = self._contract("pdd/foo.py") + out_of_scope = tmp_path / "outside.py" + out_of_scope.write_text("oops\n", encoding="utf-8") + + result = _enforce_orchestrator_scope( + tmp_path, + issue_contract=contract, + scope_guard=True, + baseline_changed={}, + baseline_ignored={}, + quiet=True, + ) + assert result is not None + assert "outside.py" in result + assert "Orchestrator scope guard" in result + assert not out_of_scope.exists() + + def test_preserves_pre_existing_baseline_path(self, tmp_path): + """Pre-existing untracked file in baseline (unchanged SHA) → preserved.""" + _init_git_repo(tmp_path) + contract = self._contract("pdd/foo.py") + user_wip = tmp_path / "userwip.py" + user_wip.write_text("user code\n", encoding="utf-8") + + # Snapshot the baseline (matches what run_agentic_sync does). + baseline = {"userwip.py": _hash_baseline_single(tmp_path, "userwip.py")} + + result = _enforce_orchestrator_scope( + tmp_path, + issue_contract=contract, + scope_guard=True, + baseline_changed=baseline, + baseline_ignored={}, + quiet=True, + ) + # Unchanged baseline → no revert needed. + assert result is None + assert user_wip.exists() + + def test_detects_baseline_clobber(self, tmp_path): + """Baseline path overwritten with different content → flagged & reverted.""" + _init_git_repo(tmp_path) + contract = self._contract("pdd/foo.py") + user_wip = tmp_path / "userwip.py" + user_wip.write_text("original\n", encoding="utf-8") + baseline = {"userwip.py": _hash_baseline_single(tmp_path, "userwip.py")} + + # Now clobber the baseline. + user_wip.write_text("CLOBBERED by LLM\n", encoding="utf-8") + + result = _enforce_orchestrator_scope( + tmp_path, + issue_contract=contract, + scope_guard=True, + baseline_changed=baseline, + baseline_ignored={}, + quiet=True, + ) + assert result is not None + assert "userwip.py" in result + # The file is gone after the revert helper sweeps it (untracked + + # not allowed → removed). + assert not user_wip.exists() + + def test_companion_allowlist_default_auto_allows_pdd_meta(self, tmp_path): + """``.pdd/meta/*.json`` is auto-allowed by DEFAULT_SYNC_COMPANION_ALLOWLIST.""" + _init_git_repo(tmp_path) + contract = self._contract("pdd/foo.py") + meta_dir = tmp_path / ".pdd" / "meta" + meta_dir.mkdir(parents=True) + meta_file = meta_dir / "foo_python.json" + meta_file.write_text('{"fingerprint": "x"}', encoding="utf-8") + + result = _enforce_orchestrator_scope( + tmp_path, + issue_contract=contract, + scope_guard=True, + baseline_changed={}, + baseline_ignored={}, + quiet=True, + ) + # Companion-allowlisted → no revert, no diagnostic. + assert result is None + assert meta_file.exists() + + +# --------------------------------------------------------------------------- +# Orchestrator scope guard integration (iter-30) +# --------------------------------------------------------------------------- + + +class TestOrchestratorScopeGuardIntegration: + """Iter-30: integration tests that drive :func:`run_agentic_sync` and + verify the orchestrator scope guard reverts/preserves correctly at the + early-return boundary.""" + + _ISSUE_BODY_WITH_BULLET_CONTRACT = ( + "Title: feat: foo\n\n" + "## Allowed Write Set\n\n" + "**Allowed write set:**\n" + "- pdd/foo.py\n" + ) + _ISSUE_BODY_NO_CONTRACT = "Title: feat: foo\n\nNo structured contract here." + + def _issue_payload(self, body: str) -> str: + return json.dumps({"title": "Test", "body": body, "comments_url": ""}) + + @patch("pdd.agentic_sync.AsyncSyncRunner") + @patch("pdd.agentic_sync.load_prompt_template", return_value="t {issue_content} {architecture_json}") + @patch("pdd.agentic_sync.run_agentic_task") + @patch("pdd.agentic_sync._load_architecture_json") + @patch("pdd.agentic_sync._run_gh_command") + @patch("pdd.agentic_sync._check_gh_cli", return_value=True) + def test_reverts_out_of_contract_llm_writes( + self, + _mock_gh_cli, + mock_gh_cmd, + mock_load_arch, + mock_agentic_task, + _mock_load_prompt, + mock_runner_cls, + tmp_path, + monkeypatch, + ): + """LLM writes an out-of-contract file during identify-modules → + orchestrator scope guard reverts before the orchestrator returns. + + The mock for ``run_agentic_task`` writes ``outside.py`` to disk and + returns an empty module list, so the orchestrator hits the + "LLM identified no modules to sync" early return (line ~1925). The + scope guard wrap on that return must observe and revert the write. + """ + _init_git_repo(tmp_path) + monkeypatch.setattr("pdd.agentic_sync._find_project_root", lambda *_: tmp_path) + monkeypatch.setattr( + "pdd.agentic_sync._detect_modules_from_branch_diff", lambda *_: [] + ) + mock_gh_cmd.return_value = ( + True, self._issue_payload(self._ISSUE_BODY_WITH_BULLET_CONTRACT) + ) + mock_load_arch.return_value = (None, tmp_path / "architecture.json") + + def llm_side_effect(*_args, **_kwargs): + # Simulate LLM writing an out-of-contract file mid-call. + (tmp_path / "outside.py").write_text("LLM wrote me\n", encoding="utf-8") + # Return a parse-failing response so we land on the + # "no modules to sync" early return inside the scope guard. + return True, "MODULES_TO_SYNC: []\nDEPS_VALID: true", 0.01, "anthropic" + + mock_agentic_task.side_effect = llm_side_effect + + success, msg, _cost, _model = run_agentic_sync( + "https://github.com/owner/repo/issues/1", quiet=True + ) + + assert success is False + assert "Orchestrator scope guard" in msg + assert "outside.py" in msg + assert not (tmp_path / "outside.py").exists(), ( + "scope guard must revert the out-of-contract write" + ) + mock_runner_cls.assert_not_called() + + @patch("pdd.agentic_sync.AsyncSyncRunner") + @patch("pdd.agentic_sync.load_prompt_template", return_value="t {issue_content} {architecture_json}") + @patch("pdd.agentic_sync.run_agentic_task") + @patch("pdd.agentic_sync._load_architecture_json") + @patch("pdd.agentic_sync._run_gh_command") + @patch("pdd.agentic_sync._check_gh_cli", return_value=True) + def test_preserves_pre_existing_baseline( + self, + _mock_gh_cli, + mock_gh_cmd, + mock_load_arch, + mock_agentic_task, + _mock_load_prompt, + mock_runner_cls, + tmp_path, + monkeypatch, + ): + """User WIP that pre-exists at orchestrator entry → preserved.""" + _init_git_repo(tmp_path) + # Pre-existing dirty user WIP. + (tmp_path / "userwip.py").write_text("user work in progress\n", encoding="utf-8") - def test_injects_after_sync_token(self): - assert _inject_dry_run_flag("pdd sync foo") == "pdd sync --dry-run foo" + monkeypatch.setattr("pdd.agentic_sync._find_project_root", lambda *_: tmp_path) + monkeypatch.setattr( + "pdd.agentic_sync._detect_modules_from_branch_diff", lambda *_: [] + ) + mock_gh_cmd.return_value = ( + True, self._issue_payload(self._ISSUE_BODY_WITH_BULLET_CONTRACT) + ) + mock_load_arch.return_value = (None, tmp_path / "architecture.json") - def test_idempotent_when_dry_run_already_present(self): - assert ( - _inject_dry_run_flag("pdd sync foo --dry-run") - == "pdd sync foo --dry-run" + # LLM does NOT touch userwip.py; just returns no modules. + mock_agentic_task.return_value = ( + True, "MODULES_TO_SYNC: []\nDEPS_VALID: true", 0.01, "anthropic" ) - def test_injects_into_cd_chained_command(self): - assert ( - _inject_dry_run_flag("cd subdir && pdd sync foo") - == "cd subdir && pdd sync --dry-run foo" + success, msg, _cost, _model = run_agentic_sync( + "https://github.com/owner/repo/issues/1", quiet=True ) - def test_does_not_match_sync_architecture(self): - # ``pdd sync-architecture`` is a different subcommand; the regex - # lookahead refuses to match so injection leaves the command alone. - assert ( - _inject_dry_run_flag("pdd sync-architecture") - == "pdd sync-architecture" + # Pre-existing untracked file MUST be preserved by the baseline rule. + assert (tmp_path / "userwip.py").exists() + assert (tmp_path / "userwip.py").read_text(encoding="utf-8") == ( + "user work in progress\n" ) + # Scope guard MUST NOT mention userwip.py in the diagnostic. + assert "userwip.py" not in msg + mock_runner_cls.assert_not_called() - def test_does_not_match_synchronize(self): - assert _inject_dry_run_flag("pdd synchronize") == "pdd synchronize" + @patch("pdd.agentic_sync.AsyncSyncRunner") + @patch("pdd.agentic_sync.load_prompt_template", return_value="t {issue_content} {architecture_json}") + @patch("pdd.agentic_sync.run_agentic_task") + @patch("pdd.agentic_sync._load_architecture_json") + @patch("pdd.agentic_sync._run_gh_command") + @patch("pdd.agentic_sync._check_gh_cli", return_value=True) + def test_detects_baseline_clobber( + self, + _mock_gh_cli, + mock_gh_cmd, + mock_load_arch, + mock_agentic_task, + _mock_load_prompt, + mock_runner_cls, + tmp_path, + monkeypatch, + ): + """LLM overwrites a baseline path with new content → flagged.""" + _init_git_repo(tmp_path) + (tmp_path / "userwip.py").write_text("original\n", encoding="utf-8") - def test_injects_at_end_of_command(self): - # ``pdd sync`` with no trailing arg → injection still lands. - assert _inject_dry_run_flag("pdd sync") == "pdd sync --dry-run" + monkeypatch.setattr("pdd.agentic_sync._find_project_root", lambda *_: tmp_path) + monkeypatch.setattr( + "pdd.agentic_sync._detect_modules_from_branch_diff", lambda *_: [] + ) + mock_gh_cmd.return_value = ( + True, self._issue_payload(self._ISSUE_BODY_WITH_BULLET_CONTRACT) + ) + mock_load_arch.return_value = (None, tmp_path / "architecture.json") - def test_injects_before_ampersand_chain(self): - assert ( - _inject_dry_run_flag("pdd sync && echo done") - == "pdd sync --dry-run && echo done" + def llm_side_effect(*_args, **_kwargs): + # Clobber pre-existing baseline path. + (tmp_path / "userwip.py").write_text( + "CLOBBERED by malicious LLM\n", encoding="utf-8" + ) + return True, "MODULES_TO_SYNC: []\nDEPS_VALID: true", 0.01, "anthropic" + + mock_agentic_task.side_effect = llm_side_effect + + success, msg, _cost, _model = run_agentic_sync( + "https://github.com/owner/repo/issues/1", quiet=True ) + assert success is False + assert "Orchestrator scope guard" in msg + assert "userwip.py" in msg + mock_runner_cls.assert_not_called() + + @patch("pdd.agentic_sync.AsyncSyncRunner") + @patch("pdd.agentic_sync.load_prompt_template", return_value="t {issue_content} {architecture_json}") + @patch("pdd.agentic_sync.run_agentic_task") + @patch("pdd.agentic_sync._load_architecture_json") + @patch("pdd.agentic_sync._run_gh_command") + @patch("pdd.agentic_sync._check_gh_cli", return_value=True) + def test_no_op_when_no_contract( + self, + _mock_gh_cli, + mock_gh_cmd, + mock_load_arch, + mock_agentic_task, + _mock_load_prompt, + mock_runner_cls, + tmp_path, + monkeypatch, + ): + """Permissive mode: no contract on issue → no revert, existing + behavior preserved.""" + _init_git_repo(tmp_path) + monkeypatch.setattr("pdd.agentic_sync._find_project_root", lambda *_: tmp_path) + monkeypatch.setattr( + "pdd.agentic_sync._detect_modules_from_branch_diff", lambda *_: [] + ) + mock_gh_cmd.return_value = ( + True, self._issue_payload(self._ISSUE_BODY_NO_CONTRACT) + ) + mock_load_arch.return_value = (None, tmp_path / "architecture.json") + + def llm_side_effect(*_args, **_kwargs): + (tmp_path / "outside.py").write_text("LLM wrote me\n", encoding="utf-8") + return True, "MODULES_TO_SYNC: []\nDEPS_VALID: true", 0.01, "anthropic" + + mock_agentic_task.side_effect = llm_side_effect + + success, msg, _cost, _model = run_agentic_sync( + "https://github.com/owner/repo/issues/1", quiet=True + ) + + # No contract → permissive mode → no revert, no diagnostic. + assert "Orchestrator scope guard" not in msg + assert (tmp_path / "outside.py").exists() + + @patch("pdd.agentic_sync.AsyncSyncRunner") + @patch("pdd.agentic_sync.load_prompt_template", return_value="t {issue_content} {architecture_json}") + @patch("pdd.agentic_sync.run_agentic_task") + @patch("pdd.agentic_sync._load_architecture_json") + @patch("pdd.agentic_sync._run_gh_command") + @patch("pdd.agentic_sync._check_gh_cli", return_value=True) + def test_no_op_with_no_scope_guard_flag( + self, + _mock_gh_cli, + mock_gh_cmd, + mock_load_arch, + mock_agentic_task, + _mock_load_prompt, + mock_runner_cls, + tmp_path, + monkeypatch, + ): + """``scope_guard=False`` → explicit opt-out, no revert, existing + behavior preserved (matches iter-26 / iter-28 gate semantics).""" + _init_git_repo(tmp_path) + monkeypatch.setattr("pdd.agentic_sync._find_project_root", lambda *_: tmp_path) + monkeypatch.setattr( + "pdd.agentic_sync._detect_modules_from_branch_diff", lambda *_: [] + ) + mock_gh_cmd.return_value = ( + True, self._issue_payload(self._ISSUE_BODY_WITH_BULLET_CONTRACT) + ) + mock_load_arch.return_value = (None, tmp_path / "architecture.json") + + def llm_side_effect(*_args, **_kwargs): + (tmp_path / "outside.py").write_text("LLM wrote me\n", encoding="utf-8") + return True, "MODULES_TO_SYNC: []\nDEPS_VALID: true", 0.01, "anthropic" + + mock_agentic_task.side_effect = llm_side_effect + + success, msg, _cost, _model = run_agentic_sync( + "https://github.com/owner/repo/issues/1", + quiet=True, + scope_guard=False, + ) + + # Explicit opt-out → no revert, no diagnostic. + assert "Orchestrator scope guard" not in msg + assert (tmp_path / "outside.py").exists() + # --------------------------------------------------------------------------- # _resolve_module_cwd @@ -2849,47 +3246,45 @@ def test_dry_run_success_rejects_changed_no_self_include_prompt_contract( # --------------------------------------------------------------------------- -# _llm_fix_dry_run_failure --dry-run injection (iter-28 B-1) +# _llm_fix_dry_run_failure safe-argv (iter-30 B-2 replacement) # --------------------------------------------------------------------------- -class TestLlmFixDryRunInjection: - """Verify that LLM-suggested sync commands always get ``--dry-run``. - - Iter-28 B-1: the orchestrator-level dry-run LLM fallback executes the - LLM's suggested command via shell. If the LLM forgets ``--dry-run`` the - command would perform a real sync write before the scope-guarded runner - exists. The orchestrator must inject ``--dry-run`` (and refuse to execute - if injection cannot land) to keep the probe non-destructive. - """ +class TestLlmFixDryRunSafeArgv: + """Iter-30: the orchestrator no longer executes an LLM-supplied shell + string. The LLM only returns ``SYNC_CWD: ``; the orchestrator + builds the argv itself and runs with ``shell=False``. Closes iter-29 B-2 + (shell injection at the orchestrator level).""" @staticmethod - def _llm_response(cmd: str) -> str: - return f"SYNC_CMD: {cmd}\n" + def _llm_response(cwd_value: str) -> str: + return f"SYNC_CWD: {cwd_value}\n" @patch("pdd.agentic_sync.subprocess.run") @patch("pdd.agentic_sync.run_agentic_task") @patch("pdd.agentic_sync.load_prompt_template") - def test_llm_fix_dry_run_injects_dry_run_flag( + def test_llm_fix_dry_run_uses_safe_argv_not_shell( self, mock_load_prompt, mock_agentic_task, mock_subprocess, tmp_path, ): - """LLM omits ``--dry-run`` → orchestrator injects it before exec.""" + """LLM returns ``SYNC_CWD: subdir`` → argv is a list, shell=False.""" + subdir = tmp_path / "subdir" + subdir.mkdir() mock_load_prompt.return_value = ( "{basename} {dry_run_error} {project_tree} {pddrc_locations} {attempted_cwd}" ) mock_agentic_task.return_value = ( True, - self._llm_response("pdd sync foo"), + self._llm_response("subdir"), 0.01, "anthropic", ) mock_subprocess.return_value = MagicMock( returncode=0, - stdout=f"__PDD_EFFECTIVE_CWD__\n{tmp_path}\n", + stdout="", stderr="", ) @@ -2901,132 +3296,118 @@ def test_llm_fix_dry_run_injects_dry_run_flag( ) assert ok is True - assert cwd == tmp_path.resolve() + assert cwd == subdir.resolve() assert err == "" - # Captured subprocess command must contain the injected flag in the - # correct token position (right after ``sync``). - executed_cmd = mock_subprocess.call_args[0][0] - assert "pdd sync --dry-run foo" in executed_cmd + + # argv must be a list (not a string), shell must be False. + call_args, call_kwargs = mock_subprocess.call_args + argv = call_args[0] + assert isinstance(argv, list), "argv must be a list — shell=False shape" + assert call_kwargs.get("shell", False) is False + assert "--dry-run" in argv + assert "sync" in argv + assert "foo" in argv + # cwd is the resolved SYNC_CWD path, not the project root. + assert call_kwargs.get("cwd") == str(subdir.resolve()) @patch("pdd.agentic_sync.subprocess.run") @patch("pdd.agentic_sync.run_agentic_task") @patch("pdd.agentic_sync.load_prompt_template") - def test_llm_fix_dry_run_preserves_existing_dry_run_flag( + def test_llm_fix_dry_run_rejects_path_outside_project_root( self, mock_load_prompt, mock_agentic_task, mock_subprocess, tmp_path, ): - """LLM already includes ``--dry-run`` → command unchanged.""" + """LLM returns ``SYNC_CWD: /etc`` (outside project root) → reject.""" mock_load_prompt.return_value = ( "{basename} {dry_run_error} {project_tree} {pddrc_locations} {attempted_cwd}" ) mock_agentic_task.return_value = ( True, - self._llm_response("pdd sync foo --dry-run"), + self._llm_response("/etc"), 0.01, "anthropic", ) - mock_subprocess.return_value = MagicMock( - returncode=0, - stdout=f"__PDD_EFFECTIVE_CWD__\n{tmp_path}\n", - stderr="", - ) - ok, _cwd, _cost, _err = _llm_fix_dry_run_failure( + ok, cwd, cost, err = _llm_fix_dry_run_failure( basename="foo", project_root=tmp_path, dry_run_error="prompt not found", quiet=True, ) - assert ok is True - executed_cmd = mock_subprocess.call_args[0][0] - # Only one ``--dry-run`` appears in the executed command (the LLM's - # own copy was preserved verbatim, no second copy was injected). - assert executed_cmd.count("--dry-run") == 1 - assert "pdd sync foo --dry-run" in executed_cmd + assert ok is False + assert cwd is None + assert "outside project root" in err + mock_subprocess.assert_not_called() @patch("pdd.agentic_sync.subprocess.run") @patch("pdd.agentic_sync.run_agentic_task") @patch("pdd.agentic_sync.load_prompt_template") - def test_llm_fix_dry_run_handles_cd_chained_command( + def test_llm_fix_dry_run_rejects_legacy_sync_cmd_format( self, mock_load_prompt, mock_agentic_task, mock_subprocess, tmp_path, ): - """LLM emits ``cd subdir && pdd sync foo`` → injection still lands.""" - subdir = tmp_path / "subdir" - subdir.mkdir() + """Stale-cache ``SYNC_CMD: pdd sync foo`` → reject with migration error.""" mock_load_prompt.return_value = ( "{basename} {dry_run_error} {project_tree} {pddrc_locations} {attempted_cwd}" ) mock_agentic_task.return_value = ( True, - self._llm_response("cd subdir && pdd sync foo"), + "SYNC_CMD: pdd --force sync foo --dry-run --agentic --no-steer\n", 0.01, "anthropic", ) - mock_subprocess.return_value = MagicMock( - returncode=0, - stdout=f"__PDD_EFFECTIVE_CWD__\n{subdir}\n", - stderr="", - ) - ok, _cwd, _cost, _err = _llm_fix_dry_run_failure( + ok, cwd, cost, err = _llm_fix_dry_run_failure( basename="foo", project_root=tmp_path, dry_run_error="prompt not found", quiet=True, ) - assert ok is True - executed_cmd = mock_subprocess.call_args[0][0] - assert "cd subdir && pdd sync --dry-run foo" in executed_cmd + assert ok is False + assert cwd is None + assert "SYNC_CWD" in err + assert "legacy" in err.lower() + mock_subprocess.assert_not_called() @patch("pdd.agentic_sync.subprocess.run") @patch("pdd.agentic_sync.run_agentic_task") @patch("pdd.agentic_sync.load_prompt_template") - def test_llm_fix_dry_run_does_not_inject_into_sync_architecture( + def test_llm_fix_dry_run_rejects_shell_metachars_in_cwd( self, mock_load_prompt, mock_agentic_task, mock_subprocess, tmp_path, ): - """``pdd sync-architecture`` is a different subcommand — must not match. - - The regex (``\\bpdd\\s+sync`` with whitespace/end-of-string lookahead) - deliberately rejects ``pdd sync-architecture`` so the injection cannot - produce ``pdd sync --dry-run-architecture``. The downstream paranoia - check then refuses to execute the un-injected command, since the - scope-guarded runner does not exist at this point. - """ + """SYNC_CWD containing shell metacharacters is rejected defensively.""" mock_load_prompt.return_value = ( "{basename} {dry_run_error} {project_tree} {pddrc_locations} {attempted_cwd}" ) mock_agentic_task.return_value = ( True, - self._llm_response("pdd sync-architecture"), + self._llm_response("subdir; rm -rf /"), 0.01, "anthropic", ) - ok, cwd, _cost, err = _llm_fix_dry_run_failure( + ok, cwd, cost, err = _llm_fix_dry_run_failure( basename="foo", project_root=tmp_path, dry_run_error="prompt not found", quiet=True, ) - # Paranoia check must reject the command outright. assert ok is False assert cwd is None - assert "--dry-run" in err - # And subprocess.run must NEVER be invoked for a non-injectable cmd. + assert "forbidden character" in err mock_subprocess.assert_not_called() From 967a8534189520dabbd0128887ce956e56de900c Mon Sep 17 00:00:00 2001 From: Serhan Date: Fri, 15 May 2026 15:04:50 -0700 Subject: [PATCH 37/42] fix(sync): iter-32 wrap dispatch boundary with orchestrator scope guard MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Codex iter-31 caught the gap iter-30 left: the unified orchestrator guard wrapped 9 early-return sites but NOT the successful-dispatch path. Any pre-dispatch write that survived past every early-return check would reach the runner, where AsyncSyncRunner.__init__ snapshots it as `_baseline_changed_paths` and preserves it (auto- allows) for the rest of the sync session. Insert one `_enforce_orchestrator_scope` call immediately before the `if durable:` runner-class branch. If diagnostic is non-None, post the standard error comment to the issue (when github state is enabled) and return False with the diagnostic. Aborts dispatch cleanly before any runner is constructed. Verified there is exactly one intervening block (the durable/async branch itself) between the iter-30 entry baseline and the runner constructions, so a single check above the branch covers both DurableSyncRunner and AsyncSyncRunner paths. Tests: dispatch blocked when pre-dispatch writes out-of-contract; dispatch allowed when clean; no-op with no contract; no-op with --no-scope-guard. 707 passed. Five pre-existing TestDependencyCorrectionsScopeGuard tests adjusted to init a clean tmp_path git repo so they don't trip on the new dispatch-boundary sweep when run against the real dirty worktree. The dispatch-boundary check coexists with iter-30's early-return wraps — both required because early returns never reach the dispatch boundary. Co-Authored-By: Claude Opus 4.7 --- pdd/agentic_sync.py | 38 ++++++ tests/test_agentic_sync.py | 257 +++++++++++++++++++++++++++++++++++++ 2 files changed, 295 insertions(+) diff --git a/pdd/agentic_sync.py b/pdd/agentic_sync.py index 6ceee9477..cac2326ed 100644 --- a/pdd/agentic_sync.py +++ b/pdd/agentic_sync.py @@ -2521,6 +2521,44 @@ def _orch_scope_check_return( contract_source: Optional[str] = ( issue_contract.source if issue_contract is not None else None ) + + # Iter-32 B-1: orchestrator scope guard at the dispatch boundary. + # iter-30 wrapped every early-return site with + # :func:`_orch_scope_check_return`, but the SUCCESSFUL DISPATCH path — + # where the orchestrator constructs ``AsyncSyncRunner`` or + # ``DurableSyncRunner`` and calls ``.run()`` — was intentionally left + # unwrapped (the runner has its own per-module guard). The gap: any + # pre-dispatch write from LLM module identification, dry-run validation, + # ``_apply_architecture_corrections``, etc. that does NOT trigger an + # early return reaches the runner, where the very first thing + # :class:`AsyncSyncRunner.__init__` does is snapshot the working tree + # as ``_baseline_changed_paths``. That baseline AUTO-ALLOWS those + # writes for the entire sync session (per-module guard preserves + # baseline paths). Run the orchestrator guard one last time here so + # out-of-contract pre-dispatch writes are reverted and dispatch fails + # with a clear diagnostic BEFORE the runner snapshots them. + # + # Single check before the ``if durable: ... else: ...`` branch covers + # both async and durable construction sites — the only intervening + # logic is the runner-class selection, which has no write side + # effects. + scope_diagnostic = _enforce_orchestrator_scope( + project_root, + issue_contract, + scope_guard, + _orch_baseline_changed, + _orch_baseline_ignored, + quiet=quiet, + ) + if scope_diagnostic is not None: + combined = ( + f"Orchestrator scope guard hard-fail before dispatch: " + f"out-of-contract artifacts detected.\n{scope_diagnostic}" + ) + if use_github_state: + _post_error_comment(owner, repo, issue_number, combined) + return False, combined, llm_cost, provider + if durable: runner = DurableSyncRunner( basenames=modules_to_sync, diff --git a/tests/test_agentic_sync.py b/tests/test_agentic_sync.py index 1959061c3..ca4b6a197 100644 --- a/tests/test_agentic_sync.py +++ b/tests/test_agentic_sync.py @@ -1541,6 +1541,7 @@ class TestDependencyCorrectionsScopeGuard: ``_apply_architecture_corrections``. The gate runs BEFORE any runner is dispatched, so per-module scope enforcement cannot catch this write.""" + @patch("pdd.agentic_sync._find_project_root") @patch("pdd.agentic_sync._apply_architecture_corrections") @patch("pdd.agentic_sync.AsyncSyncRunner") @patch("pdd.agentic_sync._filter_already_synced", return_value=["foo"]) @@ -1568,10 +1569,16 @@ def test_dependency_corrections_skipped_when_arch_outside_contract( mock_filter_synced, mock_runner_cls, mock_apply_corrections, + mock_find_root, tmp_path, capsys, ): """Contract excludes architecture.json → corrections must NOT run.""" + # Iter-32 B-1: pin project root to tmp_path so the dispatch-boundary + # orchestrator scope guard sweeps a clean tmp tree (the real repo + # has dirty worktrees that would trip the guard). + _init_git_repo(tmp_path) + mock_find_root.return_value = tmp_path issue_data = { "title": "Fix foo", "body": _CONTRACT_BODY_ARCH_OUT_OF_SCOPE, @@ -1645,6 +1652,9 @@ def test_dependency_corrections_applied_when_arch_in_contract( tmp_path, ): """Contract includes architecture.json → corrections must run.""" + # Iter-32 B-1: init git in tmp_path so the dispatch-boundary + # orchestrator scope guard's working-tree probes succeed. + _init_git_repo(tmp_path) mock_find_root.return_value = tmp_path arch_data = [{"filename": "foo_python.prompt", "dependencies": []}] mock_apply_corrections.return_value = arch_data @@ -1935,6 +1945,7 @@ def test_already_synced_early_return_does_not_leak_arch_changes( # Iter-28 B-2: nested arch_path bypass # ------------------------------------------------------------------ + @patch("pdd.agentic_sync._find_project_root") @patch("pdd.agentic_sync._apply_architecture_corrections") @patch("pdd.agentic_sync.AsyncSyncRunner") @patch("pdd.agentic_sync._filter_already_synced", return_value=["foo"]) @@ -1962,12 +1973,17 @@ def test_dependency_corrections_skipped_for_nested_arch_outside_contract( mock_filter_synced, mock_runner_cls, mock_apply_corrections, + mock_find_root, tmp_path, ): """Contract allows the literal string ``architecture.json`` but the REAL arch path is ``frontend/architecture.json``. Iter-28 B-2: the gate must compare the resolved arch path, not the bare string, so the nested arch write is rejected.""" + # Iter-32 B-1: init git + pin root so the dispatch-boundary scope + # guard sweeps a clean tmp tree. + _init_git_repo(tmp_path) + mock_find_root.return_value = tmp_path arch_data = [{"filename": "foo_python.prompt", "dependencies": []}] # Contract allows root architecture.json only — NOT the nested path. issue_data = { @@ -2039,6 +2055,9 @@ def test_dependency_corrections_applied_for_nested_arch_in_contract( ): """Contract explicitly allows ``frontend/architecture.json`` and the arch path matches → gate must permit the write.""" + # Iter-32 B-1: init git so the dispatch-boundary scope guard's + # working-tree probes succeed. + _init_git_repo(tmp_path) mock_find_root.return_value = tmp_path arch_data = [{"filename": "foo_python.prompt", "dependencies": []}] mock_apply_corrections.return_value = arch_data @@ -2075,6 +2094,7 @@ def test_dependency_corrections_applied_for_nested_arch_in_contract( assert success is True mock_apply_corrections.assert_called_once() + @patch("pdd.agentic_sync._find_project_root") @patch("pdd.agentic_sync._apply_architecture_corrections") @patch("pdd.agentic_sync.AsyncSyncRunner") @patch("pdd.agentic_sync._filter_already_synced", return_value=["foo"]) @@ -2102,6 +2122,7 @@ def test_dependency_corrections_skipped_for_arch_outside_project_root( mock_filter_synced, mock_runner_cls, mock_apply_corrections, + mock_find_root, tmp_path, ): """``arch_path`` resolves outside ``project_root`` → never in scope. @@ -2111,6 +2132,10 @@ def test_dependency_corrections_skipped_for_arch_outside_project_root( catches the ``ValueError`` from ``relative_to`` and returns False so the write is refused. """ + # Iter-32 B-1: init git + pin root so the dispatch-boundary scope + # guard sweeps a clean tmp tree. + _init_git_repo(tmp_path) + mock_find_root.return_value = tmp_path arch_data = [{"filename": "foo_python.prompt", "dependencies": []}] issue_data = { "title": "Fix foo", @@ -2677,6 +2702,238 @@ def llm_side_effect(*_args, **_kwargs): assert "Orchestrator scope guard" not in msg assert (tmp_path / "outside.py").exists() + # --------------------------------------------------------------------- + # Iter-32 B-1: dispatch-boundary scope guard + # --------------------------------------------------------------------- + # iter-30 wrapped every EARLY-RETURN site with + # ``_orch_scope_check_return``. The natural completion (iter-32) is to + # also gate the SUCCESSFUL DISPATCH path so pre-dispatch out-of-contract + # writes are not snapshotted as ``_baseline_changed_paths`` by the + # runner and silently preserved for the entire sync session. + + @patch("pdd.agentic_sync._filter_already_synced") + @patch("pdd.agentic_sync._run_dry_run_validation") + @patch("pdd.agentic_sync.AsyncSyncRunner") + @patch("pdd.agentic_sync.load_prompt_template", return_value="t {issue_content} {architecture_json}") + @patch("pdd.agentic_sync.run_agentic_task") + @patch("pdd.agentic_sync._load_architecture_json") + @patch("pdd.agentic_sync._run_gh_command") + @patch("pdd.agentic_sync._check_gh_cli", return_value=True) + def test_orchestrator_scope_guard_blocks_dispatch_when_predispatch_writes_out_of_contract( + self, + _mock_gh_cli, + mock_gh_cmd, + mock_load_arch, + mock_agentic_task, + _mock_load_prompt, + mock_runner_cls, + mock_dry_run, + mock_filter_synced, + tmp_path, + monkeypatch, + ): + """Pre-dispatch out-of-contract write that survives past all + early-return sites → orchestrator scope guard blocks dispatch and + reverts the write before the runner is constructed.""" + _init_git_repo(tmp_path) + monkeypatch.setattr("pdd.agentic_sync._find_project_root", lambda *_: tmp_path) + monkeypatch.setattr( + "pdd.agentic_sync._detect_modules_from_branch_diff", lambda *_: [] + ) + mock_gh_cmd.return_value = ( + True, self._issue_payload(self._ISSUE_BODY_WITH_BULLET_CONTRACT) + ) + mock_load_arch.return_value = (None, tmp_path / "architecture.json") + + def llm_side_effect(*_args, **_kwargs): + # Simulate the LLM identify-modules call writing an + # out-of-contract file mid-call AND returning a valid module + # list so the flow proceeds toward dispatch (skipping the + # iter-30 early-return wrap). + (tmp_path / "outside.py").write_text("LLM wrote me\n", encoding="utf-8") + return True, 'MODULES_TO_SYNC: ["foo"]\nDEPS_VALID: true', 0.01, "anthropic" + + mock_agentic_task.side_effect = llm_side_effect + # Skip dry-run early-return: report success with a cwd for "foo". + mock_dry_run.return_value = (True, {"foo": tmp_path}, [], 0.0) + # Skip "all already synced" early-return: keep "foo" in the list. + mock_filter_synced.return_value = ["foo"] + + success, msg, _cost, _model = run_agentic_sync( + "https://github.com/owner/repo/issues/1", quiet=True + ) + + assert success is False + assert "before dispatch" in msg + assert "outside.py" in msg + assert not (tmp_path / "outside.py").exists(), ( + "dispatch-boundary scope guard must revert the out-of-contract write" + ) + # Runner was NEVER constructed because dispatch was aborted. + mock_runner_cls.assert_not_called() + + @patch("pdd.agentic_sync._filter_already_synced") + @patch("pdd.agentic_sync._run_dry_run_validation") + @patch("pdd.agentic_sync.AsyncSyncRunner") + @patch("pdd.agentic_sync.load_prompt_template", return_value="t {issue_content} {architecture_json}") + @patch("pdd.agentic_sync.run_agentic_task") + @patch("pdd.agentic_sync._load_architecture_json") + @patch("pdd.agentic_sync._run_gh_command") + @patch("pdd.agentic_sync._check_gh_cli", return_value=True) + def test_orchestrator_scope_guard_allows_dispatch_when_clean( + self, + _mock_gh_cli, + mock_gh_cmd, + mock_load_arch, + mock_agentic_task, + _mock_load_prompt, + mock_runner_cls, + mock_dry_run, + mock_filter_synced, + tmp_path, + monkeypatch, + ): + """Clean working tree at dispatch boundary → runner constructed + and dispatched normally.""" + _init_git_repo(tmp_path) + monkeypatch.setattr("pdd.agentic_sync._find_project_root", lambda *_: tmp_path) + monkeypatch.setattr( + "pdd.agentic_sync._detect_modules_from_branch_diff", lambda *_: [] + ) + mock_gh_cmd.return_value = ( + True, self._issue_payload(self._ISSUE_BODY_WITH_BULLET_CONTRACT) + ) + mock_load_arch.return_value = (None, tmp_path / "architecture.json") + + # LLM does NOT write anything out-of-contract; returns a valid + # module list. + mock_agentic_task.return_value = ( + True, 'MODULES_TO_SYNC: ["foo"]\nDEPS_VALID: true', 0.01, "anthropic" + ) + mock_dry_run.return_value = (True, {"foo": tmp_path}, [], 0.0) + mock_filter_synced.return_value = ["foo"] + # Provide a runnable runner mock so the dispatch can complete. + mock_runner_cls.return_value.run.return_value = (True, "ok", 0.0) + + success, msg, _cost, _model = run_agentic_sync( + "https://github.com/owner/repo/issues/1", quiet=True + ) + + assert success is True + assert msg == "ok" + assert "before dispatch" not in msg + # Runner WAS constructed and .run() WAS called. + mock_runner_cls.assert_called_once() + mock_runner_cls.return_value.run.assert_called_once() + + @patch("pdd.agentic_sync._filter_already_synced") + @patch("pdd.agentic_sync._run_dry_run_validation") + @patch("pdd.agentic_sync.AsyncSyncRunner") + @patch("pdd.agentic_sync.load_prompt_template", return_value="t {issue_content} {architecture_json}") + @patch("pdd.agentic_sync.run_agentic_task") + @patch("pdd.agentic_sync._load_architecture_json") + @patch("pdd.agentic_sync._run_gh_command") + @patch("pdd.agentic_sync._check_gh_cli", return_value=True) + def test_orchestrator_scope_guard_dispatch_check_is_no_op_with_no_contract( + self, + _mock_gh_cli, + mock_gh_cmd, + mock_load_arch, + mock_agentic_task, + _mock_load_prompt, + mock_runner_cls, + mock_dry_run, + mock_filter_synced, + tmp_path, + monkeypatch, + ): + """Permissive mode (no contract markers) → dispatch-boundary check + is a no-op even when the LLM wrote out-of-contract files.""" + _init_git_repo(tmp_path) + monkeypatch.setattr("pdd.agentic_sync._find_project_root", lambda *_: tmp_path) + monkeypatch.setattr( + "pdd.agentic_sync._detect_modules_from_branch_diff", lambda *_: [] + ) + mock_gh_cmd.return_value = ( + True, self._issue_payload(self._ISSUE_BODY_NO_CONTRACT) + ) + mock_load_arch.return_value = (None, tmp_path / "architecture.json") + + def llm_side_effect(*_args, **_kwargs): + (tmp_path / "outside.py").write_text("LLM wrote me\n", encoding="utf-8") + return True, 'MODULES_TO_SYNC: ["foo"]\nDEPS_VALID: true', 0.01, "anthropic" + + mock_agentic_task.side_effect = llm_side_effect + mock_dry_run.return_value = (True, {"foo": tmp_path}, [], 0.0) + mock_filter_synced.return_value = ["foo"] + mock_runner_cls.return_value.run.return_value = (True, "ok", 0.0) + + success, msg, _cost, _model = run_agentic_sync( + "https://github.com/owner/repo/issues/1", quiet=True + ) + + # Permissive mode → dispatch proceeds, no revert, file preserved. + assert success is True + assert "before dispatch" not in msg + assert (tmp_path / "outside.py").exists() + mock_runner_cls.assert_called_once() + mock_runner_cls.return_value.run.assert_called_once() + + @patch("pdd.agentic_sync._filter_already_synced") + @patch("pdd.agentic_sync._run_dry_run_validation") + @patch("pdd.agentic_sync.AsyncSyncRunner") + @patch("pdd.agentic_sync.load_prompt_template", return_value="t {issue_content} {architecture_json}") + @patch("pdd.agentic_sync.run_agentic_task") + @patch("pdd.agentic_sync._load_architecture_json") + @patch("pdd.agentic_sync._run_gh_command") + @patch("pdd.agentic_sync._check_gh_cli", return_value=True) + def test_orchestrator_scope_guard_dispatch_check_is_no_op_with_scope_guard_disabled( + self, + _mock_gh_cli, + mock_gh_cmd, + mock_load_arch, + mock_agentic_task, + _mock_load_prompt, + mock_runner_cls, + mock_dry_run, + mock_filter_synced, + tmp_path, + monkeypatch, + ): + """``scope_guard=False`` (opt-out) with a contract present → + dispatch-boundary check is a no-op.""" + _init_git_repo(tmp_path) + monkeypatch.setattr("pdd.agentic_sync._find_project_root", lambda *_: tmp_path) + monkeypatch.setattr( + "pdd.agentic_sync._detect_modules_from_branch_diff", lambda *_: [] + ) + mock_gh_cmd.return_value = ( + True, self._issue_payload(self._ISSUE_BODY_WITH_BULLET_CONTRACT) + ) + mock_load_arch.return_value = (None, tmp_path / "architecture.json") + + def llm_side_effect(*_args, **_kwargs): + (tmp_path / "outside.py").write_text("LLM wrote me\n", encoding="utf-8") + return True, 'MODULES_TO_SYNC: ["foo"]\nDEPS_VALID: true', 0.01, "anthropic" + + mock_agentic_task.side_effect = llm_side_effect + mock_dry_run.return_value = (True, {"foo": tmp_path}, [], 0.0) + mock_filter_synced.return_value = ["foo"] + mock_runner_cls.return_value.run.return_value = (True, "ok", 0.0) + + success, msg, _cost, _model = run_agentic_sync( + "https://github.com/owner/repo/issues/1", + quiet=True, + scope_guard=False, + ) + + # Opt-out → dispatch proceeds, no revert, file preserved. + assert success is True + assert "before dispatch" not in msg + assert (tmp_path / "outside.py").exists() + mock_runner_cls.assert_called_once() + mock_runner_cls.return_value.run.assert_called_once() + # --------------------------------------------------------------------------- # _resolve_module_cwd From 9206ef8b2fd62148fdbaa456bc9432380e663079 Mon Sep 17 00:00:00 2001 From: Serhan Date: Fri, 15 May 2026 15:20:08 -0700 Subject: [PATCH 38/42] fix(sync): iter-34 M-3 detect deletion of pre-existing untracked baseline files MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Codex iter-33 caught a data-loss blind spot in the iter-24 hash-aware baseline preservation: when `_hash_file` returns None at scope-guard time, iter-24 just `continue`'d (dropped the baseline entry), assuming either git status would surface a deleted-tracked file as `D` or the file genuinely became out-of-scope. For UNTRACKED and IGNORED baseline paths git has no record at all — a deletion left zero trail, the scope guard returned None, and the user's pre-existing WIP was silently lost. Trigger (verifiable): user has dirty untracked `userwip.py`. Sync deletes it (LLM moves/removes during refactor). Scope guard saw `current_hash is None` → dropped baseline → returned None → module passed clean. `userwip.py` is gone, no diagnostic. Fix: collect deleted baseline paths into a `baseline_deleted` set during both baseline iterations (changed AND ignored). At diagnostic- build time, union with the existing `remaining_raw` re-scan set (dedup is fine — overlap covers tracked-deletion cases where git status would also surface the `D`). The new symmetric pass over `_baseline_ignored_paths` is mandatory because `git ls-files --ignored` only lists files that still exist; without this pass a deleted ignored baseline would never appear in the ignored-rescan loop. Tests: untracked baseline deletion flagged; ignored baseline deletion flagged; iter-24 unchanged-file preservation regression still passes. 715 passed. Two pre-existing tests (test_clean_working_tree_returns_none, test_deleted_companion_in_git_status_is_preserved) explicitly clear the baseline dicts to match their stated "clean working tree" intent — previously they relied on the iter-24 silent-drop branch that this fix removes. Co-Authored-By: Claude Opus 4.7 --- pdd/agentic_sync_runner.py | 49 ++++++++- tests/test_agentic_sync_runner.py | 168 ++++++++++++++++++++++++++++++ 2 files changed, 213 insertions(+), 4 deletions(-) diff --git a/pdd/agentic_sync_runner.py b/pdd/agentic_sync_runner.py index 280a1ed14..69cc5382d 100644 --- a/pdd/agentic_sync_runner.py +++ b/pdd/agentic_sync_runner.py @@ -2253,12 +2253,29 @@ def _enforce_scope_guard( # ``repo_root`` — in the non-durable async case those resolve # to the same path; the durable runner clears the baseline # entirely (iter-22) so this loop is a no-op there. + # + # Iter-34 M-3 (baseline-deletion blind spot, codex iter-33): + # ``current_hash is None`` means the baseline file is GONE + # from disk. For TRACKED baseline paths, ``git status`` will + # surface this as ``D `` and ``_remaining_out_of_scope_paths`` + # picks it up; but for UNTRACKED baseline paths (the user's + # local WIP captured at init) git has no record — the + # deletion is invisible to every subsequent scan and the + # module would succeed with the WIP silently lost. Collect + # the deletions here and union them into the diagnostic's + # ``remaining`` set below. + baseline_deleted: Set[str] = set() for rel_posix, baseline_hash in self._baseline_changed_paths.items(): current_hash = _hash_file(repo_root, rel_posix) if current_hash is None: - # File was deleted (or unreadable) at check time — - # don't auto-allow a ghost. The revert helpers handle - # restore semantics; we just refuse to whitelist. + # Iter-34 M-3: baseline file is GONE. Surface it as + # unrecovered regardless of whether it was tracked + # or untracked at init — we can't distinguish the + # two from the baseline snapshot, and even the + # tracked-deletion case warrants a hard-fail (the + # user didn't expect their pre-existing file to be + # removed by sync). + baseline_deleted.add(rel_posix) continue if baseline_hash is None: # Couldn't hash at init (the file was unreadable @@ -2274,6 +2291,22 @@ def _enforce_scope_guard( # Do NOT add to allowed_files — let the contract check # flag it as out-of-scope. + # Iter-34 M-3: symmetric pass for ignored baseline paths. + # ``_remaining_out_of_scope_paths`` only sees files that + # ``git ls-files --ignored`` currently lists, so a deleted + # gitignored baseline file (e.g. user-side ``cache.bin`` + # erased by sync) leaves no trail in either scan. Iterate + # the ignored baseline directly to catch the deletion. + for rel_posix, baseline_hash in self._baseline_ignored_paths.items(): + current_hash = _hash_file(repo_root, rel_posix) + if current_hash is None: + baseline_deleted.add(rel_posix) + # Present-but-changed ignored baselines are already + # surfaced by ``_remaining_out_of_scope_paths``'s + # ignored loop (the hash comparison there falls through + # to ``remaining.add`` on divergence). Don't duplicate + # that work here. + tracked_reverted = _revert_out_of_scope_changes(repo_root, allowed_files) untracked_reverted = revert_out_of_scope_changes_with_dirs( repo_root, allowed_dirs=set(), allowed_files=allowed_files @@ -2310,8 +2343,16 @@ def _enforce_scope_guard( # re-scan does not double-list. In practice when helpers succeed # the path is gone from ``git status``; when helpers fail with # ``reverted.clear()`` ``offending`` is empty. Defensive filter. + # + # Iter-34 M-3: union with ``baseline_deleted`` so a sync-side + # deletion of a pre-existing untracked/ignored baseline path + # hard-fails the module. For tracked baselines ``git status`` + # already surfaces the deletion as ``D ``, so the union just + # dedups via set semantics. offending_set = set(offending) - remaining = [p for p in remaining_raw if p not in offending_set] + remaining = sorted( + (set(remaining_raw) | baseline_deleted) - offending_set + ) if not offending and not remaining: return None diff --git a/tests/test_agentic_sync_runner.py b/tests/test_agentic_sync_runner.py index d59e9b59b..42522ad31 100644 --- a/tests/test_agentic_sync_runner.py +++ b/tests/test_agentic_sync_runner.py @@ -3094,6 +3094,162 @@ def fake_revert(_root, allowed_files): f"got: {captured_allowed['files']}" ) + # --------------------------------------------------------------------- + # Iter-34 M-3: baseline-deletion blind spot + # --------------------------------------------------------------------- + + def test_baseline_deletion_of_untracked_file_is_flagged( + self, tmp_path, monkeypatch + ): + """Iter-34 M-3 (baseline-deletion blind spot, codex iter-33): a + user's pre-existing UNTRACKED dirty file that gets deleted during + sync MUST hard-fail the module. Untracked baselines have no git + record, so ``git status`` leaves no trail after deletion — the + iter-24 logic dropped the baseline entry on ``current_hash is + None`` and silently lost the WIP.""" + from pdd import agentic_sync_runner as mod + + subprocess.run( + ["git", "init", "-b", "main", str(tmp_path)], + check=True, capture_output=True, + ) + subprocess.run( + ["git", "-C", str(tmp_path), "config", "user.email", "t@t.invalid"], + check=True, capture_output=True, + ) + subprocess.run( + ["git", "-C", str(tmp_path), "config", "user.name", "T"], + check=True, capture_output=True, + ) + (tmp_path / "README.md").write_text("initial") + subprocess.run( + ["git", "-C", str(tmp_path), "add", "README.md"], + check=True, capture_output=True, + ) + subprocess.run( + ["git", "-C", str(tmp_path), "commit", "-m", "init"], + check=True, capture_output=True, + ) + + # Pre-existing UNTRACKED WIP outside the contract. + userwip = tmp_path / "userwip.py" + userwip.write_text("wip") + + monkeypatch.chdir(tmp_path) + runner = self._make_runner( + allowed_write_set=["pdd/foo.py"], + companion_allowlist=[".pdd/meta/*.json"], + ) + runner.project_root = tmp_path.resolve() + + assert "userwip.py" in runner._baseline_changed_paths, ( + "iter-34: untracked WIP must be captured in baseline" + ) + assert runner._baseline_changed_paths["userwip.py"] is not None, ( + "iter-34: baseline SHA must be captured for readable WIP" + ) + + # Simulate sync deleting the file (e.g. refactor removed it). + userwip.unlink() + + monkeypatch.setattr( + mod, "_revert_out_of_scope_changes", lambda _root, _allowed: [] + ) + monkeypatch.setattr( + mod, "revert_out_of_scope_changes_with_dirs", + lambda _root, allowed_dirs, allowed_files: [], + ) + monkeypatch.setattr( + runner, "_resolve_repo_root", lambda _cwd: tmp_path.resolve() + ) + + diagnostic = runner._enforce_scope_guard("mod", tmp_path) + + assert diagnostic is not None, ( + "iter-34: deletion of untracked baseline WIP must hard-fail " + "the module — silent data loss otherwise" + ) + assert "userwip.py" in diagnostic, ( + f"iter-34: deleted baseline path must appear in diagnostic, " + f"got: {diagnostic!r}" + ) + + def test_baseline_deletion_of_ignored_file_is_flagged( + self, tmp_path, monkeypatch + ): + """Iter-34 M-3 (symmetric): a pre-existing gitignored baseline + file that gets deleted during sync MUST hard-fail the module. + ``git ls-files --ignored`` only lists files that currently exist, + so a deletion is invisible to the ignored-rescan loop — a + dedicated baseline iteration is required.""" + from pdd import agentic_sync_runner as mod + + subprocess.run( + ["git", "init", "-b", "main", str(tmp_path)], + check=True, capture_output=True, + ) + subprocess.run( + ["git", "-C", str(tmp_path), "config", "user.email", "t@t.invalid"], + check=True, capture_output=True, + ) + subprocess.run( + ["git", "-C", str(tmp_path), "config", "user.name", "T"], + check=True, capture_output=True, + ) + (tmp_path / ".gitignore").write_text("cache.bin\n") + (tmp_path / "README.md").write_text("initial") + subprocess.run( + ["git", "-C", str(tmp_path), "add", ".gitignore", "README.md"], + check=True, capture_output=True, + ) + subprocess.run( + ["git", "-C", str(tmp_path), "commit", "-m", "init"], + check=True, capture_output=True, + ) + + # Pre-existing gitignored file BEFORE the runner is constructed. + cache = tmp_path / "cache.bin" + cache.write_text("user cache") + + monkeypatch.chdir(tmp_path) + runner = self._make_runner( + allowed_write_set=["pdd/foo.py"], + companion_allowlist=[".pdd/meta/*.json"], + ) + runner.project_root = tmp_path.resolve() + + assert "cache.bin" in runner._baseline_ignored_paths, ( + "iter-34: pre-existing ignored file must be captured in baseline" + ) + assert runner._baseline_ignored_paths["cache.bin"] is not None, ( + "iter-34: baseline SHA must be captured for readable ignored file" + ) + + # Simulate sync deleting the gitignored cache. + cache.unlink() + + monkeypatch.setattr( + mod, "_revert_out_of_scope_changes", lambda _root, _allowed: [] + ) + monkeypatch.setattr( + mod, "revert_out_of_scope_changes_with_dirs", + lambda _root, allowed_dirs, allowed_files: [], + ) + monkeypatch.setattr( + runner, "_resolve_repo_root", lambda _cwd: tmp_path.resolve() + ) + + diagnostic = runner._enforce_scope_guard("mod", tmp_path) + + assert diagnostic is not None, ( + "iter-34: deletion of ignored baseline file must hard-fail " + "the module — git ls-files --ignored leaves no trail" + ) + assert "cache.bin" in diagnostic, ( + f"iter-34: deleted ignored baseline must appear in diagnostic, " + f"got: {diagnostic!r}" + ) + def test_wildcard_only_companion_pattern_does_not_auto_allow( self, tmp_path, monkeypatch ): @@ -3255,6 +3411,11 @@ def fake_revert(repo_root, allowed_files): allowed_write_set=["pdd/foo.py"], companion_allowlist=[".pdd/meta/*.json"], ) + # Iter-34 M-3: clear the baseline snapshot taken from the real + # working tree so this test isolates the rglob/_git_changed_paths + # companion-allowlist behavior under inspection. + runner._baseline_changed_paths = {} + runner._baseline_ignored_paths = {} monkeypatch.setattr( runner, "_resolve_repo_root", lambda _cwd: tmp_path.resolve() ) @@ -3347,6 +3508,13 @@ def test_clean_working_tree_returns_none(self, tmp_path, monkeypatch): allowed_write_set=["pdd/foo.py"], companion_allowlist=[".pdd/meta/*.json"], ) + # Iter-34 M-3: ``_make_runner`` snapshots the baseline from the + # current working directory (the real ``pdd`` repo when the test + # didn't chdir). Reset the snapshot to ``{}`` so the post-fix + # baseline-deletion scan does not flag pre-existing dirty paths + # from the developer's working tree as silently deleted. + runner._baseline_changed_paths = {} + runner._baseline_ignored_paths = {} monkeypatch.setattr( runner, "_resolve_repo_root", lambda _cwd: tmp_path.resolve() ) From a127f4cd3d4be813a43d72314744f220d94b56ab Mon Sep 17 00:00:00 2001 From: Serhan Date: Fri, 15 May 2026 15:40:04 -0700 Subject: [PATCH 39/42] fix(sync): iter-36 close B-1/B-2/B-3 orchestrator parity gaps MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Codex iter-35 found three Blockers where the iter-30/32/34 orchestrator guard didn't fully mirror the per-module runner-side safety features. B-1: PDD's own `run_agentic_task` writes `.pdd/agentic-logs/session_*.jsonl` audit logs during EVERY pre-dispatch LLM call. The orchestrator scope guard treated them as out-of-contract and hard-failed otherwise clean contracted runs. B-2: `AsyncSyncRunner._record_result` writes `.pdd/agentic_sync_state.json` after each per-module scope guard. In a multi-module sync, the state file is on disk when the NEXT module's guard runs and the next module would hard-fail on the previous module's state file. B-3: `_enforce_orchestrator_scope` iterated baseline dicts but on `current_hash is None` it silently `continue`'d, dropping the deleted entry. iter-34 closed this for the runner guard via a `baseline_deleted` set unioned into the unrecovered diagnostic — the orchestrator guard didn't mirror that pattern. Pre-dispatch LLM/subprocess deletion of pre-existing user WIP would pass the orchestrator guard silently. Fix: add `PDD_INTERNAL_PATH_ALLOWLIST` constant in agentic_common as a SEPARATE concept from the user-facing `DEFAULT_SYNC_COMPANION_ALLOWLIST`. Internal allowlist is fixed (not user-extensible) and represents tool infrastructure (`.pdd/agentic-logs/*`, `.pdd/agentic_sync_state.json`, `.pdd/bug-state/*`, `.pdd/checkup-review-loop/*`). Wire it through both guards via the existing anchored matcher so `subdir/.pdd/agentic-logs/` still does NOT match (anchoring preserved). Per-module guard scans repo-rooted (not module-rooted) so a top-level audit log matches even when module_cwd is a subdirectory. DurableSyncRunner inherits `_enforce_scope_guard` from AsyncSyncRunner so B-1/B-2 fix applies transitively — no separate durable change. Mirror iter-34's `baseline_deleted` set into the orchestrator guard for both `_orch_baseline_changed` and `_orch_baseline_ignored`. Union into the existing "Unrecovered (revert failed, manual cleanup required)" diagnostic section — parity with runner-side wording. Tests: audit log auto-allowed at orchestrator level; audit log auto-allowed at runner level (including the multi-module anchoring asymmetry); state file auto-allowed at runner level; deleted untracked baseline flagged; deleted ignored baseline flagged. 721 passed. Co-Authored-By: Claude Opus 4.7 --- pdd/agentic_common.py | 22 +++++ pdd/agentic_sync.py | 71 ++++++++++++++++- pdd/agentic_sync_runner.py | 42 ++++++++++ tests/test_agentic_sync.py | 123 ++++++++++++++++++++++++++++ tests/test_agentic_sync_runner.py | 128 ++++++++++++++++++++++++++++++ 5 files changed, 384 insertions(+), 2 deletions(-) diff --git a/pdd/agentic_common.py b/pdd/agentic_common.py index 80eb8417e..b0707905a 100644 --- a/pdd/agentic_common.py +++ b/pdd/agentic_common.py @@ -53,6 +53,28 @@ def _load_model_data(*args, **kwargs): # own ``companion_allowlist`` field. DEFAULT_SYNC_COMPANION_ALLOWLIST: Tuple[str, ...] = (".pdd/meta/*.json",) +# Issue #1013 iter-36 B-1/B-2: PDD's own infrastructure writes during a +# guarded sync run (audit logs, runner state, etc.). These are NEVER part +# of a contract — they're internal artifacts the tool produces as side +# effects of running. The scope guard (both orchestrator-level and +# per-module) MUST auto-allow them or it would hard-fail every contracted +# run. +# +# Distinct from DEFAULT_SYNC_COMPANION_ALLOWLIST: the user-facing default +# may be widened by an issue's ``companion_allowlist`` field; this set is +# fixed tool infrastructure and is NOT user-extensible. Patterns here are +# always interpreted as REPO-ROOT-anchored (matched against a path +# computed relative to the repo root), not module-relative, because the +# infrastructure writes happen at the top of the project regardless of +# which module is being synced. +PDD_INTERNAL_PATH_ALLOWLIST: Tuple[str, ...] = ( + ".pdd/agentic-logs/*", # session audit logs (run_agentic_task) + ".pdd/agentic-logs/*/*", # nested per-task subdirs if any + ".pdd/agentic_sync_state.json", # runner state file + ".pdd/bug-state/*", # bug command state + ".pdd/checkup-review-loop/*", # checkup state +) + # Semantic fallback patterns for when LLMs paraphrase instead of emitting exact tokens. # Each token maps to a list of regex patterns that capture common paraphrases. # Patterns are checked only after exact and case-insensitive matching fail, diff --git a/pdd/agentic_sync.py b/pdd/agentic_sync.py index 403c1a64d..57abc6e38 100644 --- a/pdd/agentic_sync.py +++ b/pdd/agentic_sync.py @@ -23,6 +23,7 @@ from .agentic_change import _check_gh_cli, _escape_format_braces, _parse_issue_url, _run_gh_command from .agentic_common import ( DEFAULT_SYNC_COMPANION_ALLOWLIST, + PDD_INTERNAL_PATH_ALLOWLIST, IssueContract, _is_valid_companion_pattern, _matches_companion_pattern_anchored, @@ -1815,6 +1816,15 @@ def _enforce_orchestrator_scope( # rglob for currently-on-disk companion files. The orchestrator scope # guard cannot rely on a single module_cwd, so the scan is repo-wide # — same union as the per-module guard, just with a wider net. + # + # Iter-36 B-1/B-2: ``PDD_INTERNAL_PATH_ALLOWLIST`` is a SEPARATE pass + # (not merged into ``allowlist``) because internal patterns are + # interpreted as REPO-ROOT-anchored while user-facing companion + # patterns are anchored against the iteration root for the loop they + # appear in. The orchestrator happens to iterate repo-rooted anyway, + # but keeping the passes separate preserves parity with the per-module + # guard (which iterates module-rooted) and matches the documented + # semantics of ``PDD_INTERNAL_PATH_ALLOWLIST``. for path in repo_root.rglob("*"): if not path.is_file(): continue @@ -1824,26 +1834,58 @@ def _enforce_orchestrator_scope( continue if _matches_companion(rel_posix, allowlist): allowed_files.add(path.resolve()) + elif _matches_internal(rel_posix, PDD_INTERNAL_PATH_ALLOWLIST): + allowed_files.add(path.resolve()) # Also pick up companion-shaped tracked deletions (sync legitimately # removes ``.pdd/meta/foo_python.json`` when a module is renamed; the # revert helper would otherwise resurrect it). + # + # Iter-36 B-1/B-2: same internal-allowlist parallel pass — a tracked + # deletion of ``.pdd/agentic_sync_state.json`` (e.g. between sync + # invocations) must NOT be resurrected by the revert helper. for rel_posix in _git_changed_paths(repo_root): absolute = (repo_root / rel_posix).resolve() if _matches_companion(rel_posix, allowlist): allowed_files.add(absolute) + elif _matches_internal(rel_posix, PDD_INTERNAL_PATH_ALLOWLIST): + allowed_files.add(absolute) # Iter-24 SHA-aware preservation of pre-existing changed baseline. + # + # Iter-36 B-3: mirror :meth:`AsyncSyncRunner._enforce_scope_guard`'s + # iter-34 ``baseline_deleted`` set — when ``current_hash is None``, + # the pre-existing file is gone from disk. For TRACKED baselines git + # status surfaces this as ``D `` and the re-scan picks it up, but + # UNTRACKED baselines leave no trail; without this collection the + # orchestrator silently passes a sync-side deletion of user WIP. + baseline_deleted: set[str] = set() for rel_posix, baseline_hash in baseline_changed.items(): current_hash = _hash_file(repo_root, rel_posix) if current_hash is None: - # File was deleted after baseline. Let revert helpers decide. + # File was deleted after baseline — surface it via the + # ``remaining`` set below regardless of whether it was tracked + # or untracked (we can't distinguish from the snapshot, and + # even the tracked-deletion case warrants a hard-fail). + baseline_deleted.add(rel_posix) continue if baseline_hash is None or current_hash == baseline_hash: # Unreadable at snapshot (preserve by name) or unchanged content # → preserve. allowed_files.add((repo_root / rel_posix).resolve()) + # Iter-36 B-3: symmetric pass for ignored baseline. The orchestrator + # re-scan only sees files that ``git ls-files --ignored`` currently + # lists; deleted ignored baselines (e.g. user-side ``cache.bin`` erased + # before runner dispatch) leave no trail there. Iterate the ignored + # baseline directly to catch the deletion. Present-but-changed + # ignored baselines are already surfaced by + # :func:`_orchestrator_remaining_out_of_scope_paths`'s ignored loop. + for rel_posix, baseline_hash in baseline_ignored.items(): + current_hash = _hash_file(repo_root, rel_posix) + if current_hash is None: + baseline_deleted.add(rel_posix) + tracked_reverted = _revert_out_of_scope_changes(repo_root, allowed_files) untracked_reverted = revert_out_of_scope_changes_with_dirs( repo_root, allowed_dirs=set(), allowed_files=allowed_files @@ -1867,11 +1909,18 @@ def _enforce_orchestrator_scope( # and return []. We re-scan the working tree after revert to be sure # the contract is now satisfied; anything still on disk that is not # allowed becomes the "unrecovered" set. + # + # Iter-36 B-3: union with ``baseline_deleted`` so an orchestrator-side + # deletion of a pre-existing untracked/ignored baseline path + # hard-fails the run. Mirrors :meth:`AsyncSyncRunner._enforce_scope_guard`'s + # iter-34 union for the per-module guard. remaining_raw = _orchestrator_remaining_out_of_scope_paths( repo_root, allowed_files, baseline_ignored ) offending_set = set(offending) - remaining = [p for p in remaining_raw if p not in offending_set] + remaining = sorted( + (set(remaining_raw) | baseline_deleted) - offending_set + ) if not offending and not remaining: return None @@ -1936,6 +1985,24 @@ def _matches_companion(rel_posix: str, allowlist: Iterable[str]) -> bool: return False +def _matches_internal(rel_posix: str, internal_allowlist: Iterable[str]) -> bool: + """Iter-36 B-1/B-2: anchored match for PDD-internal infrastructure paths. + + Distinct from :func:`_matches_companion` so internal allowlist patterns + bypass the user-facing :func:`_is_valid_companion_pattern` gate (the + internal list is curated, not parsed from user input) but still go + through the iter-14 segment-aware anchored matcher so e.g. + ``subdir/.pdd/agentic-logs/x.jsonl`` does NOT match — only top-level + ``.pdd/agentic-logs/x.jsonl`` does. + """ + for pattern in internal_allowlist: + if not pattern: + continue + if _matches_companion_pattern_anchored(rel_posix, pattern): + return True + return False + + def _orchestrator_remaining_out_of_scope_paths( repo_root: Path, allowed_files: set[Path], diff --git a/pdd/agentic_sync_runner.py b/pdd/agentic_sync_runner.py index 69cc5382d..aa9942bbe 100644 --- a/pdd/agentic_sync_runner.py +++ b/pdd/agentic_sync_runner.py @@ -29,6 +29,7 @@ from .agentic_common import ( DEFAULT_SYNC_COMPANION_ALLOWLIST, + PDD_INTERNAL_PATH_ALLOWLIST, _is_valid_companion_pattern, _matches_companion_pattern_anchored, _revert_out_of_scope_changes, @@ -2218,6 +2219,34 @@ def _enforce_scope_guard( if self._matches_companion_allowlist(rel_posix, allowlist): allowed_files.add(path.resolve()) + # Iter-36 B-1/B-2: PDD-internal infrastructure paths + # (``.pdd/agentic-logs/*``, ``.pdd/agentic_sync_state.json``, + # etc.) are written by the tool itself during a guarded run + # (audit logs from ``run_agentic_task``; runner state file + # from ``_record_result`` after each module). They are + # NEVER part of a contract. This pass is SEPARATE from the + # user-facing companion pass above because internal patterns + # are REPO-ROOT-anchored (the writes happen at the top of + # the project regardless of which module is being synced) — + # in the multi-module case ``module_cwd`` is a subdirectory + # and the audit log under ``/.pdd/agentic-logs/`` + # would NOT match a module-rooted match pass. + for path in repo_root.rglob("*"): + if not path.is_file(): + continue + try: + repo_rel_posix = ( + path.resolve().relative_to(repo_root).as_posix() + ) + except ValueError: + continue + for pattern in PDD_INTERNAL_PATH_ALLOWLIST: + if _matches_companion_pattern_anchored( + repo_rel_posix, pattern + ): + allowed_files.add(path.resolve()) + break + # Iter-4 F1: rglob only sees files that still exist on disk. Sync # legitimately DELETES companion artifacts (e.g. ``.pdd/meta/foo_python.json`` # when a module is renamed/removed); those deletions appear in @@ -2230,6 +2259,19 @@ def _enforce_scope_guard( # above). for rel_posix in _git_changed_paths(repo_root): absolute = (repo_root / rel_posix).resolve() + # Iter-36 B-1/B-2: tracked deletion of a PDD-internal + # artifact (e.g. ``.pdd/agentic_sync_state.json`` between + # runs) must not be resurrected by the revert helper. + # Match against the REPO-relative form before the + # module-cwd scoping below. + matched_internal = False + for pattern in PDD_INTERNAL_PATH_ALLOWLIST: + if _matches_companion_pattern_anchored(rel_posix, pattern): + allowed_files.add(absolute) + matched_internal = True + break + if matched_internal: + continue try: module_rel_posix = absolute.relative_to(cwd_path).as_posix() except ValueError: diff --git a/tests/test_agentic_sync.py b/tests/test_agentic_sync.py index b1d9c119d..0b0a48486 100644 --- a/tests/test_agentic_sync.py +++ b/tests/test_agentic_sync.py @@ -2669,6 +2669,129 @@ def test_companion_allowlist_default_auto_allows_pdd_meta(self, tmp_path): assert result is None assert meta_file.exists() + def test_pdd_audit_logs_do_not_trip_orchestrator_guard(self, tmp_path): + """Iter-36 B-1: PDD's own audit logs at ``.pdd/agentic-logs/`` written + by :func:`run_agentic_task` during the orchestrator's pre-dispatch + LLM calls MUST NOT hard-fail a contracted sync run. The audit log is + tool infrastructure (NEVER part of a contract) and the internal + allowlist auto-allows it without the contract needing to opt in. + + Baseline snapshot is empty (the log appears AFTER snapshot, mid-run); + the guard MUST still return None purely on the internal allowlist + match. + """ + _init_git_repo(tmp_path) + contract = self._contract("pdd/foo.py") + + # Audit log appears AFTER baseline snapshot — this is the realistic + # scenario: ``run_agentic_task`` writes a session record during the + # LLM call that itself happens between snapshot and guard. + log_dir = tmp_path / ".pdd" / "agentic-logs" + log_dir.mkdir(parents=True) + log_file = log_dir / "session_20251215_120000.jsonl" + log_file.write_text('{"label": "step1"}\n', encoding="utf-8") + + result = _enforce_orchestrator_scope( + tmp_path, + issue_contract=contract, + scope_guard=True, + baseline_changed={}, + baseline_ignored={}, + quiet=True, + ) + assert result is None, ( + f"iter-36 B-1: PDD audit log under .pdd/agentic-logs/ must be " + f"auto-allowed by the internal allowlist; got diagnostic: " + f"{result!r}" + ) + # The log must still exist — internal-allowlisted, not reverted. + assert log_file.exists() + + def test_orchestrator_guard_flags_deleted_untracked_baseline(self, tmp_path): + """Iter-36 B-3: untracked baseline files that disappear between + snapshot and guard MUST be surfaced as ``remaining`` (hard-fail) by + the orchestrator guard. Prior to iter-36 the orchestrator silently + ``continue``d on ``current_hash is None`` and lost user WIP without + a trace. Mirrors the per-module guard's iter-34 fix. + """ + _init_git_repo(tmp_path) + contract = self._contract("pdd/foo.py") + user_wip = tmp_path / "userwip.py" + user_wip.write_text("user code\n", encoding="utf-8") + + # Snapshot the baseline (untracked WIP captured at orchestrator entry). + baseline = {"userwip.py": _hash_baseline_single(tmp_path, "userwip.py")} + + # Orchestrator deletes the WIP before runner dispatch — simulate by + # deleting the file after snapshot. + user_wip.unlink() + + result = _enforce_orchestrator_scope( + tmp_path, + issue_contract=contract, + scope_guard=True, + baseline_changed=baseline, + baseline_ignored={}, + quiet=True, + ) + assert result is not None, ( + "iter-36 B-3: deletion of untracked baseline WIP must hard-fail " + "the orchestrator — silent data loss otherwise" + ) + assert "userwip.py" in result, ( + f"iter-36 B-3: deleted baseline path must appear in diagnostic, " + f"got: {result!r}" + ) + + def test_orchestrator_guard_flags_deleted_ignored_baseline(self, tmp_path): + """Iter-36 B-3 (symmetric): pre-existing gitignored baseline files + that disappear between snapshot and guard MUST also surface in the + orchestrator's ``remaining`` set. ``git ls-files --ignored`` only + lists files that currently exist, so a deleted ignored baseline + leaves no trail in the ignored-rescan loop. + """ + _init_git_repo(tmp_path) + # gitignore must be committed before the cache file is created so + # the cache is treated as a tracked-ignore at baseline time. + gi = tmp_path / ".gitignore" + gi.write_text("cache.bin\n", encoding="utf-8") + subprocess.run( + ["git", "-C", str(tmp_path), "add", ".gitignore"], check=True, + ) + subprocess.run( + ["git", "-C", str(tmp_path), "commit", "--quiet", "-m", "gi"], + check=True, + ) + + contract = self._contract("pdd/foo.py") + cache = tmp_path / "cache.bin" + cache.write_text("user cache\n", encoding="utf-8") + + # Baseline snapshot of the ignored file. + baseline_ignored = { + "cache.bin": _hash_baseline_single(tmp_path, "cache.bin") + } + + # Orchestrator deletes the cache before runner dispatch. + cache.unlink() + + result = _enforce_orchestrator_scope( + tmp_path, + issue_contract=contract, + scope_guard=True, + baseline_changed={}, + baseline_ignored=baseline_ignored, + quiet=True, + ) + assert result is not None, ( + "iter-36 B-3: deletion of pre-existing ignored baseline must " + "hard-fail the orchestrator — git ls-files --ignored cannot see it" + ) + assert "cache.bin" in result, ( + f"iter-36 B-3: deleted ignored baseline must appear in diagnostic, " + f"got: {result!r}" + ) + # --------------------------------------------------------------------------- # Orchestrator scope guard integration (iter-30) diff --git a/tests/test_agentic_sync_runner.py b/tests/test_agentic_sync_runner.py index 42522ad31..3c8d15ba8 100644 --- a/tests/test_agentic_sync_runner.py +++ b/tests/test_agentic_sync_runner.py @@ -3848,6 +3848,134 @@ def fake_run(cmd, *args, **kwargs): "ignored-scan failure must surface the sentinel" ) + # --------------------------------------------------------------------- + # Iter-36 B-1/B-2: PDD-internal-path allowlist + # --------------------------------------------------------------------- + + def test_pdd_audit_logs_do_not_trip_runner_guard( + self, tmp_path, monkeypatch + ): + """Iter-36 B-1: PDD's own audit logs at ``.pdd/agentic-logs/`` written + by :func:`run_agentic_task` during a per-module sync MUST NOT + hard-fail the per-module scope guard. The audit log is tool + infrastructure (NEVER part of a contract) and the internal allowlist + auto-allows it. + """ + from pdd import agentic_sync_runner as mod + + self._init_git_repo(tmp_path) + monkeypatch.chdir(tmp_path) + + runner = self._make_runner( + allowed_write_set=["pdd/foo.py"], + companion_allowlist=[".pdd/meta/*.json"], + ) + runner.project_root = tmp_path.resolve() + + # Audit log appears AFTER runner init — simulates run_agentic_task + # writing a session record during the per-module subprocess. + log_dir = tmp_path / ".pdd" / "agentic-logs" + log_dir.mkdir(parents=True) + log_file = log_dir / "session_20251215_120000.jsonl" + log_file.write_text('{"label": "step1"}\n', encoding="utf-8") + + # Real revert helpers — internal allowlist must keep the log alive. + monkeypatch.setattr( + runner, "_resolve_repo_root", lambda _cwd: tmp_path.resolve() + ) + + diagnostic = runner._enforce_scope_guard("mod", tmp_path) + assert diagnostic is None, ( + f"iter-36 B-1: PDD audit log under .pdd/agentic-logs/ must be " + f"auto-allowed by the internal allowlist in the per-module " + f"guard; got diagnostic: {diagnostic!r}" + ) + assert log_file.exists(), ( + "iter-36 B-1: internal-allowlisted audit log must not be removed" + ) + + def test_pdd_audit_logs_do_not_trip_runner_guard_multi_module( + self, tmp_path, monkeypatch + ): + """Iter-36 B-1/B-2 multi-module variant: when ``module_cwd`` is a + subdirectory (multi-module sync), the audit log under + ``/.pdd/agentic-logs/`` is REPO-rooted, not module-rooted. + The internal allowlist pass must scan repo-rooted so it still matches + — a module-rooted-only pass would miss it. + """ + from pdd import agentic_sync_runner as mod + + self._init_git_repo(tmp_path) + monkeypatch.chdir(tmp_path) + + # Module lives at /mod_a/; audit log lives at + # /.pdd/agentic-logs/. + module_cwd = tmp_path / "mod_a" + module_cwd.mkdir() + log_dir = tmp_path / ".pdd" / "agentic-logs" + log_dir.mkdir(parents=True) + log_file = log_dir / "session_20251215_120000.jsonl" + log_file.write_text('{"label": "step1"}\n', encoding="utf-8") + + runner = self._make_runner( + allowed_write_set=["pdd/foo.py"], + companion_allowlist=[".pdd/meta/*.json"], + ) + runner.project_root = tmp_path.resolve() + + monkeypatch.setattr( + runner, "_resolve_repo_root", lambda _cwd: tmp_path.resolve() + ) + + diagnostic = runner._enforce_scope_guard("mod_a", module_cwd) + assert diagnostic is None, ( + f"iter-36 B-1/B-2: in multi-module sync the audit log lives at " + f"repo-rooted .pdd/agentic-logs/, NOT module-rooted; a " + f"module-rooted-only allowlist pass would miss it. " + f"Got diagnostic: {diagnostic!r}" + ) + assert log_file.exists() + + def test_runner_state_file_does_not_trip_per_module_guard( + self, tmp_path, monkeypatch + ): + """Iter-36 B-2: ``.pdd/agentic_sync_state.json`` written by + :meth:`AsyncSyncRunner._record_result` after the previous module's + scope guard runs is on disk when the NEXT module's guard runs in a + multi-module sync. Without the internal allowlist, the next module's + guard hard-fails on the previous module's state file. Verify the + state file is auto-allowed. + """ + from pdd import agentic_sync_runner as mod + + self._init_git_repo(tmp_path) + monkeypatch.chdir(tmp_path) + + # Simulate previous-module state file present on disk. + state_dir = tmp_path / ".pdd" + state_dir.mkdir(parents=True, exist_ok=True) + state_file = state_dir / "agentic_sync_state.json" + state_file.write_text('{"version": 1, "modules": {}}', encoding="utf-8") + + runner = self._make_runner( + allowed_write_set=["pdd/foo.py"], + companion_allowlist=[".pdd/meta/*.json"], + ) + runner.project_root = tmp_path.resolve() + monkeypatch.setattr( + runner, "_resolve_repo_root", lambda _cwd: tmp_path.resolve() + ) + + diagnostic = runner._enforce_scope_guard("mod", tmp_path) + assert diagnostic is None, ( + f"iter-36 B-2: runner state file at .pdd/agentic_sync_state.json " + f"must be auto-allowed by the internal allowlist; " + f"got diagnostic: {diagnostic!r}" + ) + assert state_file.exists(), ( + "iter-36 B-2: internal-allowlisted state file must not be removed" + ) + # --------------------------------------------------------------------------- # Issue #745: initial_cost (LLM module analysis cost) tracking From 2db79b20988bf2a600b9538dd23416be966f5bf4 Mon Sep 17 00:00:00 2001 From: Serhan Date: Fri, 15 May 2026 16:01:34 -0700 Subject: [PATCH 40/42] fix(sync): iter-38 fail-closed baseline acquisition at init MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Codex iter-37 (last remaining major after iter-36 closed parity gaps): when ``_git_changed_paths`` / ``_git_ignored_paths`` fails at runner / orchestrator ``__init__`` (transient git timeout, missing binary, OSError, non-zero return), the helpers returned ``set()`` — indistinguishable from "scan succeeded, worktree clean." The empty baseline was stored. Later enforcement-time probes typically succeed (transient passed), and the scope guard treated the empty baseline as "user had nothing dirty," so any pre-existing user file was falsely flagged as out-of-scope and reverted/deleted. Same fail-open pattern as iter-9 sentinel but at init-time rather than enforcement-time. Change helper return type from ``set[str]`` to ``Optional[set[str]]``. ``None`` signals failure; empty set still signals "clean worktree." Runner/orchestrator init records ``_baseline_acquisition_failed`` flag when either helper returned ``None`` (only when scope_guard is enabled with a contract — permissive mode skips the scan and the gate). ``AsyncSyncRunner.run()`` short-circuits with a fail-closed message before any write-capable work. ``run_agentic_sync`` mirrors the gate before any pre-dispatch LLM/shell work AND posts the GitHub comment. DurableSyncRunner inherits via super().__init__()/run(). Enforcement-time call sites (iter-20 ignored re-scan, iter-9 fail- closed boundary) treat ``None`` as ``or set()`` since their separate ```` sentinel handles those policy decisions. Prompt drift fix in agentic_sync_runner_python.prompt §44-46 documents the new helper signature and ``_baseline_acquisition_failed`` contract. Tests: 6 runner-level cases (changed-None, ignored-None, OSError, permissive skip, scope-guard-disabled skip, success-empty regression) plus 4 orchestrator-level cases (changed-None abort, ignored-None abort, success regression, permissive skip). 731 passed. Co-Authored-By: Claude Opus 4.7 --- pdd/agentic_sync.py | 33 ++- pdd/agentic_sync_runner.py | 128 ++++++++--- pdd/prompts/agentic_sync_runner_python.prompt | 7 +- tests/test_agentic_sync.py | 214 ++++++++++++++++++ tests/test_agentic_sync_runner.py | 164 ++++++++++++++ 5 files changed, 506 insertions(+), 40 deletions(-) diff --git a/pdd/agentic_sync.py b/pdd/agentic_sync.py index 57abc6e38..5c3fe0091 100644 --- a/pdd/agentic_sync.py +++ b/pdd/agentic_sync.py @@ -1844,7 +1844,12 @@ def _enforce_orchestrator_scope( # Iter-36 B-1/B-2: same internal-allowlist parallel pass — a tracked # deletion of ``.pdd/agentic_sync_state.json`` (e.g. between sync # invocations) must NOT be resurrected by the revert helper. - for rel_posix in _git_changed_paths(repo_root): + # + # Iter-38 M-1: ``_git_changed_paths`` now returns ``None`` on scan + # failure (was empty set). Enforcement-time scan failures are already + # handled separately downstream; here we treat ``None`` as the empty + # set so iteration is a no-op rather than crashing. + for rel_posix in (_git_changed_paths(repo_root) or set()): absolute = (repo_root / rel_posix).resolve() if _matches_companion(rel_posix, allowlist): allowed_files.add(absolute) @@ -2349,11 +2354,33 @@ def run_agentic_sync( # only branches to worktrees inside :class:`DurableSyncRunner.run`, # which runs after this guard is no longer relevant. if scope_guard and issue_contract is not None: + # Iter-38 M-1 (fail-closed baseline acquisition): the helpers now + # return ``None`` on transient git failure (lock contention, missing + # binary, OSError) instead of an empty set. Without this + # discrimination an init-time scan failure here would silently + # produce an empty baseline that the orchestrator scope guard + # later treats as "no pre-existing files," so any pre-existing + # user WIP could be reverted/deleted by the pre-dispatch check. + # When EITHER scan fails we abort BEFORE any LLM call or shell + # command runs in this orchestrator. The runner has its own + # symmetric guard in :meth:`AsyncSyncRunner.run`. + _raw_orch_changed = _git_changed_paths(project_root) + _raw_orch_ignored = _git_ignored_paths(project_root) + if _raw_orch_changed is None or _raw_orch_ignored is None: + msg = ( + "Scope guard fail-closed: could not snapshot working-tree " + "baseline at orchestrator init (git scan failed). Aborting " + "before any pre-dispatch LLM/shell work to prevent " + "false-positive reverts of pre-existing user files." + ) + if use_github_state: + _post_error_comment(owner, repo, issue_number, msg) + return False, msg, 0.0, "" _orch_baseline_changed: Dict[str, Optional[str]] = _hash_baseline_paths( - project_root, _git_changed_paths(project_root) + project_root, _raw_orch_changed ) _orch_baseline_ignored: Dict[str, Optional[str]] = _hash_baseline_paths( - project_root, _git_ignored_paths(project_root) + project_root, _raw_orch_ignored ) else: _orch_baseline_changed = {} diff --git a/pdd/agentic_sync_runner.py b/pdd/agentic_sync_runner.py index aa9942bbe..961b0ba0e 100644 --- a/pdd/agentic_sync_runner.py +++ b/pdd/agentic_sync_runner.py @@ -158,8 +158,26 @@ def _normalize_repo_path(path: str) -> str: return cleaned -def _git_changed_paths(project_root: Path) -> set[str]: - """Return changed paths from git status, including untracked files.""" +def _git_changed_paths(project_root: Path) -> Optional[set[str]]: + """Return changed paths from git status, or ``None`` on scan failure. + + Iter-38 M-1 (fail-closed baseline acquisition): previously returned an + empty set on any subprocess failure or non-zero return, indistinguishable + from "scan succeeded but worktree was clean." That ambiguity let a + transient git failure at runner/orchestrator init time produce an empty + baseline that the scope guard later treats as "user had nothing dirty," + so any pre-existing user file is falsely flagged as out-of-scope and + reverted/deleted. + + A successful scan that finds no changes returns an empty set; only + failures (OSError, ``subprocess.SubprocessError``, non-zero return code) + return ``None``. Init-time callers MUST treat ``None`` as a fail-closed + abort signal (see :class:`AsyncSyncRunner.__init__` and + :func:`pdd.agentic_sync.run_agentic_sync`). Enforcement-time callers + that already have a separate ```` policy (see + :meth:`_remaining_out_of_scope_paths`) treat ``None`` as the empty set + via ``or set()``. + """ try: result = subprocess.run( ["git", "status", "--porcelain", "--untracked-files=all"], @@ -169,9 +187,9 @@ def _git_changed_paths(project_root: Path) -> set[str]: check=False, ) except (OSError, subprocess.SubprocessError): - return set() + return None if result.returncode != 0: - return set() + return None paths: set[str] = set() for line in result.stdout.splitlines(): @@ -189,7 +207,7 @@ def _git_changed_paths(project_root: Path) -> set[str]: return {p for p in paths if p} -def _git_ignored_paths(project_root: Path) -> set[str]: +def _git_ignored_paths(project_root: Path) -> Optional[set[str]]: """Return repo-relative POSIX paths of git-ignored files (Issue #1013 iter-20). Uses ``git ls-files --others --ignored --exclude-standard`` to enumerate @@ -199,10 +217,19 @@ def _git_ignored_paths(project_root: Path) -> set[str]: ``scope_guard_enabled AND allowed_write_paths is not None`` so non-contract runs do not pay the cost. - Returns an empty set on any subprocess failure or non-zero return — the - baseline snapshot is best-effort. The post-revert re-scan in - ``_remaining_out_of_scope_paths`` handles ignored-scan failures with the - explicit ```` sentinel instead. + Iter-38 M-1 (fail-closed baseline acquisition): returns ``None`` on any + subprocess failure or non-zero return, NOT an empty set. The init-time + callers that snapshot the baseline (see :class:`AsyncSyncRunner.__init__` + and :func:`pdd.agentic_sync.run_agentic_sync`) MUST treat ``None`` as a + fail-closed abort signal — otherwise a transient git failure at init + silently produces an empty baseline that the scope guard later treats as + "no pre-existing files," so any pre-existing user WIP is falsely flagged + as out-of-scope and reverted/deleted. + + Enforcement-time callers (post-revert re-scan in + :meth:`_remaining_out_of_scope_paths`) handle ignored-scan failures with + the separate ```` sentinel; those sites treat ``None`` + as the empty set via ``or set()``. """ try: result = subprocess.run( @@ -213,9 +240,9 @@ def _git_ignored_paths(project_root: Path) -> set[str]: check=False, ) except (OSError, subprocess.SubprocessError): - return set() + return None if result.returncode != 0: - return set() + return None paths: set[str] = set() for line in result.stdout.splitlines(): @@ -1050,15 +1077,7 @@ def __init__( # gate (``scope_guard_enabled AND allowed_write_paths is not None``) # is off, so the dict.items() loops in the enforcement path are # safe no-ops. - if self.scope_guard_enabled and self.allowed_write_paths is not None: - _baseline_paths = _git_changed_paths(self.project_root) - self._baseline_changed_paths: Dict[str, Optional[str]] = { - rel: _hash_file(self.project_root, rel) - for rel in _baseline_paths - } - else: - self._baseline_changed_paths = {} - + # # Iter-20 M-1 (gitignored fail-open): also snapshot pre-existing # gitignored files (e.g. user-side ``build/cache.bin`` under a # repo-wide ``.gitignore: build/``). ``git status`` does not show @@ -1067,22 +1086,36 @@ def __init__( # invisible to the post-revert re-scan and the module would be # marked successful with the contract violated on disk. # - # Gated on ``scope_guard_enabled AND allowed_write_paths is not None`` - # (same gate as ``_baseline_changed_paths``) so non-contract runs do - # not pay the ``git ls-files`` cost on repos with large ignored - # trees (``node_modules/``, ``build/``, etc.). + # Iter-24 M-1: same dict-with-SHA shape as ``_baseline_changed_paths``. # - # Iter-24 M-1: same dict-with-SHA shape as ``_baseline_changed_paths`` - # — pre-existing ignored files are skipped from the gitignored - # re-scan ONLY when their content is unchanged. A clobbered ignored - # baseline path surfaces via the re-scan. + # Iter-38 M-1 (fail-closed baseline acquisition): the helpers now + # return ``None`` on scan failure (transient git lock contention, + # missing binary, OSError) instead of an empty set. Without this + # discrimination an init-time scan failure would silently produce + # an empty baseline that the scope guard later treats as "no + # pre-existing files," so any pre-existing user WIP is falsely + # flagged as out-of-scope and reverted/deleted. When EITHER scan + # returns ``None`` we record ``_baseline_acquisition_failed = True`` + # and :meth:`run` aborts before any write-capable work. The flag + # is internal — public signatures are unchanged. if self.scope_guard_enabled and self.allowed_write_paths is not None: - _baseline_ignored = _git_ignored_paths(self.project_root) - self._baseline_ignored_paths: Dict[str, Optional[str]] = { - rel: _hash_file(self.project_root, rel) - for rel in _baseline_ignored - } + _raw_changed = _git_changed_paths(self.project_root) + _raw_ignored = _git_ignored_paths(self.project_root) + if _raw_changed is None or _raw_ignored is None: + self._baseline_acquisition_failed: bool = True + self._baseline_changed_paths: Dict[str, Optional[str]] = {} + self._baseline_ignored_paths: Dict[str, Optional[str]] = {} + else: + self._baseline_acquisition_failed = False + self._baseline_changed_paths = _hash_baseline_paths( + self.project_root, _raw_changed + ) + self._baseline_ignored_paths = _hash_baseline_paths( + self.project_root, _raw_ignored + ) else: + self._baseline_acquisition_failed = False + self._baseline_changed_paths = {} self._baseline_ignored_paths = {} self.total_budget = self.sync_options.get("total_budget") @@ -1684,6 +1717,26 @@ def run(self) -> Tuple[bool, str, float]: # here, which produced a duplicate line for every sync. Removed so # the operator sees one authoritative status line. + # Iter-38 M-1 (fail-closed baseline acquisition): the __init__ + # baseline-scan helpers (``_git_changed_paths`` / ``_git_ignored_paths``) + # now return ``None`` on transient git failure (lock contention, + # missing binary, OSError) rather than an empty set. An empty + # baseline indistinguishable from "scan succeeded, worktree clean" + # would later cause the scope guard to flag pre-existing user WIP + # as out-of-scope and revert/delete it. When the init recorded a + # failed acquisition, abort BEFORE any write-capable work runs. + if getattr(self, "_baseline_acquisition_failed", False): + return ( + False, + ( + "Scope guard fail-closed: could not snapshot working-tree " + "baseline at runner init (git scan failed). Aborting " + "before any write-capable work to prevent false-positive " + "reverts of pre-existing user files." + ), + self.initial_cost, + ) + if not self.basenames: return True, "No modules to sync", self.initial_cost @@ -2257,7 +2310,14 @@ def _enforce_scope_guard( # scope to ``cwd_path`` FIRST, then match the module-relative form # against the companion pattern (same semantics as the rglob loop # above). - for rel_posix in _git_changed_paths(repo_root): + # + # Iter-38 M-1: ``_git_changed_paths`` now returns ``None`` on + # scan failure (was empty set). Enforcement-time scan failures + # are already handled by the ```` sentinel in + # :meth:`_remaining_out_of_scope_paths`; here we just treat + # ``None`` as the empty set so the iteration is a no-op rather + # than crashing. + for rel_posix in (_git_changed_paths(repo_root) or set()): absolute = (repo_root / rel_posix).resolve() # Iter-36 B-1/B-2: tracked deletion of a PDD-internal # artifact (e.g. ``.pdd/agentic_sync_state.json`` between diff --git a/pdd/prompts/agentic_sync_runner_python.prompt b/pdd/prompts/agentic_sync_runner_python.prompt index a690730ae..c5f64e9eb 100644 --- a/pdd/prompts/agentic_sync_runner_python.prompt +++ b/pdd/prompts/agentic_sync_runner_python.prompt @@ -41,10 +41,11 @@ Parallel sync engine that runs `pdd sync` for multiple modules concurrently usin - `scope_guard_enabled: bool` — master switch (default `True`). When `False`, the runner records the parsed contract for diagnostics but performs no enforcement, no revert, and no hard-fail. Maps to the CLI `--no-scope-guard` opt-out. - `contract_source: Optional[str]` — diagnostic label carrying the parse source of the issue contract (`"html-comment"`, `"fenced-block"`, or `"bullet-list"`, matching `IssueContract.source`) so scope-guard diagnostics and downstream review-loop reporters can surface where the contract was detected. `None` when no contract was parsed (permissive fallback). - `project_root: Optional[Path]` — when non-`None`, overrides the default `Path.cwd()` used to seed `self.project_root` and to take the baseline-changed-paths snapshot (Issue #1013 iter-18 M-1). Subclasses such as `DurableSyncRunner` pin this to the durable worktree's git root so the baseline reflects the worktree where syncs will actually run, not the caller's current working directory. Resolved with `Path(project_root).resolve()` when provided. - - `_baseline_changed_paths: Dict[str, Optional[str]]` (Issue #1013 iter-6 B1 + iter-24 M-1) — snapshot of pre-existing dirty/untracked working-tree paths captured from `_git_changed_paths(project_root)` at runner init, mapping each repo-relative POSIX path to its init-time SHA-1 (`_hash_file(project_root, rel)`) or `None` when the file was unreadable at init. Iter-6 B1 originated the iter-6 B1 "preserve user's pre-existing untracked files" carve-out so the scope guard does not delete unrelated user WIP. **Iter-24 M-1 (baseline-clobber bug)** upgraded preservation from name-based to content-aware: the old `Set[str]` snapshot let a buggy LLM silently OVERWRITE an out-of-scope baseline file (the iter-23 codex repro: `outside: sync clobbered`). The dict + SHA invariant: a baseline path is auto-allowed (added to `allowed_files`) by `_enforce_scope_guard` ONLY IF its current SHA-1 matches the init-time SHA-1; a divergent SHA falls through to the contract check and surfaces the clobber. Gated on `scope_guard_enabled AND allowed_write_paths is not None`; when the gate is off the snapshot is an empty dict so the `.items()` iteration in the enforcement path is a no-op. The `_hash_file` helper uses SHA-1 because this is clobber detection, not adversarial collision resistance. - - `_baseline_ignored_paths: Dict[str, Optional[str]]` (Issue #1013 iter-20 M-1 + iter-24 M-1) — sibling snapshot to `_baseline_changed_paths`, populated from `git ls-files --others --ignored --exclude-standard` at init via the helper `_git_ignored_paths(project_root)`. Records repo-relative POSIX paths of pre-existing gitignored files (e.g. user-side `build/cache.bin` under a repo-wide `.gitignore: build/`) so the post-revert re-scan does not flag them as the sync run's out-of-scope writes. Gated identically to `_baseline_changed_paths` (`scope_guard_enabled AND allowed_write_paths is not None`) so non-contract runs do not pay the `git ls-files` cost on repos with large ignored trees (`node_modules/`, `build/`, etc.). **Iter-24 M-1** same dict-with-SHA shape as `_baseline_changed_paths`: pre-existing ignored files are skipped from the gitignored re-scan ONLY when their content is byte-identical to the init snapshot; a clobbered ignored baseline path surfaces via the re-scan as out-of-scope. + - `_baseline_changed_paths: Dict[str, Optional[str]]` (Issue #1013 iter-6 B1 + iter-24 M-1 + iter-38 M-1) — snapshot of pre-existing dirty/untracked working-tree paths captured from `_git_changed_paths(project_root)` at runner init, mapping each repo-relative POSIX path to its init-time SHA-1 (`_hash_file(project_root, rel)`) or `None` when the file was unreadable at init. Iter-6 B1 originated the iter-6 B1 "preserve user's pre-existing untracked files" carve-out so the scope guard does not delete unrelated user WIP. **Iter-24 M-1 (baseline-clobber bug)** upgraded preservation from name-based to content-aware: the old `Set[str]` snapshot let a buggy LLM silently OVERWRITE an out-of-scope baseline file (the iter-23 codex repro: `outside: sync clobbered`). The dict + SHA invariant: a baseline path is auto-allowed (added to `allowed_files`) by `_enforce_scope_guard` ONLY IF its current SHA-1 matches the init-time SHA-1; a divergent SHA falls through to the contract check and surfaces the clobber. Gated on `scope_guard_enabled AND allowed_write_paths is not None`; when the gate is off the snapshot is an empty dict so the `.items()` iteration in the enforcement path is a no-op. The `_hash_file` helper uses SHA-1 because this is clobber detection, not adversarial collision resistance. **Iter-38 M-1 (fail-closed baseline acquisition)** upgraded the helper signature from `set[str]` to `Optional[set[str]]`: a transient init-time git failure (lock contention, missing binary, `OSError`) now returns `None` (distinguishable from "scan succeeded, clean worktree" which returns the empty set). When EITHER `_git_changed_paths` or `_git_ignored_paths` returns `None` at init, the runner records `_baseline_acquisition_failed = True` and `run()` aborts before any write-capable work runs — otherwise the empty baseline would later cause the scope guard to flag pre-existing user WIP as out-of-scope and revert/delete it. + - `_baseline_ignored_paths: Dict[str, Optional[str]]` (Issue #1013 iter-20 M-1 + iter-24 M-1 + iter-38 M-1) — sibling snapshot to `_baseline_changed_paths`, populated from `git ls-files --others --ignored --exclude-standard` at init via the helper `_git_ignored_paths(project_root)`. Records repo-relative POSIX paths of pre-existing gitignored files (e.g. user-side `build/cache.bin` under a repo-wide `.gitignore: build/`) so the post-revert re-scan does not flag them as the sync run's out-of-scope writes. Gated identically to `_baseline_changed_paths` (`scope_guard_enabled AND allowed_write_paths is not None`) so non-contract runs do not pay the `git ls-files` cost on repos with large ignored trees (`node_modules/`, `build/`, etc.). **Iter-24 M-1** same dict-with-SHA shape as `_baseline_changed_paths`: pre-existing ignored files are skipped from the gitignored re-scan ONLY when their content is byte-identical to the init snapshot; a clobbered ignored baseline path surfaces via the re-scan as out-of-scope. **Iter-38 M-1** same `Optional[set[str]]` helper signature: a `None` return from `_git_ignored_paths` at init also triggers the fail-closed `_baseline_acquisition_failed` flag. + - `_baseline_acquisition_failed: bool` (Issue #1013 iter-38 M-1) — internal flag set during `__init__` when either init-time baseline scan helper returns `None` (transient git failure). When `True`, `run()` MUST return `(False, fail-closed-message, initial_cost)` before any write-capable work (subprocess dispatch, executor submission) runs. Default `False`; not part of the public API. The flag is also evaluated in the durable runner's inherited `super().run()` call so a fail-closed init aborts durable mode at the same boundary. - Tracks per-module state: pending -> running -> success | failed -2. Method: `run() -> Tuple[bool, str, float]` — returns (all_success, summary_message, total_cost) where total_cost includes initial_cost + per-module costs +2. Method: `run() -> Tuple[bool, str, float]` — returns (all_success, summary_message, total_cost) where total_cost includes initial_cost + per-module costs. **Iter-38 M-1 (fail-closed baseline acquisition):** before any other work, check `self._baseline_acquisition_failed` (set during `__init__` when either init-time baseline scan helper returned `None`) and short-circuit with `(False, "Scope guard fail-closed: could not snapshot working-tree baseline at runner init …", self.initial_cost)` so a transient init-time git failure cannot let the scope guard later treat an empty baseline as "no pre-existing files" and revert/delete user WIP. 3. Use `concurrent.futures.ThreadPoolExecutor` with `MAX_WORKERS = 4`; when `sync_options["total_budget"]` is set, run sequentially and pass only the remaining total budget to each child process so the total budget is not multiplied per module. 4. Dependency-aware scheduling: a module starts only when all its dependencies (within target set) have status "success" 5. Dependencies outside `basenames` are omitted from `dep_graph` (partial sync). Callers should rely on `build_dep_graph_from_architecture` warnings for visibility when an architecture edge points outside the target set; the runner still assumes those deps are out of scope for this run. diff --git a/tests/test_agentic_sync.py b/tests/test_agentic_sync.py index 0b0a48486..77d73d576 100644 --- a/tests/test_agentic_sync.py +++ b/tests/test_agentic_sync.py @@ -3287,6 +3287,220 @@ def llm_side_effect(*_args, **_kwargs): mock_runner_cls.assert_called_once() mock_runner_cls.return_value.run.assert_called_once() + # --------------------------------------------------------------------- + # Iter-38 M-1: fail-closed baseline acquisition at orchestrator init. + # When ``_git_changed_paths`` / ``_git_ignored_paths`` return ``None`` + # (transient git lock contention, missing binary, OSError) the + # orchestrator MUST abort BEFORE any pre-dispatch LLM call or shell + # command. An empty baseline produced by a silent scan failure would + # later let the pre-dispatch scope guard revert pre-existing user WIP. + + @patch("pdd.agentic_sync.AsyncSyncRunner") + @patch("pdd.agentic_sync.run_agentic_task") + @patch("pdd.agentic_sync._load_architecture_json") + @patch("pdd.agentic_sync._run_gh_command") + @patch("pdd.agentic_sync._check_gh_cli", return_value=True) + def test_orchestrator_aborts_when_baseline_changed_scan_fails( + self, + _mock_gh_cli, + mock_gh_cmd, + mock_load_arch, + mock_agentic_task, + mock_runner_cls, + tmp_path, + monkeypatch, + ): + """Init-time ``_git_changed_paths`` returns ``None`` → orchestrator + fail-closes before any LLM call or runner construction.""" + _init_git_repo(tmp_path) + monkeypatch.setattr("pdd.agentic_sync._find_project_root", lambda *_: tmp_path) + monkeypatch.setattr( + "pdd.agentic_sync._detect_modules_from_branch_diff", lambda *_: [] + ) + mock_gh_cmd.return_value = ( + True, self._issue_payload(self._ISSUE_BODY_WITH_BULLET_CONTRACT) + ) + mock_load_arch.return_value = (None, tmp_path / "architecture.json") + # Patch the helpers on the agentic_sync module (where they're + # imported by name) — the orchestrator looks them up here. + monkeypatch.setattr("pdd.agentic_sync._git_changed_paths", lambda _root: None) + monkeypatch.setattr("pdd.agentic_sync._git_ignored_paths", lambda _root: set()) + + success, msg, _cost, _model = run_agentic_sync( + "https://github.com/owner/repo/issues/1", + quiet=True, + use_github_state=False, + ) + + assert success is False + assert "fail-closed" in msg + assert "baseline" in msg + # Downstream LLM / runner construction MUST NOT have run. + mock_agentic_task.assert_not_called() + mock_runner_cls.assert_not_called() + + @patch("pdd.agentic_sync.AsyncSyncRunner") + @patch("pdd.agentic_sync.run_agentic_task") + @patch("pdd.agentic_sync._load_architecture_json") + @patch("pdd.agentic_sync._run_gh_command") + @patch("pdd.agentic_sync._check_gh_cli", return_value=True) + def test_orchestrator_aborts_when_baseline_ignored_scan_fails( + self, + _mock_gh_cli, + mock_gh_cmd, + mock_load_arch, + mock_agentic_task, + mock_runner_cls, + tmp_path, + monkeypatch, + ): + """Init-time ``_git_ignored_paths`` returns ``None`` → orchestrator + fail-closes before any LLM call or runner construction.""" + _init_git_repo(tmp_path) + monkeypatch.setattr("pdd.agentic_sync._find_project_root", lambda *_: tmp_path) + monkeypatch.setattr( + "pdd.agentic_sync._detect_modules_from_branch_diff", lambda *_: [] + ) + mock_gh_cmd.return_value = ( + True, self._issue_payload(self._ISSUE_BODY_WITH_BULLET_CONTRACT) + ) + mock_load_arch.return_value = (None, tmp_path / "architecture.json") + monkeypatch.setattr("pdd.agentic_sync._git_changed_paths", lambda _root: set()) + monkeypatch.setattr("pdd.agentic_sync._git_ignored_paths", lambda _root: None) + + success, msg, _cost, _model = run_agentic_sync( + "https://github.com/owner/repo/issues/1", + quiet=True, + use_github_state=False, + ) + + assert success is False + assert "fail-closed" in msg + mock_agentic_task.assert_not_called() + mock_runner_cls.assert_not_called() + + @patch("pdd.agentic_sync._filter_already_synced") + @patch("pdd.agentic_sync._run_dry_run_validation") + @patch("pdd.agentic_sync.AsyncSyncRunner") + @patch("pdd.agentic_sync.load_prompt_template", return_value="t {issue_content} {architecture_json}") + @patch("pdd.agentic_sync.run_agentic_task") + @patch("pdd.agentic_sync._load_architecture_json") + @patch("pdd.agentic_sync._run_gh_command") + @patch("pdd.agentic_sync._check_gh_cli", return_value=True) + def test_orchestrator_proceeds_when_baseline_scans_succeed( + self, + _mock_gh_cli, + mock_gh_cmd, + mock_load_arch, + mock_agentic_task, + _mock_load_prompt, + mock_runner_cls, + mock_dry_run, + mock_filter_synced, + tmp_path, + monkeypatch, + ): + """Regression: both baseline scans succeed (empty set is a valid + success result) → orchestrator proceeds normally.""" + _init_git_repo(tmp_path) + monkeypatch.setattr("pdd.agentic_sync._find_project_root", lambda *_: tmp_path) + monkeypatch.setattr( + "pdd.agentic_sync._detect_modules_from_branch_diff", lambda *_: [] + ) + mock_gh_cmd.return_value = ( + True, self._issue_payload(self._ISSUE_BODY_WITH_BULLET_CONTRACT) + ) + mock_load_arch.return_value = (None, tmp_path / "architecture.json") + # Successful scans returning empty sets (clean worktree). + monkeypatch.setattr("pdd.agentic_sync._git_changed_paths", lambda _root: set()) + monkeypatch.setattr("pdd.agentic_sync._git_ignored_paths", lambda _root: set()) + + mock_agentic_task.return_value = ( + True, 'MODULES_TO_SYNC: ["foo"]\nDEPS_VALID: true', 0.01, "anthropic" + ) + mock_dry_run.return_value = (True, {"foo": tmp_path}, [], 0.0) + mock_filter_synced.return_value = ["foo"] + mock_runner_cls.return_value.run.return_value = (True, "ok", 0.0) + + success, msg, _cost, _model = run_agentic_sync( + "https://github.com/owner/repo/issues/1", quiet=True + ) + + assert success is True + assert "fail-closed" not in msg + mock_runner_cls.assert_called_once() + + @patch("pdd.agentic_sync._filter_already_synced") + @patch("pdd.agentic_sync._run_dry_run_validation") + @patch("pdd.agentic_sync.AsyncSyncRunner") + @patch("pdd.agentic_sync.load_prompt_template", return_value="t {issue_content} {architecture_json}") + @patch("pdd.agentic_sync.run_agentic_task") + @patch("pdd.agentic_sync._load_architecture_json") + @patch("pdd.agentic_sync._run_gh_command") + @patch("pdd.agentic_sync._check_gh_cli", return_value=True) + def test_orchestrator_does_not_fail_closed_in_permissive_mode( + self, + _mock_gh_cli, + mock_gh_cmd, + mock_load_arch, + mock_agentic_task, + _mock_load_prompt, + mock_runner_cls, + mock_dry_run, + mock_filter_synced, + tmp_path, + monkeypatch, + ): + """Permissive mode (no contract on issue) → baseline scan is never + invoked, so a hypothetical ``None`` return from the helpers does + NOT trigger the fail-closed abort. Run proceeds normally.""" + _init_git_repo(tmp_path) + monkeypatch.setattr("pdd.agentic_sync._find_project_root", lambda *_: tmp_path) + monkeypatch.setattr( + "pdd.agentic_sync._detect_modules_from_branch_diff", lambda *_: [] + ) + mock_gh_cmd.return_value = ( + True, self._issue_payload(self._ISSUE_BODY_NO_CONTRACT) + ) + mock_load_arch.return_value = (None, tmp_path / "architecture.json") + + called = {"changed": 0, "ignored": 0} + + def fake_changed(_root): + called["changed"] += 1 + return None # Would fail-close if the gate let us reach here. + + def fake_ignored(_root): + called["ignored"] += 1 + return None + + monkeypatch.setattr("pdd.agentic_sync._git_changed_paths", fake_changed) + monkeypatch.setattr("pdd.agentic_sync._git_ignored_paths", fake_ignored) + + mock_agentic_task.return_value = ( + True, 'MODULES_TO_SYNC: ["foo"]\nDEPS_VALID: true', 0.01, "anthropic" + ) + mock_dry_run.return_value = (True, {"foo": tmp_path}, [], 0.0) + mock_filter_synced.return_value = ["foo"] + mock_runner_cls.return_value.run.return_value = (True, "ok", 0.0) + + success, msg, _cost, _model = run_agentic_sync( + "https://github.com/owner/repo/issues/1", quiet=True + ) + + # Init-time helpers are gated on (scope_guard AND issue_contract is + # not None). In permissive mode they MUST NOT be called for the + # baseline acquisition, so the fail-closed abort cannot trigger. + assert called["changed"] == 0, ( + "permissive mode must not invoke the init-time changed scan" + ) + assert called["ignored"] == 0, ( + "permissive mode must not invoke the init-time ignored scan" + ) + assert success is True + assert "fail-closed" not in msg + mock_runner_cls.assert_called_once() + # --------------------------------------------------------------------------- # _resolve_module_cwd diff --git a/tests/test_agentic_sync_runner.py b/tests/test_agentic_sync_runner.py index 3c8d15ba8..8386e0acc 100644 --- a/tests/test_agentic_sync_runner.py +++ b/tests/test_agentic_sync_runner.py @@ -2667,6 +2667,170 @@ def test_async_runner_project_root_kwarg_overrides_cwd( assert "out.py" not in runner._baseline_changed_paths +class TestBaselineFailClosed: + """Issue #1013 iter-38 M-1: when the init-time baseline scan fails + (transient git lock contention, missing binary, OSError), the runner + MUST record ``_baseline_acquisition_failed=True`` and abort + :meth:`run` before any write-capable work runs. An empty baseline + indistinguishable from "scan succeeded, worktree clean" would cause + the scope guard to later flag pre-existing user WIP as out-of-scope + and revert/delete it. + """ + + def test_async_runner_aborts_when_baseline_changed_scan_fails( + self, monkeypatch + ): + from pdd import agentic_sync_runner as mod + + monkeypatch.setattr(mod, "_git_changed_paths", lambda _root: None) + monkeypatch.setattr(mod, "_git_ignored_paths", lambda _root: set()) + + runner = AsyncSyncRunner( + basenames=["a"], + dep_graph={"a": []}, + sync_options={}, + github_info=None, + quiet=True, + allowed_write_set=["pdd/a.py"], + ) + + assert runner._baseline_acquisition_failed is True + assert runner._baseline_changed_paths == {} + assert runner._baseline_ignored_paths == {} + + success, msg, cost = runner.run() + assert success is False + assert "fail-closed" in msg + assert "baseline" in msg + assert cost == 0.0 + + def test_async_runner_aborts_when_baseline_ignored_scan_fails( + self, monkeypatch + ): + from pdd import agentic_sync_runner as mod + + monkeypatch.setattr(mod, "_git_changed_paths", lambda _root: set()) + monkeypatch.setattr(mod, "_git_ignored_paths", lambda _root: None) + + runner = AsyncSyncRunner( + basenames=["a"], + dep_graph={"a": []}, + sync_options={}, + github_info=None, + quiet=True, + allowed_write_set=["pdd/a.py"], + ) + + assert runner._baseline_acquisition_failed is True + success, msg, _cost = runner.run() + assert success is False + assert "fail-closed" in msg + + def test_async_runner_aborts_when_baseline_scan_raises_oserror( + self, monkeypatch + ): + """Verify the actual subprocess exception path: ``_git_changed_paths`` + catches ``OSError`` and returns ``None``, which must propagate to + the fail-closed flag.""" + from pdd import agentic_sync_runner as mod + + def boom(*_args, **_kwargs): + raise OSError("git binary missing") + + monkeypatch.setattr(mod.subprocess, "run", boom) + + runner = AsyncSyncRunner( + basenames=["a"], + dep_graph={"a": []}, + sync_options={}, + github_info=None, + quiet=True, + allowed_write_set=["pdd/a.py"], + ) + + assert runner._baseline_acquisition_failed is True + success, msg, _cost = runner.run() + assert success is False + assert "fail-closed" in msg + + def test_async_runner_no_flag_when_baseline_scan_fails_in_permissive_mode( + self, monkeypatch + ): + """When ``allowed_write_set`` is ``None`` (permissive), the helpers + are never called and no failure can be recorded — the run proceeds.""" + from pdd import agentic_sync_runner as mod + + called = {"changed": 0, "ignored": 0} + + def fake_changed(_root): + called["changed"] += 1 + return None + + def fake_ignored(_root): + called["ignored"] += 1 + return None + + monkeypatch.setattr(mod, "_git_changed_paths", fake_changed) + monkeypatch.setattr(mod, "_git_ignored_paths", fake_ignored) + + runner = AsyncSyncRunner( + basenames=["a"], + dep_graph={"a": []}, + sync_options={}, + github_info=None, + quiet=True, + allowed_write_set=None, + ) + + # Gate is off → helpers MUST NOT be called and the flag MUST be False. + assert called["changed"] == 0 + assert called["ignored"] == 0 + assert runner._baseline_acquisition_failed is False + + def test_async_runner_no_flag_when_scope_guard_disabled(self, monkeypatch): + """``scope_guard_enabled=False`` skips the baseline scan entirely — + even an OSError from ``subprocess.run`` cannot trigger fail-closed.""" + from pdd import agentic_sync_runner as mod + + def boom(*_args, **_kwargs): + raise OSError("would explode if reached") + + monkeypatch.setattr(mod.subprocess, "run", boom) + + runner = AsyncSyncRunner( + basenames=["a"], + dep_graph={"a": []}, + sync_options={}, + github_info=None, + quiet=True, + allowed_write_set=["pdd/a.py"], + scope_guard_enabled=False, + ) + + assert runner._baseline_acquisition_failed is False + + def test_async_runner_no_flag_when_scan_succeeds(self, monkeypatch): + """Regression: a successful scan returning an EMPTY set (clean + worktree) must NOT trigger fail-closed — only ``None`` does.""" + from pdd import agentic_sync_runner as mod + + monkeypatch.setattr(mod, "_git_changed_paths", lambda _root: set()) + monkeypatch.setattr(mod, "_git_ignored_paths", lambda _root: set()) + + runner = AsyncSyncRunner( + basenames=["a"], + dep_graph={"a": []}, + sync_options={}, + github_info=None, + quiet=True, + allowed_write_set=["pdd/a.py"], + ) + + assert runner._baseline_acquisition_failed is False + assert runner._baseline_changed_paths == {} + assert runner._baseline_ignored_paths == {} + + class TestEnforceScopeGuard: """Issue #1013 (F9): direct behavioural coverage for ``_enforce_scope_guard`` and ``_matches_companion_allowlist``. The constructor-state checks above From 35d41822633a9a17636a9edb2d75a1af9257b300 Mon Sep 17 00:00:00 2001 From: Serhan Date: Fri, 15 May 2026 16:24:11 -0700 Subject: [PATCH 41/42] fix(sync): iter-40 distinguish unreadable vs missing baseline + durable init ordering MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Codex iter-39 (2 Majors, no Blockers — strong convergence signal): M-1: _hash_file() returned None for BOTH "file gone" and "file unreadable" (permission error). The iter-34 deletion-detection code treated any None as deletion, so a baseline file that became unreadable mid-sync (permission flip, locked file) got misclassified as deleted and the diagnostic falsely claimed the file was removed. Add sibling helper _classify_baseline_path() returning a discriminated _BaselinePathStatus(sha, missing) NamedTuple. Route the 4 iter-34 baseline-iteration sites (per-module changed + ignored loops; orchestrator changed + ignored loops) through it. Unreadable files now preserve-by-name (same as iter-24's unreadable-at-init carve-out) instead of being misclassified as deletions. Three other _hash_file callers (_hash_baseline_paths and two collapsed-None re-scan loops) intentionally keep _hash_file since their fall-through semantics are already safe. M-2: DurableSyncRunner.run() called _prepare_durable_branch() BEFORE checking _baseline_acquisition_failed. A baseline-scan failure left durable side effects (worktree creation, branch checkout, possibly remote pushes) before the inherited fail-closed abort ran. Hoist the _baseline_acquisition_failed check above _prepare_durable_branch() in DurableSyncRunner.run() so the abort happens BEFORE any durable setup. Iter-22's {} baseline clear preserves the flag, so it correctly reflects the main-checkout scan failure (where the orchestrator scope guard operates). Tests: unreadable-not-misclassified (with chmod, skipped on Windows); missing-still-flagged-as-deleted regression for iter-34; durable-aborts-before-worktree-setup. 734 passed. Co-Authored-By: Claude Opus 4.7 --- pdd/agentic_sync.py | 30 ++++- pdd/agentic_sync_runner.py | 120 +++++++++++++++++++- pdd/durable_sync_runner.py | 30 +++++ tests/test_agentic_sync_runner.py | 183 ++++++++++++++++++++++++++++++ tests/test_durable_sync_runner.py | 68 +++++++++++ 5 files changed, 421 insertions(+), 10 deletions(-) diff --git a/pdd/agentic_sync.py b/pdd/agentic_sync.py index 5c3fe0091..186968097 100644 --- a/pdd/agentic_sync.py +++ b/pdd/agentic_sync.py @@ -36,6 +36,7 @@ AsyncSyncRunner, _architecture_entry_aliases, _basename_from_architecture_filename, + _classify_baseline_path, _find_pdd_executable, _git_changed_paths, _git_ignored_paths, @@ -1864,17 +1865,31 @@ def _enforce_orchestrator_scope( # status surfaces this as ``D `` and the re-scan picks it up, but # UNTRACKED baselines leave no trail; without this collection the # orchestrator silently passes a sync-side deletion of user WIP. + # + # Iter-40 M-1 (unreadable vs missing): use + # :func:`_classify_baseline_path` to distinguish "file deleted" from + # "file exists but unreadable" (permission flip, locked file). The + # latter case must NOT be flagged as deleted — preserve by name to + # avoid the false-deletion diagnostic and prevent downstream revert + # helpers from attempting to remove a still-present path. baseline_deleted: set[str] = set() for rel_posix, baseline_hash in baseline_changed.items(): - current_hash = _hash_file(repo_root, rel_posix) - if current_hash is None: + status = _classify_baseline_path(repo_root, rel_posix) + if status.missing: # File was deleted after baseline — surface it via the # ``remaining`` set below regardless of whether it was tracked # or untracked (we can't distinguish from the snapshot, and # even the tracked-deletion case warrants a hard-fail). baseline_deleted.add(rel_posix) continue - if baseline_hash is None or current_hash == baseline_hash: + if status.sha is None: + # Iter-40 M-1: present but unreadable. Preserve by name — + # same conservative carve-out as the unreadable-at-snapshot + # branch below — so a permission-flaky baseline is not + # misreported as deleted. + allowed_files.add((repo_root / rel_posix).resolve()) + continue + if baseline_hash is None or status.sha == baseline_hash: # Unreadable at snapshot (preserve by name) or unchanged content # → preserve. allowed_files.add((repo_root / rel_posix).resolve()) @@ -1886,10 +1901,15 @@ def _enforce_orchestrator_scope( # baseline directly to catch the deletion. Present-but-changed # ignored baselines are already surfaced by # :func:`_orchestrator_remaining_out_of_scope_paths`'s ignored loop. + # + # Iter-40 M-1: same unreadable-vs-missing discrimination. for rel_posix, baseline_hash in baseline_ignored.items(): - current_hash = _hash_file(repo_root, rel_posix) - if current_hash is None: + status = _classify_baseline_path(repo_root, rel_posix) + if status.missing: baseline_deleted.add(rel_posix) + elif status.sha is None: + # Present but unreadable — preserve by name. + allowed_files.add((repo_root / rel_posix).resolve()) tracked_reverted = _revert_out_of_scope_changes(repo_root, allowed_files) untracked_reverted = revert_out_of_scope_changes_with_dirs( diff --git a/pdd/agentic_sync_runner.py b/pdd/agentic_sync_runner.py index 961b0ba0e..3ac468990 100644 --- a/pdd/agentic_sync_runner.py +++ b/pdd/agentic_sync_runner.py @@ -271,6 +271,13 @@ def _hash_file(project_root: Path, rel_posix: str) -> Optional[str]: collision resistance. Returns ``None`` when the file cannot be read (missing, permission denied, etc.); callers MUST treat ``None`` as "no fingerprint available" and decide policy explicitly. + + Iter-40 M-1: callers that need to DISCRIMINATE between missing and + unreadable (the iter-34 deletion-detection paths in the scope guards) + must use :func:`_classify_baseline_path` instead — this helper collapses + both cases to ``None`` and is kept for the snapshot-time + re-scan + sites where the fall-through "preserve by name" / "surface as + out-of-scope" semantics are already correct. """ try: path = (project_root / rel_posix).resolve() @@ -281,6 +288,79 @@ def _hash_file(project_root: Path, rel_posix: str) -> Optional[str]: return hashlib.sha1(data).hexdigest() +class _BaselinePathStatus(NamedTuple): + """Result of re-classifying a baseline file at scope-guard time. + + Iter-40 M-1: the iter-34 deletion-detection branches in the per-module + and orchestrator scope guards previously collapsed "file missing" and + "file unreadable" to the same ``current_hash is None`` signal, then + treated both as deletions. A pre-existing baseline file that became + UNREADABLE mid-sync (permission flip, locked file) was falsely flagged + as deleted — and downstream revert helpers were asked to remove a + path that still exists on disk. + + Fields: + sha: SHA-1 hex digest when the file was successfully hashed, else + ``None``. + missing: True when the file no longer exists on disk (the iter-34 + deletion case), False otherwise. When False AND ``sha is None`` + the file exists but could not be read (permission, OSError). + """ + + sha: Optional[str] + missing: bool + + +def _classify_baseline_path( + project_root: Path, rel_posix: str +) -> _BaselinePathStatus: + """Discriminated re-hash for baseline preservation at enforcement time. + + Iter-40 M-1 fix for the deletion blind spot (iter-34) which previously + treated unreadable files as deleted. Returns: + + - ``_BaselinePathStatus(hex_sha, False)`` — file exists and was hashed + - ``_BaselinePathStatus(None, True)`` — file is gone (iter-34 deletion) + - ``_BaselinePathStatus(None, False)`` — file exists but unreadable + (permission flip / OSError) → callers SHOULD preserve by name to + avoid the false-deletion diagnostic + the downstream revert-helper + attempt to remove a still-present path. + + This helper is intentionally NOT a replacement for :func:`_hash_file`. + The snapshot-time callsite (:func:`_hash_baseline_paths`) and the + re-scan loops in :meth:`AsyncSyncRunner._remaining_out_of_scope_paths` + + :func:`_orchestrator_remaining_out_of_scope_paths` already do the + right thing on a collapsed ``None`` — they either record ``None`` as + "no fingerprint available" (snapshot) or fall through to surfacing + the path as out-of-scope (re-scan, where git already listed the file). + Only the iter-34 baseline-iteration sites in + :meth:`AsyncSyncRunner._enforce_scope_guard` and + :func:`pdd.agentic_sync._enforce_orchestrator_scope` need the + discriminated answer. + """ + try: + path = (project_root / rel_posix).resolve() + except OSError: + # Path resolution itself failed — treat as unreadable, preserve. + return _BaselinePathStatus(None, False) + # ``exists()`` is the primary "missing" probe. ``open()`` below still + # has a defensive ``FileNotFoundError`` catch in case the file is + # raced out from under us between the two syscalls. + if not path.exists(): + return _BaselinePathStatus(None, True) + try: + with open(path, "rb") as handle: + data = handle.read() + except FileNotFoundError: + # Raced removal between exists() and open() — treat as missing. + return _BaselinePathStatus(None, True) + except OSError: + # PermissionError / locked file / generic IO — file is present + # but cannot be read. Distinct from missing. + return _BaselinePathStatus(None, False) + return _BaselinePathStatus(hashlib.sha1(data).hexdigest(), False) + + def _hash_baseline_paths( project_root: Path, paths: Iterable[str] ) -> Dict[str, Optional[str]]: @@ -2366,10 +2446,23 @@ def _enforce_scope_guard( # module would succeed with the WIP silently lost. Collect # the deletions here and union them into the diagnostic's # ``remaining`` set below. + # + # Iter-40 M-1 (unreadable vs missing): the previous code + # collapsed "file missing" and "file unreadable" (permission + # flip, locked file) into the same ``current_hash is None`` + # signal, then treated both as deletions. A pre-existing + # baseline file that became UNREADABLE mid-sync would be + # falsely flagged as deleted, the diagnostic would lie about + # the file being removed, and downstream revert helpers + # would attempt to remove a still-present path. The + # :func:`_classify_baseline_path` helper distinguishes the + # two; unreadable falls through to the legacy iter-6 B1 + # preserve-by-name carve-out (same as the unreadable-at-init + # branch), while genuinely-missing flows the iter-34 path. baseline_deleted: Set[str] = set() for rel_posix, baseline_hash in self._baseline_changed_paths.items(): - current_hash = _hash_file(repo_root, rel_posix) - if current_hash is None: + status = _classify_baseline_path(repo_root, rel_posix) + if status.missing: # Iter-34 M-3: baseline file is GONE. Surface it as # unrecovered regardless of whether it was tracked # or untracked at init — we can't distinguish the @@ -2379,6 +2472,15 @@ def _enforce_scope_guard( # removed by sync). baseline_deleted.add(rel_posix) continue + if status.sha is None: + # Iter-40 M-1: file exists but unreadable now + # (permission flip, locked file, transient OSError). + # Preserve by name — same conservative carve-out as + # the unreadable-at-init branch below — so a + # permission-flaky baseline path is not misreported + # as deleted. + allowed_files.add((repo_root / rel_posix).resolve()) + continue if baseline_hash is None: # Couldn't hash at init (the file was unreadable # then). Be conservative and preserve by name, the @@ -2386,7 +2488,7 @@ def _enforce_scope_guard( # on permission-flaky paths that pre-date the run. allowed_files.add((repo_root / rel_posix).resolve()) continue - if current_hash == baseline_hash: + if status.sha == baseline_hash: # Unchanged user WIP — preserve. allowed_files.add((repo_root / rel_posix).resolve()) # else: sync (or some other writer) clobbered the file. @@ -2399,10 +2501,18 @@ def _enforce_scope_guard( # gitignored baseline file (e.g. user-side ``cache.bin`` # erased by sync) leaves no trail in either scan. Iterate # the ignored baseline directly to catch the deletion. + # + # Iter-40 M-1: same unreadable-vs-missing discrimination — + # an unreadable ignored baseline must NOT be flagged as + # deleted. Preserve by name so the diagnostic does not + # falsely claim the file was removed. for rel_posix, baseline_hash in self._baseline_ignored_paths.items(): - current_hash = _hash_file(repo_root, rel_posix) - if current_hash is None: + status = _classify_baseline_path(repo_root, rel_posix) + if status.missing: baseline_deleted.add(rel_posix) + elif status.sha is None: + # File exists but unreadable — preserve by name. + allowed_files.add((repo_root / rel_posix).resolve()) # Present-but-changed ignored baselines are already # surfaced by ``_remaining_out_of_scope_paths``'s # ignored loop (the hash comparison there falls through diff --git a/pdd/durable_sync_runner.py b/pdd/durable_sync_runner.py index 807412e50..9bf551f64 100644 --- a/pdd/durable_sync_runner.py +++ b/pdd/durable_sync_runner.py @@ -136,6 +136,36 @@ def _delete_state(self) -> None: """Durable mode leaves local runner state untouched.""" def run(self) -> Tuple[bool, str, float]: + # Iter-40 M-2 (durable init ordering): iter-38 added a + # fail-closed abort to :meth:`AsyncSyncRunner.run` when the + # init-time baseline scan returned ``None``, but the durable + # subclass calls :meth:`_prepare_durable_branch` BEFORE + # delegating to ``super().run()`` — so a baseline-acquisition + # failure on the main checkout would leave durable side effects + # (worktree creation, branch checkout, remote pushes) in place + # before the fail-closed check ran. Hoist the check above + # ``_prepare_durable_branch`` so no durable infrastructure is + # touched when the baseline scan failed. + # + # The flag reflects the MAIN CHECKOUT's git scan (see iter-22: + # the durable runner intentionally inherits the main-checkout + # baseline via ``super().__init__(project_root=self.git_root)`` + # and only clears the *paths* afterward — the flag is + # preserved). That is precisely the scan we want to abort on: + # the main checkout is where the orchestrator scope guard + # operates. + if getattr(self, "_baseline_acquisition_failed", False): + return ( + False, + ( + "Scope guard fail-closed: could not snapshot working-tree " + "baseline at runner init (git scan failed). Aborting " + "before any write-capable work to prevent false-positive " + "reverts of pre-existing user files." + ), + self.initial_cost, + ) + ok, message = self._prepare_durable_branch() if not ok: return False, message, self.initial_cost diff --git a/tests/test_agentic_sync_runner.py b/tests/test_agentic_sync_runner.py index 8386e0acc..411955384 100644 --- a/tests/test_agentic_sync_runner.py +++ b/tests/test_agentic_sync_runner.py @@ -3414,6 +3414,189 @@ def test_baseline_deletion_of_ignored_file_is_flagged( f"got: {diagnostic!r}" ) + # --------------------------------------------------------------------- + # Iter-40 M-1: unreadable vs missing baseline discrimination + # --------------------------------------------------------------------- + + def test_baseline_unreadable_file_is_not_misclassified_as_deleted( + self, tmp_path, monkeypatch + ): + """Iter-40 M-1: an unreadable (but still on disk) baseline file + MUST NOT be flagged as deleted. The iter-34 deletion-detection + branch previously collapsed "file gone" and "file unreadable" + (permission flip, locked file) to the same ``current_hash is None`` + signal and treated both as deletions. That falsely claimed the + file was removed AND asked downstream revert helpers to remove + a still-present path. The :func:`_classify_baseline_path` helper + distinguishes the two; the unreadable case falls through to the + legacy preserve-by-name carve-out.""" + from pdd import agentic_sync_runner as mod + + subprocess.run( + ["git", "init", "-b", "main", str(tmp_path)], + check=True, capture_output=True, + ) + subprocess.run( + ["git", "-C", str(tmp_path), "config", "user.email", "t@t.invalid"], + check=True, capture_output=True, + ) + subprocess.run( + ["git", "-C", str(tmp_path), "config", "user.name", "T"], + check=True, capture_output=True, + ) + (tmp_path / "README.md").write_text("initial") + subprocess.run( + ["git", "-C", str(tmp_path), "add", "README.md"], + check=True, capture_output=True, + ) + subprocess.run( + ["git", "-C", str(tmp_path), "commit", "-m", "init"], + check=True, capture_output=True, + ) + + # Pre-existing UNTRACKED WIP — the same shape as iter-34's test. + userwip = tmp_path / "userwip.py" + userwip.write_text("wip") + + monkeypatch.chdir(tmp_path) + runner = self._make_runner( + allowed_write_set=["pdd/foo.py"], + companion_allowlist=[".pdd/meta/*.json"], + ) + runner.project_root = tmp_path.resolve() + + assert "userwip.py" in runner._baseline_changed_paths, ( + "iter-40: untracked WIP must be captured in baseline" + ) + assert runner._baseline_changed_paths["userwip.py"] is not None, ( + "iter-40: baseline SHA must be captured for readable WIP" + ) + + # Iter-40: simulate permission flip / locked file by injecting a + # ``PermissionError`` for the baseline path while leaving the file + # itself on disk. Patching the module's ``open`` attribute is + # more reliable than ``os.chmod(path, 0)`` — macOS root, + # filesystem ACLs, and Spotlight can all defeat the latter. The + # module's function bodies use the unqualified name ``open``, + # which Python's LEGB lookup resolves via the module namespace + # before falling back to the builtin, so a ``setattr(mod, "open", + # ...)`` does intercept the call. + import builtins + wip_resolved = userwip.resolve() + builtin_open = builtins.open + + def _open_with_block(path, *args, **kwargs): + try: + resolved = Path(path).resolve() + except (OSError, TypeError): + return builtin_open(path, *args, **kwargs) + if resolved == wip_resolved: + raise PermissionError("simulated permission flip") + return builtin_open(path, *args, **kwargs) + + monkeypatch.setattr(mod, "open", _open_with_block, raising=False) + + captured_allowed: Dict[str, set] = {} + + def fake_revert(_root, allowed_files): + captured_allowed["files"] = set(allowed_files) + return [] + + monkeypatch.setattr(mod, "_revert_out_of_scope_changes", fake_revert) + monkeypatch.setattr( + mod, "revert_out_of_scope_changes_with_dirs", + lambda _root, allowed_dirs, allowed_files: [], + ) + monkeypatch.setattr( + runner, "_resolve_repo_root", lambda _cwd: tmp_path.resolve() + ) + + diagnostic = runner._enforce_scope_guard("mod", tmp_path) + + # The file is still on disk, so it cannot have been "deleted": + # the diagnostic must NOT flag it as out-of-scope. + assert userwip.exists(), ( + "iter-40 precondition: the unreadable file must still exist" + ) + assert diagnostic is None or "userwip.py" not in diagnostic, ( + "iter-40: unreadable baseline file must be preserved by name " + "(not flagged as deleted). " + f"diagnostic={diagnostic!r}" + ) + # The path must end up in the allowed set so downstream revert + # helpers do not try to remove it. + assert wip_resolved in captured_allowed["files"], ( + "iter-40: unreadable baseline path must be auto-allowed " + "(preserve by name) so revert helpers do not remove it; " + f"got allowed_files={captured_allowed['files']}" + ) + + def test_baseline_missing_file_still_flagged_as_deleted( + self, tmp_path, monkeypatch + ): + """Iter-40 M-1 regression for iter-34: actual deletion of a + pre-existing untracked baseline file MUST still hard-fail the + module. The iter-40 discrimination helper preserves the iter-34 + behaviour for the genuinely-missing case.""" + from pdd import agentic_sync_runner as mod + + subprocess.run( + ["git", "init", "-b", "main", str(tmp_path)], + check=True, capture_output=True, + ) + subprocess.run( + ["git", "-C", str(tmp_path), "config", "user.email", "t@t.invalid"], + check=True, capture_output=True, + ) + subprocess.run( + ["git", "-C", str(tmp_path), "config", "user.name", "T"], + check=True, capture_output=True, + ) + (tmp_path / "README.md").write_text("initial") + subprocess.run( + ["git", "-C", str(tmp_path), "add", "README.md"], + check=True, capture_output=True, + ) + subprocess.run( + ["git", "-C", str(tmp_path), "commit", "-m", "init"], + check=True, capture_output=True, + ) + + userwip = tmp_path / "userwip.py" + userwip.write_text("wip") + + monkeypatch.chdir(tmp_path) + runner = self._make_runner( + allowed_write_set=["pdd/foo.py"], + companion_allowlist=[".pdd/meta/*.json"], + ) + runner.project_root = tmp_path.resolve() + + # GENUINE deletion — iter-34 path must still fire. + userwip.unlink() + + monkeypatch.setattr( + mod, "_revert_out_of_scope_changes", lambda _root, _allowed: [] + ) + monkeypatch.setattr( + mod, "revert_out_of_scope_changes_with_dirs", + lambda _root, allowed_dirs, allowed_files: [], + ) + monkeypatch.setattr( + runner, "_resolve_repo_root", lambda _cwd: tmp_path.resolve() + ) + + diagnostic = runner._enforce_scope_guard("mod", tmp_path) + + assert diagnostic is not None, ( + "iter-40: genuinely-deleted untracked baseline must still " + "hard-fail the module — iter-34 regression" + ) + assert "userwip.py" in diagnostic, ( + f"iter-40: deleted baseline path must still appear in diagnostic, " + f"got: {diagnostic!r}" + ) + def test_wildcard_only_companion_pattern_does_not_auto_allow( self, tmp_path, monkeypatch ): diff --git a/tests/test_durable_sync_runner.py b/tests/test_durable_sync_runner.py index 4c0cb2952..29b5332bf 100644 --- a/tests/test_durable_sync_runner.py +++ b/tests/test_durable_sync_runner.py @@ -926,3 +926,71 @@ def test_durable_scope_guard_does_not_whitelist_main_checkout_dirty_files( # fix the baseline is empty, so ``out.py`` is correctly out of scope. assert diagnostic is not None assert "out.py" in diagnostic + + +def test_durable_runner_aborts_before_worktree_setup_when_baseline_failed( + tmp_path: Path, monkeypatch: pytest.MonkeyPatch +): + """Iter-40 M-2 (durable init ordering): when the inherited + ``AsyncSyncRunner.__init__`` records ``_baseline_acquisition_failed=True`` + (the iter-38 fail-closed signal), the durable runner's + :meth:`~DurableSyncRunner.run` MUST abort BEFORE + :meth:`_prepare_durable_branch` runs. The iter-38 fix was added in + ``AsyncSyncRunner.run()``, but ``DurableSyncRunner.run()`` calls + ``_prepare_durable_branch()`` first — without this iter-40 hoist a + transient git scan failure would leave durable side effects + (worktree creation, branch checkout, remote pushes) in place before + the inherited fail-closed check ran.""" + from pdd import agentic_sync_runner as mod + + durable_root = _init_repo_with_remote(tmp_path) + + # Patch the baseline scan to fail. ``DurableSyncRunner.__init__`` + # forwards to ``AsyncSyncRunner.__init__`` which reads the scan and + # records ``_baseline_acquisition_failed`` on ``None``. + monkeypatch.setattr(mod, "_git_changed_paths", lambda _root: None) + monkeypatch.setattr(mod, "_git_ignored_paths", lambda _root: set()) + + monkeypatch.chdir(durable_root) + + runner = _runner( + durable_root, + runner_cls=EmptyDurableRunner, + allowed_write_set=["pdd/foo.py"], + companion_allowlist=[".pdd/meta/*.json"], + ) + + # The inherited init must have flagged the baseline acquisition as + # failed. (Iter-22 clears the baseline *paths* to {} but does NOT + # clear this flag — exactly so the iter-40 hoist can see it.) + assert runner._baseline_acquisition_failed is True, ( + "iter-40: inherited fail-closed flag must reach the durable runner" + ) + + prepare_calls: list[bool] = [] + + def fake_prepare() -> tuple[bool, str]: + prepare_calls.append(True) + return True, "" + + monkeypatch.setattr(runner, "_prepare_durable_branch", fake_prepare) + + success, message, cost = runner.run() + + assert success is False, ( + "iter-40: durable runner must fail-closed when baseline scan failed" + ) + assert prepare_calls == [], ( + "iter-40: _prepare_durable_branch MUST NOT run when baseline " + "acquisition failed — otherwise worktree creation and branch " + f"checkout happen before the abort. Calls: {prepare_calls}" + ) + assert "fail-closed" in message, ( + f"iter-40: abort message must mention fail-closed; got: {message!r}" + ) + assert "baseline" in message, ( + f"iter-40: abort message must mention baseline; got: {message!r}" + ) + # The abort path returns ``self.initial_cost`` (0.0 for the default + # runner) — mirrors the inherited AsyncSyncRunner.run abort. + assert cost == 0.0 From 4a5814c406631207cc3cba5585132a1e0db88050 Mon Sep 17 00:00:00 2001 From: Serhan Date: Fri, 15 May 2026 16:34:35 -0700 Subject: [PATCH 42/42] fix(sync): iter-42 mirror PDD_INTERNAL_PATH_ALLOWLIST into durable checkpoint validation MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Codex iter-41 — same "async fix needs durable mirror" pattern as iter-14→iter-16 and iter-22 baseline-clear. iter-36 added PDD_INTERNAL_PATH_ALLOWLIST (`.pdd/agentic-logs/*`, `.pdd/agentic_sync_state.json`, etc.) to the async per-module scope guard so PDD's own infrastructure writes don't trip the guard. The durable runner's _out_of_scope_staged_paths and _unsafe_staged_paths didn't honor that allowlist — contracted durable sync hard-failed at checkpoint on PDD's own audit logs and state file. Import the allowlist in durable_sync_runner. In both _out_of_scope_staged_paths and _unsafe_staged_paths, check internal-allowlist patterns BEFORE the contract/unsafe rules and treat matches as in-scope. Patterns are repo-root-anchored so the check runs against `normalized` directly (no module_cwd prefix stripping needed). Anchored matcher enforces equal segment count so a nested `packages/app/.pdd/agentic_sync_state.json` does NOT match the root `.pdd/agentic_sync_state.json` pattern — existing nested-rejection test continues to pass unchanged. Confirmed via grep that durable runner only force-adds `.pdd/meta/_*.json` via _force_add_module_metadata; under normal `git add -A` the gitignored `.pdd/` tree is skipped, so the validation-skip approach is sufficient (no separate staging exclusion needed). Tests: audit-log not flagged, state-file not flagged, unrelated .pdd/random/junk.txt still flagged, unsafe-rules skip internal allowlist. One pre-existing test_unsafe_staged_paths_rejects_sensitive_artifacts updated to reflect the new behavior. 738 passed. Co-Authored-By: Claude Opus 4.7 --- pdd/durable_sync_runner.py | 41 +++++++++++ tests/test_durable_sync_runner.py | 117 +++++++++++++++++++++++++++++- 2 files changed, 157 insertions(+), 1 deletion(-) diff --git a/pdd/durable_sync_runner.py b/pdd/durable_sync_runner.py index 9bf551f64..efbdb662e 100644 --- a/pdd/durable_sync_runner.py +++ b/pdd/durable_sync_runner.py @@ -21,6 +21,7 @@ from typing import Dict, List, Optional, Set, Tuple from .agentic_common import ( + PDD_INTERNAL_PATH_ALLOWLIST, _is_valid_companion_pattern, _matches_companion_pattern_anchored, ) @@ -528,6 +529,26 @@ def _out_of_scope_staged_paths( # unchanged. if normalized in self.allowed_write_paths: continue + # Issue #1013 iter-42 M-1 (durable PDD_INTERNAL parity): PDD's + # own infrastructure writes (audit logs, runner state, etc.) + # are NEVER part of a contract — they're internal artifacts + # the tool produces as side effects of running. Mirror the + # async per-module guard (iter-36 B-1/B-2 at + # ``agentic_sync_runner.py`` line ~2376/~2408) so the durable + # checkpoint-staging validation honors the same allowlist; + # otherwise contracted durable runs hard-fail on PDD's own + # audit logs / state file. The internal allowlist patterns + # are REPO-ROOT-anchored (the writes happen at the top of + # the project regardless of module_cwd), so this check runs + # BEFORE the module_cwd prefix stripping below and matches + # against ``normalized`` (the raw repo-relative form). + internal_matched = False + for pattern in PDD_INTERNAL_PATH_ALLOWLIST: + if _matches_companion_pattern_anchored(normalized, pattern): + internal_matched = True + break + if internal_matched: + continue # F3 (Issue #1013): companion glob matching uses anchored, # segment-aware semantics so ``.pdd/meta/*.json`` does NOT # match nested paths like ``.pdd/meta/nested/foo.json`` or @@ -593,6 +614,26 @@ def _unsafe_staged_paths(self, basename: str, paths: List[str]) -> List[str]: for path in paths: normalized = path.replace(os.sep, "/") lower = normalized.lower() + # Issue #1013 iter-42 M-1 (durable PDD_INTERNAL parity): PDD's + # own infrastructure writes (e.g. ``.pdd/agentic-logs/*``, + # ``.pdd/agentic_sync_state.json``) match neither a contract's + # allowed_write_set nor the user-facing companion allowlist; + # they're tool internals. The unsafe-path rules below would + # otherwise classify ``.pdd/agentic-logs/foo.jsonl`` as + # unsafe via the ``_pdd_path_index`` branch (because it sits + # under ``.pdd/`` but is NOT a recognized meta artifact). The + # async per-module guard already exempts these patterns; + # mirror it here so a contracted durable run does not + # hard-fail at checkpoint on its own audit logs / state file. + # Patterns are REPO-ROOT-anchored — match the raw normalized + # path without stripping any module prefix. + internal_matched = False + for pattern in PDD_INTERNAL_PATH_ALLOWLIST: + if _matches_companion_pattern_anchored(normalized, pattern): + internal_matched = True + break + if internal_matched: + continue pdd_index = _pdd_path_index(normalized) if pdd_index is not None: matching_meta_prefix = next( diff --git a/tests/test_durable_sync_runner.py b/tests/test_durable_sync_runner.py index 29b5332bf..5509d8f03 100644 --- a/tests/test_durable_sync_runner.py +++ b/tests/test_durable_sync_runner.py @@ -298,6 +298,13 @@ def test_metadata_allowlist_rejects_nested_pdd_state_and_wrong_meta_scope(tmp_pa def test_unsafe_staged_paths_rejects_sensitive_artifacts(tmp_path: Path): + """Iter-42 M-1: PDD's own infrastructure writes (``.pdd/agentic-logs/*``, + ``.pdd/agentic_sync_state.json``) match the internal allowlist and must + be treated as safe at checkpoint validation time — mirrors the async + per-module guard. Paths that sit under ``.pdd/`` but are NOT in the + internal allowlist (e.g. ``.pdd/worktrees/...``, ``.pdd/cache/...``) + remain unsafe. + """ repo = _init_repo_with_remote(tmp_path) runner = _runner(repo) @@ -312,12 +319,14 @@ def test_unsafe_staged_paths_rejects_sensitive_artifacts(tmp_path: Path): "config/token.txt", "config/secrets/api.txt", ".pdd/worktrees/sync-issue-1328-foo", - ".pdd/agentic_sync_state.json", ".pdd/cache/unrelated.json", ] safe_paths = [ "src/app.py", ".pdd/meta/foo_python.json", + # Internal allowlist: tool infrastructure, never user-contracted. + ".pdd/agentic_sync_state.json", + ".pdd/agentic-logs/session_test.jsonl", ] result = runner._unsafe_staged_paths("foo", [*unsafe_paths, *safe_paths]) @@ -325,6 +334,112 @@ def test_unsafe_staged_paths_rejects_sensitive_artifacts(tmp_path: Path): assert result == sorted(unsafe_paths) +def test_durable_does_not_flag_pdd_audit_logs_at_checkpoint(tmp_path: Path): + """Iter-42 M-1: PDD's own audit logs under ``.pdd/agentic-logs/`` are + tool-infrastructure side effects of running, never user-contracted. + The durable checkpoint-staging validation must mirror the async + per-module guard (iter-36 B-1/B-2) and skip ``PDD_INTERNAL_PATH_ALLOWLIST`` + matches; otherwise contracted durable runs hard-fail at checkpoint on + PDD's own audit logs. + """ + repo = _init_repo_with_remote(tmp_path) + runner = _runner( + repo, + allowed_write_set=["pdd/foo.py"], + companion_allowlist=[".pdd/meta/*.json"], + ) + + result = runner._out_of_scope_staged_paths( + ["pdd/foo.py", ".pdd/agentic-logs/session_test.jsonl"], + "foo", + repo, + ) + assert ".pdd/agentic-logs/session_test.jsonl" not in result, ( + "PDD audit logs must NOT be flagged as out-of-contract by the " + "durable checkpoint validation (iter-42 M-1)" + ) + assert result == [] + + +def test_durable_does_not_flag_pdd_state_file_at_checkpoint(tmp_path: Path): + """Iter-42 M-1: ``.pdd/agentic_sync_state.json`` is the runner state + file — internal PDD infrastructure, NOT a contract artifact. The + durable checkpoint validation must auto-allow it via the internal + allowlist (mirrors async per-module guard). + """ + repo = _init_repo_with_remote(tmp_path) + runner = _runner( + repo, + allowed_write_set=["pdd/foo.py"], + companion_allowlist=[".pdd/meta/*.json"], + ) + + result = runner._out_of_scope_staged_paths( + ["pdd/foo.py", ".pdd/agentic_sync_state.json"], + "foo", + repo, + ) + assert ".pdd/agentic_sync_state.json" not in result, ( + "PDD runner state file must NOT be flagged as out-of-contract " + "by the durable checkpoint validation (iter-42 M-1)" + ) + assert result == [] + + +def test_durable_still_flags_unrelated_pdd_artifacts(tmp_path: Path): + """Iter-42 M-1 (negative): the internal allowlist must NOT widen into + a generic ``.pdd/**`` bypass. A path like ``.pdd/random/junk.txt`` + that does NOT match any ``PDD_INTERNAL_PATH_ALLOWLIST`` pattern must + still be flagged as out-of-contract. + """ + repo = _init_repo_with_remote(tmp_path) + runner = _runner( + repo, + allowed_write_set=["pdd/foo.py"], + companion_allowlist=[".pdd/meta/*.json"], + ) + + result = runner._out_of_scope_staged_paths( + ["pdd/foo.py", ".pdd/random/junk.txt"], + "foo", + repo, + ) + assert result == [".pdd/random/junk.txt"], ( + "unrelated .pdd artifacts must still be flagged as out-of-" + "contract — the internal allowlist is fixed, not a generic " + ".pdd/** bypass (iter-42 M-1 negative)" + ) + + +def test_durable_unsafe_skips_pdd_internal_allowlist(tmp_path: Path): + """Iter-42 M-1 (unsafe parity): the per-path unsafe-classification + rules in ``_unsafe_staged_paths`` would otherwise reject + ``.pdd/agentic-logs/foo.jsonl`` via the ``_pdd_path_index`` branch + (under ``.pdd/`` but NOT a recognized meta artifact). Internal + allowlist patterns must take precedence so PDD's own infrastructure + writes are not classified as unsafe at checkpoint time. + """ + repo = _init_repo_with_remote(tmp_path) + runner = _runner(repo) + + result = runner._unsafe_staged_paths( + "foo", + [ + ".pdd/agentic-logs/session_test.jsonl", + ".pdd/agentic_sync_state.json", + ".pdd/bug-state/foo.json", + # Negative control: not in internal allowlist, must still + # land in unsafe via _pdd_path_index branch. + ".pdd/random/junk.txt", + ], + ) + assert result == [".pdd/random/junk.txt"], ( + "internal allowlist matches must be skipped before unsafe-" + "classification rules run; only paths NOT in the allowlist " + "should surface as unsafe (iter-42 M-1)" + ) + + def test_allowed_write_set_rejects_out_of_scope_checkpoint_paths(tmp_path: Path): """ Issue #1013 (F5, F13, F14): kwarg is now ``allowed_write_set`` (the