Skip to content

fix(calm-suite): validate remediated CALM before opening a CALMGuard PR#2693

Draft
eddie-knight wants to merge 1 commit into
fix/calmguard-canonical-relationship-typefrom
fix/calmguard-validate-pr-writeback
Draft

fix(calm-suite): validate remediated CALM before opening a CALMGuard PR#2693
eddie-knight wants to merge 1 commit into
fix/calmguard-canonical-relationship-typefrom
fix/calmguard-validate-pr-writeback

Conversation

@eddie-knight

Copy link
Copy Markdown
Contributor

Description

The journey this fixes: You run a CALMGuard analysis, then click "Create remediation PR" to have CALMGuard open a pull request that applies its compliance fixes to your CALM document. Today, CALMGuard commits and opens that PR without validating the remediated document first — so if the remediation produces a schema-invalid CALM file, an invalid document lands in a PR against your repo, and downstream tools (the CLI validator, the Hub, anything consuming CALM) choke on it.

The most realistic way this happens: applyChangesToCalm writes an LLM-supplied control key (change.after) directly as a control-object key, with no check against the CALM schema's key pattern ^[a-zA-Z0-9-]+$. An LLM emitting something like "PCI DSS Req 8.4.2" as a key yields a schema-invalid document — and nothing was stopping it from being committed.

After this PR, the remediated document is validated before any GitHub write:

  • ✅ The remediated CALM is checked with the canonical CALM CLI (the same validator the /api/calm/validate route uses) right after the merge and before the branch/commit/PR.
  • ✅ On failure the run aborts cleanly — no branch, commit, PR, or label is created (fail-closed) — and the user sees why in the streamed error.
  • ✅ A validator that fails to run is distinguished from a genuinely invalid document (clear "could not validate" message), both fail-closed.

What changed

  • src/app/api/github/create-pr/route.ts (remediation branch): after applyChangesToCalm produces the remediated CALM, run validateWithCalmCli; on {valid:false} or a validator throw, throw → the route's existing SSE error/close path aborts before the first GitHub mutation (createBranch).
  • Pin the route to the Node runtime (export const runtime = 'nodejs') — the validator shells out via child_process, which is unavailable on the Edge runtime. Matches /api/calm/validate.
  • Added a route-level test (src/__tests__/api/create-pr.test.ts): an invalid result never reaches githubFetch (no commit); a valid result proceeds past the gate.

Scope notes

  • Remediation write-backs only. The pipeline/infra write-backs commit YAML/IaC, not CALM, so they're untouched.
  • This gates the committed output. The input document is Zod-checked at ingest (not CLI-checked); validating the artifact that actually lands in the PR is the meaningful guarantee.
  • This is defense-in-depth against bad LLM control keys and any future regression in the merge — the merge only ever adds (never drops), so it can't produce dangling refs / dropped required fields.

ℹ️ Components note: the change is in CALMGuard (calm-suite/calm-guard/), which has no checkbox in the template's "Affected Components", so none are ticked.

🔗 Stacked PR: targets fix/calmguard-canonical-relationship-type (the CALMGuard nested-relationship fix) rather than main — it depends on that PR so the remediated output is canonical and actually passes validation. Re-point to main once the parent merges. (Sibling to the Studio doc-keys PR, which also branches off it.)

Type of Change

  • 🐛 Bug fix (non-breaking change which fixes an issue)
  • ✨ New feature
  • 💥 Breaking change
  • 📚 Documentation update
  • 🎨 Code style/formatting changes
  • ♻️ Refactoring
  • ⚡ Performance improvements
  • ✅ Test additions or updates
  • 🔧 Chore

Affected Components

(All unchecked — surface is CALMGuard (calm-suite/calm-guard/), which has no template checkbox.)

Commit Message Format ✅

Conventional Commits, DCO signed-off:

  • fix(calm-suite): validate remediated CALM before opening a CALMGuard PR

Testing

  • I have tested my changes locally
  • I have added/updated unit tests
  • All existing tests pass

Verified on Node 22: npm run typecheck --workspace=calmguard (clean), npm run lint --workspace=calmguard (clean), npm run test:run --workspace=calmguard (119 pass, incl. 2 new), and npm run build --workspace=calmguard. The new route-level test mocks the CLI validator, the LLM remediation agent, and the GitHub client, and asserts that an invalid validation result aborts the stream and never calls githubFetch (no GitHub write), while a valid result proceeds past the gate.

Checklist

  • Conventional commit format
  • Documentation updated if necessary
  • Tests added
  • Follows coding standards

CALMGuard's remediation flow opened a GitHub PR with the remediated CALM
document without validating it first — so a malformed change (most realistically
an LLM-supplied control key that violates the schema's ^[a-zA-Z0-9-]+$ pattern),
or any future regression in applyChangesToCalm, could be committed and PR'd as a
schema-invalid document, undermining cross-tool interop.

- Validate the remediated output via the existing validateWithCalmCli (the same
  canonical CALM CLI check the /api/calm/validate route uses) right after
  applyChangesToCalm and BEFORE any GitHub mutation. On failure the stream emits
  an error and aborts — no branch, commit, PR, or label is created (fail-closed).
- Pin the route to the Node runtime (export const runtime = 'nodejs'); the
  validator shells out via child_process, which is unavailable on Edge. Matches
  /api/calm/validate.
- Distinguish a genuine schema-invalid document from a validator that failed to
  run (clear "could not validate" message), both fail-closed.

Scoped to remediation write-backs (pipeline/infra commit YAML/IaC, not CALM).
The input document is Zod-checked at ingest; this gates the committed output.

Adds a route-level test: an invalid result never reaches githubFetch (no
commit), a valid result proceeds past the gate.

Signed-off-by: Eddie Knight <knight@linux.com>
@eddie-knight

Copy link
Copy Markdown
Contributor Author

Stacked on #2692, can be taken out of draft when that's merged

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant