From b50432229a64a9b764d6fd228766963b2b930a2a Mon Sep 17 00:00:00 2001 From: Joe Eftekhari Date: Fri, 17 Apr 2026 20:22:24 -1000 Subject: [PATCH] Fork-path validation: fix deploy workflow bug, add CONTRIBUTING, patch README MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Third open-source-readiness blocker. Dry-ran the fork path from a fresh clone, surfaced a real deploy-workflow bug, patched it, then updated docs so the intended flow is actually the documented flow. ## What was broken The ORG_NAME sed in .github/workflows/deploy.yml uncommented `ORG_NAME = "..."` but left `[vars]` commented. Post-sed: # [vars] ORG_NAME = "Acme" ORG_NAME ended up as a top-level TOML key instead of inside [vars], so `c.env.ORG_NAME` in the worker was always undefined. Nobody noticed because the upstream dashboard ran without ORG_NAME set most of the time, and the header subtitle silently fell back to an empty string. Any forker who set `ORG_NAME` as a repo variable expecting it to show up in the dashboard header would have gotten zero signal that the sed was broken. ## Fix - **Drop the ORG_NAME sed entirely.** Pass ORG_NAME (and now GRC_AUDIENCE, for fork-specific OIDC audience scoping) via `wrangler deploy --var KEY:VALUE`. `--var` binds cleanly regardless of the TOML structure and is a no-op when the repo var is unset. - **Preflight step** that fails fast with a clear `::error::` message when `CLOUDFLARE_API_TOKEN` or `CLOUDFLARE_KV_ID` is missing. Prior behaviour was a cryptic wrangler auth failure or a literal `YOUR_KV_NAMESPACE_ID` being sent to the CF API — neither pointed back at the forker's repo settings. - Kept the KV id sed: that's a narrow string substitution that works correctly (the placeholder is structurally distinct and the file stays valid). ## Docs - **README Setup rewrite.** Split into two clearly-labelled paths: "Auto-deploy (production)" step-by-step, and "Local-only (development)". Previously the two were intermixed, which made the fork path ambiguous. Documents exactly which secrets and vars to set and where they live in the Repo Settings UI. - **GRC_AUDIENCE** now listed as an optional repo variable. Both the deploy workflow and the README describe the fork-scoping pattern consistently. - **CONTRIBUTING.md** (new) — dev loop for scanner and dashboard, .dev.vars instructions for local OIDC bypass, and three extension-point walkthroughs (add a scan rule, add a policy template, add a framework). Idempotency rules for templates spelled out so future contributors don't accidentally produce noisy commits on every scan. Linked from the README. - **Implementation checklist** items closed: fork-path validation, contributing guide, how-to-add-scan-rules, how-to-add-policy-templates. ## Side cleanup `npm audit fix` bumped `hono` past the moderate HTML-injection advisory (GHSA-458j-xx4x-4375). Backwards- compatible lockfile change only, no source edits required. ## Verified - Fresh clone → `npm install` (3s, 127 packages, no errors). - `wrangler dev --local` boots against the committed placeholder KV id and serves `/health` (200) + `/` (empty-state 200). - `npm run scan -- .` and `npx tsx scripts/smoke-dashboard.ts` both pass after the hono bump — no behaviour regression. ## Not tested here The live CF deploy path (needs real secrets in a real fork). The workflow changes are small enough to review by inspection; first fork that sets the secrets will exercise them end-to-end. --- .github/workflows/deploy.yml | 41 +++++- CONTRIBUTING.md | 222 +++++++++++++++++++++++++++++++ README.md | 60 ++++++--- docs/implementation-checklist.md | 8 +- package-lock.json | 6 +- 5 files changed, 307 insertions(+), 30 deletions(-) create mode 100644 CONTRIBUTING.md diff --git a/.github/workflows/deploy.yml b/.github/workflows/deploy.yml index c69b000..5ddb291 100644 --- a/.github/workflows/deploy.yml +++ b/.github/workflows/deploy.yml @@ -18,12 +18,45 @@ jobs: - name: Install dependencies run: npm ci - - name: Configure wrangler.toml + # Preflight: fail fast with a clear message if required secrets are + # missing. Without this, wrangler would either error on the literal + # placeholder "YOUR_KV_NAMESPACE_ID" or fail to authenticate, and + # forkers would get an opaque API response instead of a pointer back + # to their repo settings. + - name: Preflight secrets run: | - sed -i "s/YOUR_KV_NAMESPACE_ID/${{ secrets.CLOUDFLARE_KV_ID }}/" wrangler.toml - sed -i 's/# ORG_NAME = "your-org"/ORG_NAME = "${{ vars.ORG_NAME }}"/' wrangler.toml + if [ -z "${{ secrets.CLOUDFLARE_API_TOKEN }}" ]; then + echo "::error::CLOUDFLARE_API_TOKEN secret is not set. Create one at https://dash.cloudflare.com/profile/api-tokens and add it as a repo secret." + exit 1 + fi + if [ -z "${{ secrets.CLOUDFLARE_KV_ID }}" ]; then + echo "::error::CLOUDFLARE_KV_ID secret is not set. Create a KV namespace with 'npx wrangler kv namespace create GRC_KV' and add its id as a repo secret." + exit 1 + fi + # KV namespace id has to live in wrangler.toml — no CLI flag for it — + # so we substitute the placeholder at deploy time. The placeholder + # stays in the committed file so local `wrangler dev --local` still + # works against miniflare without any secret rewiring. + - name: Inject KV namespace id into wrangler.toml + run: sed -i "s/YOUR_KV_NAMESPACE_ID/${{ secrets.CLOUDFLARE_KV_ID }}/" wrangler.toml + + # ORG_NAME and GRC_AUDIENCE are passed via `--var` rather than a TOML + # rewrite. The prior approach uncommented `ORG_NAME = "..."` but left + # `[vars]` commented, leaving ORG_NAME at the top level where the + # worker never saw it. `--var` binds cleanly regardless of TOML + # structure and is a no-op when the repo variable is unset. - name: Deploy to Cloudflare Workers - run: npx wrangler deploy env: CLOUDFLARE_API_TOKEN: ${{ secrets.CLOUDFLARE_API_TOKEN }} + ORG_NAME_VAR: ${{ vars.ORG_NAME }} + GRC_AUDIENCE_VAR: ${{ vars.GRC_AUDIENCE }} + run: | + ARGS=() + if [ -n "$ORG_NAME_VAR" ]; then + ARGS+=(--var "ORG_NAME:$ORG_NAME_VAR") + fi + if [ -n "$GRC_AUDIENCE_VAR" ]; then + ARGS+=(--var "GRC_AUDIENCE:$GRC_AUDIENCE_VAR") + fi + npx wrangler deploy "${ARGS[@]}" diff --git a/CONTRIBUTING.md b/CONTRIBUTING.md new file mode 100644 index 0000000..efe06c6 --- /dev/null +++ b/CONTRIBUTING.md @@ -0,0 +1,222 @@ +# Contributing + +Thanks for looking at the GRC Observability Dashboard. This guide covers the dev loop and the three most common extension points: adding a scan rule, adding a policy template, and adding a framework. + +If you're new to the codebase, start with `CLAUDE.md` for a one-page architecture overview, then `docs/implementation-checklist.md` for what's shipped and what's next. + +--- + +## Dev loop + +```bash +# Clone and install +git clone https://github.com/YOUR_FORK/GRC-Observability-Dashboard.git +cd GRC-Observability-Dashboard +npm install +``` + +### Running the scanner + +The scanner is a Node CLI that scans a repo (defaults to the current directory) and writes a manifest + reports to `.grc/`. + +```bash +# Scan this repo +npm run scan -- . + +# Scan another repo +npm run scan -- /path/to/some/repo --url=https://that-site.com +``` + +After a scan, look at: + +- `.grc/manifest.yml` — the canonical output. What the dashboard stores. +- `.grc/nist-csf-report.md` / `.grc/ai-compliance-report.md` / `.grc/risk-assessment.md` — human-readable reports. +- `docs/policies/*.md` + `.well-known/security.txt` — generated policies committed to the consuming repo. + +### Running the dashboard locally + +```bash +# Boot miniflare-backed wrangler with an in-memory KV +npx wrangler dev --local + +# Open http://localhost:8787 +``` + +For local iteration you'll usually want to skip OIDC verification on manifest POSTs. Create `.dev.vars` (gitignored) at the repo root: + +``` +GRC_AUTH_BYPASS=1 +``` + +Then POST a manifest from another scan into the local dashboard: + +```bash +curl -X POST -H "Content-Type: text/yaml" \ + --data-binary @.grc/manifest.yml \ + "http://localhost:8787/api/report?site_url=https://example.com" +``` + +### CI + +Every PR runs `.github/workflows/ci.yml`: + +1. `npm ci` +2. Scanner smoke: `npm run scan -- .` against this repo. +3. Dashboard smoke: `npx tsx scripts/smoke-dashboard.ts` — exercises every render function with two manifest fixtures (new-shape and pre-Phase-8-shape). + +If you're making a render or summarize change, extend `scripts/smoke-dashboard.ts` to cover the new path. + +--- + +## Adding a scan rule + +A scan rule is a function that inspects the repo tree (or the scan context) and returns structured findings for the manifest. + +### 1. Write the rule + +Rules live in `scanner/rules/*.ts`. Each exports a single async function taking a `ScanContext` and returning whatever shape the manifest expects for that finding. + +```ts +// scanner/rules/my-rule.ts +import type { ScanContext } from "../types.js"; +import { walkFiles, readFileContent } from "../utils.js"; + +export async function scanMyThing(ctx: ScanContext): Promise { + const files = await walkFiles(ctx.repoPath); + // ...inspect files, return findings + return { detected: false, findings: [] }; +} +``` + +Use `walkFiles` / `readFileContent` / `fileExists` from `scanner/utils.ts` for I/O. They honor `SKIP_DIRS` so you don't accidentally walk `node_modules/` or `.grc/`. + +### 2. Add the finding type to the manifest + +`scanner/types.ts` defines the manifest schema. Add an interface for your finding and a field on `Manifest`. Keep the field optional if you want older scanners' output to still parse (the 2026-04-18 outage was a failure to do this). + +### 3. Wire the rule into the scan pipeline + +`scanner/index.ts` runs most rules in parallel inside `Promise.all`. Add your rule alongside the others, then include its result in the manifest. + +### 4. Cover it in smoke tests + +Add a minimal fixture to `scripts/smoke-dashboard.ts` so any future regression in the rule's output shape fails CI. + +### 5. Surface it on the dashboard (optional) + +If the finding deserves a UI row, extend `dashboard/views/render.ts` — usually `renderRepoDetail` or a new tab. + +--- + +## Adding a policy template + +Policies are Handlebars templates that render to markdown files and commit to the consuming repo's `docs/policies/` (or wherever `output_dir` points). + +### 1. Add the template + +Create `scanner/templates/my-policy.hbs`. Use helpers already registered in `scanner/render.ts`: `eq`, `hasGdpr`, `hasCcpa`, `joinFields`, `nextSection`. Add new helpers to the same file if you need them — they register on import. + +### 2. Add the render function + +In `scanner/render.ts`: + +```ts +export async function renderMyPolicy(ctx: RenderContext): Promise { + const templatePath = join(getTemplateDir(), "my-policy.hbs"); + const templateSource = await readFileContent(templatePath); + const template = Handlebars.compile(templateSource); + return template({ + config: ctx.config, + scanDate: formatScanDate(ctx.manifest.scanDate), + branch: ctx.manifest.branch, + commit: ctx.manifest.commit, + // ... any template-specific data + }); +} +``` + +### 3. Wire it into the scan pipeline + +`scanner/index.ts` renders policies inside `main()` after `scan()` returns. Add your renderer alongside the others, write the output to `policiesDir/.md`, and log the path. + +### 4. Extend `ArtifactStatus` and `scanArtifacts` + +`scanner/types.ts` → add a field to `ArtifactStatus` for your policy (use `"present" | "missing" | "not-applicable"` or `"generated" | "manual" | "missing"` depending on semantics). + +`scanner/rules/artifacts.ts` → check for your file and set the state. + +### 5. Credit the artifact in framework checks + +If the policy satisfies a specific framework control (like an EU AI Act article or a NIST CSF control), update the relevant check in `scanner/frameworks/*.ts` so presence of the file flips the check from `fail` → `partial` or `pass`. + +### 6. Keep it idempotent + +Scans run on every push and PR. The policy output MUST be byte-identical across scans with unchanged inputs, or every scan will produce a noisy commit. Two rules: + +- No scan date in the body (git history already records when). +- No commit hash in the body (same reason). + +The bottom of each template has `Policy generated: {{scanDate}} — Branch: {{branch}} ({{commit}})` — that's intentionally the ONLY non-idempotent line. + +--- + +## Adding a framework + +Frameworks map scan findings to external compliance standards (NIST CSF, EU AI Act, SOC 2, etc.). + +### 1. Define the controls + +Create `scanner/frameworks/my-framework.ts`. Each control is a value with: + +- `id` — framework-specific identifier +- `function` / `phase` / category — grouping field +- `description` +- `check(manifest)` — returns `"pass" | "partial" | "fail" | "not-applicable"` +- `evidence(manifest)` — human-readable reasoning string + +Look at `scanner/frameworks/eu-ai-act.ts` for a 13-control example and `scanner/frameworks/nist-csf.ts` for the 18-control NIST reference. + +### 2. Add cross-references + +`scanner/frameworks/cross-map.ts` stores mappings to other frameworks (SOC 2, ISO 27001, NIST AI RMF, ISO/IEC 42001). Extend the array for any cross-refs your framework has. + +### 3. Generate a report + +Copy `scanner/generators/framework-report.ts` as a starting point. Output goes to `.grc/-report.md`. Follow the same markdown structure (score, per-category breakdown, details per control, cross-reference tables, methodology, caveat). + +### 4. Surface on the dashboard + +Extend `dashboard/worker.ts` with a score computation (`calcScore`) and a per-category score helper. Add a new tab in `dashboard/views/render.ts` (mirroring the NIST CSF tab in `renderNistView`). + +### 5. Add to the stats row (optional) + +If the framework score belongs on the top of the dashboard, extend `renderDashboard`'s stats row. + +### 6. Don't overclaim coverage + +A framework mapping that covers 18 of NIST CSF 2.0's ~100 subcategories is not "NIST CSF compliant" — it's "75% of our 18 controls". Be explicit about partial coverage in the report methodology. + +--- + +## Local-only things the scanner doesn't check for you + +- **Shell scripts:** no shellcheck run. Run it manually if you touch `action.yml` or deploy workflows. +- **Handlebars template syntax:** templates compile lazily inside render functions. A typo only surfaces at scan time. Smoke-test locally with `npm run scan -- .` before pushing. +- **Wrangler config:** if you edit `wrangler.toml`, boot `npx wrangler dev --local` to confirm it parses. + +--- + +## Submitting changes + +Feature branches + PRs. The repo's `main` is protected — there's no direct push. + +- Keep commits coherent; one concept per commit where feasible. +- Follow the existing commit message shape: a one-line subject, blank line, body explaining the "why" with enough context to remain useful in six months. +- Update `docs/implementation-checklist.md` when you check off or add items. +- CI must be green before merge. + +--- + +## Getting help + +Open a discussion or issue in the upstream repo. For questions about the architecture, start with `CLAUDE.md` — it's kept up to date with the current shape. diff --git a/README.md b/README.md index 49d23aa..0ac36bd 100644 --- a/README.md +++ b/README.md @@ -25,31 +25,51 @@ Reports (`.grc/`) are gitignored and regenerated each scan. Policies (`docs/poli ### 1. Deploy the Dashboard +There are two paths. Most forkers want **auto-deploy via GitHub Actions** — it's the supported production flow. **Local-only** is for development and iteration. + +#### Auto-deploy (production) + +1. **Fork this repo** (or clone into a repo you control). +2. **Create a KV namespace** once, locally, to get an ID: + ```bash + npm install + npx wrangler login + npx wrangler kv namespace create GRC_KV + # Copy the id from the output (a 32-char hex string) + ``` +3. **Create a Cloudflare API token** at [dash.cloudflare.com/profile/api-tokens](https://dash.cloudflare.com/profile/api-tokens) with the "Edit Cloudflare Workers" template. +4. **Add the secrets and vars to your forked repo** (Settings → Secrets and variables → Actions): + - **Secrets:** + - `CLOUDFLARE_API_TOKEN` — the token from step 3 + - `CLOUDFLARE_KV_ID` — the KV id from step 2 + - **Variables (optional):** + - `ORG_NAME` — displayed in the dashboard header + - `GRC_AUDIENCE` — OIDC audience the dashboard expects on incoming JWTs (defaults to `grc-dashboard`). Set it if you want consumer workflows pointed at your fork to pass a matching `audience:` input so tokens minted for your dashboard can't be replayed against another. +5. **Push to `main`**. `.github/workflows/deploy.yml` runs automatically: it validates both secrets are present, injects the KV id into `wrangler.toml`, passes `ORG_NAME` via `--var` at deploy time, and runs `npx wrangler deploy`. + +After the first successful deploy, the dashboard is live at `https://grc-dashboard..workers.dev`. Point your consuming repos at it via the `dashboard_url` input on the action (step 2 below). + +#### Local-only (development) + +Miniflare provides an in-memory KV namespace, so you don't need a Cloudflare account or real secrets: + ```bash -git clone https://github.com/YOUR_ORG/GRC-Observability-Dashboard.git -cd GRC-Observability-Dashboard npm install +npx wrangler dev --local +# open http://localhost:8787 +``` -# Login to Cloudflare -npx wrangler login - -# Create KV storage -npx wrangler kv namespace create GRC_KV -# Copy the ID from the output - -# Edit wrangler.toml - paste the KV namespace ID -# Optionally set ORG_NAME in [vars] +The committed `wrangler.toml` carries the literal `YOUR_KV_NAMESPACE_ID` placeholder — miniflare ignores it. Don't commit a real id into the file; the auto-deploy workflow injects it at build time. -# Run locally -npx wrangler dev +**Skipping OIDC locally.** The dashboard verifies every `POST /api/report` against GitHub's OIDC provider. For local iteration, create a `.dev.vars` file (gitignored) at the repo root: -# Or deploy to Cloudflare -npx wrangler deploy +``` +GRC_AUTH_BYPASS=1 ``` -**Authentication.** The dashboard verifies incoming manifest POSTs against GitHub's OIDC provider — no shared secret to configure. Consumer workflows mint a short-lived JWT that the dashboard validates against GitHub's public JWKS, and the token's `repository` claim must match the manifest's `repo` field. Fork deployers optionally set `GRC_AUDIENCE` in `[vars]` on `wrangler.toml` to scope tokens to their deployment (defaults to `grc-dashboard`). +Never set this in production — the bypass is a development-only ergonomics flag. -For local development with `wrangler dev`, set `GRC_AUTH_BYPASS=1` in your local `.dev.vars` to skip verification while you iterate; never set this in production. +**Authentication in production.** The dashboard verifies incoming manifest POSTs against GitHub's OIDC provider — no shared secret to configure. Consumer workflows mint a short-lived JWT that the dashboard validates against GitHub's public JWKS, and the token's `repository` claim must match the manifest's `repo` field. Forks that want to scope tokens to their deployment can set `GRC_AUDIENCE` as a repo variable; it's passed to `wrangler deploy --var` alongside `ORG_NAME`. Defaults to `grc-dashboard`. ### 2. Add the Action to Your Repos @@ -288,14 +308,16 @@ Your repo/ | `site_url` | Live URL for the Check Production button | No | | `dashboard_url` | Dashboard URL to POST manifests to | No | +## Contributing + +Dev loop, adding scan rules, adding policy templates, adding frameworks — all in [CONTRIBUTING.md](CONTRIBUTING.md). + ## Future -- **AI Compliance Layer** (next up): EU AI Act detection and risk tiering, AI system inventory, auto-generated model cards and FRIAs, dashboard AI compliance tab - GitHub App (zero-config install, no workflow file needed per repo) - SBOM generation (CycloneDX) - SAST via Semgrep integration - Auditor evidence export (PDF/ZIP per framework) -- Dashboard authentication ## Roadmap diff --git a/docs/implementation-checklist.md b/docs/implementation-checklist.md index f31e8cb..231e1ba 100644 --- a/docs/implementation-checklist.md +++ b/docs/implementation-checklist.md @@ -194,13 +194,13 @@ Optional module — scanner works fully without AI. If an API key is provided, A - [x] Removed personal/draft files from repo - [x] Cleaned up outdated docs - [x] Deploy workflow injects KV ID and ORG_NAME from secrets/vars at deploy time (not in committed config) -- [ ] **Validate the fork path end-to-end** — follow the README from a fresh fork and document broken steps +- [x] **Validate the fork path end-to-end** — dry-ran from a fresh clone: `npm install` (127 packages, 3s, clean), `wrangler dev --local` boots against the placeholder KV id and serves `/health` + `/` correctly. Found a real bug in `.github/workflows/deploy.yml`: the ORG_NAME sed uncommented `ORG_NAME = "..."` but left `[vars]` commented, so the var ended up at top-level TOML and never reached the worker. Fixed by switching ORG_NAME + GRC_AUDIENCE to `wrangler deploy --var`. Added a preflight step that fails fast with a clear message when `CLOUDFLARE_API_TOKEN` or `CLOUDFLARE_KV_ID` is missing. Also fixed `hono` moderate CVE via `npm audit fix`. - [x] Add authentication to dashboard API — POST /api/report verifies a GitHub OIDC JWT against GitHub's JWKS and checks that the token's `repository` claim matches the manifest's `repo` field. No shared secret required — consumers add `id-token: write` to workflow permissions and the composite action handles token minting. Dashboard optionally overrides audience via `GRC_AUDIENCE` env var. `GRC_AUTH_BYPASS=1` available for local `wrangler dev` only. `/api/check-production` intentionally left unauthenticated — it's UI-triggered and can only act on already-authenticated stored state. ### Documentation -- [ ] Contributing guide -- [ ] How to add new scan rules -- [ ] How to add new policy templates +- [x] Contributing guide — `CONTRIBUTING.md` ships with the dev loop and three extension-point walkthroughs (scan rule, policy template, framework). Linked from the README. +- [x] How to add new scan rules — in `CONTRIBUTING.md` "Adding a scan rule" section. +- [x] How to add new policy templates — in `CONTRIBUTING.md` "Adding a policy template" section, including the idempotency rules that keep scans from producing noisy commits. ## Phase 7: Policy Deployment Flow — DONE diff --git a/package-lock.json b/package-lock.json index 8948072..2697106 100644 --- a/package-lock.json +++ b/package-lock.json @@ -1687,9 +1687,9 @@ } }, "node_modules/hono": { - "version": "4.12.12", - "resolved": "https://registry.npmjs.org/hono/-/hono-4.12.12.tgz", - "integrity": "sha512-p1JfQMKaceuCbpJKAPKVqyqviZdS0eUxH9v82oWo1kb9xjQ5wA6iP3FNVAPDFlz5/p7d45lO+BpSk1tuSZMF4Q==", + "version": "4.12.14", + "resolved": "https://registry.npmjs.org/hono/-/hono-4.12.14.tgz", + "integrity": "sha512-am5zfg3yu6sqn5yjKBNqhnTX7Cv+m00ox+7jbaKkrLMRJ4rAdldd1xPd/JzbBWspqaQv6RSTrgFN95EsfhC+7w==", "license": "MIT", "engines": { "node": ">=16.9.0"