A mine's canary dies first so the miners live — these nine die first so your codebase lives.
9 Quality-Safeguard Canaries for AI Coding Agents — Detect code rot, weak rules, hallucinations, supply-chain vulnerabilities, brittle architectures, and API contract drift before they pollute your codebase.
Design Principles · Benchmark · Contributing · Changelog · Security · Privacy · Releases
Part of TheColliery — siblings: CoalTipple (model/effort routing) · CoalBoard (consensus & debate board).
| Skill Name | Catches | Run Mode |
|---|---|---|
rot-canary |
Dead code, bugs, resource leaks, race conditions, silent failures, stale docs | Auto + Manual (runs on session end / manual trigger) |
gold-standard |
Audits project completeness against world-class exemplars | One-time (triggered once, governs the session) |
source-grounding |
Prevents AI hallucinations by forcing cross-source verification | Always-on (background rule for all chat sessions) |
supply-chain-audit |
Audits dependency vulnerabilities, licenses, phone-home code, and build/CI security | On-demand (manually run when relevant) |
resilience-audit |
Audits failure path handling (FMEA), rollbacks, retry limits, and idempotency | On-demand (manually run when relevant) |
telemetry-canary |
Audits observability, log structures, metrics, and telemetry quality | On-demand (manually run when relevant) |
testability-canary |
Audits testing ease, code coupling, mockability, and Dependency Injection (DI) | On-demand (manually run when relevant) |
scale-canary |
Audits performance scaling issues, |
On-demand (manually run when relevant) |
drift-canary |
Prevents contract and schema drift (API/database contract inconsistencies) | On-demand (manually run when relevant) |
Run Mode Details:
- 📌 Always-on: Runs implicitly in the background to verify facts.
- 🔄 Auto + Manual: Scans affected files at session end via lifecycle hooks (auto-wired in Claude Code; manual snippets in
platform-configs/hooks/for other agents). Manual trigger via/rot-canary. - ⚡ One-time: Governs the session by scanning and filling project-local rules.
- 🎯 On-demand: Manually run for specific tasks to conserve tokens.
Canaries follow grounding in evidence, zero grade inflation, and report before fixing. Fixes apply through a safe loop: Stash/Commit -> Apply fix -> Run build+tests -> Auto-revert if tests fail.
SKILL.md is an open standard compatible with all major AI coding agents:
| AI Agent | Target Skills Folder | Installation Shortcut | Choice Tool Support |
|---|---|---|---|
| Claude Code | plugin cache (recommended) or ~/.claude/skills/ |
/plugin install coalmine@coalmine |
✅ Native: AskUserQuestion |
| Antigravity | .agents/skills/ |
node scripts/install.mjs antigravity |
✅ Native: built-in question prompt |
| Cursor | .cursor/skills/ |
node scripts/install.mjs cursor |
✅ Native: built-in ask-question tool |
| Windsurf | .windsurf/skills/ |
node scripts/install.mjs windsurf |
✅ Native: suggested_responses |
| GitHub Copilot | .github/skills/ |
node scripts/install.mjs copilot |
✅ Native: askQuestions |
| Cline | .claude/skills/ |
node scripts/install.mjs cline |
✅ Native: ask_question |
| Gemini CLI | .gemini/skills/ |
node scripts/install.mjs gemini |
✅ Native: ask_user |
| Goose | .agents/skills/ |
node scripts/install.mjs goose |
|
| Amp | .agents/skills/ |
node scripts/install.mjs amp |
|
| Junie | .junie/skills/ |
node scripts/install.mjs junie |
|
| Codex | .agents/skills/ |
node scripts/install.mjs codex |
✅ Native: request_user_input |
Skill paths follow the cross-vendor Agent Skills spec. Cline reads .claude/skills/, Junie reads .junie/skills/, others use .agents/skills/.
| Part | Portable? |
|---|---|
| The 9 skills (the audits) | ✅ All targets natively via Agent Skills spec |
Interactive choice menus (ask_question) |
✅ Native question tools on most agents; text fallback on Goose/Amp/Junie |
| Sub-agent fan-out + tiers | ✅ Supported if host has sub-agent system; inline fallback |
| rot-canary auto-cadence | ✅ Auto-wired on Claude Code; 🔧 manual snippets in platform-configs/hooks/ for other major agents; ⛔ unsupported on Cline/Junie |
Manual Fallback: Copy conformed skill body from plugin/skills/<name>/SKILL.md (strip YAML frontmatter) into AGENTS.md / rules file.
/plugin marketplace add HetCreep/CoalMine
/plugin install coalmine@coalmine
🔧 Maintainers:
plugin/is generated output. After edits inskills/,skills/_shared/,hooks/, or.claude-plugin/plugin.json, runnode scripts/build-plugin.mjs.
npx skills add HetCreep/CoalMinegit clone https://github.com/HetCreep/CoalMine.gitRun from your project's root folder (not inside the CoalMine clone):
cd /path/to/your-project
node /path/to/CoalMine/scripts/install.mjs <agent|all|PATH>allauto-detects and installs to all configured agents in the directory.- The installer sets up pre-commit/pre-push gates in
.git/hooks, writes trigger rules, and generates.coalmine.jsonconfig.
- Verify:
node /path/to/CoalMine/scripts/verify.mjs <agent|PATH> - Uninstall:
node scripts/install.mjs --uninstall <agent|PATH>
Installing is the power button. The agent conducts the canaries and asks for consent before running expensive tasks:
| What | When it fires | Your part |
|---|---|---|
| gold-standard | Offered once on new projects, and again when a rule's revalidate date passes |
Run now / Queue / Skip |
| rot-canary | Auto-scans touched files at session end (QUICK); findings end with a fix menu | Choose a fix option |
| Specialists | Offered when conversation enters their domain (deps, schemas, async, loops, etc.) | Accept / Skip |
| source-grounding | Always-on background fact verification | — |
Consent Rule: Nothing expensive runs silently. Revocable via .coalmine.json, ~/.claude/.rot-canary-off, or --uninstall.
- Work Execution Gate: Before starting a task, the agent presents a confirm menu:
- ทำทันที / Do now — Assess scope, recommend tier, and execute.
- เก็บเข้าแผนงาน / Add to plan — Queue in
task.md. - ดูแผนงานทั้งหมด / View full plan — View queued tasks, adjust tiers, and run.
- Haldane Safety Protocol (for sub-agents):
- Active files edited by sub-agents are marked
[/] in-flightintask.mdto prevent collisions. - If conversation shifts to topics affecting in-flight files, the agent warns you first.
- Active files edited by sub-agents are marked
- Proactive Suggestions: The agent automatically offers canary runs via
ask_questionwhen relevant changes (e.g., adding a package) are detected.
- General Users (Zero-Config): Automatically generated
.coalmine.jsonpre-configured with safe, token-optimal defaults. - Programmers (Overrides): Inline comments document every key. Run the configurator tool or edit manually.
The full key set (the source of truth is scripts/lib/config-schema.mjs; the shipped platform-configs/.coalmine.json template documents every key inline):
| Key | Type | Default | Description |
|---|---|---|---|
language |
String | auto |
Override language detection (auto | en | th | ja | zh | es) |
enableConductor |
Boolean | true |
Set false to disable rules injection on Session Start |
skipOnboarding |
Boolean | false |
Skip the gold-standard onboarding offer at session start |
defaultTier |
String | auto |
Force execution tier (Light | Standard | Heavy | auto) |
rotCanaryMode |
String | auto |
rot-canary auto-scan mode (auto | manual | off) |
autoScanFileCap |
Number | 10 |
Maximum touched files to scan at session end |
autoScanFileCapSlice |
Number | 5 |
Most-recently-modified files kept when autoScanFileCap is exceeded (a count, not a fraction) |
tripwireMaxFileSizeKb |
Number | 100 |
Size limit in KB for the tripwire scan |
tripwireMaxLines |
Number | 800 |
Line count that flags a file as a smell |
tempSweepStaleDays |
Number | 7 |
Age in days before session temp files are swept |
watchedExtensions |
Array | [] |
File extensions the touch hook watches (empty = defaults) |
updateMode |
String | ask |
Self-update behavior at session start (ask | auto | remind | off) |
updateCheckDays |
Number | 14 |
Days between self-update checks/reminders |
ruleRevalidateDays |
Number | 90 |
Days before general rules need re-validation |
platformRuleRevalidateDays |
Number | 30 |
Days before platform/model rules need re-validation |
definitionRevalidateDays |
Number | 90 |
Days before general reference definitions are stale |
platformDefinitionRevalidateDays |
Number | 30 |
Days before platform definitions are stale |
disabledCanaries |
Array | [] |
Canaries to disable (e.g. ["rot-canary"] or ["all"]) |
autoFixMode |
String | interactive |
Default fix-mode behavior (interactive | safe | off) |
schemaPaths |
Array | [] |
Glob paths to schemas / API specs |
migrationDirs |
Array | [] |
Database migration directories |
packageManifests |
Array | [] |
Package manifest / lockfile paths |
trustedDomains |
Array | [] |
Extra trusted domains for source grounding |
# Set language and cap limit
node scripts/configure.mjs --language th --file-cap 15
# Disable specific canaries
node scripts/configure.mjs --disable rot-canary,drift-canaryCanaries report in a lean shape (one-line verdict + severity table of confirmed findings) to save tokens:
| # | path:line | category | severity | finding | evidence |
Severity levels: CRITICAL · HIGH · MEDIUM · LOW. Clean scan outputs a single line.
| Tier | Trigger | Orchestration | Token Cost |
|---|---|---|---|
| Light | Small scope / targeted review | Primary agent, quick | Very Low 🟢 |
| Standard | Moderate scope / module review | Multi-threaded routing, detailed | Moderate 🟡 |
| Heavy | Large scope / release prep | Sub-agent fan-out, deep paths | High 🔴 |
Headline: rot-canary scored 100% recall · 100% precision · 0/4 decoy false-positives over the 16-fixture corpus, scored mechanically (skill v3.4.0, 2026-06-13).
rot-canary is measured AV-Comparatives-style — recall, precision, decoy false-positives, and severity accuracy over a fixed fixture corpus, scored mechanically, cross-engine. Honest scope: small, dated samples authored in-project — a regression floor, not an independent benchmark; re-run on model/skill changes.
Full method, per-category scoring, and the cross-engine comparison live in the series records: TheColliery/.github/benchmarks/CoalMine.
Bound by the 11 principles of the Quantum Computer Spec: maximum performance, zero visible errors, single-brand, minimum power, essential accessories, error correction, determinism, isolation, measurement, trustworthiness, and entanglement.
CoalMine shares its engineering doctrine with CoalTipple (model/effort routing) and CoalBoard (consensus & debate board): Phoenix-13 hooks (zero-dependency, no network, fail-silent, no child processes, deterministic), single-source-of-truth config schemas, and a strict no-overkill discipline. Install one and it stands alone; install all and they compose without conflict.
MIT License. See LICENSE for details.