Skip to content

HetCreep/CoalMine

Repository files navigation

🐤 CoalMine

A mine's canary dies first so the miners live — these nine die first so your codebase lives.

9 Quality-Safeguard Canaries for AI Coding Agents — Detect code rot, weak rules, hallucinations, supply-chain vulnerabilities, brittle architectures, and API contract drift before they pollute your codebase.

version license status SKILL.md skills Claude Code agents

Design Principles · Benchmark · Contributing · Changelog · Security · Privacy · Releases

Part of TheColliery — siblings: CoalTipple (model/effort routing) · CoalBoard (consensus & debate board).


🐤 The 9 Canaries

Skill Name Catches Run Mode
rot-canary Dead code, bugs, resource leaks, race conditions, silent failures, stale docs Auto + Manual (runs on session end / manual trigger)
gold-standard Audits project completeness against world-class exemplars One-time (triggered once, governs the session)
source-grounding Prevents AI hallucinations by forcing cross-source verification Always-on (background rule for all chat sessions)
supply-chain-audit Audits dependency vulnerabilities, licenses, phone-home code, and build/CI security On-demand (manually run when relevant)
resilience-audit Audits failure path handling (FMEA), rollbacks, retry limits, and idempotency On-demand (manually run when relevant)
telemetry-canary Audits observability, log structures, metrics, and telemetry quality On-demand (manually run when relevant)
testability-canary Audits testing ease, code coupling, mockability, and Dependency Injection (DI) On-demand (manually run when relevant)
scale-canary Audits performance scaling issues, $O(N^2)$ loops, and duplicate (N+1) database queries On-demand (manually run when relevant)
drift-canary Prevents contract and schema drift (API/database contract inconsistencies) On-demand (manually run when relevant)

Run Mode Details:

  • 📌 Always-on: Runs implicitly in the background to verify facts.
  • 🔄 Auto + Manual: Scans affected files at session end via lifecycle hooks (auto-wired in Claude Code; manual snippets in platform-configs/hooks/ for other agents). Manual trigger via /rot-canary.
  • One-time: Governs the session by scanning and filling project-local rules.
  • 🎯 On-demand: Manually run for specific tasks to conserve tokens.

Canaries follow grounding in evidence, zero grade inflation, and report before fixing. Fixes apply through a safe loop: Stash/Commit -> Apply fix -> Run build+tests -> Auto-revert if tests fail.


🔌 Universal Agent Support

SKILL.md is an open standard compatible with all major AI coding agents:

AI Agent Target Skills Folder Installation Shortcut Choice Tool Support
Claude Code plugin cache (recommended) or ~/.claude/skills/ /plugin install coalmine@coalmine Native: AskUserQuestion
Antigravity .agents/skills/ node scripts/install.mjs antigravity Native: built-in question prompt
Cursor .cursor/skills/ node scripts/install.mjs cursor Native: built-in ask-question tool
Windsurf .windsurf/skills/ node scripts/install.mjs windsurf Native: suggested_responses
GitHub Copilot .github/skills/ node scripts/install.mjs copilot Native: askQuestions
Cline .claude/skills/ node scripts/install.mjs cline Native: ask_question
Gemini CLI .gemini/skills/ node scripts/install.mjs gemini Native: ask_user
Goose .agents/skills/ node scripts/install.mjs goose ⚠️ Text Fallback: no question tool
Amp .agents/skills/ node scripts/install.mjs amp ⚠️ Text Fallback: tool not documented
Junie .junie/skills/ node scripts/install.mjs junie ⚠️ Text Fallback: tool not documented
Codex .agents/skills/ node scripts/install.mjs codex Native: request_user_input

Skill paths follow the cross-vendor Agent Skills spec. Cline reads .claude/skills/, Junie reads .junie/skills/, others use .agents/skills/.

What ports where

Part Portable?
The 9 skills (the audits) ✅ All targets natively via Agent Skills spec
Interactive choice menus (ask_question) ✅ Native question tools on most agents; text fallback on Goose/Amp/Junie
Sub-agent fan-out + tiers ✅ Supported if host has sub-agent system; inline fallback
rot-canary auto-cadence ✅ Auto-wired on Claude Code; 🔧 manual snippets in platform-configs/hooks/ for other major agents; ⛔ unsupported on Cline/Junie

Manual Fallback: Copy conformed skill body from plugin/skills/<name>/SKILL.md (strip YAML frontmatter) into AGENTS.md / rules file.


🚀 Install

Option A — Claude Code Plugin (No clone needed)

/plugin marketplace add HetCreep/CoalMine
/plugin install coalmine@coalmine

🔧 Maintainers: plugin/ is generated output. After edits in skills/, skills/_shared/, hooks/, or .claude-plugin/plugin.json, run node scripts/build-plugin.mjs.

Option A2 — skills.sh (One line)

npx skills add HetCreep/CoalMine

Option B — Universal Installer

1. Clone the Repository

git clone https://github.com/HetCreep/CoalMine.git

2. Run the Installer

Run from your project's root folder (not inside the CoalMine clone):

cd /path/to/your-project
node /path/to/CoalMine/scripts/install.mjs <agent|all|PATH>
  • all auto-detects and installs to all configured agents in the directory.
  • The installer sets up pre-commit/pre-push gates in .git/hooks, writes trigger rules, and generates .coalmine.json config.

3. Verify & Uninstall

  • Verify: node /path/to/CoalMine/scripts/verify.mjs <agent|PATH>
  • Uninstall: node scripts/install.mjs --uninstall <agent|PATH>

🔋 One button: install — the suite drives itself

Installing is the power button. The agent conducts the canaries and asks for consent before running expensive tasks:

What When it fires Your part
gold-standard Offered once on new projects, and again when a rule's revalidate date passes Run now / Queue / Skip
rot-canary Auto-scans touched files at session end (QUICK); findings end with a fix menu Choose a fix option
Specialists Offered when conversation enters their domain (deps, schemas, async, loops, etc.) Accept / Skip
source-grounding Always-on background fact verification

Consent Rule: Nothing expensive runs silently. Revocable via .coalmine.json, ~/.claude/.rot-canary-off, or --uninstall.


🛡️ Work Execution Gate & Haldane Safety

  1. Work Execution Gate: Before starting a task, the agent presents a confirm menu:
    • ทำทันที / Do now — Assess scope, recommend tier, and execute.
    • เก็บเข้าแผนงาน / Add to plan — Queue in task.md.
    • ดูแผนงานทั้งหมด / View full plan — View queued tasks, adjust tiers, and run.
  2. Haldane Safety Protocol (for sub-agents):
    • Active files edited by sub-agents are marked [/] in-flight in task.md to prevent collisions.
    • If conversation shifts to topics affecting in-flight files, the agent warns you first.
  3. Proactive Suggestions: The agent automatically offers canary runs via ask_question when relevant changes (e.g., adding a package) are detected.

⚙️ Configure (.coalmine.json)

  • General Users (Zero-Config): Automatically generated .coalmine.json pre-configured with safe, token-optimal defaults.
  • Programmers (Overrides): Inline comments document every key. Run the configurator tool or edit manually.

Configuration Schema

The full key set (the source of truth is scripts/lib/config-schema.mjs; the shipped platform-configs/.coalmine.json template documents every key inline):

Key Type Default Description
language String auto Override language detection (auto | en | th | ja | zh | es)
enableConductor Boolean true Set false to disable rules injection on Session Start
skipOnboarding Boolean false Skip the gold-standard onboarding offer at session start
defaultTier String auto Force execution tier (Light | Standard | Heavy | auto)
rotCanaryMode String auto rot-canary auto-scan mode (auto | manual | off)
autoScanFileCap Number 10 Maximum touched files to scan at session end
autoScanFileCapSlice Number 5 Most-recently-modified files kept when autoScanFileCap is exceeded (a count, not a fraction)
tripwireMaxFileSizeKb Number 100 Size limit in KB for the tripwire scan
tripwireMaxLines Number 800 Line count that flags a file as a smell
tempSweepStaleDays Number 7 Age in days before session temp files are swept
watchedExtensions Array [] File extensions the touch hook watches (empty = defaults)
updateMode String ask Self-update behavior at session start (ask | auto | remind | off)
updateCheckDays Number 14 Days between self-update checks/reminders
ruleRevalidateDays Number 90 Days before general rules need re-validation
platformRuleRevalidateDays Number 30 Days before platform/model rules need re-validation
definitionRevalidateDays Number 90 Days before general reference definitions are stale
platformDefinitionRevalidateDays Number 30 Days before platform definitions are stale
disabledCanaries Array [] Canaries to disable (e.g. ["rot-canary"] or ["all"])
autoFixMode String interactive Default fix-mode behavior (interactive | safe | off)
schemaPaths Array [] Glob paths to schemas / API specs
migrationDirs Array [] Database migration directories
packageManifests Array [] Package manifest / lockfile paths
trustedDomains Array [] Extra trusted domains for source grounding

Configurator Utility

# Set language and cap limit
node scripts/configure.mjs --language th --file-cap 15

# Disable specific canaries
node scripts/configure.mjs --disable rot-canary,drift-canary

📝 Ultra-Short Summary Format

Canaries report in a lean shape (one-line verdict + severity table of confirmed findings) to save tokens:

| # | path:line | category | severity | finding | evidence |

Severity levels: CRITICAL · HIGH · MEDIUM · LOW. Clean scan outputs a single line.


⚡ Escalation Tiers

Tier Trigger Orchestration Token Cost
Light Small scope / targeted review Primary agent, quick Very Low 🟢
Standard Moderate scope / module review Multi-threaded routing, detailed Moderate 🟡
Heavy Large scope / release prep Sub-agent fan-out, deep paths High 🔴

📊 Benchmark

Headline: rot-canary scored 100% recall · 100% precision · 0/4 decoy false-positives over the 16-fixture corpus, scored mechanically (skill v3.4.0, 2026-06-13).

rot-canary is measured AV-Comparatives-style — recall, precision, decoy false-positives, and severity accuracy over a fixed fixture corpus, scored mechanically, cross-engine. Honest scope: small, dated samples authored in-project — a regression floor, not an independent benchmark; re-run on model/skill changes.

Full method, per-category scoring, and the cross-engine comparison live in the series records: TheColliery/.github/benchmarks/CoalMine.


🧭 Design Principles

Bound by the 11 principles of the Quantum Computer Spec: maximum performance, zero visible errors, single-brand, minimum power, essential accessories, error correction, determinism, isolation, measurement, trustworthiness, and entanglement.


🏭 Part of TheColliery

CoalMine shares its engineering doctrine with CoalTipple (model/effort routing) and CoalBoard (consensus & debate board): Phoenix-13 hooks (zero-dependency, no network, fail-silent, no child processes, deterministic), single-source-of-truth config schemas, and a strict no-overkill discipline. Install one and it stands alone; install all and they compose without conflict.


📄 License

MIT License. See LICENSE for details.

About

9 quality-canary skills for AI coding agents - code health, gold-standard rules, anti-hallucination grounding, supply chain, resilience, observability, testability, scaling, contract drift. Consent-gated, token-lean, one-button install, works across all major AI coding agents.

Topics

Resources

License

Contributing

Security policy

Stars

Watchers

Forks

Packages

 
 
 

Contributors