A Local-First Developer-Agent Control Harness
TriageCore is an early research workbench for AI-assisted software work that keeps local control, reviewable artifacts, and privacy boundaries visible to the operator. It can generate preflight and handoff packets, inspect privacy-safe route audit events, run local benchmark/report workflows, and support a bounded Qwen Cloud path for external-safe packets. The project is real and runnable today, but it should be read as a prototype and research harness, not a finished production framework. Cloud supervisors are optional and intentional here, not the default path for every task.
Status TriageCore is active as a local-first prototype/workbench. Current capabilities, supporting docs, tests, and demo paths are in-repo now. Broader governance, release polish, and long-term environmental-edge integrations should be treated as ongoing work, not completed product claims.
- Verifies operator environment and local repo state with
tc doctor. - Generates reviewable preflight and handoff artifacts with
tc preflightandtc handoff. - Records and inspects privacy-safe route audit events with
tc audit. - Supports local benchmark fixtures and benchmark reports without hiding the evidence trail.
- Enforces local-only privacy boundaries before any optional external-safe Qwen Cloud path is considered.
- It makes local vs cloud execution explicit instead of burying that decision inside an agent loop.
- It preserves human review, permission boundaries, and fail-closed local-only handling as core workflow rules.
- It produces inspectable artifacts and route evidence instead of relying on vague autonomy claims.
- It is useful today as safer AI-assisted SDLC framing and useful later as a control pattern for environmental edge workflows.
Current capabilities
- local-first operator workflow
- route audit inspection
- benchmark scaffolding and reports
- bounded Qwen Cloud escalation for external-safe packets only
Planned / future-facing
- public release polish such as release metadata upkeep and GitHub metadata
- deeper environmental-edge packaging around Clear Lake Watch style workflows
Research framing
- methodology, evidence schema, and benchmark comparison docs remain first-class because the project is also an evaluation workbench, not only a tool wrapper
Install locally:
git clone https://github.com/coreytshaffer/TriageCore
cd TriageCore
pip install -e .Then run:
tc doctor
tc demo --dry-run
tc preflight CR-017
tc handoff latest --print
tc audit --self-test
tc audit --kind route_audit --last 10
tc audit --kind demo_dry_run --last 5
tc audit --privacy-invariants
triagecore benchmark --list-onlyOptional deeper verification:
tc model check --manifest docs\security\examples\model_route_manifest_local_ollama.json
tc model warn --manifest docs\security\examples\model_route_manifest_local_ollama.json --route docs\security\examples\route_payload_local_ollama.json
tc model warn --manifest docs\security\examples\model_route_manifest_cloud_qwen.json --route docs\security\examples\route_payload_local_ollama.jsonExpected outputs:
tc doctorconfirms repo root, Python, CLI path, ledger path, and pytest visibility.tc demo --dry-runshows the offline safety-control loop from packet summary through human review and writes one metadata-only demo event.tc preflight CR-017writes a handoff artifact under.triagecore/handoffs/.tc handoff latest --printprints a reviewable handoff packet.tc audit --self-testwrites one privacy-saferoute_auditevent.tc audit --kind route_audit --last 10shows routing metadata without raw prompt/data leakage.tc audit --kind demo_dry_run --last 5shows the deterministic demo evidence without raw request or proposed-output content.tc audit --privacy-invariantsscans the persistent ledger for forbidden raw-content keys and confirms the CR-021 invariant still holds.triagecore benchmark --list-onlyshows the benchmark fixture set without contacting a backend.tc model checkvalidates the documented manifest example locally.tc model warnprovides warning-only route/manifest comparison visibility and remains non-blocking when mismatches exist.
Sample audit transcript:
> tc audit --self-test
Success: Wrote privacy-safe route_audit self-test event to ...\.triagecore\ledger.jsonl.
> tc audit --kind route_audit --last 10
[2026-06-11T03:39:17.292773+00:00] Task: audit-self-test | Type: route_audit
Decision: allowed | Reason: audit_self_test
Privacy: public (Scan Passed: True)
Local Only: False | Route: self_test | Backend: self_test
The deterministic demo runs offline, calls no model backend, and changes no source files. It demonstrates the current workflow structure and review gates; it is not evidence of production safety certification.
The manifest warning commands are optional deeper verification only. They demonstrate route/manifest comparison visibility, not runtime enforcement, backend probing, or production certification.
Start here if you want the shortest guided path:
- Hackathon Demo
- Judge Submission Bundle
- Verification Guide
- Evidence Schema
- Benchmark Fixtures
- Public Evidence Example
Current in-repo proof markers:
- a runnable reviewer path using existing commands
- a judge-facing submission bundle under
docs/submission/ - a privacy-safe route audit self-test and public evidence example
- persistent artifact privacy invariant audit via
tc audit --privacy-invariants - benchmark fixtures and benchmark-report scaffolding
- a full offline-oriented test suite runnable with
python -m pytest -q - a public README test badge backed by the GitHub Actions workflow
Proof markers that still depend on GitHub/release state rather than repository files:
- release metadata upkeep
- GitHub About description
- GitHub topics
You can install TriageCore locally for CLI access:
git clone https://github.com/coreytshaffer/TriageCore
cd TriageCore
pip install -e .For a bounded operator walkthrough that works with existing commands, see docs/workflows/hackathon_demo.md.
For the judge-facing submission bundle, start with docs/submission/README.md.
That demo is designed to support:
- TriageCore local-first plus optional Qwen Cloud escalation as the primary story
- safer AI-assisted SDLC as the secondary framing
- Clear Lake Watch or other environmental edge workflows as a future extension
Launch the local control plane GUI to actively manage tasks, monitor telemetry, and interact with the local LLM engine:
triagecore desk- Live Local Engine: Hooks directly into Ollama or LM Studio to stream generated code right into the UI.
- Energy-Aware Routing:
psutilintegration actively monitors your battery life. If your battery dips below 20% while unplugged, TriageCore refuses to run heavy LLM tasks and prompts you to plug in (Permacomputing in action). - Telemetry & Resource Accounting: Tracks measured or heuristic resource estimates for energy consumption (kWh/Joules) and carbon emissions (gCO2e) in a local append-only ledger (
.triagecore/ledger.jsonl). - Local-First Benefit Signals: The dashboard foregrounds accepted yield, local-first routing share, accepted local work, and review-light tasks so the bench encourages continued evidence collection while formal reports remain baseline-bound.
Audit the files your agents modify to ensure they didn't bypass the initial risk assessment:
triagecore audit <task_id> --files src/main.py- Scope Verification: Flags modified files that were not in the original target list.
- Profile Adherence: Blocks changes if the task was rated
read-only. - Escalation Detection: Static analysis checks for
requests,socket,subprocess, etc., if the task was classified as low-risk.
TriageCore supports pluggable backends so you can process tasks against any local runner without manually wrangling URLs. All local generations route through a unified OpenAICompatibleBackend adapter, and the Qwen Cloud path stays bounded behind explicit external-safe routing.
You can configure your TriageClient with the following presets:
from triage_core import TriageClient
client = TriageClient(backend_type="ollama", model="qwen2.5-coder:7b")client = TriageClient(backend_type="vllm", model="Qwen/Qwen2.5-Coder-7B-Instruct")client = TriageClient(backend_type="llama.cpp", model="local-model")from triage_core.backends import OpenAICompatibleBackend
from triage_core import TriageClient
backend = OpenAICompatibleBackend(
name="lmstudio",
base_url="http://localhost:1234/v1",
model="local-model"
)
client = TriageClient(backend=backend)TriageCore is also being developed as a scientific model evaluation and token-balancing workbench. Each task attempt can be treated as an experimental observation that records routing decisions, backend behavior, token use, validation outcomes, energy estimates, and human review results.
The project methodology is documented in docs/methodology.md. Supporting literature is collected in docs/references.md. Together, these describe the evidence loop for model evaluation, safety routing, mistake logging, and human-reviewed learning.
The shared evidence schema is documented in docs/evidence_schema.md. The first repeatable study plan is docs/study_001_local_model_baseline.md, model/backend comparison is planned in docs/study_002_model_backend_comparison.md, and Codex/Antigravity supervision is described in docs/codex_antigravity_bridge.md.
Use docs/verification_guide.md for practical code, UI, study-evidence, and human-review verification checks.
TriageCore is inspired by sustainable and permacomputing practices that emphasize sufficiency, repairability, visible infrastructure, and graceful operation under constraints.
Rather than optimizing for maximum automation, TriageCore optimizes for bounded, reviewable, locally controlled developer-agent work.
Design commitments:
- Prefer local files over remote services.
- Prefer Markdown, JSON, and TOML over opaque state.
- Prefer small task packets over broad autonomous sessions.
- Prefer explicit permission recommendations over silent execution.
- Prefer deferral or refusal when a task is too broad, risky, or wasteful.
- Preserve human review as a first-class part of the workflow.
- Treat compute, attention, battery, trust, and hardware lifespan as scarce resources.
TriageCore includes repeatable benchmark fixtures in benchmarks/tasks.jsonl. List them without contacting a backend:
triagecore benchmark --list-onlyRun them against a local backend and append model-evaluation evidence to .triagecore/ledger.jsonl:
triagecore benchmark --backend-type ollama --model qwen2.5-coder:7bTag formal study runs so reports can exclude exploratory ledger history:
triagecore benchmark --study-id study_001 --run-id trial_001Summarize benchmark evidence:
triagecore benchmark-report
triagecore benchmark-report --output reports/benchmark-report.md
triagecore benchmark-report --study-id study_001 --run-id trial_001 --output reports/study_001_benchmark_report.mdCompare backend/model pairs by giving each run a unique run_id under one study:
triagecore benchmark --study-id study_002 --run-id ollama_qwen25_coder_7b_trial_001 --backend-type ollama --model qwen2.5-coder:7b-triagecore
triagecore benchmark --study-id study_002 --run-id lmstudio_loaded_model_trial_001 --backend-type custom --base-url http://localhost:1234/v1 --model <loaded-model-name>
triagecore benchmark-report --study-id study_002 --output reports/study_002_model_backend_comparison.mdComparison reports include By Supervision, By Backend, By Model, and By Category sections. When supervised benchmark records exist, reports also include a Supervisor Reviews table with decision counts and estimated supervisor token totals under the same study/run filter.
TriageCore can generate pending learning proposals from ledger evidence, but it does not automatically change routing behavior:
triagecore propose-lessonsRecord an explicit human decision:
triagecore review-lesson <proposal_id> --decision accepted --notes "Evidence supports this routing change."Record a Codex or Antigravity supervisor decision for a task:
triagecore record-supervisor-review <task_id> --tool codex --decision needs_revision --notes "Local draft missed tests." --model gpt-5 --profile high
triagecore record-supervisor-review <task_id> --tool antigravity --decision accepted --notes "IDE supervisor accepted the local draft." --model gemini-3.1-pro-high --profile supervisorImport supervisor usage from a verified JSON or JSONL artifact:
triagecore scan-supervisor-usage supervisor_logs\
triagecore import-supervisor-usage supervisor_usage.jsonl --tool codex --token-source imported_exact --dry-run
triagecore import-supervisor-usage supervisor_usage.jsonl --tool codex --token-source imported_exactTriageCore provides a convenient CLI for generating agent task bundles offline:
Generate a default AGENTS.md file in your repository:
triagecore init-agentsCreate a standalone markdown task file (triage_tasks/codex_task_low.md) for Codex:
triagecore codex-task --prompt "Refactor the database connection string logic" --files src/db.pyCreate a robust multi-file bundle (.agent_tasks/my-slug/TASK.md, ACCEPTANCE_CRITERIA.md):
triagecore antigravity-task --prompt "Add pytest coverage for handoff.py" --files tests/test_handoff.py --slug add-testsTriageCore uses pytest to ensure all routing and safety logic operates completely offline without network calls. Tests actively mock the backend requests module to verify payload structures across Ollama, vLLM, and llama.cpp presets.
TriageCore is a research-stage orchestration and workflow-control project. It is designed to support privacy-aware routing, local-first execution, human approval gates, and auditable task records.
TriageCore is not a certified safety system, compliance system, medical device, legal decision system, emergency dispatch system, or critical infrastructure control system. It does not guarantee safe, lawful, complete, accurate, or compliant outcomes.
Operators are responsible for validating outputs, configuring policies, reviewing logs, and ensuring that any deployment satisfies applicable legal, security, privacy, safety, and sector-specific requirements.
pip install pytest
pytest tests/MIT