AgentForge Clinical Co-Pilot

This repository is a fork of OpenEMR extended with an AI clinical co-pilot for primary care physicians, built for the Gauntlet AI Austin admission sprint.

Week 2 final demo video: link landing here once recorded — see W2_DEMO_SCRIPT.md for the 5:00 walkthrough

What changed since the W2 MVP grade

The MVP passed Wednesday with the note "harden the UX and reliability, and stronger visibility into the retrieval architecture, eval coverage, and worker orchestration." Everything below shipped after that grade in direct response, plus the two surprise-challenge additions:

Area	Shipped	Where
Visibility	`/visibility` page — corpus inspector, ASCII supervisor graph, deterministic routing rule table, eval coverage with locked-baseline rates, recent supervisor decisions, live retrieval inspector showing BM25 / dense / rerank scores per chunk before any LLM sees them	https://copilot-agent-production-ba87.up.railway.app/visibility
UX hardening	Retry button on every error / refusal message in the chat panel; per-status friendly HTTP error copy (502/503/504/401/403/429 each get specific actionable text instead of "Server returned 502")	`interface/modules/custom_modules/oe-module-clinical-copilot/public/js/copilot-chat.js`
Multi-format ingestion (W2 surprise #1)	HL7 v2 ORU + ADT (structured parse, zero LLM cost), DOCX referral letters, XLSX patient workbooks, TIFF fax packets — all routed through `/agent/extract` with the appropriate `doc_type`. Cohort 5 W2 asset pack is committed as fixtures + a smoke runner	`agent-service/src/copilot/extraction/{hl7v2,docx,xlsx,tiff}.py`
Modern dashboard (W2 surprise #2)	Next.js 15 + Auth.js v5 OIDC against OpenEMR's existing OAuth flow. Six clinical cards + bidirectional cross-app navigation. Defense in PATIENT_DASHBOARD_MIGRATION.md	`dashboard/`
Cookbook stages 3-5 (production-evals reference)	Replay harness (`--record` / `--replay` JSONL), LLM-as-judge tier (`judge_yes_no` on Haiku for clinical-quality binary checks), A/B experiment diff (compare two recordings side-by-side)	`agent-service/evals/w2/{replay,judge,experiments}.py`
Eval gate	Locked at 63 cases / 11 categories / 6 rubrics / 100% baseline. PR-blocking GitHub Action enforces ≥95% per category (5pp regression delta)	`.github/workflows/eval-gate.yml`

Document	Purpose
SETUP.md	Bring the stack up locally (`docker compose up -d`)
AUDIT.md	Security / performance / architecture / data-quality / compliance audit of the OpenEMR codebase
USERS.md	Target user (PCP, 20-patient day) and the use cases the agent addresses
W1_ARCHITECTURE.md	Week 1 AI integration plan — verification, observability, failure modes, scaling
W2_ARCHITECTURE.md	Week 2 multimodal evidence agent — schemas, vision pipeline, hybrid RAG, supervisor + worker graph, eval gate
THREAT_MODEL.md	Week 3 attack surface map for the adversarial platform — 6 categories, highest-risk findings, coverage prioritization
ARCHITECTURE.md	Week 3 multi-agent adversarial-platform architecture — Orchestrator / Red Team / Judge / Documentation agents, inter-agent comms, regression harness, observability
PATIENT_DASHBOARD_MIGRATION.md	Week 2 surprise port — defense of Next.js 15 + Auth.js v5 framework choice for the FHIR-backed patient dashboard
W1_COSTS.md	W1 cost & scale projections (100 → 100K users)
W2_COSTS.md	Week 2 cost & latency report — vision / RAG / multi-format / cookbook tier / dashboard, p50/p95, bottleneck analysis
COSTS.md	Week 3 adversarial-platform cost analysis — per-attempt decomposition, dev spend, projections at 100/1K/10K/100K runs/day, architectural changes per scale
DEMO_SCRIPT.md / W2_DEMO_SCRIPT.md	W1 / W2 demo video scripts
INTERVIEW_PREP.md	AI-interview talking points keyed to repo artifacts
W2_PRESEARCH.md	W2 architectural-discovery checklist (16 questions) — design rationale for the Clinical Co-Pilot
PRESEARCH.md	Week 3 architectural-discovery checklist (11 sections × 3 phases) — design rationale for the adversarial platform that attacks the W2 target
EVIDENCE.md	Week 3 — one finding traced end-to-end (Orchestrator decision → Red Team attack → target response → Judge verdict → Documentation Agent → trust gate → W2 eval gate). Each step links to the deployed URL where the artifact can be inspected directly.

W3 hard gate — deployed target URL (the live system the adversarial platform attacks; required with every checkpoint per the W3 PRD):

https://copilot-agent-production-ba87.up.railway.app

Health check: GET /healthz → {"status":"ok"}. The Red Team Agent's httpx.AsyncClient (agent-service/src/redteam/target.py) points at this base URL by default; every attempt in agent-service/evals/redteam_runs/ records the real status code + response body from this endpoint. Nothing is mocked.

Deployed apps:

Embedded co-pilot (PHP module rendered into the patient chart): https://openemr-production-0996.up.railway.app/
Modern dashboard (Next.js port of the patient chart): https://openemr-dashboard-production.up.railway.app/
System visibility page (W2: corpus, routing, eval coverage, live retrieval inspector): https://copilot-agent-production-ba87.up.railway.app/visibility
Adversarial platform dashboard (W3: coverage, vuln pipeline, recent campaigns with Orchestrator rationale, attempts trend): https://copilot-agent-production-ba87.up.railway.app/adversarial
Standalone agent UI (token-less demo / fallback): https://copilot-agent-production-ba87.up.railway.app/

Reviewer entry points:

No login required:
- /visibility — W2 agent introspection (corpus, routing, eval coverage, live retrieval inspector)
- /adversarial — W3 platform operator view (92 attempts of evidence, 3 live + 7 pending vulns, Orchestrator rationale per campaign)
- The standalone agent UI (token-less /demo/chat)
OpenEMR login required:
- The embedded co-pilot panel (Farrah Rolle is a good demo patient with rich data)
OAuth flow on first hit:
- The Next.js dashboard (signs in against OpenEMR's existing OIDC server)

Stack:

OpenEMR (PHP 8.2 + Apache + MariaDB) → interface/modules/custom_modules/oe-module-clinical-copilot/ embedded chat panel
agent-service/ — Python (FastAPI + Anthropic Claude) → Redis context cache → Langfuse traces. Hybrid RAG (BM25 + Voyage + Cohere Rerank) over a 24-chunk USPSTF / ADA / ACIP / ACC-AHA / CDC guideline corpus
Multi-format ingestion: PDF (vision), HL7 v2 ORU + ADT (structured parse), DOCX referral letters, XLSX workbooks, TIFF fax packets
dashboard/ — Next.js 15 (App Router, React 19 server components)
- Auth.js v5 OIDC against OpenEMR's /oauth2/default endpoint
Four+1 Railway services: OpenEMR + agent + Redis + MySQL + dashboard

Quick start (local):

docker compose up -d         # OpenEMR + MariaDB + phpMyAdmin
docker compose logs -f openemr   # watch first-boot install (~5 min)
# then open http://localhost:8080  (admin / pass)

Quick start (Railway deploy):

railway login
railway init                 # create a new project
railway add --plugin mysql   # provision the MariaDB plugin
railway up                   # build + deploy from this Dockerfile
# then configure env vars in the Railway dashboard (see Dockerfile header)

OpenEMR (upstream README)

OpenEMR

OpenEMR is a Free and Open Source electronic health records and medical practice management application. It features fully integrated electronic health records, practice management, scheduling, electronic billing, internationalization, free support, a vibrant community, and a whole lot more. It runs on Windows, Linux, Mac OS X, and many other platforms.

Contributing

OpenEMR is a leader in healthcare open source software and comprises a large and diverse community of software developers, medical providers and educators with a very healthy mix of both volunteers and professionals. Join us and learn how to start contributing today!

Already comfortable with git? Check out CONTRIBUTING.md for quick setup instructions and requirements for contributing to OpenEMR by resolving a bug or adding an awesome feature 😊.

Support

Community and Professional support can be found here.

Extensive documentation and forums can be found on the OpenEMR website that can help you to become more familiar about the project 📖.

Reporting Issues and Bugs

Report these on the Issue Tracker. If you are unsure if it is an issue/bug, then always feel free to use the Forum and Chat to discuss about the issue 🪲.

For Developers

If using OpenEMR directly from the code repository, then the following commands will build OpenEMR (Node.js version 24.* is required) :

composer install --no-dev
npm install
npm run build
composer dump-autoload -o

Contributors

This project exists thanks to all the people who have contributed. [Contribute].

License

GNU GPL

Name		Name	Last commit message	Last commit date
Latest commit History 164 Commits
.github/workflows		.github/workflows
Documentation		Documentation
agent-service		agent-service
apis		apis
bin		bin
ccdaservice		ccdaservice
ccr		ccr
config		config
contrib		contrib
controllers		controllers
custom		custom
dashboard		dashboard
db		db
demo		demo
gacl		gacl
interface		interface
library		library
meta/health		meta/health
oauth2		oauth2
portal		portal
public		public
sites/default		sites/default
sphere		sphere
sql		sql
src		src
swagger		swagger
templates		templates
vulns		vulns
.dockerignore		.dockerignore
.env.example		.env.example
.gitattributes		.gitattributes
.gitignore		.gitignore
.htaccess.example		.htaccess.example
API_README.md		API_README.md
ARCHITECTURE.md		ARCHITECTURE.md
AUDIT.md		AUDIT.md
CHANGELOG.md		CHANGELOG.md
CODE_OF_CONDUCT.md		CODE_OF_CONDUCT.md
CONTRIBUTING.md		CONTRIBUTING.md
COSTS.md		COSTS.md
DEMO_SCRIPT.md		DEMO_SCRIPT.md
Dockerfile		Dockerfile
Dockerfile.test		Dockerfile.test
EVIDENCE.md		EVIDENCE.md
FHIR_README.md		FHIR_README.md
INTERVIEW_PREP.md		INTERVIEW_PREP.md
LICENSE		LICENSE
PATIENT_DASHBOARD_MIGRATION.md		PATIENT_DASHBOARD_MIGRATION.md
PRESEARCH.md		PRESEARCH.md
README.md		README.md
REFLECTIONS.md		REFLECTIONS.md
SETUP.md		SETUP.md
SOCIAL_POST.md		SOCIAL_POST.md
THREAT_MODEL.md		THREAT_MODEL.md
USERS.md		USERS.md
W1_ARCHITECTURE.md		W1_ARCHITECTURE.md
W1_COSTS.md		W1_COSTS.md
W2_ARCHITECTURE.md		W2_ARCHITECTURE.md
W2_COSTS.md		W2_COSTS.md
W2_DEMO_SCRIPT.md		W2_DEMO_SCRIPT.md
W2_PRESEARCH.md		W2_PRESEARCH.md
W3_DEMO_SCRIPT.md		W3_DEMO_SCRIPT.md
W3_SOCIAL_POST.md		W3_SOCIAL_POST.md
_rest_routes.inc.php		_rest_routes.inc.php
acknowledge_license_cert.html		acknowledge_license_cert.html
acl_upgrade.php		acl_upgrade.php
admin.php		admin.php
bootstrap.php		bootstrap.php
cli		cli
composer.json		composer.json
composer.lock		composer.lock
controller.php		controller.php
docker-compose.test.yml		docker-compose.test.yml
docker-compose.yml		docker-compose.yml
docker-entrypoint-wrapper.sh		docker-entrypoint-wrapper.sh
docker-version		docker-version
gulpfile.js		gulpfile.js
index.php		index.php
ippf_upgrade.php		ippf_upgrade.php
jest.config.js		jest.config.js
package-lock.json		package-lock.json
package.json		package.json
railway.json		railway.json
robots.txt		robots.txt
setup.php		setup.php
sql_patch.php		sql_patch.php
sql_upgrade.php		sql_upgrade.php
version.php		version.php

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

AgentForge Clinical Co-Pilot

What changed since the W2 MVP grade

OpenEMR (upstream README)

OpenEMR

Contributing

Support

Reporting Issues and Bugs

Reporting Security Vulnerabilities

API

Docker

FHIR

For Developers

Contributors

Sponsors

License

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

AgentForge Clinical Co-Pilot

What changed since the W2 MVP grade

OpenEMR (upstream README)

OpenEMR

Contributing

Support

Reporting Issues and Bugs

Reporting Security Vulnerabilities

API

Docker

FHIR

For Developers

Contributors

Sponsors

License

About

Resources

License

Code of conduct

Contributing

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages