NathanMaine-Labs

Nathan Maine

Senior Technical Program Manager | AI Systems Builder | 12+ Years Enterprise Delivery

I build the AI systems I've spent a career learning to manage — compliance LLMs, governed inference gateways, evaluation harnesses for autonomous agents, and patent-pending token governance infrastructure. 12 years enterprise delivery across identity platforms (700K users), data unification (89M records, 95.48% match rates), and $20M+ multi-cloud programs.

CMMC Compliance AI Platform (v1.5.1)

The only CMMC-specific fine-tuned LLM suite in the open-source ecosystem. Four models (7B–72B) trained for $77 total compute, deployed fully air-gapped via Ollama. 708 tests across three tiers — including 140 blind holdout scenarios that caught 3 real security bugs the 568 internal tests missed. 27 CMMC controls across 5 families (AC, AU, IA, SC, SI) plus 3 DFARS clauses.

Model	Size	HuggingFace
cmmc-expert-7b	5.1 GB	Nathan-Maine/cmmc-expert-7b
cmmc-expert-14b	~10 GB	Nathan-Maine/cmmc-expert-14b
cmmc-expert-32b	18.5 GB	Nathan-Maine/cmmc-expert-32b
cmmc-expert-72b	44.2 GB	Nathan-Maine/cmmc-expert-72b

Repository	Problem It Solves	How It Works
governed-llm-gateway	Commercial AI gateways log to editable databases — useless for compliance audits	Every AI request passes through an 11-step pipeline that logs who asked what, blocks sensitive data, and chains each entry to the previous one using SHA-256 hashes — any tampering breaks the chain
cmmc-compliance-ai-model	CMMC compliance consultants cost $125K–$250K and no open-source AI alternative exists	Four AI models fine-tuned on government compliance documents, sized from fast lookups (7B) to deep multi-framework analysis (72B), running entirely on local hardware with zero cloud dependency
garak-compliance-probes	Standard AI vulnerability scanners don't test whether models leak regulated data like CUI, HIPAA, or DFARS content	Four custom probes and six detectors for NVIDIA's garak scanner that specifically try to trick compliance models into revealing controlled information. PR #1619 — 20 files, 1,599 lines. Dev fork
governance-graph-compiler	Compliance policies written in documents are hard to trace and enforce with software	Reads policy Markdown files and converts them into a connected graph where each requirement links to its evidence artifacts and enforcement points

AI Agent Evaluation & Dark Factory Testing

140 black-box behavioral scenarios in a physically separate holdout repository — the AI that builds the platform never sees the tests. An agent cannot game what it cannot see. The first sweep caught 3 real security bugs that passed all 568 internal tests: broken MFA setup (High), missing X-Content-Type-Options header (Medium), and missing X-Frame-Options header (Medium). This architecture independently converged with StrongDM's Software Factory pattern (published February 2026). The Agentic Evaluation Sandbox was created December 2025, predating their publication.

Repository	Problem It Solves	How It Works
cmmc-scenario-holdout	568 internal tests all passed but 3 real security bugs shipped — visible tests get gamed by AI agents	140 black-box HTTP scenarios in a separate repo the AI never sees. Docker digital twin with mock Ollama (no GPU). Covers auth, RBAC, PII/CUI blocking, prompt injection, audit integrity, security headers, and 8 CMMC policy rules. First sweep caught 3 bugs — all fixed in v1.5.1
agentic-evaluation-sandbox	AI agents can learn to game their own tests when the tests live inside the codebase	The original Dark Factory framework (December 2025). Defines four evaluation roles — Doer, Judge, Adversary, Observer — with holdout scenarios and probabilistic satisfaction scoring instead of simple pass/fail
agentic-ai-portfolio	Five independent agent repos need a single entry point for orchestration and documentation	Umbrella repository tying together AES, GGC, AMG, SHAW, and TEA with shared configuration and docs
voice-robustness-testing-agent	Voice assistants and NLU classifiers break under noisy or unusual input — failures need to be found before users find them	Injects noise, varied phrasing, and edge cases into voice endpoints, classifies each response into three outcome states, and generates a robustness report with evidence
agent-perf-test-generator	Writing load test plans for steady traffic, burst traffic, and endurance runs is repetitive and error-prone	Takes a service profile and SLO targets as input, then generates ready-to-run test configurations with pass/fail thresholds derived directly from the SLOs
semantic-test-coverage-agent	Line coverage numbers don't tell you whether every function's actual behavior is tested	Walks the code's syntax tree to find every function, matches each one against existing test files, and generates skeleton tests for anything that's untested

Agent Infrastructure

Purpose-built components for agent memory, recovery, planning, and coordination. Deterministic and auditable — identical inputs always produce identical outputs.

Repository	Problem It Solves	How It Works
agentic-memory-graph-engine	AI agents forget what happened in previous tasks — they have no persistent, queryable memory	Stores facts and relationships in a knowledge graph, retrieves relevant memories using similarity search, and can explain why it recalled something by tracing the graph path
self-healing-agentic-workflows	When an AI agent fails mid-task, most systems just crash instead of recovering	Wraps agent tasks in retry logic with exponential backoff, fallback chains (try Plan B if Plan A fails), and circuit breakers that stop calling a broken service
temporal-executive-agent	Complex tasks have dependencies — step A must finish before step B can start — and agents often ignore this	Builds a dependency graph of subtasks, resolves ordering constraints, and executes them in valid sequence while tracking state at each step
multi-agent-fairness-governor	When multiple agents compete for tasks, some get overloaded while others sit idle	Coordinates agents using weighted round-robin allocation with capacity constraints and a skew-ratio metric that detects and corrects workload imbalance
agentic-workflow-simplifier	Defining multi-agent workflows usually requires heavy frameworks with steep learning curves	Lightweight Python library that lets you define workflows as directed graphs with automatic dependency resolution and feedback loops
mcp-conversational-data-agent	LLMs can't natively query databases or CRM systems during a conversation	Model Context Protocol server that exposes database, CRM, and ticket system tools so an LLM can query structured data through natural language

Developer Experience & Operations (8 projects)

Repository	Problem It Solves	How It Works
architectural-design-review-agent	Architecture reviews are inconsistent — different reviewers check different things	Accepts YAML architecture briefs, runs them through an LLM, and generates structured reviews with risk assessments, open questions, and checklists. Includes stub mode for testing without an LLM
living-docforce-agent	Documentation goes stale the moment code changes — nobody notices until it's already wrong	Compares code and config files against their corresponding docs, labels each gap by severity, and logs evidence of every drift it finds
meeting-memory-companion	Meeting notes are unstructured text — you can't search "who owns action item X?" without reading everything	Extracts attendees, decisions, and action items from raw meeting notes using 6-category classification into a structured, queryable record
devex-metrics-dashboard	Teams can't see their developer experience health at a glance — they rely on gut feel	Ingests delivery signals (build times, PR cycle time, incident counts) and computes 7 DORA-aligned health indicators with a compact status snapshot
devex-env-bootstrap-agent	New developers waste days setting up their local environment because setup docs are stale or missing	Scans the project folder, detects languages and dependencies, and generates a bootstrap script plus onboarding checklist automatically
ai-ops-kpi-pipeline	AI operations metrics are scattered across tools with no unified view	ETL pipeline that ingests KPI snapshots from multiple sources, aggregates them using suffix-based heuristics, and produces a single evidence-backed report
ai-deployment-experience-bot	Developers switch between Slack and deployment dashboards to trigger deploys and check status	Slack bot that lets developers trigger deployments, rollbacks, and status checks via simple chat commands without leaving the conversation
agent-zero-dashboard	Agent Zero's autonomous tasks run in the background with no visibility into what it's doing or why	Real-time monitoring dashboard that visualizes Agent Zero's current state, task queue, and execution history

Real-Time AI & Meeting Intelligence (6 projects)

Repository	Problem It Solves	How It Works
Project-Aurora-Echo	Meetings produce hours of audio — decisions and action items get lost because nobody re-listens	Captures audio in real time, transcribes with faster-whisper, identifies who said what via speaker diarization, and generates structured summaries through an LLM. v2.0 adds NVIDIA GPU acceleration
realtime-ai-assistant	Same core problem — exploring different architectures for live meeting transcription and summarization	Evolved through four iterations: vanilla Python → improved pipeline → FastAPI + WebSocket → Streamlit UI, each using xAI Grok for live summarization

Salesforce / Agentforce (3 projects)

Repository	Problem It Solves	How It Works
Agentforce-Data-Aware-Agent	Salesforce AI agents can access org data without respecting field-level security or sharing rules — a compliance risk	Auto-discovers the org schema (objects, fields, relationships), enforces FLS and sharing rules, then runs safe actions (SOQL, Flow, Apex) within those boundaries
Agentforce-Dynamic-Action	Every new Salesforce agent capability requires manually writing Apex code	Takes a natural-language goal description and generates working Apex code on the fly to execute it
Agentforce-Dynamic-Action-Invocable	The Dynamic Action pattern needs to work inside Salesforce Flows, Einstein Copilot, and REST — not just standalone	Invocable wrapper that exposes the dynamic action generator to Flow, Einstein Copilot, and REST endpoints

Classical AI & Side Projects (7 projects)

Repository	Problem It Solves	How It Works
LISP-Backward-Chaining-Core-Engine	Modern AI is all neural nets — classic rule-based reasoning that can explain its conclusions is underrepresented	Common Lisp expert system that works backward from a goal, checking rules and facts until it can prove or disprove the goal, with certainty scores on every conclusion
lisp-car-expert-system	Demonstrating practical symbolic AI with a real-world application and a complete dev environment	Rule-based car troubleshooting system with a forward chaining inference engine — starts from symptoms and fires rules until it reaches a diagnosis. Includes VS Code + SBCL setup
Chess-Analyzer-v2	Analyzing chess games online means switching between multiple tools and sites	React + TypeScript web app that imports Chess.com games, runs in-browser move analysis, and shows performance trends on interactive dashboards
chess-analyzer	Desktop chess analysis with deeper engine integration and multi-provider AI commentary	Cross-platform Python app with Stockfish evaluation, AI-generated move explanations (why was that move bad?), and batch game processing
Thermomix-Recipe-Genius	Thermomix TM6 recipes need exact weight-based measurements that regular recipes don't provide	React + TypeScript frontend that uses Google Gemini and xAI Grok to generate and convert weight-based Thermomix recipes
bongo_cat_monitor_remix	System monitoring is boring — why not make it fun with an animated desk companion?	ESP32 microcontroller with TFT display: animated Bongo Cat that reacts to CPU usage, RAM, and typing speed, plus meme triggers and Imgflip integration
Github-Avatar-Rotator	GitHub avatars get stale — automating rotation keeps the profile fresh	Script that cycles through a set of avatar images using the GitHub API on a schedule

Plus 8 private repositories covering compliance training data pipelines, SDLC validation, architecture review, and load testing infrastructure.

MIT AI/ML Certificate · Salesforce: Data Cloud Consultant · Administrator · AI Associate · Scrum: CSM

nmaine@gmail.com · LinkedIn · nathanmaine.com · HuggingFace

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

NathanMaine-Labs

Nathan Maine

CMMC Compliance AI Platform (v1.5.1)

AI Agent Evaluation & Dark Factory Testing

Agent Infrastructure

Popular repositories Loading

Repositories

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

People

Top languages

Uh oh!

Most used topics