CLAUDE.md

This file provides guidance to Codex CLI (OpenAI Codex) when working with code in this repository.

Project Overview

Auto-Codex is a multi-agent autonomous coding framework that builds software through coordinated AI agent sessions. It uses the OpenAI Codex CLI to run agents in isolated workspaces with security controls.

Note on naming:

Repo/app name: Auto-Codex
Python backend directory: auto-codex/
Desktop UI directory: auto-codex-ui/
Per-project working data: .auto-codex/ (created inside your target project)
Work branches/worktrees: auto-codex/*

Commands

Prerequisites

Python 3.12+
Node.js 18+
Codex CLI (npm install -g @openai/codex)

Setup

# Install dependencies (from auto-codex/)
cd auto-codex
uv venv && uv pip install -r requirements.txt
# Or: python3 -m venv .venv && source .venv/bin/activate && pip install -r requirements.txt

# Install Codex CLI (if not already installed)
npm install -g @openai/codex

# Authenticate Codex CLI (recommended) or set API key directly
codex login
# Or set OPENAI_API_KEY in your shell or .env file
export OPENAI_API_KEY=your-key
# Or add to auto-codex/.env: OPENAI_API_KEY=your-key

Creating and Running Specs

# Create a spec interactively
python3 auto-codex/runners/spec_runner.py --interactive

# Create spec from task description
python3 auto-codex/runners/spec_runner.py --task "Add user authentication"

# Force complexity level (simple/standard/complex)
python3 auto-codex/runners/spec_runner.py --task "Fix button" --complexity simple

# Run autonomous build
python3 auto-codex/run.py --spec 001

# List all specs
python3 auto-codex/run.py --list

Workspace Management

# Review changes in isolated worktree
python3 auto-codex/run.py --spec 001 --review

# Merge completed build into project
python3 auto-codex/run.py --spec 001 --merge

# Discard build
python3 auto-codex/run.py --spec 001 --discard

QA Validation

# Run QA manually
python3 auto-codex/run.py --spec 001 --qa

# Check QA status
python3 auto-codex/run.py --spec 001 --qa-status

Testing

# Install test dependencies (required first time)
cd auto-codex && uv pip install -r ../tests/requirements-test.txt

# Run all tests (use virtual environment pytest)
auto-codex/.venv/bin/python -m pytest -v

# Run single test file
auto-codex/.venv/bin/python -m pytest tests/test_security.py -v

# Run specific test
auto-codex/.venv/bin/python -m pytest tests/test_security.py::test_bash_command_validation -v

# Skip slow tests
auto-codex/.venv/bin/python -m pytest -m "not slow"

Spec Validation

python3 auto-codex/spec/validate_spec.py --spec-dir /path/to/your-project/.auto-codex/specs/001-feature --checkpoint all

Releases

# Automated version bump and release (recommended)
node scripts/bump-version.js patch   # 2.5.5 -> 2.5.6
node scripts/bump-version.js minor   # 2.5.5 -> 2.6.0
node scripts/bump-version.js major   # 2.5.5 -> 3.0.0
node scripts/bump-version.js 2.6.0   # Set specific version

# Then push to trigger GitHub release workflows
git push origin main
git push origin v2.6.0

See RELEASE.md for detailed release process documentation.

Architecture

Core Pipeline

Spec Creation (spec_runner.py) - Dynamic 3-8 phase pipeline based on task complexity:

SIMPLE (3 phases): Discovery → Quick Spec → Validate
STANDARD (6-7 phases): Discovery → Requirements → [Research] → Context → Spec → Plan → Validate
COMPLEX (8 phases): Full pipeline with Research and Self-Critique phases

Implementation (run.py → agent.py) - Multi-session build:

Planner Agent creates subtask-based implementation plan
Coder Agent implements subtasks (can spawn subagents for parallel work)
QA Reviewer validates acceptance criteria
QA Fixer resolves issues in a loop

Key Components

client.py - Codex CLI client with security hooks and tool permissions
security.py + project_analyzer.py - Dynamic command allowlisting based on detected project stack
worktree.py - Git worktree isolation for safe feature development
memory.py - File-based session memory (primary, always-available storage)
graphiti_memory.py - Optional graph-based cross-session memory with semantic search
graphiti_providers.py - Multi-provider factory for Graphiti (OpenAI, Anthropic, Azure, Ollama, Google AI)
graphiti_config.py - Configuration and validation for Graphiti integration
linear_updater.py - Optional Linear integration for progress tracking

Agent Prompts (auto-codex/prompts/)

Prompt	Purpose
planner.md	Creates implementation plan with subtasks
coder.md	Implements individual subtasks
coder_recovery.md	Recovers from stuck/failed subtasks
qa_reviewer.md	Validates acceptance criteria
qa_fixer.md	Fixes QA-reported issues
spec_gatherer.md	Collects user requirements
spec_researcher.md	Validates external integrations
spec_writer.md	Creates spec.md document
spec_critic.md	Self-critique using ultrathink
complexity_assessor.md	AI-based complexity assessment

Spec Directory Structure

Each spec in .auto-codex/specs/XXX-name/ (inside your target project) contains:

spec.md - Feature specification
requirements.json - Structured user requirements
context.json - Discovered codebase context
implementation_plan.json - Subtask-based plan with status tracking
qa_report.md - QA validation results
QA_FIX_REQUEST.md - Issues to fix (when rejected)

Branching & Worktree Strategy

Auto-Codex uses git worktrees for isolated builds. All branches stay LOCAL until user explicitly pushes:

main (user's branch)
└── auto-codex/{spec-name}  ← spec branch (isolated worktree)

Key principles:

ONE branch per spec (auto-codex/{spec-name})
Parallel work uses subagents (agent decides when to spawn)
NO automatic pushes to GitHub - user controls when to push
User reviews in spec worktree (.worktrees/{spec-name}/)
Final merge: spec branch → main (after user approval)

Workflow:

Build runs in isolated worktree on spec branch
Agent implements subtasks (can spawn subagents for parallel work)
User tests feature in .worktrees/{spec-name}/
User runs --merge to add to their project
User pushes to remote when ready

Security Model

Three-layer defense:

OS Sandbox - Bash command isolation
Filesystem Permissions - Operations restricted to project directory
Command Allowlist - Dynamic allowlist from project analysis (security.py + project_analyzer.py)

Security profile cached in .auto-codex-security.json.

Memory System

Dual-layer memory architecture:

File-Based Memory (Primary) - memory.py

Zero dependencies, always available
Human-readable files in specs/XXX/memory/
Session insights, patterns, gotchas, codebase map

Graphiti Memory (Optional Enhancement) - graphiti_memory.py

Graph database with semantic search (FalkorDB)
Cross-session context retrieval
Multi-provider support (V2):
- LLM: OpenAI, Anthropic, Azure OpenAI, Ollama, Google AI (Gemini)
- Embedders: OpenAI, Voyage AI, Azure OpenAI, Ollama, Google AI

Enable with: GRAPHITI_ENABLED=true + provider credentials. See .env.example.

Project Structure

Auto-Codex can be used in two ways:

As a standalone CLI tool:

python3 auto-codex/run.py --spec 001

With the optional Electron frontend (auto-codex-ui/):

Provides a GUI for task management and progress tracking
Wraps the CLI commands - the backend works independently

Directory layout:

auto-codex/ - Python backend/CLI (the framework code)
auto-codex-ui/ - Optional Electron frontend
.auto-codex/specs/ - Per-project data (specs, plans, QA reports) - gitignored

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

CLAUDE.md

Project Overview

Commands

Prerequisites

Setup

Creating and Running Specs

Workspace Management

QA Validation

Testing

Spec Validation

Releases

Architecture

Core Pipeline

Key Components

Agent Prompts (auto-codex/prompts/)

Spec Directory Structure

Branching & Worktree Strategy

Security Model

Memory System

Project Structure

FilesExpand file tree

CLAUDE.md

Latest commit

History

CLAUDE.md

File metadata and controls

CLAUDE.md

Project Overview

Commands

Prerequisites

Setup

Creating and Running Specs

Workspace Management

QA Validation

Testing

Spec Validation

Releases

Architecture

Core Pipeline

Key Components

Agent Prompts (auto-codex/prompts/)

Spec Directory Structure

Branching & Worktree Strategy

Security Model

Memory System

Project Structure