A serverless monorepo platform for ephemeral AI coding agent environments on Cloudflare Workers + Hetzner Cloud VMs.
apps/
├── api/ # Cloudflare Worker API (Hono)
├── web/ # Control plane UI (React + Vite)
├── www/ # Marketing website, blog & docs (Astro + Starlight) — simple-agent-manager.org
└── tail-worker/ # Cloudflare Tail Worker (observability)
packages/
├── shared/ # Shared types and utilities
├── providers/ # Cloud provider abstraction (Hetzner, Scaleway)
├── terminal/ # Shared terminal component
├── cloud-init/ # Cloud-init template generator
├── acp-client/ # Shared ACP React components (MessageBubble, MessageActions, AudioPlayer)
├── ui/ # Design system tokens and shared UI components
└── vm-agent/ # Go VM agent (PTY, WebSocket, ACP, MCP tool endpoints)
tasks/ # Task tracking (backlog -> active -> archive)
specs/ # Feature specifications
docs/ # Documentation
strategy/ # Strategic planning (competitive, business, marketing, engineering, content)
pnpm install # Install dependencies
pnpm build # Build all packages
pnpm test # Run tests
pnpm typecheck # Type check
pnpm lint # Lint
pnpm format # FormatBuild packages in dependency order: shared -> providers -> cloud-init -> api / web
pnpm --filter @simple-agent-manager/shared build
pnpm --filter @simple-agent-manager/providers build
pnpm --filter @simple-agent-manager/api buildThis monorepo has TWO separate web surfaces. Do NOT confuse them:
| Surface | Directory | Domain | Stack | What it is |
|---|---|---|---|---|
| Marketing website | apps/www/ |
simple-agent-manager.org |
Astro + Starlight | Public website, landing pages, blog, docs |
| App (control plane) | apps/web/ |
app.simple-agent-manager.org |
React + Vite | Authenticated SaaS UI (dashboard, projects, settings) |
When the user mentions website, marketing, landing page, blog, docs site, or public pages → look in apps/www/.
When the user mentions app, dashboard, projects, settings, or UI → look in apps/web/.
Local-first, Cloudflare-integrated. Prove as much of a feature as you can locally before touching staging. Local iteration takes seconds; staging iteration takes minutes and burns VM quota. Staging is for things that genuinely require real infrastructure (OAuth callbacks, DNS, VM provisioning, edge TLS) — not for discovering whether your code compiles.
- Prototype and test locally first — unit tests, Miniflare integration tests, local Vite dev server, Playwright visual audits. Hybrid loops (local UI against staging API, or local API against staging VM agent) are encouraged. See
.claude/rules/29-local-first-debugging.md. - Deploy to staging only when local verification is exhausted — when the remaining work genuinely needs real OAuth, DNS, or VMs. Partial-feature staging deploys are fine for end-to-end plumbing while the rest is still developed locally. Staging deploys take ~7 minutes via
gh workflow run deploy-staging.yml. - Query staging directly via Cloudflare API — use
$CF_TOKENto query D1 (SQL), read/write KV, check DNS records, and inspect Workers. This is the fastest way to verify deploys, debug issues, and understand staging state. Always check infrastructure state via CF API before guessing at fixes. See.claude/rules/32-cf-api-debugging.mdfor the full cheat sheet. - When something fails on staging, QUERY THEN READ LOGS before changing any code — first query D1/KV/DNS via CF API to understand the data state, then use
wrangler tail,/admin/logs,/admin/errors, the Node detail page's log stream,journalctl -u vm-agentvia SSH,docker logsfor containers. Never guess-and-redeploy. See.claude/rules/29-local-first-debugging.mdfor the log location matrix. - Merge to main — triggers production deployment.
Full local-development guide: docs/guides/local-development.md.
Merge to main automatically deploys to production via GitHub Actions.
- CI (
ci.yml): lint, typecheck, test, build on all pushes/PRs - Deploy Staging (
deploy-staging.yml): manual trigger only (workflow_dispatch) — agents trigger this explicitly during/doPhase 6 - Deploy Production (
deploy.yml): full Pulumi + Wrangler deployment on push to main - Teardown (
teardown.yml): manual only — destroys all resources
Staging deployment is manual — triggered via gh workflow run deploy-staging.yml --ref <branch>. Agents executing the /do workflow MUST deploy to staging and verify the live app before merging. A failed staging deploy blocks merge just like a failed test. Before triggering a deployment, check for existing active runs and wait at least 5 minutes if one is in progress. If the deploy fails due to missing secrets or configuration (not code), alert the user immediately — do not skip verification. See .claude/rules/13-staging-verification.md.
Staging verification means the feature WORKS — not that pages load, not that config endpoints respond, not that the UI renders. The actual feature, exercised as an end user would, must complete successfully with ZERO errors. If the feature errors on staging for ANY reason (missing binding, wrong toolchain version, unconfigured service), do NOT merge — alert the user immediately. Never rationalize a staging error as "expected." See .claude/rules/30-never-ship-broken-features.md.
After merging ANY PR to main, agents MUST monitor the Deploy Production workflow to completion. If the deploy fails, alert the user immediately with the failure reason and whether it requires human intervention. Do NOT silently finish the task when the deploy fails — a merged PR is not shipped until the deploy succeeds. See the /do workflow Phase 7b for the full procedure.
Production data loss is catastrophic and irreversible. Multiple deterministic gates prevent it:
| Gate | Runs in | What it catches |
|---|---|---|
pnpm quality:migration-safety |
CI (every PR) | DROP TABLE on CASCADE parents, DELETE without WHERE, PRAGMA foreign_keys=OFF, UPDATE without WHERE, any DROP TABLE in new migrations |
pnpm quality:do-migration-safety |
CI (every PR) | DROP TABLE, DELETE without WHERE, UPDATE without WHERE in Durable Object SQLite migrations (no recovery mechanism) |
| Pre-migration D1 backup | Deploy pipeline | Creates time-travel bookmark + explicit backup before every migration run |
| Post-migration row count verification | Deploy pipeline | Compares row counts before/after migrations; blocks deploy if >50% data loss detected in any table |
| D1 Time Travel Restore | Manual workflow | Point-in-time recovery for D1 databases (30-day window). See d1-restore.yml |
Migration rules: See .claude/rules/31-migration-safety.md. NEVER use DROP TABLE on any table with CASCADE children. Use ALTER TABLE ADD COLUMN instead of table recreation.
- Workspace: AI coding environment (VM + devcontainer + Claude Code)
- Node: VM host that runs multiple workspaces
- Provider: Cloud infrastructure abstraction (currently Hetzner only)
- Project: Primary organizational unit linking a GitHub repo to workspaces, chat sessions, tasks, and activity
- ProjectData DO: Per-project Durable Object with embedded SQLite for chat sessions, messages, activity events, and ACP sessions (spec 027). Accessed via
env.PROJECT_DATA.idFromName(projectId) - NodeLifecycle DO: Per-node Durable Object managing warm pool state machine (active → warm → destroying). Accessed via
env.NODE_LIFECYCLE.idFromName(nodeId). Handles idle timeout alarms; actual infrastructure teardown delegated to cron sweep. - Warm Node Pooling: After task completion, auto-provisioned nodes enter "warm" state for 30 min (configurable via
NODE_WARM_TIMEOUT_MS) for fast reuse. Three-layer defense against orphans: DO alarm + cron sweep + max lifetime. - Task Runner: Autonomous task execution — selects/provisions nodes, creates workspaces, runs agents, cleans up. VM size precedence: explicit override > project default > platform default.
- Lifecycle Control: Workspaces/nodes stopped, restarted, or deleted explicitly via API/UI
The root domain does NOT serve any application. Always use subdomains:
| Destination | URL Pattern |
|---|---|
| Web UI | https://app.${BASE_DOMAIN}/... |
| API | https://api.${BASE_DOMAIN}/... |
| Workspace | https://ws-${id}.${BASE_DOMAIN} |
| Workspace Port | https://ws-${id}--${port}.${BASE_DOMAIN} |
- User-facing redirects ->
app.${BASE_DOMAIN}(NEVER bare${BASE_DOMAIN}) - API-to-API references ->
api.${BASE_DOMAIN} - Relative redirects in API worker are WRONG — they resolve to the API subdomain
GitHub Actions secret names cannot start with GITHUB_*, so GitHub App secrets use GH_* prefix. The deployment script (configure-secrets.sh) maps them to GITHUB_* Worker secrets.
| Context | Prefix | Example |
|---|---|---|
| GitHub Environment | GH_ |
GH_CLIENT_ID |
| Worker runtime / .env | GITHUB_ |
GITHUB_CLIENT_ID |
Full env var reference: use the env-reference skill or see apps/api/.env.example.
Environment-specific [env.*] sections are NOT checked into the repository. They are generated at deploy time by scripts/deploy/sync-wrangler-config.ts from Pulumi outputs + the top-level config. When adding ANY new binding to wrangler.toml, add it to the top-level section only. The sync script copies static bindings (Durable Objects, AI, migrations) and generates dynamic bindings (D1, KV, R2, worker name, routes, tail_consumers) automatically. The CI quality check (pnpm quality:wrangler-bindings) verifies that no env sections are committed and that required binding types are present at the top level. See .claude/rules/07-env-and-urls.md for details.
- BYOC (Bring-Your-Own-Cloud): Users provide their own Hetzner tokens. The platform does NOT have cloud provider credentials.
- User credentials encrypted per-user in the database — NOT stored as env vars or Worker secrets. See
docs/architecture/credential-security.md. - Platform secrets (ENCRYPTION_KEY and purpose-specific overrides, JWT keys, CF_API_TOKEN) are Cloudflare Worker secrets set during deployment. See
docs/architecture/secrets-taxonomy.md. - Canonical IDs for identity — use
workspaceId,nodeId,sessionIdfor all machine-critical operations (storage, routing, lifecycle). Human-readable labels are for UX/logging only and MUST be treated as mutable and non-unique. - Hybrid D1 + Durable Object storage — D1 for cross-project queries (dashboard, tasks, users); per-project DOs for write-heavy data (chat sessions, messages, activity events). See
docs/adr/004-hybrid-d1-do-storage.md.
- Always use worktrees and PRs — never commit directly to main. Create a feature branch in a git worktree and open a PR.
- Push early and often — environments are ephemeral. Unpushed work can be lost at any time.
- Pull and rebase frequently — before starting work and before pushing, run
git fetch origin && git rebase origin/mainto stay current and avoid conflicts. - After pushing, check CI and fix any failures before moving on.
- Fix all build/lint errors before pushing — even pre-existing ones
- No dead code — if code is no longer referenced, remove it in the same change
- Capability tests required — every multi-component feature needs at least one test that exercises the complete happy path across system boundaries. Component tests alone are not sufficient. See
.claude/rules/10-e2e-verification.md. - Verify assumptions, don't trust documentation — when specs or docs say "existing X works," verify with a test or manual check before building on it. See post-mortem:
docs/notes/2026-02-28-missing-initial-prompt-postmortem.md. - Cite code paths in behavioral docs — when documenting what the system does, cite specific functions. Never write "X happens" without a code reference. Mark unimplemented behavior as "intended" not present tense.
- Diagrams in markdown — use Mermaid (
\``mermaid) for all diagrams in.md` files. The markdown renderer supports Mermaid natively. - Subagents live in
.claude/agents/; Codex skills in.agents/skills/ - Playwright screenshots go in
.codex/tmp/playwright-screenshots/(gitignored) - Playwright visual audit required for UI changes — any PR touching
apps/web/,packages/ui/, orpackages/terminal/must run Playwright visual tests with diverse mock data on mobile (375px) and desktop (1280px) viewports. See.claude/rules/17-ui-visual-testing.md. - No duplicate UI controls — before adding any new settings control or form field, search for existing controls managing the same API field. Consolidate into one canonical location. See
.claude/rules/24-no-duplicate-ui-controls.md.
Claude Code supports dual authentication: API keys (pay-per-use from Anthropic Console) and OAuth tokens (from Claude Max/Pro subscriptions via claude setup-token). Users toggle between them in Settings. The system injects CLAUDE_CODE_OAUTH_TOKEN or ANTHROPIC_API_KEY based on active credential type.
- Staging authentication: Use the smoke test token in
SAM_PLAYWRIGHT_PRIMARY_USERenv var. POST it tohttps://api.sammy.party/api/auth/token-loginwith body{ "token": "<value>" }to get a session cookie, then navigate tohttps://app.sammy.party. See.claude/rules/13-staging-verification.mdfor full procedure. - Production authentication: Use GitHub OAuth credentials at
/workspaces/.tmp/secure/demo-credentials.md(outside repo) - Live test cleanup required: delete test workspaces/nodes after verification
- Staging verification required for every code PR — see
.claude/rules/13-staging-verification.md - See
.claude/rules/02-quality-gates.mdfor full testing requirements
When you discover bugs or errors during testing — even if unrelated to your current task — file them as backlog tasks immediately so they don't get lost:
- Create
tasks/backlog/YYYY-MM-DD-descriptive-name.md - Include: Problem description, Context (where/when discovered), Acceptance Criteria checklist
- Continue with your current work
- Build errors: Run builds in dependency order (see Build Order above)
- Test failures: Check Miniflare bindings are configured in
vitest.config.ts - Type errors: Run
pnpm typecheckfrom root to see all issues - Staging issues: Query staging state directly via
$CF_TOKENand the Cloudflare API — D1 SQL queries, KV reads, DNS checks. See.claude/rules/32-cf-api-debugging.mdfor copy-paste commands. Always query before guessing.
Tasks tracked as markdown in tasks/ (backlog -> active -> archive). See tasks/README.md for conventions.
Dispatching tasks: When dispatching tasks to other agents, always instruct them to use the /do skill. This ensures the receiving agent follows the full end-to-end workflow (research, implement, review, staging verify, PR). See .claude/rules/09-task-tracking.md.
Strategic planning artifacts live in strategy/ — see strategy/README.md for full structure.
| Domain | Directory | Skill | Key Artifacts |
|---|---|---|---|
| Competitive Research | strategy/competitive/ |
/competitive-research |
Competitor profiles, feature matrix, positioning map, SWOT |
| Marketing | strategy/marketing/ |
/marketing-strategy |
Positioning doc, messaging guide, content calendar, gap analysis |
| Business | strategy/business/ |
/business-strategy |
Market sizing (TAM/SAM/SOM), pricing, business model, GTM plan |
| Engineering | strategy/engineering/ |
/engineering-strategy |
Roadmap (Now/Next/Later), tech radar, tech debt register |
| Content | strategy/content/ |
/content-create |
Social posts, blog drafts, changelogs, launch copy |
Domains chain together: competitive research feeds marketing and business strategy, which feed engineering priorities and content creation.
- TypeScript 5.x (API Worker) + @mastra/core (AI agent orchestration), workers-ai-provider (Vercel AI SDK bridge to Workers AI), Cloudflare Workers AI binding (llm-task-title-generation)
- TypeScript 5.x (Worker/Web), Go 1.24+ (VM Agent) + Hono (API framework), Drizzle ORM (D1), React + Vite (Web), Cloudflare Workers SDK (Durable Objects) (018-project-first-architecture)
- Cloudflare D1 (platform metadata) + Durable Objects with SQLite (per-project high-throughput data) + KV (ephemeral tokens) + R2 (agent binaries) (018-project-first-architecture)
- TypeScript 5.x (React 19 + Vite for web UI) + React 19, React Router 7, Vite, existing
@simple-agent-manager/uidesign system (019-ui-overhaul) - N/A (frontend-only changes; backend APIs already exist from spec 018) (019-ui-overhaul)
- Go 1.24 (VM Agent) with log/slog structured logging, TypeScript 5.x (API Worker + Web UI) (020-node-observability)
- journald (systemd journal) on VM for log aggregation; Docker journald log driver; no new database storage (020-node-observability)
- TypeScript 5.x (API Worker, Web UI), Go 1.24+ (VM Agent) + Hono (API), React 19 + Vite (Web), Drizzle ORM (D1), Cloudflare Workers SDK (DOs), ACP Go SDK, cenkalti/backoff/v5 (new, Go retry) (021-task-chat-architecture)
- Cloudflare D1 (relational metadata), Durable Objects with SQLite (per-project chat data), VM-local SQLite (message outbox) (021-task-chat-architecture)
- TypeScript 5.x (API Worker + Web UI), Go 1.24+ (VM Agent) + Hono (API framework), Drizzle ORM (D1), React 19 + Vite (Web), Cloudflare Workers SDK (Durable Objects),
creack/pty+gorilla/websocket(VM Agent) (022-simplified-chat-ux) - TypeScript 5.x (API Worker + Web UI) + Hono (API), React 19 + Vite (Web), Drizzle ORM (D1), Cloudflare Workers SDK (Durable Objects, Tail Workers) (023-admin-observability)
- Cloudflare D1 (new
OBSERVABILITY_DATABASEfor errors) + existing D1 (DATABASEfor health queries) + Cloudflare Workers Observability API (historical logs, 7-day retention) (023-admin-observability) - TypeScript 5.x (React 19 + Vite 5) + Tailwind CSS v4,
@tailwindcss/viteplugin, React 19, Vite 5, Lucide React (024-tailwind-adoption) - N/A (no backend changes) (024-tailwind-adoption)
- TypeScript 5.x (React 19 + Vite) + React 19,
@simple-agent-manager/acp-client(shared components), Tailwind CSS v4 (026-chat-message-parity) - N/A (frontend-only changes; no database or API changes) (026-chat-message-parity)
- TypeScript 5.x (API Worker + Web UI), Go 1.24+ (VM Agent) + Hono (API), Drizzle ORM (D1), React 19 + Vite (Web), Cloudflare Workers SDK (Durable Objects),
creack/pty+gorilla/websocket(VM Agent), ACP Go SDK (027-do-session-ownership) - Cloudflare D1 (cross-project queries), Durable Objects with SQLite (per-project session data), VM-local SQLite (message outbox) (027-do-session-ownership)
- TypeScript 5.x (Cloudflare Workers runtime) + Hono (API framework), Drizzle ORM (D1),
@simple-agent-manager/shared,@simple-agent-manager/cloud-init(028-provider-infrastructure) - Cloudflare D1 (credentials table with AES-GCM encrypted tokens) (028-provider-infrastructure)
- ai-proxy-universal-tracking: Universal AI proxy passthrough for usage tracking — URL-path-based proxy routes (
apps/api/src/routes/ai-proxy-passthrough.ts) at/ai/proxy/:wstoken/anthropic/v1/messages,/ai/proxy/:wstoken/anthropic/v1/messages/count_tokens,/ai/proxy/:wstoken/openai/v1/chat/completionsembed workspace callback token in URL path, freeing auth headers for user's own API keys; runtime.ts (apps/api/src/routes/workspaces/runtime.ts) now ALWAYS returnsinferenceConfigwhen AI proxy is enabled — two modes:apiKeySource: 'user-credential'with provideranthropic-passthroughoropenai-passthrough(user has own key, passthrough proxy for tracking only) andapiKeySource: 'callback-token'with provideranthropic-proxyoropenai-proxy(platform proxy, existing behavior); base URLs use{wstoken}placeholder replaced at injection time by VM agent (packages/vm-agent/internal/acp/session_host.go); per-user RPM rate limiting and daily token budget applied;cf-aig-metadatainjected for cost attribution via AI Gateway; user credentials forwarded viax-api-key(Anthropic) orAuthorization(OpenAI) headers; configurable via AI_PROXY_ENABLED, AI_PROXY_RATE_LIMIT_RPM, AI_GATEWAY_ID, CF_ACCOUNT_ID - user-ai-budget-controls: User-facing AI budget controls —
GET /api/usage/ai/budgetroute (apps/api/src/routes/usage.ts) returns user's budget settings, daily usage, effective limits (3-tier resolution: user → env → constant), monthly cost from AI Gateway logs, utilization percentages, and exceeded flag;PUT /api/usage/ai/budgetvalidates and saves custom budget settings to KV (ai-budget-settings:{userId});DELETE /api/usage/ai/budgetresets to platform defaults; budget service (apps/api/src/services/ai-token-budget.ts) providesgetUserBudgetSettings(),saveUserBudgetSettings(),deleteUserBudgetSettings(),validateBudgetUpdate(),resolveEffectiveLimits(),checkTokenBudget(),incrementTokenUsage();BudgetSettingsSectioncomponent inSettingsComputeUsage.tsxwithBudgetBarutilization progress bars (color-coded: green < 80%, yellow 80-99%, red ≥ 100%), budget exceeded alert banner, Configure/Save/Cancel form with daily input/output token limits, monthly cost cap, alert threshold; shared typesUserAiBudgetSettings,UserAiBudgetResponse,UpdateAiBudgetRequestinpackages/shared/src/types/ai-usage.ts; configurable via AI_PROXY_DAILY_INPUT_TOKEN_LIMIT (default: 500000), AI_PROXY_DAILY_OUTPUT_TOKEN_LIMIT (default: 200000), AI_USAGE_MAX_DAILY_TOKEN_LIMIT (default: 10000000), AI_USAGE_MAX_MONTHLY_COST_CAP_USD (default: 1000), AI_USAGE_MIN_DAILY_TOKEN_LIMIT (default: 1000), AI_USAGE_MIN_MONTHLY_COST_CAP_USD (default: 0.01), AI_USAGE_BUDGET_TTL_SECONDS (default: 90000) - anthropic-proxy-endpoint: Native Anthropic Messages API proxy —
POST /ai/anthropic/v1/messagespass-through to Cloudflare AI Gateway (apps/api/src/routes/ai-proxy-anthropic.ts); receives native Anthropic format, forwards unchanged to AI Gateway's/anthropic/v1/messagespath (no format translation); auth viax-api-keyheader (workspace callback token — matches Claude Code's auth format); forwardsanthropic-versionandanthropic-betaheaders; SSE streaming pass-through; model validation (claude-* only);POST /ai/anthropic/v1/messages/count_tokenstoken counting endpoint; shared helpers inapps/api/src/services/ai-proxy-shared.ts(extractCallbackToken,verifyAIProxyAuth,buildAIGatewayMetadata,buildAnthropicGatewayUrl,AIProxyAuthError); upstream auth resolved viaresolveUpstreamAuth()fromai-billing.ts— supports Unified Billing (cf-aig-authorizationheader withCF_AIG_TOKEN ?? CF_API_TOKENfallback) and platform API key (x-api-key) modes;cf-aig-metadataheader injected in all billing modes for cost attribution; per-user RPM rate limiting and daily token budget shared with OpenAI proxy; Anthropic-format error responses ({ type: "error", error: { type, message } }); configurable via AI_PROXY_ENABLED (kill switch), AI_PROXY_RATE_LIMIT_RPM (default: 30), AI_PROXY_RATE_LIMIT_WINDOW_SECONDS (default: 60), AI_PROXY_DAILY_INPUT_TOKEN_LIMIT (default: 500000), AI_PROXY_DAILY_OUTPUT_TOKEN_LIMIT (default: 200000), AI_GATEWAY_ID, AI_PROXY_BILLING_MODE (default: auto) - user-ai-usage-dashboard: User-facing LLM usage dashboard —
GET /api/usage/ai?period=current-month|7d|30d|90droute (apps/api/src/routes/usage.ts) queries Cloudflare AI Gateway logs API filtered by authenticated user's metadata.userId; sharedai-gateway-logs.tsservice (apps/api/src/services/ai-gateway-logs.ts) extracted from duplicated admin-costs/admin-ai-usage code providesfetchGatewayLogs(),iterateGatewayLogs(),parseGatewayPeriod(),getGatewayPeriodBounds(),getPeriodLabel(),resolveGatewayPagination(),aggregateByModel(),aggregateByDay()functions; sharedUserAiUsageResponsetype inpackages/shared/src/types/ai-usage.ts;AiUsageSectioncomponent inSettingsComputeUsage.tsxwith KPI cards (total cost, requests, input/output tokens), model breakdown (cost, request count, cached/error counts), daily trend bars, period selector; mobile-first layout (2x2 grid mobile, 4-col desktop); AI Gateway metadata compacted to 5 entries (userId, workspaceId, projectId, source, trialId) to respect CF limit; configurable via AI_GATEWAY_ID, AI_USAGE_PAGE_SIZE (default: 50), AI_USAGE_MAX_PAGES (default: 20, hard cap: 20) - cost-monitoring-dashboard: Admin cost monitoring dashboard —
GET /api/admin/costsroute (apps/api/src/routes/admin-costs.ts) aggregates LLM costs from Cloudflare AI Gateway logs API (paginated, per-model/per-day/per-user breakdown, trial cost tracking, cached/error request counts) and compute costs fromgetAllUsersNodeUsageSummarynode usage service into a unifiedCostSummaryResponse; monthly projection via daily average extrapolation;AdminCosts.tsxpage with KPI cards (LLM cost, monthly projection, compute estimate, combined), daily cost trend AreaChart, cost by model horizontal BarChart + table, cost by user table; period selector (current-month, 30d, 90d); admin tab at/admin/costs; configurable via COST_MONITORING_ENABLED (default: true, set to 'false' to disable), COMPUTE_VCPU_HOUR_COST_USD (default: 0.003), AI_USAGE_PAGE_SIZE (default: 50), AI_USAGE_MAX_PAGES (default: 20, hard cap: 20) - sam-observability-context-tools: SAM observability and codebase context tools — 5 new tools in
apps/api/src/durable-objects/sam-session/tools/enable SAM to search task messages and browse project codebases:list_sessions(browse project chat sessions with status/taskId filters viaprojectDataService.listSessions),get_session_messages(retrieve grouped messages from a session viaprojectDataService.getMessages+groupTokensIntoMessages),search_task_messages(full-text search across task messages viaprojectDataService.searchMessageswith FTS5/LIKE fallback, taskId→sessionId resolution),search_code(GitHub Code Search API withrepo:owner/namequalifier, path/extension filters, text_matches snippets),get_file_content(GitHub Contents API for file content or directory listing with base64 decode); sharedhelpers.tsextractsresolveProjectWithOwnership(),parseRepository(),getUserGitHubToken()fromget-ci-status.ts; SAM_SYSTEM_PROMPT updated with "Task Message Search (Observability)" and "Codebase Context" tool sections; configurable via SAM_SESSION_MESSAGES_LIMIT (default: 50), SAM_SESSION_MESSAGES_MAX_LIMIT (default: 200), SAM_SESSION_LIST_LIMIT (default: 20), SAM_SESSION_LIST_MAX_LIMIT (default: 100), SAM_TASK_MESSAGE_SEARCH_LIMIT (default: 10), SAM_TASK_MESSAGE_SEARCH_MAX_LIMIT (default: 50), SAM_CODE_SEARCH_LIMIT (default: 10), SAM_CODE_SEARCH_MAX_LIMIT (default: 30), SAM_FILE_CONTENT_MAX_BYTES (default: 1048576) - sam-agent-phase-a-tools: SAM Phase A orchestration tools — 4 new tools in
apps/api/src/durable-objects/sam-session/tools/transform SAM from read-only to functional orchestrator:dispatch_task(provisions workspace, runs agent, resolves config via explicit→profile→project→platform chain, reusesstartTaskRunnerDO,generateBranchName,generateTaskTitle,resolveAgentProfile,resolveCredentialSource,projectDataService.createSession/persistMessage),get_task_details(full task details with output/PR/error via D1 tasks+projects join),create_mission(D1 insert + ProjectOrchestrator DO registration, per-project limit enforcement),get_mission(mission status with task summary counts and individual task list via projects join for ownership); all tools registered intools/index.tsSAM_TOOLS array and toolHandlers map; SAM_SYSTEM_PROMPT updated with Observation and Action tool categories;dispatch_taskaccepts optionalmissionIdfor mission-task association; configurable via SAM_DISPATCH_MAX_DESCRIPTION_LENGTH (default: 32000) - sam-conversation-persistence-phase1: SAM conversation persistence and FTS5 search — SamSession DO migration 003 adds
type,linked_session_id,linked_project_idcolumns to conversations table (ALTER TABLE ADD COLUMN, no DROP TABLE) andmessages_ftsFTS5 virtual table (external content, unicode61 tokenizer) with backfill;searchMessages()two-tier strategy (FTS5 MATCH first, LIKE fallback with de-duplication);search_conversation_historytool for SAM agent to query past conversations; frontend (SamPrototype.tsx) loads persisted human conversation on mount with message mapping (assistant→sam, tool_result attachment);GET /searchendpoint on SamSession DO;GET /conversationsacceptstypequery filter;GET /conversations/:id/messagesacceptslimitparam withSAM_HISTORY_LOAD_LIMITdefault; FTS5 sync inpersistMessage()(non-fatal try/catch) and cleanup on conversation eviction; configurable via SAM_FTS_ENABLED (default: true), SAM_SEARCH_LIMIT (default: 10), SAM_SEARCH_MAX_LIMIT (default: 50), SAM_HISTORY_LOAD_LIMIT (default: 200) - policy-propagation-phase4: Policy Propagation system (Phase 4 orchestration) —
project_policiestable in ProjectData DO SQLite (migration 019) with category (rule/constraint/delegation/preference), title, content, source (explicit/inferred), sourceSessionId, confidence, active fields; 5 MCP tools (add_policy,list_policies,get_policy,update_policy,remove_policy) inapps/api/src/routes/mcp/policy-tools.tswith tool definitions intool-definitions-policy-tools.ts; policy injection intoget_instructionsMCP response viaformatPolicyDirectives()(readable text grouped by category with "PROJECT POLICY" headers) andbuildPolicyInstructions()(agent instructions for policy usage); policy propagation viadispatch_task— when dispatching within a mission, active policies are appended to child task descriptions as "## Project Policies (inherited)" section; REST API at/api/projects/:projectId/policies(GET list with category filter and pagination, GET /:id, POST create, PATCH /:id update, DELETE /:id soft-delete) guarded byrequireOwnedProject; service layer inapps/api/src/services/project-data.ts(createPolicy, getPolicy, listPolicies, updatePolicy, removePolicy, getActivePolicies); shared types inpackages/shared/src/types/policy.ts(ProjectPolicy,CreatePolicyRequest,UpdatePolicyRequest,ListPoliciesResponse,isPolicyCategory,isPolicySource); configurable limits inpackages/shared/src/constants/policies.tsviaresolvePolicyLimits(env): POLICY_MAX_PER_PROJECT (default: 100), POLICY_TITLE_MAX_LENGTH (default: 200), POLICY_CONTENT_MAX_LENGTH (default: 2000), POLICY_LIST_PAGE_SIZE (default: 50), POLICY_LIST_MAX_PAGE_SIZE (default: 200), POLICY_DEFAULT_CONFIDENCE (default: 0.8) - project-orchestrator-phase3: Per-project ProjectOrchestrator Durable Object (Phase 3 orchestration) — alarm-driven scheduling loop (default 30s) watches active missions, routes handoff packets from completed tasks to dependents via
enqueueMailboxMessage()withdeliverclass, detects stalled tasks (configurable timeout, default 20min) and sendsinterruptmessages, manages mission lifecycle (pause/resume/cancel); internal SQLite tables (orchestrator_missions,scheduling_queue,decision_log) with idempotent migration; 6 MCP tools (get_orchestrator_status,get_scheduling_queue,pause_mission,resume_mission,cancel_mission,override_task_state) inapps/api/src/routes/mcp/orchestrator-lifecycle-tools.ts; REST API at/api/projects/:projectId/orchestrator/*(status, queue, mission pause/resume/cancel, task state override); service layer inapps/api/src/services/project-orchestrator.ts; integration hooks:create_missionauto-registers with orchestrator,complete_tasktriggers immediate scheduling cycle vianotifyTaskEvent(); wrangler bindingPROJECT_ORCHESTRATORwith migration v10; configurable via ORCHESTRATOR_SCHEDULING_INTERVAL_MS (default: 30000), ORCHESTRATOR_STALL_TIMEOUT_MS (default: 1200000), ORCHESTRATOR_MAX_DISPATCHES_PER_CYCLE (default: 5), ORCHESTRATOR_MAX_ACTIVE_TASKS_PER_MISSION (default: 5), ORCHESTRATOR_DECISION_LOG_MAX_ENTRIES (default: 500), ORCHESTRATOR_RECENT_DECISIONS_LIMIT (default: 20), ORCHESTRATOR_QUEUE_MAX_ENTRIES (default: 100) - durable-interrupts-phase1: Durable messaging layer for orchestration — 5 message classes (notify, deliver, interrupt, preempt_and_replan, shutdown_with_final_prompt) with urgency-based priority ordering; delivery state machine (queued → delivered → acked → expired) with DELIVERY_STATE_TRANSITIONS map enforcing valid transitions; DO alarm-based delivery sweep for retry/re-delivery integrated into recalculateAlarm(); 3 MCP tools (send_durable_message, get_pending_messages, ack_message) in
apps/api/src/routes/mcp/mailbox-tools.ts; REST API at/api/projects/:projectId/mailboxfor message inspection (list with filters, stats, get, cancel); backwards-compatible upgrade of send_message_to_subtask to queue notify-class messages on agent_busy; migration 017-agent-mailbox extends session_inbox via ALTER TABLE ADD COLUMN (no DROP TABLE); configurable via MAILBOX_ACK_TIMEOUT_MS (default: 300000), MAILBOX_REDELIVERY_MAX_ATTEMPTS (default: 5), MAILBOX_TTL_MS (default: 3600000), MAILBOX_DELIVERY_POLL_INTERVAL_MS (default: 30000), MAILBOX_MAX_MESSAGES_PER_PROJECT (default: 1000), MAILBOX_MESSAGE_MAX_LENGTH (default: 32768) - mission-state-handoff-packets: Phase 2 orchestration primitives —
missionsD1 table (migration 0048) with project/user FKs, status, budget_config;mission_idandscheduler_statenullable columns on tasks (ALTER TABLE ADD COLUMN); ProjectData DO migration 018 addsmission_state_entriesandhandoff_packetstables for per-project high-write storage; 8 MCP tools (create_mission,get_mission,add_mission_state,get_mission_state,update_mission_state,delete_mission_state,create_handoff,get_handoffs);dispatch_taskupgraded withmissionIdparameter and parent inheritance; REST API at/api/projects/:projectId/missions/*(list, detail, state entries, handoff packets);computeSchedulerStates()pure function derives scheduler state from dependency graph (11 states: schedulable, blocked_dependency, blocked_budget, blocked_resource, blocked_human, waiting_delivery, stalled, running, completed, failed, cancelled); shared types inpackages/shared/src/types/mission.tsand constants inpackages/shared/src/constants/missions.ts; configurable via MISSION_MAX_PER_PROJECT (default: 50), MISSION_MAX_STATE_ENTRIES (default: 200), MISSION_MAX_HANDOFFS (default: 100), MISSION_TITLE_MAX_LENGTH (default: 200), MISSION_DESCRIPTION_MAX_LENGTH (default: 5000), MISSION_STATE_TITLE_MAX_LENGTH (default: 200), MISSION_STATE_CONTENT_MAX_LENGTH (default: 2000), HANDOFF_SUMMARY_MAX_LENGTH (default: 5000), HANDOFF_MAX_FACTS (default: 50), HANDOFF_MAX_OPEN_QUESTIONS (default: 20), HANDOFF_MAX_ARTIFACT_REFS (default: 30), HANDOFF_MAX_SUGGESTED_ACTIONS (default: 20), MISSION_LIST_PAGE_SIZE (default: 20), MISSION_LIST_MAX_PAGE_SIZE (default: 100) - ai-proxy-gateway: AI inference proxy routes LLM requests through Cloudflare AI Gateway —
POST /ai/v1/chat/completionsaccepts OpenAI-format requests, transparently routes to Workers AI (@cf/_ models) or Anthropic (claude-_ models) with format translation (ai-anthropic-translate.ts); per-user RPM rate limiting + daily token budget via KV; admin model picker at/admin/ai-proxy; AI usage analytics dashboard at/admin/analytics/ai-usageaggregates AI Gateway logs by model, day, cost; Unified Billing support viacf-aig-authorizationheader allows routing Anthropic requests through Cloudflare credits without a stored provider API key; billing mode resolution inai-billing.ts(resolveUpstreamAuth(),resolveBillingMode(),resolveUnifiedBillingToken()) with KV > env > default precedence; token resolution:CF_AIG_TOKEN ?? CF_API_TOKEN(CF_API_TOKEN already a Worker secret); both OpenAI-compat and native Anthropic proxy routes useresolveUpstreamAuth()for consistent billing; admin billing mode toggle (auto/unified/platform-key) viaPATCH /api/admin/ai-proxy/config; configurable via AI_PROXY_ENABLED, AI_PROXY_DEFAULT_MODEL, AI_GATEWAY_ID, AI_PROXY_ALLOWED_MODELS, AI_PROXY_RATE_LIMIT_RPM, AI_PROXY_RATE_LIMIT_WINDOW_SECONDS, AI_PROXY_MAX_INPUT_TOKENS_PER_REQUEST, AI_PROXY_BILLING_MODE (default: auto — unified when CF_AIG_TOKEN or CF_API_TOKEN is set, falls back to platform key), AI_USAGE_PAGE_SIZE, AI_USAGE_MAX_PAGES - trial-agent-boot: TrialOrchestrator
discovery_agent_startstep now runs the full 5-step idempotent VM boot (registers agent session viacreateAgentSessionOnNode, mints MCP token with trialId as synthetic taskId,startAgentSessionOnNodewith discovery prompt + MCP server URL, drives ACP sessionpending → assigned → running; idempotency flagsmcpToken,agentSessionCreatedOnVm,agentStartedOnVm,acpAssignedOnVm,acpRunningOnVmon DO state let crash/retry resume without double-booking); newfetchDefaultBranch()probes GitHub/repos/:owner/:repowith AbortController-bounded fetch and threads the real default branch throughprojects.defaultBranch+ workspacegit clone --branch(master-default repos likeoctocat/Hello-Worldnow work); configurable via TRIAL_GITHUB_TIMEOUT_MS (default: 5000); new capability testapps/api/tests/unit/durable-objects/trial-orchestrator-agent-boot.test.tsasserts every cross-boundary call fires with correct payload; rule 10 updated with port-of-pattern coverage requirement. Seedocs/notes/2026-04-19-trial-orchestrator-agent-boot-postmortem.md. - trial-sse-events-fix: Fixed "zero trial.* events on staging" —
formatSse()inapps/api/src/routes/trial/events.tspreviously emitted named SSE frames (event: trial.knowledge\ndata: {...}), but the frontend subscribes viasource.onmessagewhich only fires for the default (unnamed) event; frames arrived on the wire (curl saw them) but browser EventSource silently dropped them. Now emits unnameddata: {JSON}\n\nframes; theTrialEventpayload's owntypediscriminator preserves dispatch info. Also fixedeventsUrlinapps/api/src/routes/trial/create.tsresponse shape mismatch (/api/trial/events?trialId=X→/api/trial/:trialId/events). New capability testapps/api/tests/workers/trial-event-bus-sse.test.tsasserts noevent:line + JSON round-trip across the TrialEventBus DO → SSE endpoint boundary; unit tests updated to assert new unnamed-frame contract and exacteventsUrlshape (no substring matches on URL contracts). Rule 13 updated to ban curl-only verification of browser-consumed SSE/WebSocket streams — curl confirms bytes, browsers confirm dispatch. Seedocs/notes/2026-04-19-trial-sse-named-events-postmortem.md. - trial-orchestrator-wire-up: TrialOrchestrator Durable Object + GitHub-API knowledge fast-path —
POST /api/trial/createnow fire-and-forget dispatches two concurrentc.executionCtx.waitUntiltasks: (1)env.TRIAL_ORCHESTRATOR.idFromName(trialId)DO state machine (alarm-driven, steps: project_creation → node_provisioning → workspace_creation → workspace_ready → agent_session → completed; idempotentstart(); terminal guard on completed/failed; overall-timeout emitstrial.error); (2)emitGithubKnowledgeEvents()probe hits unauthenticated/repos/:o/:n,/repos/:o/:n/languages,/repos/:o/:n/readmein parallel with AbortController-bounded fetches, emits up toTRIAL_KNOWLEDGE_MAX_EVENTStrial.knowledgeevents (description, primary language, stars, topics, license, language breakdown by bytes, README first paragraph), swallows all errors;apps/api/src/services/trial/bridge.tsbridges ACP session transitions (running→trial.ready,failed→trial.error) and MCP tool calls (add_knowledge→trial.knowledge,create_idea→trial.idea) into the SSE stream viareadTrialByProject()KV lookup (no-op on non-trial projects); new sentinelTRIAL_ANONYMOUS_INSTALLATION_IDrow ingithub_installationsso trial projects satisfy the FK; configurable via TRIAL_ORCHESTRATOR_OVERALL_TIMEOUT_MS (default: 300000), TRIAL_ORCHESTRATOR_STEP_MAX_RETRIES (default: 5), TRIAL_ORCHESTRATOR_RETRY_BASE_DELAY_MS (default: 1000), TRIAL_ORCHESTRATOR_RETRY_MAX_DELAY_MS (default: 60000), TRIAL_ORCHESTRATOR_NODE_READY_TIMEOUT_MS (default: 180000), TRIAL_ORCHESTRATOR_AGENT_READY_TIMEOUT_MS (default: 60000), TRIAL_ORCHESTRATOR_WORKSPACE_READY_TIMEOUT_MS (default: 180000), TRIAL_ORCHESTRATOR_WORKSPACE_READY_POLL_INTERVAL_MS (default: 5000), TRIAL_VM_SIZE (default: DEFAULT_VM_SIZE), TRIAL_VM_LOCATION (default: DEFAULT_VM_LOCATION), TRIAL_KNOWLEDGE_GITHUB_TIMEOUT_MS (default: 5000), TRIAL_KNOWLEDGE_MAX_EVENTS (default: 10) - project-credential-overrides: Per-project agent credential overrides —
credentials.project_idcolumn (migration 0042, nullable FK toprojects.id ON DELETE CASCADE) with two partial unique indexes (WHERE project_id IS NULLfor user-scoped,WHERE project_id IS NOT NULLfor project-scoped);getDecryptedAgentKey(db, userId, agentType, key, projectId?)resolves project → user → platform in order; workspace runtime callback forwardsworkspace.projectId;CodexRefreshLockDO preserves scope on OAuth token rotation; new/api/projects/:id/credentialsroutes (GET/PUT/DELETE) guarded byrequireOwnedProject(404 on cross-user);ProjectAgentsSectionon Project Settings combines credential override and model/permission override per agent usingAgentKeyCard(scope='project') with inheritance hints; cross-user writes rejected at query layer AND ownership check;autoActivateonly affects project-scoped rows (user-scoped untouched) - project-knowledge-graph: Per-project knowledge graph for persistent agent memory —
knowledge_entities,knowledge_observations,knowledge_relationstables + FTS5 virtual table in ProjectData DO SQLite (migration 016); entity-observation-relation model with confidence scoring and recency weighting; 11 MCP tools (add_knowledge,update_knowledge,remove_knowledge,get_knowledge,search_knowledge,get_project_knowledge,get_relevant_knowledge,relate_knowledge,get_related,confirm_knowledge,flag_contradiction) inapps/api/src/routes/mcp/knowledge-tools.ts; auto-retrieval of relevant knowledge inget_instructionsMCP tool; REST API at/api/projects/:projectId/knowledge/*for UI CRUD; Knowledge Browser page at/projects/:id/knowledgewith entity list, search, type filters, detail panel; configurable via KNOWLEDGE_AUTO_RETRIEVE_LIMIT (default: 20), KNOWLEDGE_MAX_ENTITIES_PER_PROJECT (default: 500), KNOWLEDGE_MAX_OBSERVATIONS_PER_ENTITY (default: 100), KNOWLEDGE_SEARCH_LIMIT (default: 20), KNOWLEDGE_SEARCH_MAX_LIMIT (default: 100), KNOWLEDGE_LIST_PAGE_SIZE (default: 50), KNOWLEDGE_LIST_MAX_PAGE_SIZE (default: 200), KNOWLEDGE_OBSERVATION_MAX_LENGTH (default: 1000) - dispatch-task-config-parity: Full task execution config parity for
dispatch_taskMCP tool — extended schema accepts optionalagentProfileId,taskMode(task/conversation),agentType,workspaceProfile(default/lightweight),provider(hetzner/scaleway/gcp),vmLocation; config precedence matches normal submit path: explicit field → agent profile → project default → platform default;resolveAgentProfile()fromagent-profiles.tsresolves profiles by ID or name with built-in seeding; profile-derived values (model,permissionMode,systemPromptAppend) passed through tostartTaskRunnerDO();agentProfileHintandtaskModepersisted in task INSERT for observability; location validated against resolved provider;maxTurns/timeoutMinutesexcluded — not enforced by runtime (documented in task file) - project-file-library-mcp-tools: 4 MCP tools for project file library —
list_library_files(browse with tag/type/source filters, configurable sort and limit),download_library_file(decrypt from R2 and transfer to workspace via VM agent, configurable target directory),upload_to_library(read from workspace via VM agent, encrypt and store with agent source/tags, returns FILE_EXISTS with metadata on duplicate filename),replace_library_file(download new content from workspace, replace via service, additive tag merge, returns FILE_NOT_FOUND for invalid fileId); handlers inapps/api/src/routes/mcp/library-tools.ts; path traversal validation on targetPath; error message sanitization; Authorization Bearer header for server-to-server VM agent calls; configurable via LIBRARY_MCP_DOWNLOAD_DIR (default: .library), LIBRARY_MCP_TRANSFER_TIMEOUT_MS (default: 60000), LIBRARY_UPLOAD_MAX_BYTES (default: 50MB) - strengthen-eslint-configuration: Strengthened ESLint with
eslint-plugin-jsx-a11y(accessibility warnings for.tsx),eslint-plugin-simple-import-sort(import ordering as errors — CI-breaking),@typescript-eslint/consistent-type-imports(enforcesimport typesyntax as errors — auto-fixable),@typescript-eslint/no-non-null-assertion(warning); 645 files auto-fixed for import ordering and type imports; a11y rules start as warnings for incremental adoption;no-consoleenforcement for API code (from PR #581) remains in place - neko-browser-streaming-sidecar: Neko remote browser sidecar for workspaces — VM agent
internal/browser/package manages Neko container lifecycle (start/stop/status) per workspace, Docker network attachment, and socat port forwarders syncing DevContainer ports via/proc/net/tcppolling; 4 HTTP endpoints (POST/GET/DELETE /workspaces/{id}/browser,GET /workspaces/{id}/browser/ports); API Worker proxy routes at both project-session level (/projects/:id/sessions/:sessionId/browser) and workspace level (/workspaces/:id/browser); cloud-init optional Neko image pre-pull (nekoImage,nekoPrePullvariables);BrowserSidecarReact component with workspace/session dual-mode, mobile viewport detection, Neko iframe embed;useBrowserSidecarhook with auto-polling; integrated intoWorkspaceSidebarcollapsible section; configurable via NEKO_IMAGE (default: ghcr.io/m1k1o/neko/google-chrome:latest), NEKO_SCREEN_RESOLUTION (default: 1920x1080), NEKO_MAX_FPS (default: 30), NEKO_WEBRTC_PORT (default: 6080), NEKO_SOCAT_POLL_INTERVAL (default: 5s), NEKO_MIN_RAM_MB (default: 2048), NEKO_ENABLE_AUDIO (default: true), NEKO_TCP_FALLBACK (default: true), NEKO_MUX_PORT (default: 59000), NEKO_NAT1TO1 (default: auto-detect public IP), BROWSER_PROXY_TIMEOUT_MS (default: 30000) - per-project-scaling-provider-locations: Per-project scaling parameters and provider-aware location validation —
PROVIDER_LOCATIONSregistry in shared constants maps each provider (hetzner, scaleway, gcp) to valid locations;isValidLocationForProvider(),getLocationsForProvider(),getDefaultLocationForProvider()validation functions; 9 new nullable columns on projects table (defaultLocation + 8 scaling params: taskExecutionTimeoutMs, maxConcurrentTasks, maxDispatchDepth, maxSubTasksPerTask, warmNodeTimeoutMs, maxWorkspacesPerNode, nodeCpuThresholdPercent, nodeMemoryThresholdPercent);resolveProjectScalingConfig()helper for project→env→default fallback chain;SCALING_PARAMSregistry with ScalingParamMeta for UI generation; API validation on PATCH projects, POST nodes, POST tasks (submit + run); TaskRunner DO uses projectScaling for node capacity thresholds and warm timeout; MCP dispatch tools use project overrides for depth/concurrency/sub-task limits; NodeLifecycle DO accepts warmTimeoutOverrideMs; ScalingSettings UI component with provider/location dropdowns, task limit fields, node scheduling fields, platform defaults as placeholders, per-field reset; location resolution: explicit override → project defaultLocation → provider default → platform default - task-submission-file-attachments: Task submission file attachments via R2 presigned uploads —
POST /api/projects/:id/tasks/request-uploadgenerates presigned PUT URL for direct browser→R2 upload;uploadAttachmentToR2()client function with XHR progress events;TaskSubmitFormandProjectChatattachment UI (paperclip button, progress chips, file validation);validateAttachments()R2 HEAD checks at submit time;attachment_transferexecution step in TaskRunner DO (betweenworkspace_readyandagent_session) downloads from R2 and uploads to workspace.private/via VM agent; augmented initial prompt lists attached files;cleanupAttachments()eager R2 delete after transfer; shared typesTaskAttachment,RequestAttachmentUploadRequest/Response,ATTACHMENT_DEFAULTS,SAFE_FILENAME_REGEX; configurable via R2_ACCESS_KEY_ID, R2_SECRET_ACCESS_KEY, R2_BUCKET_NAME, CF_ACCOUNT_ID, ATTACHMENT_UPLOAD_MAX_BYTES (default: 50MB), ATTACHMENT_UPLOAD_BATCH_MAX_BYTES (default: 200MB), ATTACHMENT_MAX_FILES (default: 20), ATTACHMENT_PRESIGN_EXPIRY_SECONDS (default: 900) - workspace-file-upload-download: File upload/download for workspace sessions — VM agent
POST /workspaces/{id}/files/upload(multipart,docker exec tee, no shell interpolation) with configurable per-file max (FILE_UPLOAD_MAX_BYTES, default: 50MB) and batch max (FILE_UPLOAD_BATCH_MAX_BYTES, default: 250MB);GET /workspaces/{id}/files/download?path=...withdocker exec catand CRLF-stripped Content-Disposition; API proxy routesPOST/GET /api/projects/:id/sessions/:sessionId/files/upload|downloadwith size-limited streaming;uploadSessionFiles/downloadSessionFileclient functions; Paperclip attach button inFollowUpInput; download button inChatFilePanelview mode;.privateupload destination created during bootstrap (ensureVolumeWritable); safe filename regex rejects shell metacharacters; configurable via FILE_UPLOAD_MAX_BYTES, FILE_UPLOAD_BATCH_MAX_BYTES, FILE_UPLOAD_TIMEOUT (VM agent), FILE_UPLOAD_TIMEOUT_MS, FILE_DOWNLOAD_TIMEOUT_MS, FILE_DOWNLOAD_MAX_BYTES (Worker) - file-browser-image-rendering: Image rendering in file browser — new
GET /workspaces/{id}/files/rawendpoint on VM agent streams binary files with MIME detection (mime.TypeByExtension), ETag/304 support, and SVG Content-Security-Policy; API proxy routeGET /api/projects/:id/sessions/:sessionId/files/rawwith separate 50MB limit (FILE_RAW_PROXY_MAX_BYTES);ImageViewercomponent inapps/web/src/components/shared-file-viewer/with fit-to-panel/1:1 toggle, size guardrails (inline < 10MB, click-to-load < 50MB, download-only > 50MB); image detection viaisImageFile()inapps/web/src/lib/file-utils.ts; integrated into bothFileViewerPanel(workspace view) andChatFilePanel(project chat); image icon in file browser listing; configurable via FILE_RAW_MAX_SIZE (default: 50MB), FILE_RAW_TIMEOUT (default: 60s), FILE_RAW_PROXY_MAX_BYTES (default: 50MB), VITE_FILE_PREVIEW_INLINE_MAX_BYTES (default: 10MB), VITE_FILE_PREVIEW_LOAD_MAX_BYTES (default: 50MB) - file-browsing-diff-views-in-chat: Inline file viewer in project chat — four API proxy routes (
GET /api/projects/:id/sessions/:sessionId/files/list,files/view,git/status,git/diff) resolve the session's workspace via D1 workspaces table, generate a terminal token, and proxy to VM agent; path sanitization vianormalizeProjectFilePath; configurable via FILE_PROXY_TIMEOUT_MS (default: 15000), FILE_PROXY_MAX_RESPONSE_BYTES (default: 2097152); newChatFilePanelslide-over component (browse/view/diff/git-status modes) accessed from session header "Files"/"Git" buttons and clickable file refs inToolCallCard; sharedDiffRendererextracted toapps/web/src/components/shared-file-viewer/and reused by bothGitDiffView(workspace view) andChatFilePanel(chat view) - global-persistent-audio-player: Global persistent TTS audio player —
GlobalAudioProviderwraps the app above the router (App.tsx);GlobalAudioPlayerbar renders inAppShell(mobile: below main via flexbox, desktop: spanning full width via CSS Grid row 2); audio survives page navigation; three callers migrated from per-componentuseAudioPlaybacktouseGlobalAudio(ProjectMessageView,TruncatedSummary,TaskDetail);MessageActionsandMessageBubbleinacp-clientaccept newonPlayAudiocallback prop to delegate to external player; new--sam-z-player: 15token added to design system; slide-in animation withprefers-reduced-motionsupport - analytics-engine-phase4-forwarding: Analytics Engine Phase 4 — external event forwarding; daily cron job (
0 3 * * *) queries Analytics Engine for key conversion events (signup, login, project_created, workspace_created, task_submitted) and batch-forwards them to Segment (Track API with Basic auth) and/or GA4 (Measurement Protocol); cursor-based deduplication via KV; new serviceanalytics-forward.tswithrunAnalyticsForward()orchestrator; admin dashboardForwardingStatuscard showing enabled state, last-forwarded timestamp, and destination configuration;GET /api/admin/analytics/forward-statusendpoint; configurable via ANALYTICS_FORWARD_ENABLED (default: false), ANALYTICS_FORWARD_EVENTS, ANALYTICS_FORWARD_LOOKBACK_HOURS (default: 25), SEGMENT_WRITE_KEY, SEGMENT_API_URL, SEGMENT_MAX_BATCH_SIZE (default: 100), GA4_MEASUREMENT_ID, GA4_API_SECRET, GA4_API_URL, GA4_MAX_BATCH_SIZE (default: 25) - analytics-engine-phase3-dashboards: Analytics Engine Phase 3 — dashboard visualizations for feature adoption (horizontal bars + sparklines for feature-event trends), geographic distribution (country-level user breakdown from CF headers), and weekly retention cohorts (heat-map cohort table with server-computed retention matrix); three new API endpoints (
/api/admin/analytics/feature-adoption,/geo,/retention) using Cloudflare Analytics Engine SQL API;AdminAnalytics.tsxrefactored from monolithic file toadmin-analytics/directory with individual chart components; configurable via ANALYTICS_GEO_LIMIT (default: 50), ANALYTICS_RETENTION_WEEKS (default: 12) - chat-idea-association: Many-to-many chat session ↔ idea (task) linking —
chat_session_ideasjunction table in ProjectData DO SQLite (migration 012); 4 MCP tools (link_idea,unlink_idea,list_linked_ideas,find_related_ideas) for agents to manage associations mid-conversation; REST endpointsGET/POST /sessions/:id/ideas,DELETE /sessions/:id/ideas/:taskId,GET /tasks/:id/sessionsfor UI; batch D1 enrichment withinArray(); sharedSessionIdeaLinktype; configurable via MCP_IDEA_CONTEXT_MAX_LENGTH (default: 500) - message-materialization-fts5: Post-session message materialization + FTS5 full-text search — when a session stops,
materializeSession()groups raw streaming tokens intochat_messages_groupedand populateschat_messages_grouped_ftsFTS5 virtual table;searchMessages()uses FTS5 MATCH for materialized sessions with LIKE fallback for active sessions;materializeAllStopped()backfills existing data; migration 011 adds tables +materialized_atcolumn; configurable via MCP_MESSAGE_SEARCH_MAX (default: 20) - token-message-concatenation: MCP
get_session_messagesnow groups consecutive streaming tokens (assistant, tool, thinking roles) into logical messages viagroupTokensIntoMessages()before returning to agents; configurable via MCP_MESSAGE_LIST_LIMIT (default: 50), MCP_MESSAGE_LIST_MAX (default: 200) - notification-system-phase2: Agent-initiated notifications —
request_human_inputMCP tool for agents to signal when blocked/need decisions (high-urgency needs_input notification); progress notification emission fromupdate_task_statuswith batching (one per task per 5 min window); session_ended notification on conversation-modecomplete_taskremap; task_complete deduplication (60s window); notification grouping by project in NotificationCenter UI; configurable via NOTIFICATION_PROGRESS_BATCH_WINDOW_MS, NOTIFICATION_MIN_SESSION_DURATION_MS, NOTIFICATION_DEDUP_WINDOW_MS - 027-do-session-ownership: DO-owned ACP session lifecycle — shifts session state machine (pending→assigned→running→completed/failed/interrupted) from VM agent in-memory maps to ProjectData DO SQLite; heartbeat-based VM failure detection via DO alarm; session forking with lineage tracking; workspace-project binding enforcement; configurable via ACP_SESSION_DETECTION_WINDOW_MS, ACP_SESSION_MAX_FORK_DEPTH
- codex-token-refresh-proxy: Centralized Codex OAuth token refresh proxy —
POST /api/auth/codex-refreshreceives refresh requests from Codex instances in workspaces, serializes them per user viaCodexRefreshLockDurable Object (keyed by userId), and proxies to OpenAI; prevents rotating refresh token race condition where concurrent refreshes permanently invalidate tokens; auth via workspace callback token in?token=query param; compares request's refresh_token with stored credential: match → forward to OpenAI and store new tokens, stale → return latest from DB, missing → 401; VM agent injectsCODEX_REFRESH_TOKEN_URL_OVERRIDEenv var for openai-codex oauth-token sessions; configurable via CODEX_REFRESH_PROXY_ENABLED (kill switch, default: enabled), CODEX_REFRESH_LOCK_TIMEOUT_MS (default: 30000), CODEX_REFRESH_UPSTREAM_URL (default: https://auth.openai.com/oauth/token), CODEX_REFRESH_UPSTREAM_TIMEOUT_MS (default: 10000), CODEX_CLIENT_ID (default: app_EMoamEEZ73f0CkXaXp7hrann) - codex-oauth-token-sync: Post-session credential sync-back for file-based agent credentials (e.g., codex-acp auth.json); reads updated auth file from container after session ends via
syncCredentialOnStop(), sends to API viaPOST /api/workspaces/:id/agent-credential-syncwith callbackretry; re-encrypts with fresh AES-GCM IV on change; guards: injectionMode=auth-file + CredentialSyncer configured; best-effort (errors logged, teardown not blocked) - llm-task-title-generation: AI-powered task title generation via Cloudflare Workers AI (Mastra + workers-ai-provider + @cf/google/gemma-3-12b-it by default); generates concise titles (≤100 chars) from full message text at task submit time; falls back to truncation on failure or timeout; short messages (≤100 chars) bypass AI; configurable via TASK_TITLE_MODEL, TASK_TITLE_MAX_LENGTH, TASK_TITLE_TIMEOUT_MS, TASK_TITLE_GENERATION_ENABLED, TASK_TITLE_SHORT_MESSAGE_THRESHOLD
- fix-streaming-token-ordering: ACP notification serialization via orderedPipe in VM agent; wraps agent stdout with a serializing pipe that waits for each session/update handler to complete before delivering the next, preventing the ACP SDK's concurrent goroutine dispatch from reordering streaming tokens; configurable via
ACP_NOTIF_SERIALIZE_TIMEOUT(default: 5s) - 023-admin-observability: Admin observability dashboard — error storage in D1, health overview, error list with filtering, historical log viewer via CF API proxy, real-time log stream via AdminLogs DO + Tail Worker, error trends visualization
- 022-simplified-chat-ux: Chat-first UX — project page is now a chat interface (no tabs), dashboard shows project cards, descriptive branch naming (sam/...), idle auto-push safety net (15 min DO alarm), settings drawer, agent completion git push + PR creation, gh CLI injection + token refresh wrapper, finalization guard for idempotent git push results
- 021-task-chat-architecture: Task-driven chat with autonomous workspace execution, warm node pooling, project chat view, kanban board, task submit form, project default VM size
- 018-project-first-architecture: Added TypeScript 5.x (Worker/Web), Go 1.24+ (VM Agent) + Hono (API framework), Drizzle ORM (D1), React + Vite (Web), Cloudflare Workers SDK (Durable Objects)