Skip to content

Latest commit

 

History

History
124 lines (83 loc) · 6.14 KB

File metadata and controls

124 lines (83 loc) · 6.14 KB

Architecture

Updated for round-2 features: streaming, OpenAI provider, cost/fork, MCP, cargo test hook.

The whole agent is roughly:

            ┌─────────────────────────────────────────────────┐
   stdin -> │ run::chat (REPL)                                 │
            │   ↓                                              │
            │ Agent::turn(prompt)                              │
            │   loop:                                          │
            │     Provider::complete(messages, tools, system) ─┼─► Anthropic API
            │     ↓                                            │
            │     for ToolUse block:                           │
            │       Permission::check ─→ allow / deny / prompt │
            │       Tool::invoke ─→ ToolOutput                 │
            │       CargoVerifier::maybe_annotate (if .rs)     │
            │     append tool_result blocks                    │
            │     break when stop_reason != tool_use           │
            └─────────────────────────────────────────────────┘
                                  │
                          ToolContext { cwd }

Eight modules. Each does one thing.

provider

Defines Message, ContentBlock, ToolSchema, CompletionRequest/Response, and the Provider trait. Two implementations:

  • anthropic.rs — POSTs to /v1/messages, deserialises the response, surfaces useful errors.
  • mock.rs — replays a Vec<CannedResponse> for tests.

The trait is intentionally tiny so a third provider can be added in ~100 lines. There is no provider abstraction over streaming — that comes after the protocol stabilises in non-streaming form.

tools

Tool is an async trait. Each tool returns ToolOutput { stdout, summary, is_error }. Tools also declare a SideEffect (Pure / Read / Mutating) that controls whether the permission policy is consulted.

ToolRegistry::defaults() registers the canonical set: read, write, edit, bash, ls, glob, grep. Anything beyond this is opt-in.

dispatch() is the entry point: it looks up the tool, checks permission, invokes, and returns a DispatchOutcome.

permission

PermissionPolicy::check(tool, input) returns Allow or Deny(reason).

Order of evaluation:

  1. Pure and Read tools always allow.
  2. deny patterns from config — first match wins.
  3. allow patterns from config.
  4. Session-allow list (populated by user pressing a at a prompt).
  5. Mode dispatch:
    • Yolo → allow
    • NonInteractive → deny
    • Interactive → prompt the user

Patterns are "<tool>:<glob>". For bash, the right side matches the command string. For path-taking tools, it matches the path argument.

verifier

The differentiator. CargoVerifier::maybe_annotate(tool_name, input, ctx, &mut out) is called after every tool invocation. It:

  1. Returns immediately unless the tool was write or edit on a .rs file.
  2. Debounces: skips if the last verifier run was less than debounce_secs ago.
  3. Spawns cargo check --message-format=json -q with a timeout.
  4. Parses the JSON output for compiler-message records.
  5. Appends a --- cargo verifier --- section to the tool's stdout. If errors occurred, sets is_error = true.

The model thus sees compiler diagnostics in the same turn as the offending edit, rather than at some later prompt. Empirically, this collapses the "edit → forget → ten turns later, oh no" failure mode that all other agents share.

The agent's system prompt tells the model that this hook exists and that it should never declare a task done while the verifier shows errors.

agent

Agent::turn(user_input) drives the loop until the model emits a non-tool stop reason or max_turns is hit. Each iteration:

  1. Calls the provider.
  2. Renders any Text blocks via ui and writes them to the journal.
  3. For every ToolUse block: dispatches, runs the verifier hook, renders the result, writes to journal, and accumulates tool_result blocks.
  4. Pushes the assistant turn and the tool-results turn onto history.

Each Agent carries the system prompt, history, registry, policy, verifier, journal, and cumulative Usage. The system prompt is built once at construction (it embeds cwd, the day's date, and MICROCODE.md content).

journal

A markdown file under .microcode/journal/<timestamp>.md. Each turn and each tool call gets a ## section. Tool bodies are truncated at 4 KB.

The journal is not read back into context. It exists for human auditability.

config

Three layers, last wins: defaults → ~/.config/microcode/config.toml<cwd>/.microcode/config.toml. Then CLI flags override the merged result.

Config is the toml-shape; the model() / max_tokens() / max_turns() accessors return defaults if a field is None.

run

The CLI entry points. chat builds an Agent with PromptMode::Interactive, spins up rustyline, and dispatches to Agent::turn or to a slash-command handler. one_shot builds with PromptMode::NonInteractive, takes a single prompt, and exits.

init_project writes a minimal .microcode/config.toml and MICROCODE.md. config_cmd exposes show / path / edit.

cost

Token-to-USD math. Reads from the per-model ModelPrice table; supports user overrides under [pricing."model"] in config.

mcp

Minimal stdio JSON-RPC client. Discovers tools via tools/list after the initialize handshake, then exposes each as mcp__<server>__<tool> in the shared registry. Failed servers log and are skipped.

What's not here

  • Prompt caching. We read the cache-related usage fields but don't yet set cache_control markers. Roadmap.
  • TUI. Still a separate project.
  • OAuth-based subscription auth (à la Claude Code's claude login). Microcode uses raw API keys.
  • Auto-compaction. When the conversation approaches model limits, we currently just fail; auto-compact with a summary turn is a plausible add.