| Agent | Repo | Commit | Branch | Clone date |
|---|---|---|---|---|
| OpenClaw | github.com/openclaw/openclaw |
8b2d24b |
main |
2026-04-01 |
| Hermes Agent | github.com/NousResearch/hermes-agent |
HEAD at review | main |
2026-04-01 |
OpenClaw and Hermes Agent solve the same fundamental problem — making stateless LLMs behave like persistent, tool-using assistants — but they diverge sharply in implementation language, storage architecture, memory philosophy, extensibility model, and security posture. OpenClaw is a TypeScript/Node.js system built around a WebSocket gateway daemon with file-based memory and Docker sandboxing. Hermes is a Python system built around a synchronous agent loop with SQLite-backed sessions, bounded tool-managed memory, and multiple terminal backends. Both support multi-channel messaging, skills, sub-agents, cron scheduling, and context compression — but the internals are architecturally distinct.
This document is a technical comparison. It assumes familiarity with both systems or the companion OpenClaw Architecture Deep Dive.
| Dimension | OpenClaw | Hermes Agent |
|---|---|---|
| Language | TypeScript / Node.js | Python 3.11+ |
| Runtime model | Single async daemon (Gateway) | Synchronous agent loop + async gateway |
| Primary storage | Markdown files (JSONL transcripts) | SQLite (WAL mode) + Markdown memory files |
| Package | npm (openclaw) |
pip/uv (hermes-agent[all]) |
| Binary distribution | Node.js required | Python required; Node.js optional (for gateway platforms) |
| License | Source-available (custom) | MIT |
| Origin | Independent project | Nous Research (Hermes is an explicit OpenClaw-inspired fork/reimplementation) |
OpenClaw's agent loop is mediated by the Gateway daemon. Messages flow through a lane-aware command queue before reaching the embedded pi-mono runtime:
Inbound → Channel Bridge → Session Resolution → Command Queue → pi-mono Runtime → LLM
│
┌───────────────┴───────────────┐
│ Lane-based FIFO │
│ • Global: maxConcurrent=4 │
│ • Per-session: concurrency=1 │
│ • Sub-agent: concurrency=8 │
│ • Cron: parallel lane │
└───────────────────────────────┘
Queue modes (collect, steer, followup, steer-backlog) control how inbound messages interact with active runs. The Gateway is the single writer for each session — no concurrent writes to transcripts, deterministic ordering guaranteed.
Hermes uses a direct synchronous orchestration pattern. AIAgent.run_conversation() is the core loop:
User message → AIAgent.run_conversation()
→ Build system prompt (cached)
→ Maybe preflight-compress
→ Build api_messages
→ Inject ephemeral layers
→ Apply prompt caching
→ Interruptible API call
→ If tool_calls: execute → append results → loop
→ If final text: persist → return
The loop is explicitly interruptible — the CLI or gateway can cancel mid-flight. Tool execution uses either sequential or concurrent strategies depending on the tool mix.
OpenClaw separates orchestration (Gateway) from inference (pi-mono runtime). The Gateway owns all state and connection management; the runtime is an embedded component. Hermes combines both into a single AIAgent class — the gateway is a separate process that dispatches to AIAgent instances but doesn't own the agent loop itself.
Concurrency model: OpenClaw's lane-based queue is a formal concurrency primitive. Hermes handles write contention at the SQLite level with application-level retries, random jitter, and BEGIN IMMEDIATE transactions.
~/.openclaw/agents/<agentId>/sessions/
├── <sessionId>.jsonl # Full transcript (append-only)
└── sessions.json # Session key → {sessionId, updatedAt, ...} map
- Transcripts are append-only JSONL — each line is a message/event
- Compaction summaries are persisted in the JSONL (permanent)
- Session metadata stored in a flat JSON file
- No database dependency (SQLite used only for vector search)
- Migration = git clone + config update
~/.hermes/state.db (SQLite, WAL mode)
├── sessions # Metadata, token counts, billing, lineage
├── messages # Full message history per session
├── messages_fts # FTS5 virtual table for full-text search
└── schema_version # Migration tracking
- Full relational schema with token/cost accounting per session
- FTS5 full-text search across all sessions (no external vector DB)
- Session lineage via
parent_session_idchains (compression-triggered splits) - Schema migrations tracked and applied incrementally (currently v6)
- Write contention: 15 retries, 20-150ms random jitter,
BEGIN IMMEDIATE
OpenClaw's file approach is simpler, human-readable, and git-backable. Hermes' SQLite approach enables structured queries (cost analytics, session lineage, FTS5 search) without external tools. OpenClaw needs a separate vector index (sqlite-vec) for semantic search; Hermes gets full-text search for free via FTS5 triggers but uses a separate approach for semantic recall (session_search with LLM summarization vs. OpenClaw's embedding-based hybrid BM25+vector).
This is the most significant architectural divergence.
┌─────────────────────────────────────────┐
│ Layer 4: Semantic Vector Search │ ← BM25 + vector hybrid (70/30)
│ (SQLite + embeddings) │
├─────────────────────────────────────────┤
│ Layer 3: MEMORY.md (curated long-term) │ ← No size limit, manually maintained
├─────────────────────────────────────────┤
│ Layer 2: memory/YYYY-MM-DD.md (daily) │ ← Append-only daily notes
├─────────────────────────────────────────┤
│ Layer 1: Session context (JSONL) │ ← Current context window
└─────────────────────────────────────────┘
- No hard size limits on MEMORY.md — grows organically
- Memory is plain Markdown the model reads/writes via
read/write/edittools - Daily logs provide temporal structure
- Vector search (BM25 + embeddings) enables semantic recall across all files
- Pre-compaction memory flush: silent agentic turn writes durable notes before context is summarized
- Memory search exposed as
memory_search+memory_gettools
┌─────────────────────────────────────────┐
│ Layer 3: Session Search (FTS5 + LLM) │ ← Cross-session recall with summarization
├─────────────────────────────────────────┤
│ Layer 2: MEMORY.md (~800 tokens, 2.2K) │ ← Hard-capped, tool-managed
│ USER.md (~500 tokens, 1.4K) │ ← Hard-capped, tool-managed
├─────────────────────────────────────────┤
│ Layer 1: Session context (SQLite) │ ← Current context window
└─────────────────────────────────────────┘
│
│ + Optional: Honcho (dialectic user modeling)
- Hard character limits: MEMORY.md = 2,200 chars (~800 tokens), USER.md = 1,375 chars (~500 tokens)
- Memory managed via a dedicated
memorytool (add/replace/remove actions) - Frozen snapshot pattern: injected at session start, never mutated mid-session
- Capacity shown in prompt header (
67% — 1,474/2,200 chars) so the model self-manages - Cross-session recall via
session_search(FTS5 + Gemini Flash summarization) - Optional Honcho integration for dialectic user modeling across sessions
- Security scanning on memory writes (injection/exfiltration detection)
OpenClaw treats memory as an unbounded knowledge base — write anything, search semantically later. The model uses standard file tools. Hermes treats memory as a bounded cache — fixed capacity forces the model to curate aggressively. The model uses a dedicated memory tool with add/replace/remove semantics and substring matching.
OpenClaw's approach scales better for knowledge-heavy agents but risks prompt bloat. Hermes' approach guarantees bounded token cost (~1,300 tokens/session) but requires the model to make curation decisions. Both do pre-compaction memory flush; OpenClaw uses a silent agentic turn, Hermes uses gateway-level session hygiene.
For cross-session recall: OpenClaw uses vector embeddings (BM25+vector hybrid, 70/30 weighting); Hermes uses FTS5 keyword search + LLM summarization. Different tradeoffs — vector catches semantic matches but needs an embedding pipeline; FTS5 is zero-setup but misses paraphrased recall.
System prompt assembled from multiple sources every turn:
- Tooling descriptions + Safety guardrails + Skills metadata
- Workspace files: AGENTS.md, SOUL.md, TOOLS.md, IDENTITY.md, USER.md, HEARTBEAT.md
- Runtime metadata (host/OS/model/thinking level)
- Bootstrap files truncated at
bootstrapMaxChars(default 20K) - Time is timezone-only (no dynamic clock) for cache stability
Hermes deliberately separates cached and ephemeral prompt layers:
Cached layers (stable across turns):
- Agent identity (SOUL.md or default)
- Tool-aware behavior guidance
- Honcho static block
- Frozen MEMORY snapshot
- Frozen USER profile snapshot
- Skills index
- Context files (AGENTS.md, .cursorrules, etc.)
- Timestamp + platform hint
Ephemeral layers (API-call-time only, never persisted):
ephemeral_system_prompt- Prefill messages
- Gateway session context overlays
- Honcho recall injected into current-turn user message
Hermes implements Anthropic's cache_control breakpoints:
- System prompt = breakpoint 1 (stable)
- 3rd/2nd/last messages = rolling breakpoints 2-4
- ~75% input token cost reduction on multi-turn conversations
- TTL configurable: 5m default, 1h for long sessions
OpenClaw's time-as-timezone-only design achieves similar cache stability but without explicit provider-level caching primitives.
- Compaction: summarizes older messages, persists summary in JSONL transcript (permanent)
- Pruning: trims old tool results in-memory per request (non-destructive)
- Pre-compaction memory flush via silent agentic turn
Two independent compression systems:
Gateway Session Hygiene (85% threshold) → Safety net, rough estimate
↓
Agent ContextCompressor (50% threshold) → Primary, real token counts
4-phase algorithm:
- Prune old tool results (cheap, no LLM call — replaces >200 char outputs)
- Determine boundaries (head/middle/tail with tool-group alignment)
- Generate structured summary (using auxiliary LLM, template-based)
- Assemble compressed messages (head + summary + tail, orphan cleanup)
Iterative re-compression: previous summary passed to LLM with "update, don't re-summarize" instructions. Items move from "In Progress" to "Done" across compressions.
Session lineage: compression triggers a new session ID linked via parent_session_id, creating traceable compression chains.
Both use LLM-generated summaries, but Hermes' structured summary template (Goal/Progress/Decisions/Files/Next Steps) produces more consistently useful compressions than OpenClaw's freeform approach. Hermes' dual-layer (gateway safety net + agent compressor) provides defense in depth against sessions that escape normal compression. OpenClaw's pruning (trim tool results without LLM) is cheaper for the common case of verbose tool output.
Core tools built into the runtime:
- File ops (
read/write/edit), Shell (exec/process), Browser (CDP), Web, Message, Cron, Memory, Sessions, Nodes, Canvas, TTS, Gateway self-management
Tool policy: layered allow/deny system:
Global deny → Per-agent deny → Global allow → Per-agent allow → Default
Extensibility via plugins (in-process TypeScript modules with lifecycle hooks) and hooks (event-driven scripts).
Tools self-register via registry.register() at import time. Central dispatch through ToolRegistry:
registry.register(
name="terminal", toolset="terminal",
schema={...}, handler=handle_terminal,
check_fn=check_terminal, # Availability gate
requires_env=["SOME_VAR"], # For UI display
is_async=False,
)Toolset system: named bundles resolved via platform presets (hermes-cli, hermes-telegram, etc.) + explicit enable/disable.
40+ tools organized into toolsets:
- terminal, file, web, browser, vision, image_generation, skills, cron, tts, todo, memory, session_search, delegate, send_message, honcho, homeassistant, code_execution, clarify, etc.
DANGEROUS_PATTERNS approval system: regex-based detection of destructive commands → interactive approval (CLI) or async approval (gateway). Smart approval via auxiliary LLM for false positives. Per-session state + permanent allowlist.
OpenClaw's tool policy is declarative (JSON config) with a formal layered resolution order. Hermes' tool system is imperative (Python registration with check functions) with a toolset/preset resolution model. OpenClaw has richer extensibility (plugins with lifecycle hooks, in-process); Hermes has richer built-in tools (vision, image generation, Honcho, Home Assistant, code execution sandbox).
OpenClaw's exec approval is sender-based (authorized senders can approve); Hermes has a more sophisticated approval flow with pattern-based detection, smart LLM approval, and permanent allowlisting.
Both implement the agentskills.io open standard with near-identical patterns:
| Aspect | OpenClaw | Hermes |
|---|---|---|
| Format | SKILL.md with YAML frontmatter | SKILL.md with YAML frontmatter |
| Loading | Lazy (model reads SKILL.md on demand) |
Progressive disclosure (list → view → reference) |
| Discovery | Workspace → Managed → Bundled | Local ~/.hermes/skills/ (single source) + external dirs |
| Gating | requires.bins, requires.env, requires.config, os |
platforms, fallback_for_toolsets, requires_toolsets |
| Self-creation | Agent can write skills via file tools | Agent creates skills autonomously after complex tasks |
| Self-improvement | Manual updates | Skills self-improve during use (learning loop) |
| Hub | ClawhHub (clawhub.ai) | Skills Hub (agentskills.io) |
| Slash commands | No | Yes (/skill-name triggers skill loading) |
Hermes has autonomous skill creation — after solving a complex task, it can automatically create a skill from the experience. It also has a learning loop where skills self-improve during use. OpenClaw's skills are static unless manually edited. Hermes also has conditional activation (fallback skills that appear only when premium tools are unavailable) and secure setup on load (prompts for API keys only when the skill is accessed).
- Single long-lived Node.js daemon
- Typed WebSocket API (JSON frames with req/res/event protocol)
- All clients connect over one WebSocket: macOS app, CLI, web UI, mobile nodes
- Idempotency keys for side-effecting methods
- Channel bridges: WhatsApp (Baileys), Telegram (grammY), Discord, Slack, Signal, iMessage, WebChat
- Device pairing with challenge-nonce trust model
- DM scope options:
main,per-peer,per-channel-peer
- Long-running Python process (
hermes gateway) - Platform adapters (Telegram, Discord, Slack, WhatsApp, Signal, Email, Home Assistant)
- Multi-source config: env vars +
gateway.json+ bridged values fromconfig.yaml - DM pairing flows with platform-specific authorization
- Session routing by platform + user/chat identity + thread/topic
- Delivery routing: home channel, explicit targets, mirroring
OpenClaw's WebSocket-first protocol enables richer client integration (real-time streaming, device pairing, node system). Hermes' gateway is simpler — it's a dispatch layer that routes to AIAgent instances. OpenClaw's node system (camera, screen, location, commands on paired devices) has no Hermes equivalent.
| Aspect | OpenClaw | Hermes |
|---|---|---|
| Isolation | Dedicated session (subagent:<uuid>) |
Spawned via delegate_task |
| Nesting | Prohibited (no fan-out) | Shared iteration budget across parent/child |
| Tool restrictions | No session tools by default | Budget pressure hints near limit |
| Result delivery | Announce pattern (post back to requester) | Return to parent context |
| Concurrency | Dedicated lane (max 8) | Thread-based parallel execution |
| Model selection | Can use cheaper models | Can use different providers |
| ACP integration | Spawn coding agents (Codex, Claude Code) via sessions_spawn |
ACP editor integration (stdio/JSON-RPC) |
- Modes:
off(host),non-main(sandbox non-main sessions),all - Scope: per-session, per-agent, or shared container
- Workspace access: none, read-only, or read-write mount
- Elevated execution bypass for authorized senders
- Local: direct host execution
- Docker: containerized execution
- SSH: remote host execution
- Daytona: serverless persistence (hibernates when idle)
- Modal: serverless cloud execution
- Singularity: HPC container execution
Hermes offers significantly more execution environments. Daytona and Modal provide serverless persistence — the agent's environment hibernates between sessions, costing nothing when idle. OpenClaw's sandboxing is focused on security isolation; Hermes' backends are focused on deployment flexibility.
| Layer | OpenClaw | Hermes |
|---|---|---|
| Auth | Gateway token (all connections) | Platform allowlists + DM pairing |
| Device trust | Challenge-nonce pairing, device tokens | Platform-specific pairing flows |
| Tool control | Layered allow/deny policy (JSON) | Toolset enable/disable + per-tool check_fn |
| Exec approval | Sender-based authorization | Pattern-based detection + smart LLM approval + permanent allowlist |
| Sandbox | Docker (off/non-main/all) | Six backends (Docker, Daytona, Modal, etc.) |
| Outbound | Send policy (outbound gates) | Not explicitly documented |
| Prompt safety | Advisory guardrails in system prompt | Advisory guardrails in system prompt |
| Memory safety | MEMORY.md only in private sessions | Security scanning on memory writes (injection/exfiltration) |
OpenClaw has a 7-layer formal security model with explicit outbound send policy. Hermes has more sophisticated command approval (regex patterns, smart LLM approval, session state, permanent allowlist). Hermes scans memory writes for injection attempts; OpenClaw protects memory by scope (private sessions only).
Hermes includes infrastructure that has no OpenClaw equivalent:
- Batch trajectory generation for SFT data
- Atropos RL environments for reinforcement learning
- Trajectory compression for training tool-calling models
- Environment framework for benchmarks and evaluation
This reflects Hermes' origin at Nous Research — the agent is both a product and a research platform for improving the next generation of models.
| Capability | OpenClaw | Hermes |
|---|---|---|
| Language/Runtime | TypeScript / Node.js | Python |
| Session Storage | JSONL files | SQLite (WAL) |
| Memory Model | Unbounded files + vector search | Bounded tool-managed + FTS5 |
| Memory Recall | BM25 + vector embeddings (hybrid) | FTS5 + LLM summarization |
| Prompt Caching | Cache-stable timestamps | Anthropic cache_control breakpoints |
| Compression | Compaction + pruning | Dual-layer (gateway + agent) with structured templates |
| Tool Extension | Plugins (in-process) + Hooks (event) | Registry (self-registering) + Toolsets |
| Exec Approval | Sender-based | Pattern + smart LLM + allowlist |
| Terminal Backends | Local + Docker | Local + Docker + SSH + Daytona + Modal + Singularity |
| Skills Self-Creation | No | Yes (autonomous + self-improving) |
| Node/Device Control | Yes (camera, screen, location, run) | No |
| User Modeling | USER.md (manual) | USER.md + Honcho (AI dialectic) |
| RL/Training | No | Atropos + trajectories + benchmarks |
| OpenClaw Migration | N/A | Built-in (hermes claw migrate) |
| Gateway Protocol | Typed WebSocket (req/res/event) | Platform adapters (dispatch) |
| Concurrent Writes | Lane-based command queue | SQLite WAL + application-level retries |
OpenClaw is infrastructure-first: a robust daemon that owns all state, with formal concurrency primitives, a typed wire protocol, and defense-in-depth security. It's optimized for always-on deployment with rich client integration (desktop apps, mobile nodes, device pairing). The workspace-as-git-repo philosophy makes everything human-readable and version-controllable.
Hermes is research-first: a flexible agent loop designed for experimentation, with multiple execution backends, autonomous skill creation, RL training infrastructure, and user modeling. It's optimized for the developer who wants to customize everything — switch models, add terminal backends, generate training data, run benchmarks. The bounded memory system forces efficient curation.
Both are converging: Hermes has an explicit OpenClaw migration path, compatible skills format, and similar prompt assembly patterns. The architectures are different enough to represent genuine alternative approaches to the same problem — not just language ports.
Comparison authored by Loki@FastStart ⚡ — April 2026 Sources: OpenClaw Architecture Deep Dive, Hermes Agent developer documentation (hermes-agent.nousresearch.com/docs)