agent-teardowns/comparisons/openclaw-vs-hermes.md at main · inceptionstack/agent-teardowns

Hermes Agent vs OpenClaw: Technical Architecture Comparison

A Side-by-Side Breakdown for Software Architects

Source Metadata

Agent	Repo	Commit	Branch	Clone date
OpenClaw	`github.com/openclaw/openclaw`	`8b2d24b`	`main`	2026-04-01
Hermes Agent	`github.com/NousResearch/hermes-agent`	HEAD at review	`main`	2026-04-01

1. Executive Summary

OpenClaw and Hermes Agent solve the same fundamental problem — making stateless LLMs behave like persistent, tool-using assistants — but they diverge sharply in implementation language, storage architecture, memory philosophy, extensibility model, and security posture. OpenClaw is a TypeScript/Node.js system built around a WebSocket gateway daemon with file-based memory and Docker sandboxing. Hermes is a Python system built around a synchronous agent loop with SQLite-backed sessions, bounded tool-managed memory, and multiple terminal backends. Both support multi-channel messaging, skills, sub-agents, cron scheduling, and context compression — but the internals are architecturally distinct.

This document is a technical comparison. It assumes familiarity with both systems or the companion OpenClaw Architecture Deep Dive.

2. Foundational Differences

Dimension	OpenClaw	Hermes Agent
Language	TypeScript / Node.js	Python 3.11+
Runtime model	Single async daemon (Gateway)	Synchronous agent loop + async gateway
Primary storage	Markdown files (JSONL transcripts)	SQLite (WAL mode) + Markdown memory files
Package	npm (`openclaw`)	pip/uv (`hermes-agent[all]`)
Binary distribution	Node.js required	Python required; Node.js optional (for gateway platforms)
License	Source-available (custom)	MIT
Origin	Independent project	Nous Research (Hermes is an explicit OpenClaw-inspired fork/reimplementation)

3. Agent Loop Architecture

OpenClaw: Gateway-Mediated, Queue-Serialized

OpenClaw's agent loop is mediated by the Gateway daemon. Messages flow through a lane-aware command queue before reaching the embedded pi-mono runtime:

Inbound → Channel Bridge → Session Resolution → Command Queue → pi-mono Runtime → LLM
                                                    │
                                    ┌───────────────┴───────────────┐
                                    │ Lane-based FIFO               │
                                    │  • Global: maxConcurrent=4    │
                                    │  • Per-session: concurrency=1 │
                                    │  • Sub-agent: concurrency=8   │
                                    │  • Cron: parallel lane         │
                                    └───────────────────────────────┘

Queue modes (collect, steer, followup, steer-backlog) control how inbound messages interact with active runs. The Gateway is the single writer for each session — no concurrent writes to transcripts, deterministic ordering guaranteed.

Hermes: Direct Synchronous Loop

Hermes uses a direct synchronous orchestration pattern. AIAgent.run_conversation() is the core loop:

User message → AIAgent.run_conversation()
                → Build system prompt (cached)
                → Maybe preflight-compress
                → Build api_messages
                → Inject ephemeral layers
                → Apply prompt caching
                → Interruptible API call
                → If tool_calls: execute → append results → loop
                → If final text: persist → return

The loop is explicitly interruptible — the CLI or gateway can cancel mid-flight. Tool execution uses either sequential or concurrent strategies depending on the tool mix.

Key Difference

OpenClaw separates orchestration (Gateway) from inference (pi-mono runtime). The Gateway owns all state and connection management; the runtime is an embedded component. Hermes combines both into a single AIAgent class — the gateway is a separate process that dispatches to AIAgent instances but doesn't own the agent loop itself.

Concurrency model: OpenClaw's lane-based queue is a formal concurrency primitive. Hermes handles write contention at the SQLite level with application-level retries, random jitter, and BEGIN IMMEDIATE transactions.

4. Session Storage

OpenClaw: File-Based (JSONL + JSON)

~/.openclaw/agents/<agentId>/sessions/
├── <sessionId>.jsonl          # Full transcript (append-only)
└── sessions.json              # Session key → {sessionId, updatedAt, ...} map

Transcripts are append-only JSONL — each line is a message/event
Compaction summaries are persisted in the JSONL (permanent)
Session metadata stored in a flat JSON file
No database dependency (SQLite used only for vector search)
Migration = git clone + config update

Hermes: SQLite Database (WAL Mode)

~/.hermes/state.db (SQLite, WAL mode)
├── sessions          # Metadata, token counts, billing, lineage
├── messages          # Full message history per session
├── messages_fts      # FTS5 virtual table for full-text search
└── schema_version    # Migration tracking

Full relational schema with token/cost accounting per session
FTS5 full-text search across all sessions (no external vector DB)
Session lineage via parent_session_id chains (compression-triggered splits)
Schema migrations tracked and applied incrementally (currently v6)
Write contention: 15 retries, 20-150ms random jitter, BEGIN IMMEDIATE

Key Difference

OpenClaw's file approach is simpler, human-readable, and git-backable. Hermes' SQLite approach enables structured queries (cost analytics, session lineage, FTS5 search) without external tools. OpenClaw needs a separate vector index (sqlite-vec) for semantic search; Hermes gets full-text search for free via FTS5 triggers but uses a separate approach for semantic recall (session_search with LLM summarization vs. OpenClaw's embedding-based hybrid BM25+vector).

5. Memory Architecture

This is the most significant architectural divergence.

OpenClaw: Unbounded File-Based Memory

┌─────────────────────────────────────────┐
│ Layer 4: Semantic Vector Search         │ ← BM25 + vector hybrid (70/30)
│          (SQLite + embeddings)          │
├─────────────────────────────────────────┤
│ Layer 3: MEMORY.md (curated long-term)  │ ← No size limit, manually maintained
├─────────────────────────────────────────┤
│ Layer 2: memory/YYYY-MM-DD.md (daily)   │ ← Append-only daily notes
├─────────────────────────────────────────┤
│ Layer 1: Session context (JSONL)        │ ← Current context window
└─────────────────────────────────────────┘

No hard size limits on MEMORY.md — grows organically
Memory is plain Markdown the model reads/writes via read/write/edit tools
Daily logs provide temporal structure
Vector search (BM25 + embeddings) enables semantic recall across all files
Pre-compaction memory flush: silent agentic turn writes durable notes before context is summarized
Memory search exposed as memory_search + memory_get tools

Hermes: Bounded Tool-Managed Memory

┌─────────────────────────────────────────┐
│ Layer 3: Session Search (FTS5 + LLM)    │ ← Cross-session recall with summarization
├─────────────────────────────────────────┤
│ Layer 2: MEMORY.md (~800 tokens, 2.2K)  │ ← Hard-capped, tool-managed
│          USER.md (~500 tokens, 1.4K)    │ ← Hard-capped, tool-managed
├─────────────────────────────────────────┤
│ Layer 1: Session context (SQLite)       │ ← Current context window
└─────────────────────────────────────────┘
│
│  + Optional: Honcho (dialectic user modeling)

Hard character limits: MEMORY.md = 2,200 chars (~800 tokens), USER.md = 1,375 chars (~500 tokens)
Memory managed via a dedicated memory tool (add/replace/remove actions)
Frozen snapshot pattern: injected at session start, never mutated mid-session
Capacity shown in prompt header (67% — 1,474/2,200 chars) so the model self-manages
Cross-session recall via session_search (FTS5 + Gemini Flash summarization)
Optional Honcho integration for dialectic user modeling across sessions
Security scanning on memory writes (injection/exfiltration detection)

Key Difference

OpenClaw treats memory as an unbounded knowledge base — write anything, search semantically later. The model uses standard file tools. Hermes treats memory as a bounded cache — fixed capacity forces the model to curate aggressively. The model uses a dedicated memory tool with add/replace/remove semantics and substring matching.

OpenClaw's approach scales better for knowledge-heavy agents but risks prompt bloat. Hermes' approach guarantees bounded token cost (~1,300 tokens/session) but requires the model to make curation decisions. Both do pre-compaction memory flush; OpenClaw uses a silent agentic turn, Hermes uses gateway-level session hygiene.

For cross-session recall: OpenClaw uses vector embeddings (BM25+vector hybrid, 70/30 weighting); Hermes uses FTS5 keyword search + LLM summarization. Different tradeoffs — vector catches semantic matches but needs an embedding pipeline; FTS5 is zero-setup but misses paraphrased recall.

6. Prompt Assembly

OpenClaw: Dynamic Assembly Per-Turn

System prompt assembled from multiple sources every turn:

Tooling descriptions + Safety guardrails + Skills metadata
Workspace files: AGENTS.md, SOUL.md, TOOLS.md, IDENTITY.md, USER.md, HEARTBEAT.md
Runtime metadata (host/OS/model/thinking level)
Bootstrap files truncated at bootstrapMaxChars (default 20K)
Time is timezone-only (no dynamic clock) for cache stability

Hermes: Cached + Ephemeral Split

Hermes deliberately separates cached and ephemeral prompt layers:

Cached layers (stable across turns):

Agent identity (SOUL.md or default)
Tool-aware behavior guidance
Honcho static block
Frozen MEMORY snapshot
Frozen USER profile snapshot
Skills index
Context files (AGENTS.md, .cursorrules, etc.)
Timestamp + platform hint

Ephemeral layers (API-call-time only, never persisted):

ephemeral_system_prompt
Prefill messages
Gateway session context overlays
Honcho recall injected into current-turn user message

Anthropic Prompt Caching (Hermes)

Hermes implements Anthropic's cache_control breakpoints:

System prompt = breakpoint 1 (stable)
3rd/2nd/last messages = rolling breakpoints 2-4
~75% input token cost reduction on multi-turn conversations
TTL configurable: 5m default, 1h for long sessions

OpenClaw's time-as-timezone-only design achieves similar cache stability but without explicit provider-level caching primitives.

7. Context Compression

OpenClaw: Compaction + Pruning (Two Mechanisms)

Compaction: summarizes older messages, persists summary in JSONL transcript (permanent)
Pruning: trims old tool results in-memory per request (non-destructive)
Pre-compaction memory flush via silent agentic turn

Hermes: Dual-Layer Compression

Two independent compression systems:

Gateway Session Hygiene (85% threshold) → Safety net, rough estimate
         ↓
Agent ContextCompressor (50% threshold) → Primary, real token counts

4-phase algorithm:

Prune old tool results (cheap, no LLM call — replaces >200 char outputs)
Determine boundaries (head/middle/tail with tool-group alignment)
Generate structured summary (using auxiliary LLM, template-based)
Assemble compressed messages (head + summary + tail, orphan cleanup)

Iterative re-compression: previous summary passed to LLM with "update, don't re-summarize" instructions. Items move from "In Progress" to "Done" across compressions.

Session lineage: compression triggers a new session ID linked via parent_session_id, creating traceable compression chains.

Key Difference

Both use LLM-generated summaries, but Hermes' structured summary template (Goal/Progress/Decisions/Files/Next Steps) produces more consistently useful compressions than OpenClaw's freeform approach. Hermes' dual-layer (gateway safety net + agent compressor) provides defense in depth against sessions that escape normal compression. OpenClaw's pruning (trim tool results without LLM) is cheaper for the common case of verbose tool output.

8. Tool Systems

OpenClaw: Policy-Layered, Plugin-Extensible

Core tools built into the runtime:

File ops (read/write/edit), Shell (exec/process), Browser (CDP), Web, Message, Cron, Memory, Sessions, Nodes, Canvas, TTS, Gateway self-management

Tool policy: layered allow/deny system:

Global deny → Per-agent deny → Global allow → Per-agent allow → Default

Extensibility via plugins (in-process TypeScript modules with lifecycle hooks) and hooks (event-driven scripts).

Hermes: Registry-Based, Self-Registering

Tools self-register via registry.register() at import time. Central dispatch through ToolRegistry:

registry.register(
    name="terminal", toolset="terminal",
    schema={...}, handler=handle_terminal,
    check_fn=check_terminal,        # Availability gate
    requires_env=["SOME_VAR"],      # For UI display
    is_async=False,
)

Toolset system: named bundles resolved via platform presets (hermes-cli, hermes-telegram, etc.) + explicit enable/disable.

40+ tools organized into toolsets:

terminal, file, web, browser, vision, image_generation, skills, cron, tts, todo, memory, session_search, delegate, send_message, honcho, homeassistant, code_execution, clarify, etc.

DANGEROUS_PATTERNS approval system: regex-based detection of destructive commands → interactive approval (CLI) or async approval (gateway). Smart approval via auxiliary LLM for false positives. Per-session state + permanent allowlist.

Key Difference

OpenClaw's tool policy is declarative (JSON config) with a formal layered resolution order. Hermes' tool system is imperative (Python registration with check functions) with a toolset/preset resolution model. OpenClaw has richer extensibility (plugins with lifecycle hooks, in-process); Hermes has richer built-in tools (vision, image generation, Honcho, Home Assistant, code execution sandbox).

OpenClaw's exec approval is sender-based (authorized senders can approve); Hermes has a more sophisticated approval flow with pattern-based detection, smart LLM approval, and permanent allowlisting.

9. Skills System

Both implement the agentskills.io open standard with near-identical patterns:

Aspect	OpenClaw	Hermes
Format	SKILL.md with YAML frontmatter	SKILL.md with YAML frontmatter
Loading	Lazy (model `read`s SKILL.md on demand)	Progressive disclosure (list → view → reference)
Discovery	Workspace → Managed → Bundled	Local `~/.hermes/skills/` (single source) + external dirs
Gating	`requires.bins`, `requires.env`, `requires.config`, `os`	`platforms`, `fallback_for_toolsets`, `requires_toolsets`
Self-creation	Agent can write skills via file tools	Agent creates skills autonomously after complex tasks
Self-improvement	Manual updates	Skills self-improve during use (learning loop)
Hub	ClawhHub (clawhub.ai)	Skills Hub (agentskills.io)
Slash commands	No	Yes (`/skill-name` triggers skill loading)

Key Difference

Hermes has autonomous skill creation — after solving a complex task, it can automatically create a skill from the experience. It also has a learning loop where skills self-improve during use. OpenClaw's skills are static unless manually edited. Hermes also has conditional activation (fallback skills that appear only when premium tools are unavailable) and secure setup on load (prompts for API keys only when the skill is accessed).

10. Messaging & Gateway

OpenClaw: WebSocket-First Daemon

Single long-lived Node.js daemon
Typed WebSocket API (JSON frames with req/res/event protocol)
All clients connect over one WebSocket: macOS app, CLI, web UI, mobile nodes
Idempotency keys for side-effecting methods
Channel bridges: WhatsApp (Baileys), Telegram (grammY), Discord, Slack, Signal, iMessage, WebChat
Device pairing with challenge-nonce trust model
DM scope options: main, per-peer, per-channel-peer

Hermes: Orchestration Process

Long-running Python process (hermes gateway)
Platform adapters (Telegram, Discord, Slack, WhatsApp, Signal, Email, Home Assistant)
Multi-source config: env vars + gateway.json + bridged values from config.yaml
DM pairing flows with platform-specific authorization
Session routing by platform + user/chat identity + thread/topic
Delivery routing: home channel, explicit targets, mirroring

Key Difference

OpenClaw's WebSocket-first protocol enables richer client integration (real-time streaming, device pairing, node system). Hermes' gateway is simpler — it's a dispatch layer that routes to AIAgent instances. OpenClaw's node system (camera, screen, location, commands on paired devices) has no Hermes equivalent.

11. Sub-Agent Architecture

Aspect	OpenClaw	Hermes
Isolation	Dedicated session (`subagent:<uuid>`)	Spawned via `delegate_task`
Nesting	Prohibited (no fan-out)	Shared iteration budget across parent/child
Tool restrictions	No session tools by default	Budget pressure hints near limit
Result delivery	Announce pattern (post back to requester)	Return to parent context
Concurrency	Dedicated lane (max 8)	Thread-based parallel execution
Model selection	Can use cheaper models	Can use different providers
ACP integration	Spawn coding agents (Codex, Claude Code) via `sessions_spawn`	ACP editor integration (stdio/JSON-RPC)

12. Terminal / Execution Environments

OpenClaw: Host + Docker Sandbox

Modes: off (host), non-main (sandbox non-main sessions), all
Scope: per-session, per-agent, or shared container
Workspace access: none, read-only, or read-write mount
Elevated execution bypass for authorized senders

Hermes: Six Terminal Backends

Local: direct host execution
Docker: containerized execution
SSH: remote host execution
Daytona: serverless persistence (hibernates when idle)
Modal: serverless cloud execution
Singularity: HPC container execution

Key Difference

Hermes offers significantly more execution environments. Daytona and Modal provide serverless persistence — the agent's environment hibernates between sessions, costing nothing when idle. OpenClaw's sandboxing is focused on security isolation; Hermes' backends are focused on deployment flexibility.

13. Security Model

Layer	OpenClaw	Hermes
Auth	Gateway token (all connections)	Platform allowlists + DM pairing
Device trust	Challenge-nonce pairing, device tokens	Platform-specific pairing flows
Tool control	Layered allow/deny policy (JSON)	Toolset enable/disable + per-tool check_fn
Exec approval	Sender-based authorization	Pattern-based detection + smart LLM approval + permanent allowlist
Sandbox	Docker (off/non-main/all)	Six backends (Docker, Daytona, Modal, etc.)
Outbound	Send policy (outbound gates)	Not explicitly documented
Prompt safety	Advisory guardrails in system prompt	Advisory guardrails in system prompt
Memory safety	MEMORY.md only in private sessions	Security scanning on memory writes (injection/exfiltration)

Key Difference

OpenClaw has a 7-layer formal security model with explicit outbound send policy. Hermes has more sophisticated command approval (regex patterns, smart LLM approval, session state, permanent allowlist). Hermes scans memory writes for injection attempts; OpenClaw protects memory by scope (private sessions only).

14. Research & Training (Hermes Only)

Hermes includes infrastructure that has no OpenClaw equivalent:

Batch trajectory generation for SFT data
Atropos RL environments for reinforcement learning
Trajectory compression for training tool-calling models
Environment framework for benchmarks and evaluation

This reflects Hermes' origin at Nous Research — the agent is both a product and a research platform for improving the next generation of models.

15. Summary Matrix

Capability	OpenClaw	Hermes
Language/Runtime	TypeScript / Node.js	Python
Session Storage	JSONL files	SQLite (WAL)
Memory Model	Unbounded files + vector search	Bounded tool-managed + FTS5
Memory Recall	BM25 + vector embeddings (hybrid)	FTS5 + LLM summarization
Prompt Caching	Cache-stable timestamps	Anthropic cache_control breakpoints
Compression	Compaction + pruning	Dual-layer (gateway + agent) with structured templates
Tool Extension	Plugins (in-process) + Hooks (event)	Registry (self-registering) + Toolsets
Exec Approval	Sender-based	Pattern + smart LLM + allowlist
Terminal Backends	Local + Docker	Local + Docker + SSH + Daytona + Modal + Singularity
Skills Self-Creation	No	Yes (autonomous + self-improving)
Node/Device Control	Yes (camera, screen, location, run)	No
User Modeling	USER.md (manual)	USER.md + Honcho (AI dialectic)
RL/Training	No	Atropos + trajectories + benchmarks
OpenClaw Migration	N/A	Built-in (`hermes claw migrate`)
Gateway Protocol	Typed WebSocket (req/res/event)	Platform adapters (dispatch)
Concurrent Writes	Lane-based command queue	SQLite WAL + application-level retries

16. Architectural Philosophy

OpenClaw is infrastructure-first: a robust daemon that owns all state, with formal concurrency primitives, a typed wire protocol, and defense-in-depth security. It's optimized for always-on deployment with rich client integration (desktop apps, mobile nodes, device pairing). The workspace-as-git-repo philosophy makes everything human-readable and version-controllable.

Hermes is research-first: a flexible agent loop designed for experimentation, with multiple execution backends, autonomous skill creation, RL training infrastructure, and user modeling. It's optimized for the developer who wants to customize everything — switch models, add terminal backends, generate training data, run benchmarks. The bounded memory system forces efficient curation.

Both are converging: Hermes has an explicit OpenClaw migration path, compatible skills format, and similar prompt assembly patterns. The architectures are different enough to represent genuine alternative approaches to the same problem — not just language ports.

Comparison authored by Loki@FastStart ⚡ — April 2026 Sources: OpenClaw Architecture Deep Dive, Hermes Agent developer documentation (hermes-agent.nousresearch.com/docs)

FilesExpand file tree

openclaw-vs-hermes.md

Latest commit

History

openclaw-vs-hermes.md

File metadata and controls

Hermes Agent vs OpenClaw: Technical Architecture Comparison

A Side-by-Side Breakdown for Software Architects

Source Metadata

1. Executive Summary

2. Foundational Differences

3. Agent Loop Architecture

OpenClaw: Gateway-Mediated, Queue-Serialized

Hermes: Direct Synchronous Loop

Key Difference

4. Session Storage

OpenClaw: File-Based (JSONL + JSON)

Hermes: SQLite Database (WAL Mode)

Key Difference

5. Memory Architecture

OpenClaw: Unbounded File-Based Memory

Hermes: Bounded Tool-Managed Memory

Key Difference

6. Prompt Assembly

OpenClaw: Dynamic Assembly Per-Turn

Hermes: Cached + Ephemeral Split

Anthropic Prompt Caching (Hermes)

7. Context Compression

OpenClaw: Compaction + Pruning (Two Mechanisms)

Hermes: Dual-Layer Compression

Key Difference

8. Tool Systems

OpenClaw: Policy-Layered, Plugin-Extensible

Hermes: Registry-Based, Self-Registering

Key Difference

9. Skills System

Key Difference

10. Messaging & Gateway

OpenClaw: WebSocket-First Daemon

Hermes: Orchestration Process

Key Difference

11. Sub-Agent Architecture

12. Terminal / Execution Environments

OpenClaw: Host + Docker Sandbox

Hermes: Six Terminal Backends

Key Difference

13. Security Model

Key Difference

14. Research & Training (Hermes Only)

15. Summary Matrix

16. Architectural Philosophy