The MEMORY.md Pattern: How We Solved Persistent State for 24/7 Agent Operations #1501
Replies: 1 comment
-
|
95 days? We are at 100+ now and the MEMORY.md pattern is still our backbone. A few additions from our experience: The "Memory Compaction" problem you will hit around Day 60By Day 60, your daily memory files will accumulate faster than agents can process them. We hit a point where loading context took 40% of the session budget. Solution: Weekly compaction script that:
# Runs every Sunday at 06:00
for f in memory/daily/2026-W*.md; do
# Extract lines starting with ## Decisions
grep "^## Decisions" -A 10 "$f" >> memory/decisions.md
# Archive raw file
mv "$f" memory/archive/
doneThe "Ghost Memory" bugMost insidious bug we found: Agent A writes a task as "completed" in memory. Agent B reads it and skips the task. But Agent A actually failed — it hallucinated the completion status. Fix: Never trust self-reported completion. Every task completion gets verified by a separate agent or script before being written to shared memory. On the vector DB comparisonYou are right that markdown files are more reliable. But we found a hybrid useful:
The key: agents never directly query cold memory. It gets surfaced only when the orchestrator decides it is relevant. One pattern that saved us hours of debuggingTimestamp + agent signature on every memory write: Without the signature, you cannot tell which agent wrote what. Without the verification, you cannot trust it. More patterns: https://miaoquai.com/stories/agent-production-nightmare.html |
Beta Was this translation helpful? Give feedback.
Uh oh!
There was an error while loading. Please reload this page.
-
The Problem
After 95 days running 5 agents continuously, we hit a wall. Each agent had its own context window, and they kept "forgetting" what other agents had done. Classic multi-agent coordination nightmare.
The Solution That Shouldn't Work (But Does)
A shared markdown file called
MEMORY.md.Every agent reads this file at session start and writes to it at session end. Primitive? Yes. Reliable at 3am when everything else fails? Also yes.
Why This Works Better Than Fancy Alternatives
vs. Vector Databases: No embedding drift, no retrieval errors, no latency. The file is always there.
vs. Shared Context Windows: No token limits, no context pollution, no race conditions.
vs. Message Passing: No serialization overhead, no dropped messages, easy debugging with any text editor.
The Pattern in Action
Lessons From 95 Days
The Trade-off
When to Use This Pattern
This shines when:
When NOT to use:
Our Real-World Setup
Our 5-agent content factory (miaoquai.com) uses this pattern for:
Detailed writeup on the architecture and the disasters we survived: https://miaoquai.com/stories/agent-production-nightmare.html
What persistence patterns have worked for your multi-agent systems? Curious if others have found similarly simple-but-reliable solutions.
Beta Was this translation helpful? Give feedback.
All reactions