memU-server is the backend management service for MemU, responsible for providing API endpoints, data storage, and management capabilities, as well as deep integration with the core memU framework. It powers the frontend memU-ui with reliable data support, ensuring efficient reading, writing, and maintenance of Agent memories. memU-server can be deployed locally or in private environments and supports quick startup and configuration via Docker, enabling developers to manage the AI memory system in a secure environment.
- Core Algorithm 👉 memU: https://github.com/NevaMind-AI/memU
- One call = response + memory 👉 memU Response API: https://memu.pro/docs#responseapi
- Try it instantly 👉 https://app.memu.so/quick-start
Star memU-server to get notified about new releases and join our growing community of AI developers building intelligent agents with persistent memory capabilities. 💬 Join our Discord community: https://discord.gg/memu
memU-server runs as two cooperating processes backed by shared infrastructure:
┌─────────────────────────────────────┐
Client ──HTTP──► │ FastAPI API Server (port 8000) │
│ POST /memorize → start workflow │
│ GET /memorize/status/{task_id} │
│ POST /retrieve, /clear, /categories│
└──────────────┬──────────────────────┘
│ gRPC
┌──────────────▼──────────────────────┐
│ Temporal Server (port 7233) │
└──────────────┬──────────────────────┘
│ poll
┌──────────────▼──────────────────────┐
│ Temporal Worker Process │
│ MemorizeWorkflow → task_memorize │
│ (calls memu-py MemoryService) │
└──────────────┬──────────────────────┘
│ SQL
┌──────────────▼──────────────────────┐
│ PostgreSQL + pgvector (port 5432) │
│ app db: memu | temporal db: temporal│
└─────────────────────────────────────┘
| Component | Technology | Purpose |
|---|---|---|
| API Server | FastAPI 0.122+ / Python 3.13 | HTTP endpoints, request validation, workflow dispatch |
| Workflow Engine | Temporal 1.25 / temporalio SDK 1.16 | Durable async task orchestration for /memorize |
| Worker | Temporal Worker (same codebase) | Executes MemorizeWorkflow → task_memorize activity |
| Database | PostgreSQL 16 + pgvector | Vector storage for memories, Temporal persistence |
| Memory Core | memu-py 1.2+ | Three-layer memory algorithm (Resource → Item → Category) |
- Client POSTs conversation payload to
/memorize. - API server saves conversation to local storage, starts a Temporal workflow, and returns immediately with a
task_id. - Temporal dispatches the
MemorizeWorkflowto the worker process. - The worker executes
task_memorizeactivity (calls memu-pyMemoryService.memorize()), writing results to PostgreSQL. - Client polls
GET /memorize/status/{task_id}to track progress (RUNNING→COMPLETED/FAILED).
- Python 3.13+ and uv package manager
- Docker & Docker Compose (for infrastructure services)
- OpenAI API key (required for LLM and embedding operations)
Launch PostgreSQL (with pgvector), Temporal Server, and Temporal UI:
docker compose up -d| Service | Port | Description |
|---|---|---|
| PostgreSQL | 5432 | Database with pgvector extension |
| Temporal | 7233 | Workflow engine gRPC API |
| Temporal UI | 8088 | Web management interface |
# Clone the repository
git clone https://github.com/NevaMind-AI/memU-server.git
cd memU-server
# Install dependencies
make install
# or: uv sync
# Configure environment (create .env or export)
export OPENAI_API_KEY=your_api_key_here
# Start the API server (terminal 1)
make run
# or: uv run fastapi dev
# Start the Temporal worker (terminal 2)
uv run python -m app.workers.workerThe API server runs on http://127.0.0.1:8000.
export OPENAI_API_KEY=your_api_key_here
# Pull and run the API server
docker pull nevamindai/memu-server:latest
docker run --rm -p 8000:8000 \
--network memu-network \
-e OPENAI_API_KEY=$OPENAI_API_KEY \
-e POSTGRES_HOST=postgres \
-e TEMPORAL_HOST=temporal \
nevamindai/memu-server:latestNote: Both the API server and Temporal worker share the same Docker image. Override the entrypoint to run the worker:
docker run --rm \ --network memu-network \ -e OPENAI_API_KEY=$OPENAI_API_KEY \ -e POSTGRES_HOST=postgres \ -e TEMPORAL_HOST=temporal \ nevamindai/memu-server:latest \ uv run python -m app.workers.worker
The memU-server API and worker processes load their configuration from environment variables or an .env file. Key application-level variables:
Docker Compose may define additional infrastructure-specific environment variables (for example,
TEMPORAL_DB); refer todocker-compose.ymlfor the complete list used by the containers.
| Variable | Default | Description |
|---|---|---|
OPENAI_API_KEY |
(required) | OpenAI API key |
OPENAI_BASE_URL |
https://api.openai.com/v1 |
OpenAI-compatible API base URL |
DEFAULT_LLM_MODEL |
gpt-4o-mini |
Chat model for memorization |
EMBEDDING_API_KEY |
Falls back to OPENAI_API_KEY |
Embedding provider API key |
EMBEDDING_BASE_URL |
https://api.voyageai.com/v1 |
Embedding API base URL |
EMBEDDING_MODEL |
voyage-3.5-lite |
Embedding model name |
POSTGRES_USER |
postgres |
PostgreSQL user |
POSTGRES_PASSWORD |
postgres |
PostgreSQL password |
POSTGRES_HOST |
localhost |
PostgreSQL host |
POSTGRES_PORT |
5432 |
PostgreSQL port |
POSTGRES_DB |
memu |
Application database name |
DATABASE_URL |
(auto-assembled) | Full DSN (overrides individual PG vars) |
TEMPORAL_HOST |
localhost |
Temporal server host |
TEMPORAL_PORT |
7233 |
Temporal server gRPC port |
TEMPORAL_NAMESPACE |
default |
Temporal namespace |
STORAGE_PATH |
./data/storage |
Local directory for conversation files |
make install # Install dependencies & pre-commit hooks
make run # Start FastAPI dev server
make check # Lint + type check + dependency check (CI)
make test # Run tests with coverage
make clean # Clean __pycache__, .pyc, build artifacts
make docker-up # Start Docker infrastructure services
make docker-down # Stop Docker infrastructure servicescurl http://localhost:8000/Response: {"message": "Hello MemU user!"}
Saves conversation data and starts an async Temporal workflow. Returns immediately with a task_id for status polling.
Request:
{
"conversation": [
{"role": "user", "content": {"text": "I prefer dark mode"}, "created_at": "2025-03-20 10:00:00"},
{"role": "assistant", "content": {"text": "Noted!"}, "created_at": "2025-03-20 10:00:01"}
],
"user_id": "user-001",
"agent_id": "agent-001",
"override_config": null
}Response:
{
"status": "success",
"result": {
"task_id": "memorize-a1b2c3d4e5f60718293a4b5c6d7e8f90",
"status": "PENDING",
"message": "Memorization task submitted for user user-001"
}
}Track a memorization task. The task_id must match the format memorize-<32 hex chars> (as returned by POST /memorize).
curl http://localhost:8000/memorize/status/memorize-a1b2c3d4e5f60718293a4b5c6d7e8f90Response:
{
"status": "success",
"result": {
"task_id": "memorize-a1b2c3d4e5f60718293a4b5c6d7e8f90",
"status": "COMPLETED",
"detail": "SUCCESS"
}
}Status values: RUNNING, COMPLETED, FAILED, CANCELED, TERMINATED, UNKNOWN.
{"query": "What are the user's UI preferences?"}Response:
{
"status": "success",
"result": { ... }
}Delete memories for a specific user and/or agent. At least one of user_id or agent_id must be provided.
{"user_id": "user-001", "agent_id": "agent-001"}Response:
{
"status": "success",
"result": {
"purged_categories": 3,
"purged_items": 15,
"purged_resources": 2
}
}List all memory categories for a user.
{"user_id": "user-001"}Response:
{
"status": "success",
"result": {
"categories": [
{
"name": "UI Preferences",
"description": "User interface preferences",
"user_id": "user-001",
"agent_id": "agent-001",
"summary": "User prefers dark mode..."
}
]
}
}import httpx
BASE = "http://localhost:8000"
# Memorize a conversation
resp = httpx.post(f"{BASE}/memorize", json={
"conversation": [
{"role": "user", "content": {"text": "I like Python"}, "created_at": "2025-03-20 10:00:00"},
{"role": "assistant", "content": {"text": "Great choice!"}, "created_at": "2025-03-20 10:00:01"},
],
"user_id": "user-001",
})
task_id = resp.json()["result"]["task_id"]
# Poll until complete
import time
while True:
status = httpx.get(f"{BASE}/memorize/status/{task_id}").json()
if status["result"]["status"] in ("COMPLETED", "FAILED"):
break
time.sleep(2)
# Retrieve memories
result = httpx.post(f"{BASE}/retrieve", json={"query": "What languages does the user like?"})
print(result.json())# Submit memorization
curl -X POST http://localhost:8000/memorize \
-H "Content-Type: application/json" \
-d '{"conversation": [{"role":"user","content":{"text":"hello"},"created_at":"2025-01-01 00:00:00"}], "user_id":"u1"}'
# Check status (use the task_id returned by POST /memorize)
curl http://localhost:8000/memorize/status/<task_id>
# Retrieve
curl -X POST http://localhost:8000/retrieve \
-H "Content-Type: application/json" \
-d '{"query": "user preferences"}'
# List categories
curl -X POST http://localhost:8000/categories \
-H "Content-Type: application/json" \
-d '{"user_id": "u1"}'
# Clear memories
curl -X POST http://localhost:8000/clear \
-H "Content-Type: application/json" \
-d '{"user_id": "u1"}'- Non-blocking
/memorizeendpoint returns immediately with a task ID - Durable workflow execution — tasks survive server restarts
- Status tracking via
/memorize/status/{task_id} - 10-minute activity timeout with automatic retry support
- Docker image for both API server and worker
- Docker Compose for infrastructure (PostgreSQL + Temporal)
- Single
make install && make runto start development
- Memorize: Async conversation ingestion via Temporal workflows
- Retrieve: Semantic search over stored memories (RAG-based or LLM-based)
- Clear: Targeted memory deletion by user/agent
- Categories: Browse and manage memory categories
Most memory systems in current LLM pipelines rely heavily on explicit modeling, requiring manual definition and annotation of memory categories. This limits AI’s ability to truly understand memory and makes it difficult to support diverse usage scenarios.
MemU offers a flexible and robust alternative, inspired by hierarchical storage architecture in computer systems. It progressively transforms heterogeneous input data into queryable and interpretable textual memory.
Its core architecture consists of three layers: Resource Layer → Memory Item Layer → MemoryCategory Layer.
- Resource Layer: Multimodal raw data warehouse
- Memory Item Layer: Discrete extracted memory units
- MemoryCategory Layer: Aggregated textual memory units
- Full Traceability: Track from raw data → items → documents and back
- Memory Lifecycle: Memorization → Retrieval → Self-evolution
- Two Retrieval Methods:
- RAG-based: Fast embedding vector search
- LLM-based: Direct file reading with deep semantic understanding
- Self-Evolving: Adapts memory structure based on usage patterns
By contributing to memU-server, you agree that your contributions will be licensed under the AGPL-3.0 License.
For more information please contact info@nevamind.ai
- GitHub Issues: Report bugs, request features, and track development. Submit an issue
- Discord: Get real-time support, chat with the community, and stay updated. Join us
- X (Twitter): Follow for updates, AI insights, and key announcements. Follow us