Mnemos is a persistent memory engine for AI coding agents that stores, retrieves, and manages project knowledge across sessions using SQLite with full-text and semantic search.
Core MCP Tools:
mnemos_store: Save memories with optional metadata — type (short_term,long_term,episodic,semantic,working), tags, category, summary, source, and project scope. Automatically deduplicates and indexes content.mnemos_search: Hybrid FTS5 + semantic search with configurable mode (text,semantic, orhybrid) and RRF ranking, optionally scoped to a project.mnemos_context: Assemble the most relevant memories for a query within a token budget — ideal for context injection at session start.mnemos_get: Fetch a specific memory by ID.mnemos_update: Modify an existing memory's content, summary, or tags (PATCH semantics).mnemos_delete: Soft-delete a memory by ID (recoverable).mnemos_relate: Create typed relationships between memories with optional strength, building a knowledge graph.mnemos_maintain: Run decay scoring, archival, and garbage collection to keep the memory store efficient.
Deployment & Integration:
Runs as an MCP server (stdio) or optional REST API server
One-command autopilot setup for Claude Code, Kiro, Cursor, and Windsurf
Single Go binary with embedded SQLite — no cloud, no runtime dependencies
Optional semantic search via Ollama or OpenAI embeddings
Optional Markdown mirror for human-readable memory export
Sub-60ms operations regardless of dataset size
mnemos
Your AI agent has the memory of a goldfish. Mnemos fixes that.
A persistent memory engine for AI coding agents.
Mnemos gives Claude Code, Kiro, Cursor, Windsurf, and other MCP clients a memory that survives across sessions: architecture decisions, bug root causes, project conventions, and non-obvious implementation details.
Single Go binary. Embedded SQLite. Zero runtime dependencies. No Docker. No cloud. No Python. No Node.
Agent (Claude Code / Kiro / Cursor / Windsurf / ...)
↓ MCP stdio
mnemos serve
↓
SQLite + FTS5 (~/.mnemos/mnemos.db)What does it actually do?
Every time your agent learns something worth keeping, it stores it in Mnemos. Next session, it can pull that context back before it starts coding.
That means:
fewer repeated explanations
less re-discovery of old bugs and decisions
more continuity across sessions
better context for long-running projects
No more re-explaining your project structure every Monday morning. No more rediscovering the same environment quirk three times in one week.
The memory lifecycle:
Agent finishes something meaningful (fixed a bug, made a decision, learned a pattern)
Calls
mnemos_storewith the contentMnemos deduplicates, classifies, and indexes it
Next session:
mnemos_contextassembles relevant memories within a token budgetAgent picks up right where it left off
Why it feels different
Mnemos is built for real coding workflows, not just generic note storage.
MCP-native: designed to be called directly by coding agentsFast to install: one binary, one local databaseActually useful retrieval: FTS + optional semantic search + context assemblyLifecycle aware: deduplication, relevance decay, archive/GCReadable by humans too: optional Markdown mirrorAutopilot ready: one setup command wires hooks + steering for Claude, Kiro, or Cursor
Quick Start
# install
curl -fsSL https://raw.githubusercontent.com/mnemos-dev/mnemos/main/install.sh | bash
# first-time setup
mnemos init
# run as MCP server
mnemos serveThen wire it to your AI client — or use autopilot setup to make it fully automatic.
Autopilot Setup
Mnemos is an MCP server, not an agent controller. By default, the agent only uses memory if it's instructed to. Autopilot closes that gap: one command injects hook config, steering files, and MCP config so the agent uses memory automatically on every session — no reminding needed.
# Claude Code
mnemos setup claude
# Kiro
mnemos setup kiro
# Cursor
mnemos setup cursorThat's it. From that point:
Session start — Mnemos automatically loads relevant context into the agent's window
During work — Mnemos searches memory when the topic changes
Session end — Mnemos verifies memory was captured and cleans up state
Use --global to install for all projects instead of just the current one:
mnemos setup claude --globalUse --force to overwrite existing config files without prompting:
mnemos setup claude --forceWhat mnemos setup writes
Claude Code (mnemos setup claude):
File | Purpose |
| Steering instructions — tells Claude when and what to store |
| Hook config — wires session-start, prompt-submit, session-end |
| MCP server config — registers |
Kiro (mnemos setup kiro):
File | Purpose |
| Steering file — auto-loaded by Kiro on every session |
| MCP server config |
Cursor (mnemos setup cursor):
File | Purpose |
| Steering instructions for Cursor |
| MCP server config |
How autopilot works under the hood
Autopilot uses two complementary systems:
Hooks (deterministic) — mnemos hook session-start/prompt-submit/session-end
These are short-lived processes called by the AI client at specific lifecycle events. They run in InitLight mode (no background workers, cold start < 50ms) and always exit 0 — they never interrupt the agent session.
session-start: assembles relevant context from Mnemos and injects it into the agent's windowprompt-submit: detects topic changes using Jaccard similarity; searches memory when the topic shifts; respects a 5-minute cooldown per topic to avoid noisesession-end: counts memories stored during the session; optionally leaves a breadcrumb; cleans up session state
Steering (LLM-guided) — CLAUDE.md / .kiro/steering/mnemos.md / .cursorrules
These files instruct the agent on what to store and when. The agent makes the semantic judgment — hooks handle the mechanical retrieval.
Session start → hook injects context → agent reads it
During work → hook searches on topic change → agent receives results
→ agent discovers durable learning → agent calls mnemos_store via MCP
Session end → hook verifies coverage → state cleanupSession state
Each session gets its own state file at .mnemos/sessions/session-<id>.json (falls back to ~/.mnemos/sessions/ if no local .mnemos/ dir exists). Stale and orphaned sessions are cleaned up automatically.
Why mnemos?
claude-mem | engram | neural-memory | mnemos | |
MCP native | ✅ | ✅ | ✅ | ✅ |
Single binary / zero install | ❌ | ✅ | ❌ (pip) | ✅ |
Zero config to start | ✅ | ✅ | ❌ | ✅ |
Hybrid search (FTS + semantic RRF) | ❌ | ❌ | ❌ | ✅ |
Memory decay / lifecycle | ❌ | ❌ | ✅ | ✅ |
Deduplication | ❌ | ❌ | ❌ | ✅ (3-tier) |
Relation graph | ❌ | ❌ | ✅ (spreading activation) | ✅ |
Token-budget context assembly | ❌ | partial | ❌ | ✅ |
Human-readable Markdown mirror | ✅ | ❌ | ❌ | ✅ |
Works with Kiro / Cursor / Windsurf | ❌ | ✅ | ✅ | ✅ |
No Python / Node runtime required | ✅ | ✅ | ❌ | ✅ |
One-command autopilot setup | ❌ | ❌ | ❌ | ✅ |
Written in Go | ❌ | ✅ | ❌ (Python) | ✅ |
Install
# curl (macOS / Linux)
curl -fsSL https://raw.githubusercontent.com/mnemos-dev/mnemos/main/install.sh | bash
# Homebrew
brew install s60yucca/tap/mnemos
# Build from source (requires Go 1.23+)
git clone https://github.com/mnemos-dev/mnemos
cd mnemos && make build
# binary at: bin/mnemosInitialize on first run:
mnemos init
# Creates ~/.mnemos/mnemos.db and ~/.mnemos/config.yamlUse with Claude Code
Autopilot (recommended):
mnemos setup claudeThis writes CLAUDE.md, .claude/hooks.json, and .mcp.json in one shot. Restart Claude Code and memory is fully automatic.
Manual setup:
Add to .mcp.json in your project root:
{
"mcpServers": {
"mnemos": {
"command": "mnemos",
"args": ["serve"],
"env": {
"MNEMOS_PROJECT_ID": "my-project"
}
}
}
}Then add a session instruction based on templates/claude/CLAUDE.md.
Use with Kiro
Autopilot (recommended):
mnemos setup kiroThis writes .kiro/steering/mnemos.md and .kiro/settings/mcp.json. Kiro picks up the steering file automatically on every session.
Manual setup:
Add to .kiro/settings/mcp.json:
{
"mcpServers": {
"mnemos": {
"command": "mnemos",
"args": ["serve"],
"env": {
"MNEMOS_PROJECT_ID": "my-project"
},
"disabled": false,
"autoApprove": ["mnemos_search", "mnemos_get", "mnemos_context"]
}
}
}Copy templates/kiro/steering/mnemos.md to .kiro/steering/mnemos.md.
Use with Cursor / Windsurf / any MCP client
Autopilot:
mnemos setup cursorManual: Same JSON MCP config as above — mnemos speaks standard MCP over stdio.
MCP Tools
Tool | What it does |
| Store a memory with optional type, tags, project scope |
| Hybrid FTS + semantic search with RRF ranking |
| Fetch a memory by ID |
| Update content, summary, or tags |
| Soft-delete (recoverable via maintain) |
| Link two memories with a typed relation |
| Assemble relevant memories within a token budget |
| Run decay, archival, and garbage collection |
Resources: mnemos://memories/{project_id}, mnemos://stats
Prompts: load_context (session start), save_session (session end)
CLI
mnemos init # first-time setup
mnemos store "JWT uses RS256, tokens expire in 1h" # store a memory
mnemos search "authentication" # hybrid search
mnemos search "auth" --mode text # text-only search
mnemos list --project myapp # list memories
mnemos get <id> # fetch by id
mnemos update <id> --content "updated text" # update
mnemos delete <id> # soft delete
mnemos delete <id> --hard # permanent delete
mnemos relate <src-id> <tgt-id> --type depends_on # create relation
mnemos stats --project myapp # storage stats
mnemos maintain # decay + GC
mnemos serve # start MCP server (stdio)
mnemos serve --rest --port 8080 # start REST server
mnemos version # print version
# Autopilot setup
mnemos setup claude # setup for Claude Code
mnemos setup kiro # setup for Kiro
mnemos setup cursor # setup for Cursor
mnemos setup claude --global # install globally (home dir)
mnemos setup claude --force # overwrite without prompting
# Hook subcommands (called automatically by AI clients — not for manual use)
mnemos hook session-start
mnemos hook prompt-submit
mnemos hook session-endGlobal flags: --project <id>, --config <path>, --log-level debug|info|warn|error
Memory Types
Mnemos auto-classifies memories based on content. You can override manually.
Type | Decay rate | Use for |
| fast (~1 day) | todos, temp notes, WIP |
| medium (~1 month) | session events, bug fixes |
| slow (~6 months) | architecture decisions |
| very slow | facts, definitions, knowledge |
| fast | active task context |
How search works
Mnemos uses Reciprocal Rank Fusion (RRF) to combine two search signals:
FTS5 — SQLite full-text search with BM25 ranking. Fast, offline, no setup.
Semantic — vector cosine similarity via embeddings. Optional, requires Ollama or OpenAI.
With only FTS5 (default), search is keyword-based but still very good. Enable embeddings to find memories by meaning — e.g. query "token expiry" finds a memory about "JWT RS256 1h lifetime".
Configuration
~/.mnemos/config.yaml:
data_dir: ~/.mnemos
log_level: info
log_format: text # text or json
embeddings:
provider: noop # noop (default) | ollama | openai
base_url: http://localhost:11434
model: nomic-embed-text
dims: 384
api_key: ""
dedup:
fuzzy_threshold: 0.85
semantic_threshold: 0.92
lifecycle:
decay_interval: 24h
gc_retention_days: 30
archive_threshold: 0.1
mirror:
enabled: false # set true to write human-readable Markdown files
base_dir: ~/.mnemos/mirror
hook:
enabled: true
search_cooldown: 5m
session_start_max_tokens: 2000
prompt_search_limit: 5
stale_timeout: 1h
log_level: warnEnvironment variables override config — prefix with MNEMOS_:
MNEMOS_PROJECT_ID=myapp # scope memories to a project
MNEMOS_LOG_LEVEL=debug
# Only needed if using semantic embeddings (optional):
MNEMOS_EMBEDDINGS_PROVIDER=ollama
MNEMOS_EMBEDDINGS_API_KEY=sk-...Embedding Providers
Embeddings are optional. By default mnemos uses noop — pure FTS5 text search, zero config, works fully offline.
Enable embeddings only if you want semantic similarity search (find memories by meaning, not just keywords).
Ollama (local, free, no API key):
embeddings:
provider: ollama
base_url: http://localhost:11434
model: nomic-embed-text
dims: 768OpenAI:
embeddings:
provider: openai
model: text-embedding-3-small
dims: 1536
api_key: sk-...Performance
Benchmarked on macOS (Apple M-series), SQLite WAL mode, embeddings disabled (noop), cold process start per operation.
Operation | 350 memories | 1500 memories | Notes |
| 57 ms | 24 ms | includes dedup check |
| 55 ms | 22 ms | hash match, no write |
| 60 ms | 54 ms | BM25 ranking |
| 42 ms | 39 ms | FTS + noop vector |
| 34 ms | 26 ms | sorted by created_at |
| 27 ms | 108 ms | full table scan |
context assembly (MMR, 20 candidates) | ~147 µs | — | in-process, token budget 2000 |
hook session-start | < 200 ms | — | cold start, InitLight |
hook prompt-submit | < 100 ms | — | cold start, InitLight |
hook session-end | < 100 ms | — | cold start, InitLight |
binary size | 12 MB | — | single static binary |
startup time | ~50 ms | — | cold start |
Most operations stay under 60 ms regardless of dataset size. Hook subcommands use InitLight mode — no background workers start, keeping latency low enough to not interrupt agent sessions.
REST API
mnemos serve --rest --port 8080POST /memories store
GET /memories/{id} get
PATCH /memories/{id} update
DELETE /memories/{id} soft-delete
GET /memories list
POST /memories/search search
POST /memories/{id}/relate relate
GET /stats stats
POST /maintain maintenanceBuild
make build # → bin/mnemos
make test # all tests
make lint # golangci-lint
make release # goreleaser snapshotLicense
MIT