mcp-memory-graph
MCP Memory Graph
A memory server for Claude Code and any other MCP client. It gives your AI assistant a permanent, searchable memory that lives in one SQLite file on your machine. Store a decision today, ask about it next month, and the answer comes back. Everything runs locally: the embedding model, the search index, the knowledge graph. No cloud account, no API key, no per-token cost.
License: source-available and free for noncommercial use (PolyForm Noncommercial 1.0.0): personal projects, hobby, study, research, charity, education, and government. Commercial use requires a paid license (COMMERCIAL.md).
Who it's for: developers who want Claude (or Cursor, Codex, any MCP client) to remember decisions across sessions. Solo builders and hobbyists use it free. Teams share a knowledge base over git. And anyone who wants to replace a cloud memory service (mem0, Zep, Letta, Supermemory) with something that runs entirely on their own machine.
What it looks like
Run npx mcp-memory-graph serve and you get a local web dashboard for browsing and searching your memory outside Claude.

Search works by meaning, not keywords. The query below ("how do we handle payments") finds the Stripe, GDPR, and Postgres notes even though none of them contains that phrase — each result carries a confidence score and a match-type badge:

Browse and sort the whole store in one table — scope, type, tags, quality score, and how often each memory has been read:

Related MCP server: ai-memory
How it compares
mem0, Zep, Letta, and Supermemory are the usual names for AI memory, and several of them have open-source cores. This one is built around a different default: nothing leaves your machine and there's no infrastructure to run.
MCP Memory Graph | Typical hosted memory service | |
Where it runs | One SQLite file on your machine | A managed cloud service (some also self-host) |
Embeddings | Local model in Node (MiniLM), no API key | Usually a cloud embedding API |
Cost per token | $0 — nothing is metered | Usage-based, or a server you operate |
Extra infrastructure | None | Often Postgres/pgvector, Redis, or a Python service |
Claude Code integration | First-class: hooks auto-capture and recall | Manual wiring |
Benchmarks | Committed corpus + runner, reproducible locally | Mostly self-reported |
The trade-off is honest: a single-process SQLite server tops out in the low hundreds of thousands of vectors (see Limitations), and a hosted service will scale past that without you thinking about it. If you're a solo developer or a small team who wants memory that's private, free, and zero-ops, that ceiling is rarely the thing you hit first.
Why this exists
AI assistants forget everything between sessions. Your decisions, your patterns, the bug you fixed last Tuesday: all gone when the conversation ends. This server fixes that.
Knowledge stored today is searchable tomorrow, next week, next year.
Search works by meaning, not just keywords. "contract notice period" finds "90-day renewal clause".
It improves itself. It tracks what gets used, scores quality, extracts learnings from your sessions, and cleans itself up on a schedule.
It stays private. Local embeddings, no cloud APIs, no telemetry. The one exception is the optional Stop hook, which sends your session transcript to your own locally installed Claude Code (
claude -p) for learning extraction. You can turn that off withreview_on_stop: false.It works for any kind of knowledge. Engineers store architecture decisions, lawyers store contract patterns, accountants store audit procedures.
Quick start (about 5 minutes)
You need Node.js 20 or newer and Claude Code installed.
1. Get the server. From npm (easiest):
npm install -g mcp-memory-graphOr from source:
git clone https://github.com/YonasValentin/mcp-memory-graph.git
cd mcp-memory-graph
npm install
npm run build2. Register the server with Claude Code (optional — init in step 3 does this for you at user scope):
# npm install:
claude mcp add memory-server -- npx -y mcp-memory-graph
# from source:
claude mcp add memory-server node /path/to/mcp-memory-graph/dist/index.js3. Install the hooks (recommended):
npx mcp-memory-graph initThis is the one command that wires everything up: it registers the MCP server (user scope), installs the auto-capture/recall hooks and the usage skill, writes config, and schedules a nightly cleanup. Answer the prompts, or pass --yes to accept the defaults. (Skip the auto-registration with --no-register if you manage claude mcp yourself.)
4. Try it. Open a Claude Code session and say:
Remember this: we use Postgres for the main app database. Decided 2026-06-01,
because we need JSONB and full-text search in one place.Then, in a later session:
What database did we decide to use, and why?Claude searches its memory and answers with the stored decision. That's the whole loop.
5. Verify the install. Ask Claude:
What memory tools do you have available?It should list all 50 tools (44 memory_*, 3 vault_*, 3 core_memory_*).
The first time a memory tool runs, the embedding model (about 30 MB) downloads from HuggingFace and is cached at ~/.cache/huggingface/. Every start after that is instant.
To undo everything: npx mcp-memory-graph uninstall.
Upgrading
npm install -g mcp-memory-graph@latest # or just let `npx -y mcp-memory-graph` pull it
npx mcp-memory-graph init # re-run to refresh on-disk hooks + the nightly scheduleUpgrading the package updates the code that runs each session (hooks, tools, the server), so server-side fixes apply the next time a tool runs — nothing else needed for those.
But files that init wrote earlier are not rewritten by a package upgrade: the Claude Code hook registrations in settings.json and the macOS launchd plist at ~/Library/LaunchAgents/com.mcp-memory.consolidate.plist. If you installed before 2.6.3, that plist used a bare node that launchd (whose minimal PATH excludes nvm) could not run — so the nightly consolidation silently never fired. Re-run npx mcp-memory-graph init once after upgrading to regenerate it with an absolute node path and an output log. Verify it then runs:
launchctl start com.mcp-memory.consolidate
cat ~/.mcp-memory/consolidation.log # should show a "Consolidation complete" reportTo clear conflict noise that accumulated while the job wasn't running: npx mcp-memory-graph consolidate.
How it works, in plain terms
When you store a memory, the server turns the text into a vector (a list of 384 numbers that captures its meaning) using a small model that runs inside Node.js. It also indexes the text for keyword search. Both live in one SQLite file, by default at ~/.mcp-memory/memory.db.
When you search, the server runs both kinds of search at once, merges the rankings, and returns the best matches with a confidence label. A second model can then re-sort the top results for better precision (this is the reranker, on by default for MCP clients, and it costs about 200 ms).
On top of that sits a knowledge graph: memories link to entities and to each other, so the server can answer questions that need more than one hop, like "what does the payment service depend on?". A nightly "dream cycle" deduplicates, re-scores, prunes, and reports gaps.
The benchmarks, and how to read them
Every number below was produced locally: real embedding model, real production handlers, no network. You can rerun all of them on your own machine.
A quick primer if benchmarks are new to you. A gold set is a list of questions where the right answer is known in advance. Precision@1 asks: was the top result the right one? Recall@5 asks: was the right answer anywhere in the top 5? MRR (mean reciprocal rank) rewards putting the right answer near the top. The reranker is a second model that re-sorts the top 50 results; it is slower but noticeably more accurate.
Local gold set
precision@1 | precision@3 | MRR | search p95 | |
Hybrid (RRF) | 0.563 | 0.750 | 0.704 | ~4 ms |
+ cross-encoder rerank (MCP default) | 0.813 | 0.875 | 0.867 | ~230 ms |
Reproduce with npm run bench. Full methodology, the gold set itself, and every miss are printed and documented in docs/BENCHMARKS.md.
Scale
With the real embedder and a file-backed SQLite database, retrieval p95 is 9.1 ms at 10,000 vectors and 30 ms at 50,000. The rerank pass adds a roughly constant 200 ms on top. Most memory products publish self-reported, cloud-hosted numbers; these are measured locally and reproducible from a committed corpus and runner.
Public benchmarks
Four public memory benchmarks, run untuned (stock MiniLM embedder, production handlers, zero benchmark-specific tweaks), matching or beating MemPalace on all four:
Benchmark | Our result | Comparison |
R@5 = 97.8% | vs 96.6% published | |
R@10 = 93.5% | vs 92.9% | |
session R@10 = 82.2%, R@50 = 100% | vs 60.3% baseline | |
hit@5 = 78.7% | vs their 80.3% tuned |
Run them yourself: npm run bench:longmemeval, bench:locomo, bench:convomem, bench:membench. The honest notes (where the reranker helps and where it hurts, the dedup floor on MemBench, gold-set size caveats) are in docs/BENCHMARKS.md.
Features
Core
50 MCP tools: CRUD and retrieval, a confidence-tagged knowledge graph, a self-correcting write gate, signed provenance and verification, an event bus with SSRF-guarded webhooks, change propagation and advisor surfaces, resumable session state, expertise profiles, memory tiers, Obsidian vault round-tripping, and GDPR-grade forget and history. Full list below.
Hybrid search: vector similarity (meaning) plus keyword matching (exact terms), merged with Reciprocal Rank Fusion.
rerank: trueadds the cross-encoder pass.use_graph: trueblends in HippoRAG Personalized PageRank multi-hop scores.as_of: <timestamp>searches the graph as it stood at a past moment.Local embeddings: Transformers.js running all-MiniLM-L6-v2 (384 dimensions) inside Node.js. No Python, no cloud API, no GPU.
SQLite storage: one file, using better-sqlite3 with two extensions: sqlite-vec for vector nearest-neighbor search, FTS5 for keyword search with BM25 ranking.
Structure-aware chunking: text splits on paragraphs, markdown on headings (heading context preserved in each chunk), code on function and class boundaries, legal on sentences.
Scopes: organize memories into
global,project,user,team,department.Version history: every update saves the previous version. Full audit trail of who changed what, when.
Temporal decay: optional time-based scoring that favors recent memories (exponential or linear).
Confidence scoring: every result carries a 0 to 1 confidence and a plain label (high, medium, low).
Expiration: time-sensitive memories can carry an expiry date and drop out of search automatically.
Self-improvement
Access tracking: every search, get, and related-memory call records which memories were touched.
Quality scoring: automatic
importance_scoreandconfidence_scoreon every memory, from access frequency, recency, and content signals.Learning extraction: at session end, a headless
claude -previews the transcript and stores zero to five curated learnings. (This replaces the oldertype: "agent"Stop hook, which is silently broken on macOS; see anthropics/claude-code#39184.)Dream cycle: scheduled or on-demand deduplication, re-scoring, pruning, expiry enforcement, and knowledge-gap detection.
Gap detection: searches that return nothing are logged, so you can see what knowledge is missing.
Claude Code hooks
Four opt-in hooks, installed by init:
Hook | When it fires | What it does |
SessionStart | session begins | Fast status check (memory count, expired, stale docs) |
PostToolUse | after a memory search | Tracks hits and misses to |
PreCompact | before context compression | Optional learning extraction (off by default) |
Stop | session ends | Spawns headless |
The Stop hook detaches in about 30 ms and reviews in the background for 10 to 60 seconds. It needs the claude CLI on $PATH (or $CLAUDE_BIN), authenticated. Turn it off with review_on_stop: false in ~/.mcp-memory/config.json.
Metadata on every memory
Field | Purpose | Examples |
| Isolation level | global, project, user, team, department |
| Sub-scope grouping | "my-project", "legal-team", "q4-audit" |
| Organizational unit | legal, engineering, hr, sales, finance |
| Content classification | contract, policy, code, incident, decision, report |
| Data sensitivity | public, internal, confidential, restricted |
| Flexible categorization | ["renewal", "notice-period", "compliance"] |
| Content language (ISO 639-1) | "en", "da", "de" |
| Origin | file path, URL, system name |
| Creator | person or system name |
| Domain-specific JSON |
|
| Auto-expiration date | ISO 8601 timestamp |
scope and namespace group content within one database. A shared-database MCP_API_NAMESPACE pin gives supported per-namespace multi-tenant isolation (schema v14); a separate database file per tenant is the strongest boundary. See docs/MULTI-TENANCY.md.
Knowledge graph and bi-temporal model
Bi-temporal validity: every memory carries valid-time (
valid_from,valid_to) alongside transaction-time. Updates invalidate rather than delete: the prior fact gets avalid_tostamp instead of being overwritten, so history is never lost. Reads default to currently valid rows but acceptas_of: <timestamp>for point-in-time recall.memory_historyreturns one memory's full timeline.Confidence-tagged links: memories connect via wikilink, co-occurrence, and similarity edges, each with a confidence weight.
memory_graphtraverses entities and relationships up to 3 hops.memory_extract_entitiesstores LLM-extracted entities and relationships.HippoRAG multi-hop:
use_graph: trueon search runs Personalized PageRank over the entity and link graph for associative retrieval.Token-budgeted answers:
memory_queryanswers a question with a tight subgraph. It seeds from hybrid search, walks the graph up tomax_hopswhile avoiding hubs, and returns a token-budgeted context string instead of flooding the window.Communities:
memory_communitiesfinds densely connected entity clusters, for "what are the main themes in here?" questions.
Self-correcting writes
Write gate: stores route through an ADD, UPDATE, DELETE, or NOOP decision (
on_conflict), so new facts reconcile with existing ones instead of piling up duplicates.Contradiction detection: a cross-encoder NLI model flags when an incoming memory contradicts something already stored.
Forgetting curve: memories carry a
stabilitysignal, so rarely reinforced knowledge slowly sinks in ranking, the way human memory fades.
Agent-OS memory
Core memory block: a small, bounded, always-in-context note per
(scope, namespace)that the agent maintains itself (core_memory_get,core_memory_append,core_memory_replace). Appends that would overflow are refused, which forces deliberate compaction.Tiers:
memory_tiersreports a MemGPT-style hot / recall / archival distribution and lists the hot working set.Reflection:
memory_reflectgathers the most reflection-worthy memories and, in store mode, persists synthesized insights linked back to their sources.
Obsidian vault
Bidirectional sync:
vault_syncreads a vault in.memory_export_vaultwrites memories out as.mdfiles with YAML frontmatter that round-trips losslessly for every authored field (id, scope, namespace, tags, access_level, importance, timestamps). Two derived scores are not in the frontmatter and reset on re-import:confidence_score(to 0.6) andstability(to 1.0). Usememory_export(JSON) for a byte-perfect backup. One metadata key is reserved:metadata._vaultholds internal sync bookkeeping and never appears in tool output or exported files.JSON Canvas:
memory_canvasexports the graph as a JSON Canvas 1.0.canvasfile that opens as a spatial board in Obsidian.Read-only wiki:
serveexposes/publish/:namespace(index, page, search, graph) as a read-only wiki. It is deliberately not behind bearer auth, but is hard-scoped to published access levels (MCP_PUBLISH_ACCESS_LEVELS, defaultpublic).Session notes and templates:
memory_session_noteappends to one "daily note" per session.memory_templatereturns structured note scaffolds per document type.
Team and solo sharing (git)
memory initwizard: interactive setup (or--yesfor defaults) that writes~/.mcp-memory/config.json(or project-scoped config) plus the Claude Code wiring.Committable graph artifact:
memory export-graphwrites a deterministicmemory-graph.jsonyou can commit and share.memory git-setupinstalls a.gitattributesentry and thememory-unionmerge driver so parallel commits merge instead of conflict.Attribution: set
MCP_AGENT_ID(or passagent_idper store) andmemory_attributionreports how many valid memories each agent wrote.
Trust and governance
Questions to ask:
memory_questionssurfaces what the graph is well placed to find: ambiguous links to confirm, frequently mentioned but under-documented entities, orphaned and stale memories.GDPR-grade forget:
memory_forgetsoft-deletes by default (a tombstone viavalid_to, recoverable, still visible viaas_of). Withhard: trueit returns a portability export first, then permanently erases.memory_deleteis unchanged.Output sanitization: every tool result passes through one chokepoint that strips ANSI and VT escapes, control characters, and zero-width or BiDi Trojan-Source spoofing before it leaves the server. Stored content stays raw at rest.
Hot reload: config changes apply without a restart.
Web dashboard
The server ships a browser dashboard for viewing and managing memories outside Claude. It runs on the same Express server as the MCP HTTP transport, so there is no separate process.
Six pages:
Dashboard: memory counts, content size, breakdowns by scope, department, and type, plus the 10 most recently updated memories.
Search: hybrid search with confidence and match-type badges, and instant fuzzy suggestions as you type.
Browse: sortable, paginated table of all memories with scope filtering and quality indicators.
Memory detail: full content, metadata, version history, related memories, inline edit and delete.
Knowledge graph: D3 force-directed view. Nodes sized by importance, colored by scope. Zoom, pan, drag, double-click to navigate.
Tools: a console for the full tool surface. It lists every tool the server advertises, renders a form from each schema, and runs it over the authenticated MCP endpoint. Destructive tools ask for confirmation first.
Tech: React 19, Vite, Tailwind CSS v4, shadcn/ui, Fuse.js, D3, Recharts.
Run it:
# Development (hot reload)
npm run build && npm run serve # Terminal 1: server on :3100
npm run dev:web # Terminal 2: Vite on :5173 (proxies /api to :3100)
# Production (single process)
npm run build:all # Builds server + frontend
npm run serve # http://localhost:3100 serves both API and UIDocker: the image includes the built frontend. After docker compose up, the dashboard is at http://<host>:3200 alongside the MCP endpoint. Team members can browse the shared store from any browser, no Claude Code required.
REST API (16 endpoints)
The REST surface is for reading and managing. Creating memories goes through MCP (memory_store over POST /mcp); there is deliberately no POST /api/memories.
Method | Path | Description |
|
| Memory counts and breakdowns |
|
| Hybrid search with filters |
|
| List with pagination and sorting |
|
| Single memory with metadata |
|
| Version history |
|
| Semantically related memories |
|
| Update content or metadata |
|
| Delete a memory |
|
| Nodes and edges for graph visualization |
|
| Integrity manifest (merkle root plus per-memory hashes) |
|
| Trends and themes summary |
|
| Knowledge-gap report (recurring zero-result searches) |
|
| List webhook targets (gated by |
|
| Register an SSRF-validated outbound target |
|
| Remove a webhook target |
|
| Drain the durable, HMAC-signed delivery queue |
The first nine are what the dashboard uses. All REST endpoints call the same handlers as the MCP tools; no business logic is duplicated.
Self-improvement in detail
The server tracks how knowledge is used, scores quality, learns from sessions, and consolidates itself over time.
The learning loop
┌──────────────────────────────────────────────────────────┐
│ SESSION │
│ Claude searches → access_count++ on matched memories │
│ Claude stores → new memory with initial scores │
│ Zero results → knowledge gap recorded │
└─────────────┬────────────────────────────────────────────┘
│
▼
┌──────────────────────────────────────────────────────────┐
│ SESSION END (Stop command hook) │
│ Hook spawns detached `claude -p` headless review │
│ --allowedTools restricts to memory_store only │
│ Claude judges → 0-5 curated entries via memory_store │
│ Deduplicates against existing memories │
└─────────────┬────────────────────────────────────────────┘
│
▼
┌──────────────────────────────────────────────────────────┐
│ DREAM CYCLE (nightly or manual) │
│ 1. Score : Recalculate importance from access data │
│ 2. Expire : Enforce expiration dates │
│ 3. Prune : Remove low-quality, never-accessed items │
│ 4. Dedup : Merge near-duplicate memories │
│ 5. Gaps : Surface zero-result search patterns │
└──────────────────────────────────────────────────────────┘Quality scoring
Every memory gets an importance_score between 0 and 1:
importance = 0.3 * current_score + 0.4 * normalized_access_frequency + 0.3 * recency_factorRecency factor:
Age | Factor |
< 7 days | 1.0 |
< 30 days | 0.7 |
< 90 days | 0.4 |
> 90 days | 0.1 |
Memories that are never accessed gradually lose importance. Auto-extracted memories start lower and get pruned if they never prove useful.
Note on access reinforcement. The formula above is the periodic recompute run by the consolidate Score stage. Each read (
memory_get,memory_search,memory_related) also applies a small immediate boost (importance_score += 0.03, capped at 1.0), and search uses importance as a mild rank multiplier (1 + importance * 0.5). A memory read 20 or more times approaches the ceiling from reads alone, and consolidate re-baselines it on the next run. This popularity weighting is intentional. If you want a fixed value that reads don't drift, set an explicitimportance_scoreonmemory_storeormemory_update.
Knowledge gap detection
When a search returns nothing, the query is logged. The dream cycle's gap stage surfaces these, so you can see what's missing from the store.
Installation reference
Prerequisites
Node.js 20+, for any client.
An MCP client. Claude Code is the first-class experience; the automatic capture and recall hooks are Claude-Code-only. Other MCP clients (Codex, Cursor, and the rest) get all 50 tools but drive them manually. See "Other MCP clients" below.
For the Stop hook only: the
claudebinary on$PATH(or$CLAUDE_BIN), authenticated without prompting. Optional; disable withreview_on_stop: false.
What init does
npx mcp-memory-graph init # user scope: hooks apply to all projects
npx mcp-memory-graph init --scope project # this project onlyUser scope writes hooks to ~/.claude/settings.json, so they fire in every Claude Code session. Project scope writes hooks to .claude/settings.json in the current directory and creates .mcp.json for automatic server discovery; collaborators who clone the project get the memory server registered automatically.
Init does seven things:
Verifies the hook scripts exist in
dist/hooks/.Registers the four hooks in settings.json.
Creates the config file with sensible defaults:
~/.mcp-memory/config.json(user scope) or<project>/.mcp-memory/config.json(project scope; the generated.mcp.jsonpins it viaMCP_MEMORY_CONFIG_PATH).Writes memory usage instructions to
.claude/CLAUDE.md(project scope) or prints a snippet (user scope).Registers the MCP server with Claude Code — user scope runs
claude mcp add -s user memory-server -- npx -y mcp-memory-graphfor you (idempotent; best-effort — warns with the manual command if theclaudeCLI isn't onPATH; skip with--no-register). Project scope is registered via the committable.mcp.jsoninstead. This makes step 2 of the Quick Start optional.Installs the
mcp-memory-graphusage skill into~/.claude/skills/so Claude Code has inline guidance for all 50 tools, gotchas, and workflows. Skip with--no-skill.Sets up the nightly consolidation schedule (macOS: launchd, loaded immediately so it runs without a relogin; Linux: prints a cron suggestion; skipped for project scope).
Under a non-interactive shell (agent/CI) the wizard is bypassed: defaults are applied and a report is printed showing what was set and how to change each value. Passing --yes applies the defaults silently (no report).
Key flags: --scope user|project, --schedule HH:MM[,HH:MM] (nightly consolidation time, default 03:00), --vault <path> (enable Obsidian vault round-trip), --no-review-on-stop (disable the end-of-session learning review), --no-skill (skip skill install), --no-register (skip the user-scope claude mcp add), --remote <url> (team server mode).
npx mcp-memory-graph uninstall reverses everything init did: removes hooks, the nightly schedule, the CLAUDE.md block, and the installed skill.
Unattended setup (CI, provisioning, agents)
Every step is scriptable. There is no interactive-only path:
git clone https://github.com/YonasValentin/mcp-memory-graph.git
cd mcp-memory-graph
npm install && npm run build
npx mcp-memory-graph init --scope project --yes # local: hooks + .mcp.json, no prompts
# or point at a shared self-hosted server instead:
# npx mcp-memory-graph init --remote https://memory.example.com --token-env MEMORY_MCP_TOKENOther MCP clients (Codex, Cursor, and more)
Claude Code gets the hooks; everyone else gets the same 50 tools, driven manually. The server is a standard MCP server, so any client works. A line in the client's rules file makes usage near-automatic.
Register the server. Example for Codex, in ~/.codex/config.toml (global) or .codex/config.toml (project, trusted only):
[mcp_servers.memory-graph]
command = "node"
args = ["/abs/path/to/mcp-memory-graph/dist/index.js"]
tool_timeout_sec = 180 # the first call downloads the ~30 MB model once; the 60s default can be tight
[mcp_servers.memory-graph.env]
MCP_MEMORY_DB_PATH = "/abs/path/to/.mcp-memory/memory.db"
# or a shared self-hosted server over HTTP (see Self-hosting below):
# url = "https://memory.example.com/mcp"
# bearer_token_env_var = "MEMORY_MCP_TOKEN"Or codex mcp add memory-graph -- node /abs/path/to/mcp-memory-graph/dist/index.js. Cursor, Windsurf, and other clients use their own MCP config format, but the server command (node .../dist/index.js) and the HTTP option are the same.
Then nudge the agent in its instructions file (Codex: AGENTS.md; Cursor: project rules):
Before answering questions about architecture, decisions, patterns, or past fixes, call
memory_searchon the memory-graph server first; store new decisions, patterns, and fixes withmemory_store.
Self-hosting and sharing a memory base
The server runs three ways, from a single-user cache to a knowledge base shared across many machines. All three are local-first: nothing leaves the machines you choose to run it on.
1. Local (single user), the default
npx mcp-memory-graph init registers a local stdio server plus the hooks. Memory lives in one SQLite file on your machine. Nothing else to run. Right choice for solo use.
2. Shared server (multiple machines or a group)
Run one server that many clients connect to over HTTP. Everyone shares the same memory base, live.
Start the server (pick one):
# From source: build the server (and the dashboard, if you want it) first
npm run build:all
MCP_AUTH_TOKEN=$(openssl rand -hex 32) MCP_BIND=0.0.0.0 npm run serve
# MCP at /mcp, REST at /api, dashboard at /, all on :3100
# Or with Docker (frontend included; publishes host port 3200 by default)
MCP_AUTH_TOKEN=$(openssl rand -hex 32) docker compose up -dSet MCP_AUTH_TOKEN whenever the server is reachable beyond loopback. It is a shared bearer token, one secret for all clients. The server refuses to start unauthenticated on a non-loopback bind unless you set MCP_AUTH_OPTIONAL=1. Terminate TLS at a reverse proxy or tunnel for anything off-host.
Connect a client, one command per machine:
npx mcp-memory-graph init --remote https://memory.example.com --token-env MEMORY_MCP_TOKEN
export MEMORY_MCP_TOKEN=<the server token> # in your shell or .envFor Claude Code this writes a project .mcp.json pointing at the shared server. The token is stored as an env-var reference ("Authorization": "Bearer ${MEMORY_MCP_TOKEN}"), so the committed .mcp.json never contains the secret. Non-Claude clients point at the same server through their own MCP config.
Flag | Effect |
| Reference this env var for the token (default |
| Inline a literal token instead (avoid committing it) |
| Omit the auth header (loopback or trusted network only) |
In remote mode the local capture and recall hooks are not installed. The memory lives on the server, not in a local file the hooks could read. The agent uses
memory_searchandmemory_storedirectly (the CLAUDE.md guidance is still written).
3. Git vault (async, version-controlled sharing)
Prefer your knowledge base in git, reviewed through pull requests, with no server to run? Export memories to plain Markdown and share the folder as a git repo:
npx mcp-memory-graph vault-init # make the vault a git repo (union merge driver + rebuild hook)
git add -A && git commit -m "memory snapshot" && git push
# collaborators, once after cloning:
# npx mcp-memory-graph vault-init # registers the union merge driver + post-merge hook in THEIR clone
# collaborators, thereafter: git pull && npx mcp-memory-graph rebuildEach collaborator must run
vault-initonce in their own clone. The merge driver and post-merge rebuild hook live in local git config (.git/), not in the repo. A fresh clone withoutvault-initwill hit raw conflict markers in.memory/graph.jsonon its first concurrent pull. Re-runningvault-initis idempotent and does not clobber the committed sidecar.
Two recovery notes for team vaults:
After a merge you resolved by hand (the post-merge hook only fires on clean merges),
memory rebuildcan refuse withVaultIntegrityErrorbecause.memory/manifest.jsonis stale. Delete that file and re-runrebuild; it is derived state and regenerates.Hand-edited a
.mdwhile your database has newer state? Import first (vault_syncorrebuild), then export (memory sync). A full export from a stale database overwrites vault files, including your hand edit.
Security notes
MCP_AUTH_TOKENis a single shared secret, fine for a trusted group; rotate it by restarting the server with a new value. For per-key RBAC (one server, N keys, each pinned to a namespace set and an access-level ceiling) usememory keys create|list|revoke(schema v16). The legacy shared token still works and is checked first. See docs/MULTI-TENANCY.md.Never commit a token. The
--remotedefault keeps it in an env var by design.Bind to
127.0.0.1(the default) unless you front the server with a proxy that terminates TLS; then setMCP_BIND=0.0.0.0.
Building an org-wide AI brain? One server, a key per employee, an org chart the AI can traverse (people, teams, SOPs, and tools as typed graph nodes), with enforced who-sees-what. The recipe, built on existing primitives, is in docs/ENTERPRISE-BRAIN.md.
Configuration
Environment variables
Variable | Default | Description |
|
| Database file location. The directory is created automatically. |
|
| HuggingFace embedding model name. Must be an ONNX model compatible with Transformers.js. |
|
| Embedding vector dimensions. Must match the model's output. |
|
| Override location for the configuration file. |
The full env reference (auth, rate limits, webhooks, vault, publish) is in docs/ENV.md.
Custom database location
claude mcp add memory-server --env MCP_MEMORY_DB_PATH=/path/to/project/.memory.db node /path/to/dist/index.jsAlternative embedding models
# Swap the embedding model (same 384 dimensions; drop-in for the existing index
# AFTER a re-embed; see the warning below)
claude mcp add memory-server \
--env MCP_MEMORY_MODEL=Xenova/bge-small-en-v1.5 \
--env MCP_MEMORY_DIMENSIONS=384 \
node /path/to/dist/index.jsModel identity is recorded and enforced. The database remembers which embedding model built it (
schema_meta.embedding_model). Starting the server with a differentMCP_MEMORY_MODELfails loudly instead of silently degrading every search (same dimension does not mean same vector space). To switch models: set the new model and runmemory rebuild(re-embeds from the vault), or export and re-import.
Configuration file
The config file controls self-improvement behavior, hook settings, and per-project overrides. Resolution order: MCP_MEMORY_CONFIG_PATH env, then <cwd>/.mcp-memory/config.json (project-scope init writes this), then ~/.mcp-memory/config.json. Created by npx mcp-memory-graph init, or write it by hand:
{
"defaults": {
"scope": "project",
"namespace": "auto"
},
"projects": [
{
"path": "~/Documents/MyApp",
"namespace": "my-app",
"watch": ["README.md", "docs/**/*.md"]
}
],
"consolidation": {
"similarity_threshold": 0.85,
"prune_after_days": 30,
"min_importance_to_keep": 0.1,
"max_operations": 100,
"schedule": [
{ "hour": 11, "minute": 30 },
{ "hour": 16, "minute": 0 }
]
},
"hooks": {
"extract_on_compact": false,
"extract_on_session_end": false,
"track_searches": true
},
"extraction": {
"categories": ["decision", "pattern", "error_fix", "convention"],
"min_confidence": 0.4
}
}Section | Key | Default | Description |
|
|
| Default scope for new memories |
|
|
| Default namespace ( |
|
| Project root directory | |
|
| Namespace override for this project | |
|
| Glob patterns for files to track for changes | |
|
|
| Cosine similarity threshold for deduplication (0.5-1.0) |
|
|
| Days before pruning low-quality memories |
|
|
| Minimum importance score to survive pruning |
|
|
| Max operations per consolidation run |
|
|
| One or more |
|
|
| Mine transcript before context compression (regex-based, off by default) |
|
|
| Extract learnings when session ends (regex-based, off by default) |
|
|
| Log search hits and misses to |
|
|
| Spawn headless |
|
|
| Learning categories to extract |
|
|
| Minimum confidence for extracted learnings |
|
| scope-dependent | SQLite file location ( |
|
| unset | Obsidian vault root used by |
|
|
| Mirror memory writes out to the vault as |
CLI commands
Command | Description |
| Start the MCP server on stdio (default) |
| Start the HTTP server: MCP transport, REST API, web dashboard |
| Interactive setup wizard: hooks, config, nightly schedule (user scope). Add |
| Setup for the current project only (creates |
| Reverse init: remove hooks and schedule |
| Run the dream cycle manually |
| Write a committable, deterministic |
| Install the |
| Git union merge driver for |
| Make the vault a git repo: union merge driver, |
| Export all valid memories plus the graph sidecar to the vault ( |
| Rebuild the SQLite index from the vault's |
| Upgrade the database to the current schema version |
| WAL-safe online snapshot (retention: |
| Per-key RBAC: mint, inspect, revoke API keys (namespace set plus access ceiling) |
Tools reference
1. memory_store
Store a new memory. The vector embedding is generated automatically.
Parameter | Type | Required | Default | Description |
| string | Yes | The text content to store | |
| string | No | Short title for the memory | |
| enum | No |
| global, project, user, team, department |
| string | No | ¹ | Sub-scope (e.g., project name) |
| number | No | computed | 0-1 manual importance override |
| string | No |
| Attribution for memory_attribution rollups |
| enum | No |
| add, supersede, skip: write-gate behavior on near-duplicates |
| string | No | contract, policy, code, incident, decision, etc. | |
| string | No | Where this content came from | |
| string | No | Who created it | |
| string | No | legal, engineering, hr, sales, finance | |
| string[] | No | Tags for categorization | |
| enum | No |
| public, internal, confidential, restricted |
| string | No |
| ISO 639-1 language code |
| object | No | Domain-specific key-value pairs | |
| string | No | ISO 8601 expiration date |
¹ When omitted, a loaded config file's defaults.scope and defaults.namespace ("auto" = project directory name) apply first; the hardcoded fallback is global with no namespace.
Example prompt:
Store this memory with department=legal and tags=["compliance","gdpr"]:
"All customer data processing agreements must include a GDPR Article 28 addendum effective January 2025."2. memory_search
Hybrid vector plus keyword search across stored memories.
How it works:
Your query is embedded and compared against all stored vectors (semantic similarity).
Your keywords are matched against memory text via FTS5 (exact matching).
Both result lists merge using Reciprocal Rank Fusion.
Optional temporal decay favors recent memories.
Results get a confidence score and label.
The access is recorded for quality scoring.
Parameter | Type | Required | Default | Description |
| string | Yes | Natural language query or keywords | |
| enum | No | Filter by scope | |
| string | No | Filter by namespace | |
| string | No | Filter by department | |
| string | No | Filter by document type | |
| string[] | No | Filter: must contain ALL specified tags | |
| enum | No | Filter by access level | |
| string | No | Filter by language | |
| number | No |
| Max results (1-100) |
| number | No |
| Pagination offset |
| enum | No |
|
|
| object | No |
| |
| string | No | Only memories after this date | |
| string | No | Only memories before this date | |
| number | No | Minimum confidence threshold (0-1) |
Example prompts:
Search memories for "contract renewal notice requirements" in the legal department
Search memories for "authentication" with search_mode=keyword
Search memories for "deployment patterns" with temporal_decay={type:"exponential", half_life_days:60}Each result includes the memory content and metadata, the combined RRF score, a normalized confidence (0-1), a confidence_level label (high at 0.7 and above, medium at 0.4 and above, low below that), and a match_type (hybrid, vector, or keyword).
The default
detail_level: "summary"projection returnsconfidence_levelbut omits the numericconfidenceand the fullcontent, to save tokens. Passdetail_level: "full"when you need them.
3. memory_get
Retrieve a specific memory by ID. For ingested documents, optionally include all child chunks.
Parameter | Type | Required | Default | Description |
| string | Yes | Memory UUID | |
| boolean | No |
| Include child chunks for ingested documents |
4. memory_update
Update an existing memory. If content changes, the embedding regenerates automatically. The previous version is saved to history.
Parameter | Type | Required | Default | Description |
| string | Yes | Memory ID to update | |
| string | No | New content (triggers re-embedding) | |
| string | No | New title | |
| object | No | Replacement metadata | |
| string[] | No | Replacement tags | |
| string/null | No | New expiry, or null to remove | |
| string | No | Who made this change |
5. memory_delete
Delete memories by ID or by filter. At least one of id or filter is required.
Parameter | Type | Required | Description |
| string | No | Delete a specific memory |
| enum | No | Delete all in scope |
| string | No | Delete all in namespace |
| string | No | Delete all in department |
| string | No | Delete older than date |
| boolean | No | Only delete expired memories |
6. memory_list
Browse memories with filtering, pagination, and sorting.
Parameter | Type | Default | Description |
| enum | Filter by scope | |
| string | Filter by namespace | |
| string | Filter by department | |
| string | Filter by type | |
| number |
| Max results (1-100) |
| number |
| Pagination offset |
| enum |
|
|
| enum |
|
|
7. memory_ingest
Ingest a full document: it is chunked by content type, each chunk is embedded, and everything is stored with parent-child relationships. Use this for large documents.
Parameter | Type | Default | Description |
| string | Full document text (required) | |
| string | Document title | |
| enum |
| Chunking strategy: |
| number |
| Target chunk size in characters (~4 chars per token) |
| number |
| Overlap between chunks for context |
| string | Origin file or URL | |
| string | Document classification | |
| string | Department | |
| string | Author | |
| string[] | Tags | |
| object | Domain-specific metadata |
Chunking by content type:
Type | Strategy | Splits on |
| Paragraph | Double newlines ( |
| Heading-aware |
|
| Function-aware |
|
| Sentence | Period, exclamation, question marks |
| Paragraph | Double newlines (same as text) |
8. memory_related
Find memories semantically related to a given one. Uses vector similarity, so it finds connections keyword search misses.
Parameter | Type | Default | Description |
| string | Memory ID to find related for (required) | |
| number |
| Max results (1-50) |
| number | Minimum similarity threshold (0-1) |
9. memory_versions
View a memory's version history. Every update creates a version record.
Parameter | Type | Default | Description |
| string | Memory ID (required) | |
| number |
| Max versions (1-50) |
10. memory_stats
Usage statistics about stored memories.
Parameter | Type | Description |
| enum | Filter stats by scope |
| string | Filter stats by namespace |
| string | Filter stats by department |
Returns totals for memories, documents, and chunks, breakdowns by scope, department, and type, storage size, and the expired count.
11. memory_export
Export current memory content as JSON for portability or migration. This is not a full backup: it serializes only currently live, top-level memories. It omits edit history, the knowledge graph, condense-undo originals, ingested child chunks, and soft-forgotten rows. For disaster recovery, copy the SQLite file (cp ~/.mcp-memory/memory.db ..., see the RUNBOOK); embeddings recompute deterministically on import.
Parameter | Type | Default | Description |
| enum | Filter export | |
| string | Filter export | |
| string | Filter export |
Max 1000 records per export.
12. memory_import
Import memories from JSON. Each item is embedded and stored.
Parameter | Type | Default | Description |
| array | Array of memory objects (required) | |
| boolean |
| Overwrite existing IDs |
13. vault_sync
Scan an Obsidian vault, parse the markdown, embed and store. See Obsidian Vault Integration below.
14. vault_status
Sync status for a vault: files synced, pending, changed, and the last sync time.
15. vault_search
Hybrid search scoped to one vault's memories.
By default this searches the namespace named after the vault's folder name. Memories exported from another namespace keep their original namespace in frontmatter. If a search over a freshly synced vault returns nothing, pass an explicit
namespace(and/orscope) override.
16. memory_consolidate
The dream cycle: deduplicate, score, prune, expire, and detect knowledge gaps.
Parameter | Type | Required | Default | Description |
| enum | No | Limit consolidation to a scope | |
| string | No | Limit consolidation to a namespace | |
| number | No |
| Cosine similarity for dedup (0.5-1.0) |
| boolean | No |
| Remove expired memories |
| boolean | No |
| Remove memories below min importance |
| boolean | No |
| Preview changes without applying |
| number | No |
| Cap on total operations per run |
Five stages run in order: Score (recalculate importance), Expire (enforce expires_at), Prune (drop low-quality when enabled), Dedup (merge near-duplicates), Gaps (surface zero-result searches). Returns a report with counts per stage.
Example prompts:
Run a dream cycle consolidation with dry_run=true to preview what would change
Consolidate memories in namespace=my-project with similarity_threshold=0.9
Run consolidation with prune_low_quality=true to clean up unused memories17. memory_extract_learnings
Mine a session transcript for decisions, patterns, error fixes, and conventions using heuristic pattern matching. No external LLM needed.
Parameter | Type | Required | Default | Description |
| string | Yes | Session transcript text to mine | |
| enum | No | Scope for extracted memories | |
| string | No | Namespace for extracted memories | |
| string | No | Department for extracted memories | |
| string[] | No | Additional tags | |
| string | No | Source attribution | |
| enum[] | No | all | Filter to |
| boolean | No |
| Automatically store extracted learnings |
Extraction looks for decision language ("we decided", "the fix was"), pattern language ("always use", "never do"), error fixes ("the problem was", "solved by"), and conventions ("our convention is", "standard practice"). Each hit is deduplicated against existing memories and stored with a lower initial confidence.
18-42. Graph, Agent-OS, vault round-trip, and governance tools
Parameters for the remaining tools are validated by Zod schemas in src/schemas/; each registration's full description lives in src/server.ts.
# | Tool | Purpose |
18 |
| MemGPT-style hot / recall / archival tier distribution plus the hot working set |
19 |
| Write memories out to an Obsidian vault as |
20 |
| Export the graph as a JSON Canvas 1.0 |
21 |
| Lightweight content-free index (titles, types, tags, scores) to discover what exists |
22 |
| Query the knowledge graph: entities, relationships, linked memories, multi-hop traversal (depth 1-3) |
23 |
| Store LLM-extracted entities and relationships for a memory |
24 |
| Apply agent-generated summaries to condense old memories (original preserved) |
25 |
| Restore a condensed memory to its original content and re-embed |
26 |
| Answer a question with a tight, token-budgeted subgraph instead of flooding context |
27 |
| Read the pinned, always-in-context core-memory block for a |
28 |
| Append to the core-memory block (refused if it would overflow |
29 |
| Replace text in the core-memory block (used to update or compact it) |
30 |
| Generative-Agents-style reflection: gather material, or store a synthesized insight |
31 |
| GraphRAG community detection over the entity graph for corpus-level themes |
32 |
| Fetch a structured note scaffold per document type |
33 |
| Per-session "daily note" (appends to one memory per |
34 |
| Roll up how many valid memories each |
35 |
| "Questions to ask" digest: ambiguous links, under-documented entities, orphans |
36 |
| GDPR-grade forget: soft-delete (recoverable) by default, or |
37 |
| Point-in-time bi-temporal timeline plus edit-version history for one memory |
38 |
| Entity names mentioned in memory text with no graph edge yet (suggested links) |
39 |
| Exact metadata filter query over top-level memories (no semantic ranking) |
40 |
| Line-level diff between two stored versions of a memory |
41 |
| Roll a memory back to a previous version (snapshots the current one first) |
42 |
| Verify the signed provenance envelope of memories (ed25519 over content_hash plus origin): per-memory |
43-50. Active infrastructure and typed shapes
# | Tool | Purpose |
43 |
| Manage the event bus (gated by |
44 |
| Advisor digest: unresolved conflicts, stale memories, most-contradicted facts, evidence-less decisions |
45 |
| Store health roll-up: live/retired/stale counts, aging buckets, unresolved conflicts, webhook delivery health |
46 |
| Change propagation: list stale memories, preview a change's blast radius (dry run), or confirm a memory is current |
47 |
| Resumable "where was I" session state, save and resume (versioned) |
48 |
| Per-user expertise profile: observe a topic, get the profile |
49 |
| Export learnings and reflections as JSONL training pairs (pairs/chatml/alpaca) for fine-tuning |
50 |
| Capture a structured lesson or incident in one call: fills the matching section template (incident → Symptom/Root Cause/Fix/Prevention; lesson → What/Why it matters/How to apply) from your field values and stores it through the normal deduped write path |
Architecture
System overview
Claude Code ──stdio──> MCP Memory Graph
│
┌───────┴───────┐
│ │
Transformers.js SQLite DB
(embeddings) (~/.mcp-memory/memory.db)
│
┌────────────┼────────────┐
│ │ │
memories memories_fts memories_vec
(data + (FTS5 index) (vec0 index)
scores)
│
┌────────┼────────┐
│ │ │
memory_ memory_ ingest_
versions access_ source_
log tracking
Claude Code Hooks (opt-in)
│
├── SessionStart ──> memory_stats (status check)
├── PostToolUse ───> search-log.jsonl (hit/miss tracking)
├── PreCompact ────> learning extraction (disabled by default)
└── Stop ──────────> spawn detached `claude -p` headless review
│
└─> --allowedTools mcp__memory-server__memory_store
Claude reviews transcript → memory_store calls
Nightly Schedule (opt-in)
└── 3:00 AM ───────> memory_consolidate (dream cycle)How hybrid search works
Query: "contract renewal notice"
│
┌────┴────┐
│ │
Embed Tokenize
│ │
▼ ▼
sqlite-vec FTS5
(semantic) (keyword)
│ │
│ rank │ rank
│ 1: A │ 1: A
│ 2: C │ 2: B
│ 3: B │ 3: D
│ │
└────┬────┘
│
Reciprocal Rank Fusion
RRF(d) = Σ 1/(60 + rank)
│
▼
[A: 0.033, B: 0.026, C: 0.016, D: 0.016]
│
Temporal Decay (optional)
│
Confidence Scoring
│
Access Tracking (record hit)
│
▼
Final ranked resultsDatabase schema
The SQLite database is at schema version 18, with automatic forward migration from any earlier version. The core tables:
memories: all memory data, TEXT primary key (UUIDs), parent-child support for document chunks, plusaccess_count,last_accessed_at,importance_score, andconfidence_score.memories_fts: FTS5 virtual table for keyword search with BM25 ranking, synced with the memories table.memories_vec: vec0 virtual table for vector search. 384-dimension float32 embeddings with scope and namespace metadata for pre-filtering.memory_versions: version history for every change.memory_access_log: every search, get, and related-memory access, with timestamps and query context.ingest_source_tracking: ingested files, for change detection on re-ingestion.
Later schema versions add the knowledge-graph tables (entities, links, conflicts, communities), webhooks, session state, and the RBAC api_keys table. Every mutation keeps the three core tables in sync atomically inside a SQLite transaction; the repository.ts layer enforces this, and nothing else touches the tables directly.
Project layout
src/
├── index.ts # Entry point (stdio transport)
├── server.ts # All 50 tool registrations
├── config/ # Config file loading + validation
├── db/ # Connection, schema, migrations, repository (three-table sync)
├── embeddings/ # Embedding providers (Transformers.js, registry, Ollama)
├── search/ # Hybrid search, reranker, temporal decay, scoring
├── chunking/ # Per-content-type chunking strategies
├── graph/ # Entities, links, PageRank, communities
├── vault/ # Obsidian round-trip, write-through, bookkeeping
├── tools/ # One handler per MCP tool
├── api/ # REST routes + security middleware
├── events/ # Webhook bus (SSRF guard, HMAC, retry)
├── cli/ # init, serve, vault, backup, keys, migrate, ...
├── hooks/ # Claude Code lifecycle hooks
└── schemas/ # Zod schemas for every tool inputUse cases by department
Engineering:
Store memory: "We chose event sourcing over CRUD for the order service because
we need full audit trail and the ability to replay events for debugging.
ADR-042, decided 2026-03-15."
department=engineering, document_type=decision, tags=["architecture","event-sourcing"]Legal:
Ingest this contract template with content_type=legal, department=legal,
document_type=contract, tags=["template","nda","standard"]Finance:
Store memory: "Q4 2025 revenue recognition policy change: SaaS contracts
over 12 months now recognized ratably per ASC 606 guidance."
department=finance, document_type=policy, tags=["revenue-recognition","asc-606"]HR:
Ingest the employee handbook with department=hr, content_type=text,
document_type=policy, tags=["handbook","onboarding"]Sales:
Store memory: "When prospect objects on price vs CompetitorX, lead with
our 99.9% uptime SLA and dedicated support. This converted 3 deals in Q1."
department=sales, document_type=pattern, tags=["objection-handling","pricing","competitorx"]Obsidian Vault Integration
Point the server at a vault folder and every markdown file becomes a searchable memory, with frontmatter, tags, and wiki-links extracted. No Obsidian app needed; it reads the files straight from disk.
Tool | Description |
| Scan vault, parse files, embed and store. Incremental (mtime-based). |
| Sync status: files synced, pending, changed, last sync time. |
| Hybrid search scoped to a vault's memories. |
What gets extracted:
Obsidian feature | Memory field |
YAML frontmatter |
|
YAML frontmatter |
|
YAML frontmatter |
|
YAML frontmatter (all fields) |
|
Inline |
|
|
|
File path relative to vault |
|
Vault directory name |
|
Usage examples:
Sync my Obsidian vault at ~/Documents/my-vault
Check vault sync status for ~/Documents/my-vault
Search my vault for "meeting action items about hiring"
Sync vault but only the notes/ and projects/ folders:
vault_sync with include_patterns=["notes/**", "projects/**"]
Force re-sync everything (ignore modification times):
vault_sync with force=truevault_sync parameters:
Parameter | Type | Default | Description |
| string | Absolute path to vault directory (required) | |
| number |
| Target chunk size for large files |
| number |
| Overlap between chunks |
| boolean |
| Re-sync all files regardless of mtime |
| string[] | Only sync matching globs (e.g., | |
| string[] | Skip matching globs (e.g., |
How sync works: it scans the vault recursively for .md files (skipping .obsidian/, .trash/, .git/), compares modification times against the last sync, extracts frontmatter, wiki-links, and tags from new or changed files, embeds, and stores. Deleted files have their memories removed. Files larger than the chunk size are split with markdown-aware chunking. A second sync of an unchanged vault takes under a millisecond.
Security and privacy
No network calls after the one-time model download (cached locally).
No telemetry, no analytics, no tracking.
Hooks are opt-in. They are only installed when you run
npx mcp-memory-graph init.The nightly schedule is opt-in too, and removed by
npx mcp-memory-graph uninstall.Everything is one SQLite file: easy to back up, move, or delete.
access_levelmetadata (public, internal, confidential, restricted) for organizational awareness.Data never leaves your machine.
Backup:
# WAL-safe online snapshot with retention
npx mcp-memory-graph backup
# Or a simple file copy
cp ~/.mcp-memory/memory.db ~/.mcp-memory/memory.db.backupReset:
# Delete the database to start fresh
rm ~/.mcp-memory/memory.dbNightly consolidation
When installed via npx mcp-memory-graph init, a nightly job runs all five dream-cycle stages plus access-log rotation (entries older than 90 days are dropped).
On macOS, a launchd plist is created at ~/Library/LaunchAgents/com.mcp-memory.consolidate.plist, scheduled for 3:00 AM. On Linux, init prints a cron suggestion:
# Add to crontab -e
0 3 * * * /usr/local/bin/npx mcp-memory-graph consolidateRun it manually any time:
npx mcp-memory-graph consolidateLimitations
Scale ceiling. Vector search is an exact scan: 9.1 ms p95 at 10K vectors, about 30 ms at 50K, and it degrades linearly from there. Comfortable into the low hundreds of thousands; past that you want a dedicated ANN index, which this server does not have yet.
English-optimized. The default MiniLM model is English-only in practice; cross-language matching is weak. A multilingual model can be configured via
MCP_MEMORY_MODEL(with a rebuild), but the shipped benchmarks only validate the default.First-call cold start. Three to five seconds on first use while the embedding model loads. Cached after that.
Heuristic extraction.
memory_extract_learningsuses pattern matching, not an LLM. It catches common phrasings and misses subtle ones. (The Stop hook'sclaude -preview is the LLM-quality path.)One process. RBAC keys and revocation live in the server process. For horizontal scale you shard tenants across processes or give each tenant their own database file.
Roadmap
What's actually next, in rough order:
Multilingual embeddings, opt-in. Ship a multilingual ONNX model option (the embedder registry and the model-identity guard already exist, so a swap is safe and loud).
Office document ingestion. PDF, DOCX, XLSX, and friends as an ingest mode, with local extraction only.
Vault file watcher. Auto-rebuild on
.mdchanges instead of manualrebuild.as_ofcontent reconstruction. Point-in-time queries currently reconstruct validity (which facts were live); reconstructing the content of edited memories at that instant is the remaining half.ANN index for corpora past a few hundred thousand vectors.
Windows test suite port. The server runs on Windows, but the test suite carries POSIX path assumptions; the Windows CI leg is non-blocking until that's done.
Tech stack
Component | Package | Purpose |
MCP SDK |
| Model Context Protocol server framework |
Embeddings |
| Local ONNX model inference in Node.js |
Database |
| Synchronous SQLite with native bindings |
Vector search |
| vec0 virtual table for KNN search |
Validation |
| Schema validation for tool inputs |
TypeScript |
| Strict mode, ES2022 target |
Frontend | React 19, Vite, Tailwind CSS v4 | Web dashboard SPA |
UI components | shadcn/ui | Accessible component primitives |
Fuzzy search |
| Client-side autocomplete suggestions |
Graph viz |
| Knowledge graph layout |
License
Source-available, not open source. Licensed under the PolyForm Noncommercial License 1.0.0: free for any noncommercial purpose (personal projects, hobby, study, research, charitable, educational, public-research, and government use). Commercial use requires a paid license; see COMMERCIAL.md.
If you're unsure whether your use counts as commercial, check the safe harbors in the license or just ask: yonasmougaard@gmail.com.
Keywords
MCP memory server · Model Context Protocol · Claude Code memory · persistent AI memory · LLM long-term memory · AI agent memory · local-first memory · $0/token memory · hybrid vector + keyword search · semantic search · knowledge graph · bi-temporal memory · HippoRAG / Personalized PageRank · cross-encoder reranking · RAG memory · SQLite vector database · sqlite-vec · FTS5 / BM25 · local embeddings (all-MiniLM-L6-v2, Transformers.js) · Obsidian vault sync · JSON Canvas · GDPR forget · signed provenance · self-hosted memory.
Also searched as: a self-hosted, privacy-first alternative to mem0, Zep, Letta, Cognee, and Supermemory · long-term memory for Claude / Cursor / Codex · an Obsidian-backed knowledge base for AI agents · a local knowledge-graph memory that never leaves your machine.
Maintenance
Latest Blog Posts
MCP directory API
We provide all the information about MCP servers via our MCP API.
curl -X GET 'https://glama.ai/api/mcp/v1/servers/YonasValentin/mcp-memory-graph'
If you have feedback or need assistance with the MCP directory API, please join our Discord server