Mnemo MCP
OfficialEnables syncing memories across machines via Google Drive as a storage backend for encrypted passport sync.
Supports MinIO as a storage backend for encrypted passport sync, allowing self-hosted S3-compatible storage.
Allows using OpenAI models for embedding, reranking, and LLM tasks as an alternative to the built-in local models.
Click on "Install Server".
Wait a few minutes for the server to deploy. Once ready, it will show a "Started" state.
In the chat, type
@followed by the MCP server name and your instructions, e.g., "@Mnemo MCPremember that I prefer dark mode in my code editor"
That's it! The server will respond to your query, and you can continue using it as needed.
Here is a step-by-step guide with screenshots.
Mnemo MCP Server
mcp-name: io.github.n24q02m/mnemo-mcp
Persistent AI memory with hybrid search and embedded sync. Open, free, unlimited.
Project | Tagline | Tag |
Knowledge graph for token-efficient code reviews -- semantic search and call-... | MCP | |
IMAP/SMTP email for AI agents -- read, send, organize folders, and manage att... | MCP | |
Composite MCP server for Godot Engine -- 17 composite tools for AI-assisted g... | MCP | |
Markdown-first Notion for AI agents -- pages, databases, blocks, and comments... | MCP | |
Telegram for AI agents -- messages, chats, media, and contacts across both bo... | MCP | |
Claude Code plugin marketplace for the n24q02m MCP servers -- install web sea... | Marketplace | |
Image and video understanding + generation for AI agents -- across Gemini, Op... | MCP | |
Chrome Extension for bulk operations on Jules tasks via batchexecute API -- a... | Tooling | |
Shared foundation for building MCP servers -- Streamable HTTP transport, OAut... | MCP | |
Persistent AI memory with hybrid search and embedded sync. Open, free, unlimi... | MCP | |
Lightweight Qwen3 text embedding and reranking via ONNX Runtime and GGUF | Library | |
Secrets without the server. | CLI | |
TACET: a self-distilling neuro-symbolic cascade that amortises LLM cost in kn... | Tooling | |
Shared web infrastructure package for search, scraping, HTTP security, and st... | Library | |
Open-source MCP server for AI agents: web search, content extraction, and lib... | MCP |
Table of contents
Related MCP server: Muninn
Roadmap
All three phases below have shipped. The temporal knowledge graph (Phase 3) is the current major line (v2.x).
Phase | Version | Status | Highlights |
Phase 1 | v1.x | Shipped | Typed |
Phase 2 | v1.x | Shipped | LLM-driven compression of older memories + Passport sync (encrypted import/export bundle for cross-machine bootstrap) -- AES-256-GCM + Argon2id, S3 / R2 / B2 / MinIO + GDrive backends, delta-sync with LWW per row |
Phase 3 | v2.x | Shipped (BREAKING) | Temporal knowledge graph -- bitemporal |
Features
Hybrid retrieval -- FTS5 + sqlite-vec, fused via Reciprocal Rank Fusion (k=60), then re-ranked by a configurable rerank chain (
RERANK_MODELS, order = litellm fallback; empty -> local qwen3-reranker) with temporal decay and importance boostTyped capture --
memory(action="capture")with 6 context_types (conversation/fact/preference/skill/task/decision), embedding-based dedup, and a configurable LLM chain (LLM_MODELS, order = litellm fallback)Knowledge graph -- Automatic entity extraction and relation tracking; top results boosted by graph proximity
Importance scoring + archive policy -- LLM-scored 0.0-1.0 importance; soft-archive when
recency_factor * (1 - importance) > 1.0; restore action availableAuto-archive trigger -- Background sweep every Nth capture (default 100) -- no cron required
STM-to-LTM consolidation -- LLM summarization of related memories in a category
Duplicate detection -- Warns before adding semantically similar memories
Zero config -- Built-in local Qwen3 ONNX embedding + reranking, no API keys needed. Optional cloud providers (Jina AI, Gemini, OpenAI, Cohere)
Multi-machine sync -- JSONL-based merge sync via Google Drive (bundled Desktop OAuth public client)
Plugin trinity -- Ships
/recall-context+/memory-commitskills and SessionStart + opt-in PostToolUse hooks (see docs/ARCHITECTURE.md)Proactive memory -- Tool descriptions and skills guide AI to save preferences, decisions, facts at the right moment
LLM compression -- Per-turn compression via the multi-provider dispatcher targets ~3x token reduction at >=0.90 fact retention; graceful skip when no provider configured (see docs/compression.md)
Encrypted passport sync -- AES-256-GCM bundles + Argon2id KDF, S3 (R2 / B2 / MinIO) and Google Drive backends, delta-sync with last-write-wins per row (see docs/passport.md). Bootstrap via the
passport-bootstrapskill.Temporal knowledge graph -- Bitemporal columns (
valid_from/valid_to/superseded_by) on every memory + entity-resolution dedup (embedding KNN at default 0.85 cosine threshold) + audit trail (memory_audittable with prev/new state hashes) + new actions (entity_search/entity_graph/history) + opt-inKG_AUTO_ENABLEDauto-extract on capture. BREAKING for clients that calledmemory.getexpecting historical-inclusive results: passas_offor time-travel; default now filters to current-state (valid_to IS NULL).
Quick install
# Method 1 (default): plugin install via Claude Code
/plugin marketplace add n24q02m/claude-plugins
/plugin install mnemo-mcp@n24q02m-plugins
# Method 1 (CLI): direct uvx invocation (zero config -- runs on the built-in local model)
claude mcp add mnemo -- uvx mnemo-mcp
# Method 3 (HTTP / multi-device / multi-user)
docker run -d --name mnemo-mcp-http -p 8085:8080 \
-v mnemo-data:/data -e MCP_TRANSPORT=http \
-e PUBLIC_URL=https://mnemo.example.com \
n24q02m/mnemo-mcp:latestNo API keys are required: with no provider keys set, mnemo runs fully offline on the bundled local Qwen3 ONNX embedding + reranker. Add cloud provider keys only to switch embedding / rerank / LLM onto a hosted model (see Configuration).
Full setup matrices live at the canonical docs site mcp.n24q02m.com/servers/mnemo-mcp/setup/ and the paste-to-agent snippet at claude-plugins/plugins/mnemo-mcp/setup-with-agent.md.
Configuration
All settings are plain environment variables (no prefix). Everything is optional -- mnemo runs zero-config on the local model. The most common knobs:
Model selection (per-task chains)
Embedding, reranking, and LLM features each take an ordered, comma-separated chain of
provider/model entries (tried in order, litellm fallback). Leave a chain empty to use
the bundled local model (embedding / rerank) or to disable the feature (LLM).
Env var | Default | Purpose |
| (empty -> local Qwen3 ONNX) | Embedding chain, e.g. |
| (empty -> local Qwen3 cross-encoder) | Rerank chain, e.g. |
| (built-in cloud chain) | LLM chain for graph extraction / importance / compression; empty disables those features |
|
| Embedding dimensions ( |
Provider is inferred from the model prefix; supply each provider's key via the litellm
<PROVIDER>_API_KEY convention:
model prefix | key env var | get it at |
|
| jina.ai/api-dashboard |
|
| aistudio.google.com/apikey |
|
| platform.openai.com/api-keys |
|
| dashboard.cohere.com/api-keys |
Any other litellm provider works via env passthrough; see
https://docs.litellm.ai/docs/providers/<provider> for its <PROVIDER>_API_KEY name.
Custom OpenAI-compatible endpoints (SSRF-guarded): LLM_API_BASE, EMBEDDING_API_BASE,
RERANK_API_BASE.
Changing the embedding model changes the vector space. A safe-by-default guard blocks boot on mismatch; set
REINDEX_ON_MODEL_CHANGE=trueto re-embed.
Storage, sync, retrieval, and archive
Env var | Default | Purpose |
|
| SQLite database path (also accepts |
|
| Enable Google Drive multi-machine sync |
| (none) | OAuth client ID required for sync |
|
| Google Drive folder name |
|
| Auto-sync interval in seconds ( |
|
| Enable reranking of fused results |
|
| Number of reranked results to keep |
|
| Enable importance x recency soft-archive sweeps |
|
| Age before a memory is eligible for archive |
|
| Similarity above which a new memory is a duplicate |
|
| Half-life for temporal decay scoring |
|
| Auto-extract entities + relations on capture |
|
| Log verbosity |
Manual config example
{
"mcpServers": {
"mnemo": {
"command": "uvx",
"args": ["mnemo-mcp"],
"env": {
"EMBEDDING_MODELS": "jina_ai/jina-embeddings-v5-text-small,gemini/gemini-embedding-001",
"RERANK_MODELS": "jina_ai/jina-reranker-v3",
"LLM_MODELS": "gemini/gemini-3-flash-preview",
"JINA_AI_API_KEY": "jina_xxx",
"GEMINI_API_KEY": "AIza_xxx"
}
}
}
}Comparison vs. peers
Feature | mnemo-mcp | Mem0 | Letta | OpenMemory |
Hybrid retrieval (FTS + vec) | yes (FTS5 + sqlite-vec + RRF) | yes | partial | yes |
Cross-encoder rerank chain | yes (qwen3 local + Jina + Cohere) | partial (Cohere only) | no | no |
Temporal decay scoring | yes (exp half-life) | no | no | no |
Importance boost in rank | yes (LLM 0.0-1.0) | no | no | no |
Soft-archive + restore policy | yes (importance x recency) | no | no | no |
Self-hostable (single SQLite file) | yes (zero ext deps) | partial (cloud-first) | yes (Postgres) | yes (Postgres + Qdrant) |
Multi-provider LLM dispatch | yes ( | partial | yes | partial |
Plugin trinity (skills + hooks) | yes (recall-context + memory-commit) | n/a | n/a | n/a |
Multi-machine sync | yes (GDrive bundled OAuth) | yes (cloud) | n/a | n/a |
E2E-encrypted passport sync | yes (AES-256-GCM + Argon2id, S3 + GDrive) | no | no | no |
LLM compression on capture | yes (multi-provider, ~3x at >=0.90 retention) | no | no | no |
Backend-pluggable sync architecture | yes (S3 / R2 / B2 / MinIO + GDrive) | no | no | no |
Bitemporal | yes ( | no | partial (events only) | no |
Entity resolution via embedding KNN | yes (cosine threshold tunable) | no | no | no |
Audit trail with state hashes | yes ( | no | no | no |
Status
2026-05-02 -- Architecture stabilization update
Past months saw significant churn around credential handling and the daemon-bridge auto-spawn pattern. This caused multi-process races, browser tab spam, and inconsistent setup UX across plugins. The architecture is now stable: 2 clean modes (stdio + HTTP), no daemon-bridge layer, no auto-spawn from stdio.
Apologies for the instability period. If you encountered issues with prior versions, please update to the latest release and follow the current setup docs -- most prior workarounds are no longer needed.
Related plugins from the same author:
wet-mcp -- Web search + content extraction
imagine-mcp -- Image/video understanding + generation
better-notion-mcp -- Notion API
better-email-mcp -- Email management
better-telegram-mcp -- Telegram
better-godot-mcp -- Godot Engine
better-code-review-graph -- Code review knowledge graph
All plugins share the same architecture -- install once, learn pattern transfers.
Documentation
Full docs at mcp.n24q02m.com/servers/mnemo-mcp/setup/:
Setup -- install methods for Claude Code, Codex, Gemini CLI, Cursor, Windsurf, mcp.json
Modes overview -- stdio / local-relay / remote-relay / remote-oauth
Multi-user setup -- per-JWT-sub credential model
In-repo references:
docs/ARCHITECTURE.md-- storage layout, embedding / rerank dispatch, knowledge graph, plugin trinitydocs/compression.md-- LLM compression pipelinedocs/passport.md-- encrypted passport sync (S3 / GDrive backends)docs/BENCHMARKS.md-- retrieval quality + latency metrics
Install with AI agent -- paste this to your AI coding agent:
Install MCP server
mnemo-mcpfollowing the steps at https://raw.githubusercontent.com/n24q02m/claude-plugins/main/plugins/mnemo-mcp/setup-with-agent.md
Tools
15 MCP tools, 17 memory actions. The memory surface is exposed both as 11 specialized single-purpose tools and a legacy memory dispatcher (same actions), plus config, help, and config__open_relay:
Tool | Actions | Description |
| (one action each) | Specialized single-purpose memory tools -- the recommended surface |
|
| Core CRUD + typed capture (6 context_types) + hybrid search (RRF + rerank + temporal decay) + import/export + soft-archive + restore + on-demand archive sweep + LLM consolidation + LLM compression + temporal KG (entity search / graph / history) |
|
| Server status, trigger sync, update settings, pre-download embedding model, authenticate sync provider, manage HTTP setup form lifecycle, passport export/import |
|
| Full documentation for any tool |
| (HTTP relay mode) | Open the zero-config relay setup form (registered via mcp-core) |
Plugin trinity (Claude Code marketplace install):
Component | Trigger | Purpose |
| session start, before significant decisions, "what do I know about X?" | Pulls cwd / topic-relevant memories with |
| "remember this" / "save this" / "ghi nho" / "luu lai" | Typed manual capture with |
| periodic / "audit memory" | Find duplicates, contradictions, stale entries; consolidate |
| end of session | Capture decisions / preferences / corrections / conventions / open questions |
SessionStart hook | every session init | Non-blocking nudge to invoke |
PostToolUse hook (opt-in) |
| Hint |
MCP Resources
URI | Description |
| Database statistics and server status |
MCP Prompts
Prompt | Parameters | Description |
|
| Generate prompt to save a conversation summary as memory |
|
| Generate prompt to recall relevant memories about a topic |
Security
Graceful fallbacks -- Cloud → Local embedding, no cross-mode fallback
Sync token security -- OAuth tokens stored at
~/.mnemo-mcp/tokens/with 600 permissionsInput validation -- Sync provider, folder, remote validated against allowlists
Error sanitization -- No credentials in error messages
Build from Source
git clone https://github.com/n24q02m/mnemo-mcp.git
cd mnemo-mcp
uv sync
uv run mnemo-mcpTrust Model
This plugin implements TC-Local (machine-bound, single trust principal). The mode/storage/encryption breakdown below is the full classification.
Mode | Credentials | Memory data | Who can read your data? |
stdio (default) | Read from environment variables (no credential file written) | Local SQLite at | Only your OS user |
HTTP self-host (single user) | Encrypted | Local SQLite (same host) | Only you (admin = user) |
HTTP multi-user remote ( | Per-JWT- | Per- | Only the authenticated user (per- |
Passport sync bundles are always end-to-end encrypted (AES-256-GCM + Argon2id); backends never see plaintext.
License
MIT -- See LICENSE.
This server cannot be installed
Maintenance
Latest Blog Posts
MCP directory API
We provide all the information about MCP servers via our MCP API.
curl -X GET 'https://glama.ai/api/mcp/v1/servers/expandingideas-ai/Mnemo-MCP'
If you have feedback or need assistance with the MCP directory API, please join our Discord server