Supports syncing persistent memory data across multiple machines using Dropbox as a remote storage provider.
Utilizes Google's Gemini embedding models to provide cloud-based semantic search capabilities for stored memories.
Enables automatic synchronization of memory databases across multiple instances using Google Drive via rclone.
Integrates with OpenAI's embedding models to enable high-quality semantic search and vector-based retrieval of facts and preferences.
Uses an embedded and managed rclone subprocess to facilitate automatic, multi-machine synchronization of memory files.
Mnemo MCP Server
mcp-name: io.github.n24q02m/mnemo-mcp
Persistent AI memory with hybrid search and embedded sync. Open, free, unlimited.
Features
Hybrid search: FTS5 full-text + sqlite-vec semantic + Qwen3-Embedding-0.6B (built-in)
Zero config mode: Works out of the box — local embedding, no API keys needed
Auto-detect embedding: Set
API_KEYSfor cloud embedding, auto-fallback to localEmbedded sync: rclone auto-downloaded and managed as subprocess
Multi-machine: JSONL-based merge sync via rclone (Google Drive, S3, etc.)
Proactive memory: Tool descriptions guide AI to save preferences, decisions, facts
Quick Start
The recommended way to run this server is via uvx:
uvx mnemo-mcp@latestAlternatively, you can use
pipx run mnemo-mcp.
Option 1: uvx (Recommended)
{
"mcpServers": {
"mnemo": {
"command": "uvx",
"args": ["mnemo-mcp@latest"],
"env": {
// -- optional: LiteLLM Proxy (production, selfhosted gateway)
// "LITELLM_PROXY_URL": "http://10.0.0.20:4000",
// "LITELLM_PROXY_KEY": "sk-your-virtual-key",
// -- optional: cloud embedding (Gemini > OpenAI > Cohere) for semantic search
// -- without this, uses built-in local Qwen3-Embedding-0.6B (ONNX, CPU)
// -- first run downloads ~570MB model, cached for subsequent runs
"API_KEYS": "GOOGLE_API_KEY:AIza...",
// -- optional: custom embedding endpoint (e.g. modalcom-ai-workers on Modal.com)
// "EMBEDDING_API_BASE": "https://your-worker.modal.run",
// "EMBEDDING_API_KEY": "your-key",
// -- optional: sync memories across machines via rclone
// -- on first sync, a browser opens for OAuth (auto, no manual setup)
"SYNC_ENABLED": "true", // optional, default: false
"SYNC_INTERVAL": "300" // optional, auto-sync every 5min (0 = manual only)
// "SYNC_REMOTE": "gdrive", // optional, default: gdrive
// "SYNC_PROVIDER": "drive", // optional, default: drive (Google Drive)
}
}
}
}Option 2: Docker
{
"mcpServers": {
"mnemo": {
"command": "docker",
"args": [
"run", "-i", "--rm",
"--name", "mcp-mnemo",
"-v", "mnemo-data:/data", // persists memories across restarts
"-e", "LITELLM_PROXY_URL", // optional: pass-through from env below
"-e", "LITELLM_PROXY_KEY", // optional: pass-through from env below
"-e", "API_KEYS", // optional: pass-through from env below
"-e", "EMBEDDING_API_BASE", // optional: pass-through from env below
"-e", "EMBEDDING_API_KEY", // optional: pass-through from env below
"-e", "SYNC_ENABLED", // optional: pass-through from env below
"-e", "SYNC_INTERVAL", // optional: pass-through from env below
"n24q02m/mnemo-mcp:latest"
],
"env": {
// -- optional: LiteLLM Proxy (production, selfhosted gateway)
// "LITELLM_PROXY_URL": "http://10.0.0.20:4000",
// "LITELLM_PROXY_KEY": "sk-your-virtual-key",
// -- optional: cloud embedding (Gemini > OpenAI > Cohere) for semantic search
// -- without this, uses built-in local Qwen3-Embedding-0.6B (ONNX, CPU)
"API_KEYS": "GOOGLE_API_KEY:AIza...",
// -- optional: custom embedding endpoint (e.g. modalcom-ai-workers on Modal.com)
// "EMBEDDING_API_BASE": "https://your-worker.modal.run",
// "EMBEDDING_API_KEY": "your-key",
// -- optional: sync memories across machines via rclone
"SYNC_ENABLED": "true", // optional, default: false
"SYNC_INTERVAL": "300" // optional, auto-sync every 5min (0 = manual only)
}
}
}
}Pre-install (optional)
Pre-download dependencies before adding to your MCP client config. This avoids slow first-run startup:
# Pre-download embedding model (~570MB) and validate API keys
uvx mnemo-mcp warmup
# With cloud embedding (validates API key, skips local download if cloud works)
API_KEYS="GOOGLE_API_KEY:AIza..." uvx mnemo-mcp warmupSync setup
Sync is fully automatic. Just set SYNC_ENABLED=true and the server handles everything:
First sync: rclone is auto-downloaded, a browser opens for OAuth authentication
Token saved: OAuth token is stored locally at
~/.mnemo-mcp/tokens/(600 permissions)Subsequent runs: Token is loaded automatically — no manual steps needed
For non-Google Drive providers, set SYNC_PROVIDER and SYNC_REMOTE:
{
"SYNC_ENABLED": "true",
"SYNC_PROVIDER": "dropbox", // rclone provider type
"SYNC_REMOTE": "dropbox" // rclone remote name
}Advanced: You can also run
uvx mnemo-mcp setup-sync driveto pre-authenticate before first use, but this is optional.
Configuration
Variable | Default | Description |
|
| Database location |
| — | LiteLLM Proxy URL (e.g. |
| — | LiteLLM Proxy virtual key (e.g. |
| — | API keys ( |
| — | Custom embedding endpoint URL (optional, for SDK mode) |
| — | Custom embedding endpoint key (optional) |
| (auto-detect) |
|
| auto-detect | LiteLLM model name (optional) |
|
| Embedding dimensions (0 = auto-detect, default 768) |
|
| Enable rclone sync |
|
| rclone provider type (drive, dropbox, s3, etc.) |
|
| rclone remote name |
|
| Remote folder |
|
| Auto-sync seconds (0=manual) |
|
| Log level |
Embedding (3-Mode Architecture)
Embedding is always available — a local model is built-in and requires no configuration.
Embedding access supports 3 modes, resolved by priority:
Priority | Mode | Config | Use case |
1 | Proxy |
| Production (OCI VM, selfhosted gateway) |
2 | SDK |
| Dev/local with direct API access |
3 | Local | Nothing needed | Offline, always available as fallback |
No cross-mode fallback — if proxy is configured but unreachable, calls fail (no silent fallback to direct API).
Local mode: Qwen3-Embedding-0.6B, always available with zero config.
GPU auto-detection: If GPU is available (CUDA/DirectML) and
llama-cpp-pythonis installed, automatically uses GGUF model (~480MB) instead of ONNX (~570MB) for better performance.All embeddings stored at 768 dims (default). Switching providers never breaks the vector table.
Override with
EMBEDDING_BACKEND=localto force local even with API keys.
API_KEYS supports multiple providers in a single string:
API_KEYS=GOOGLE_API_KEY:AIza...,OPENAI_API_KEY:sk-...,COHERE_API_KEY:co-...Cloud embedding providers (auto-detected from API_KEYS, priority order):
Priority | Env Var (LiteLLM) | Model | Native Dims | Stored |
1 |
|
| 3072 | 768 |
2 |
|
| 3072 | 768 |
3 |
|
| 1024 | 768 |
All embeddings are truncated to 768 dims (default) for storage. This ensures switching models never breaks the vector table. Override with EMBEDDING_DIMS if needed.
API_KEYS format maps your env var to LiteLLM's expected var (e.g., GOOGLE_API_KEY:key auto-sets GEMINI_API_KEY). Set EMBEDDING_MODEL explicitly for other providers.
MCP Tools
memory — Core memory operations
Action | Required | Optional |
|
|
|
|
|
|
| — |
|
|
|
|
|
| — |
| — | — |
|
|
|
| — | — |
config — Server configuration
Action | Required | Optional |
| — | — |
| — | — |
|
| — |
help — Full documentation
help(topic="memory") # or "config"MCP Resources
URI | Description |
| Database statistics and server status |
| 10 most recently updated memories |
MCP Prompts
Prompt | Parameters | Description |
|
| Generate prompt to save a conversation summary as memory |
|
| Generate prompt to recall relevant memories about a topic |
Architecture
MCP Client (Claude, Cursor, etc.)
|
FastMCP Server
/ | \
memory config help
| | |
MemoryDB Settings docs/
/ \
FTS5 sqlite-vec
|
EmbeddingBackend
/ \
LiteLLM Qwen3 ONNX
| (local CPU)
Gemini / OpenAI / Cohere
Sync: rclone (embedded) -> Google Drive / S3 / ...Development
# Install
uv sync
# Run
uv run mnemo-mcp
# Lint
uv run ruff check src/
uv run ty check src/
# Test
uv run pytestCompatible With
Also by n24q02m
Server | Description | Install |
Notion API for AI agents |
| |
Web search, content extraction, library docs |
| |
Email (IMAP/SMTP) for AI agents |
| |
Godot Engine for AI agents |
|
Related Projects
modalcom-ai-workers — GPU-accelerated AI workers on Modal.com (embedding, reranking)
qwen3-embed — Local embedding/reranking library used by mnemo-mcp
Contributing
See CONTRIBUTING.md
License
MIT - See LICENSE