Which integrations are available for this server?

Supports syncing persistent memory data across multiple machines using Dropbox as a remote storage provider. Utilizes Google's Gemini embedding models to provide cloud-based semantic search capabilities for stored memories. Enables automatic synchronization of memory databases across multiple instances using Google Drive via rclone. Integrates with OpenAI's embedding models to enable high-quality semantic search and vector-based retrieval of facts and preferences. Uses an embedded and managed rclone subprocess to facilitate automatic, multi-machine synchronization of memory files.

Mnemo MCP Server

mcp-name: io.github.n24q02m/mnemo-mcp

Persistent AI memory with hybrid search and embedded sync. Open, free, unlimited.

codecov PyPI Docker License: MIT

Python SQLite MCP semantic-release Renovate

Features

Hybrid search: FTS5 full-text + sqlite-vec semantic + Qwen3-Embedding-0.6B (built-in)
Zero config mode: Works out of the box — local embedding, no API keys needed
Auto-detect embedding: Set API_KEYS for cloud embedding, auto-fallback to local
Embedded sync: rclone auto-downloaded and managed as subprocess
Multi-machine: JSONL-based merge sync via rclone (Google Drive, S3, etc.)
Proactive memory: Tool descriptions guide AI to save preferences, decisions, facts

Related MCP server: WET - Web Extended Toolkit

Quick Start

The recommended way to run this server is via uvx:

uvx mnemo-mcp@latest

Alternatively, you can use pipx run mnemo-mcp.

Option 1: uvx (Recommended)

{
  "mcpServers": {
    "mnemo": {
      "command": "uvx",
      "args": ["mnemo-mcp@latest"],
      "env": {
        // -- optional: LiteLLM Proxy (production, selfhosted gateway)
        // "LITELLM_PROXY_URL": "http://10.0.0.20:4000",
        // "LITELLM_PROXY_KEY": "sk-your-virtual-key",
        // -- optional: cloud embedding (Gemini > OpenAI > Cohere) for semantic search
        // -- without this, uses built-in local Qwen3-Embedding-0.6B (ONNX, CPU)
        // -- first run downloads ~570MB model, cached for subsequent runs
        "API_KEYS": "GOOGLE_API_KEY:AIza...",
        // -- optional: custom embedding endpoint (e.g. modalcom-ai-workers on Modal.com)
        // "EMBEDDING_API_BASE": "https://your-worker.modal.run",
        // "EMBEDDING_API_KEY": "your-key",
        // -- optional: sync memories across machines via rclone
        // -- on first sync, a browser opens for OAuth (auto, no manual setup)
        "SYNC_ENABLED": "true",                    // optional, default: false
        "SYNC_INTERVAL": "300"                     // optional, auto-sync every 5min (0 = manual only)
        // "SYNC_REMOTE": "gdrive",                 // optional, default: gdrive
        // "SYNC_PROVIDER": "drive",                // optional, default: drive (Google Drive)
      }
    }
  }
}

Option 2: Docker

{
  "mcpServers": {
    "mnemo": {
      "command": "docker",
      "args": [
        "run", "-i", "--rm",
        "--name", "mcp-mnemo",
        "-v", "mnemo-data:/data",                  // persists memories across restarts
        "-e", "LITELLM_PROXY_URL",                 // optional: pass-through from env below
        "-e", "LITELLM_PROXY_KEY",                 // optional: pass-through from env below
        "-e", "API_KEYS",                          // optional: pass-through from env below
        "-e", "EMBEDDING_API_BASE",                // optional: pass-through from env below
        "-e", "EMBEDDING_API_KEY",                 // optional: pass-through from env below
        "-e", "SYNC_ENABLED",                      // optional: pass-through from env below
        "-e", "SYNC_INTERVAL",                     // optional: pass-through from env below
        "n24q02m/mnemo-mcp:latest"
      ],
      "env": {
        // -- optional: LiteLLM Proxy (production, selfhosted gateway)
        // "LITELLM_PROXY_URL": "http://10.0.0.20:4000",
        // "LITELLM_PROXY_KEY": "sk-your-virtual-key",
        // -- optional: cloud embedding (Gemini > OpenAI > Cohere) for semantic search
        // -- without this, uses built-in local Qwen3-Embedding-0.6B (ONNX, CPU)
        "API_KEYS": "GOOGLE_API_KEY:AIza...",
        // -- optional: custom embedding endpoint (e.g. modalcom-ai-workers on Modal.com)
        // "EMBEDDING_API_BASE": "https://your-worker.modal.run",
        // "EMBEDDING_API_KEY": "your-key",
        // -- optional: sync memories across machines via rclone
        "SYNC_ENABLED": "true",                    // optional, default: false
        "SYNC_INTERVAL": "300"                     // optional, auto-sync every 5min (0 = manual only)
      }
    }
  }
}

Pre-install (optional)

Pre-download dependencies before adding to your MCP client config. This avoids slow first-run startup:

# Pre-download embedding model (~570MB) and validate API keys
uvx mnemo-mcp warmup

# With cloud embedding (validates API key, skips local download if cloud works)
API_KEYS="GOOGLE_API_KEY:AIza..." uvx mnemo-mcp warmup

Sync setup

Sync is fully automatic. Just set SYNC_ENABLED=true and the server handles everything:

First sync: rclone is auto-downloaded, a browser opens for OAuth authentication
Token saved: OAuth token is stored locally at ~/.mnemo-mcp/tokens/ (600 permissions)
Subsequent runs: Token is loaded automatically — no manual steps needed

For non-Google Drive providers, set SYNC_PROVIDER and SYNC_REMOTE:

{
  "SYNC_ENABLED": "true",
  "SYNC_PROVIDER": "dropbox",        // rclone provider type
  "SYNC_REMOTE": "dropbox"           // rclone remote name
}

Advanced: You can also run uvx mnemo-mcp setup-sync drive to pre-authenticate before first use, but this is optional.

Configuration

Variable	Default	Description
`DB_PATH`	`~/.mnemo-mcp/memories.db`	Database location
`LITELLM_PROXY_URL`	—	LiteLLM Proxy URL (e.g. `http://10.0.0.20:4000`). Enables proxy mode
`LITELLM_PROXY_KEY`	—	LiteLLM Proxy virtual key (e.g. `sk-...`)
`API_KEYS`	—	API keys (`ENV:key,ENV:key`). Optional: enables semantic search (SDK mode)
`EMBEDDING_API_BASE`	—	Custom embedding endpoint URL (optional, for SDK mode)
`EMBEDDING_API_KEY`	—	Custom embedding endpoint key (optional)
`EMBEDDING_BACKEND`	(auto-detect)	`litellm` (cloud API) or `local` (Qwen3). Auto: API_KEYS -> litellm, else local (always available)
`EMBEDDING_MODEL`	auto-detect	LiteLLM model name (optional)
`EMBEDDING_DIMS`	`0` (auto=768)	Embedding dimensions (0 = auto-detect, default 768)
`SYNC_ENABLED`	`false`	Enable rclone sync
`SYNC_PROVIDER`	`drive`	rclone provider type (drive, dropbox, s3, etc.)
`SYNC_REMOTE`	`gdrive`	rclone remote name
`SYNC_FOLDER`	`mnemo-mcp`	Remote folder
`SYNC_INTERVAL`	`300`	Auto-sync seconds (0=manual)
`LOG_LEVEL`	`INFO`	Log level

Embedding (3-Mode Architecture)

Embedding is always available — a local model is built-in and requires no configuration.

Embedding access supports 3 modes, resolved by priority:

Priority	Mode	Config	Use case
1	Proxy	`LITELLM_PROXY_URL` + `LITELLM_PROXY_KEY`	Production (OCI VM, selfhosted gateway)
2	SDK	`API_KEYS` or `EMBEDDING_API_BASE`	Dev/local with direct API access
3	Local	Nothing needed	Offline, always available as fallback

No cross-mode fallback — if proxy is configured but unreachable, calls fail (no silent fallback to direct API).

Local mode: Qwen3-Embedding-0.6B, always available with zero config.
GPU auto-detection: If GPU is available (CUDA/DirectML) and llama-cpp-python is installed, automatically uses GGUF model (~480MB) instead of ONNX (~570MB) for better performance.
All embeddings stored at 768 dims (default). Switching providers never breaks the vector table.
Override with EMBEDDING_BACKEND=local to force local even with API keys.

API_KEYS supports multiple providers in a single string:

API_KEYS=GOOGLE_API_KEY:AIza...,OPENAI_API_KEY:sk-...,COHERE_API_KEY:co-...

Cloud embedding providers (auto-detected from API_KEYS, priority order):

Priority	Env Var (LiteLLM)	Model	Native Dims	Stored
1	`GEMINI_API_KEY`	`gemini/gemini-embedding-001`	3072	768
2	`OPENAI_API_KEY`	`text-embedding-3-large`	3072	768
3	`COHERE_API_KEY`	`embed-multilingual-v3.0`	1024	768

All embeddings are truncated to 768 dims (default) for storage. This ensures switching models never breaks the vector table. Override with EMBEDDING_DIMS if needed.

API_KEYS format maps your env var to LiteLLM's expected var (e.g., GOOGLE_API_KEY:key auto-sets GEMINI_API_KEY). Set EMBEDDING_MODEL explicitly for other providers.

MCP Tools

`memory` — Core memory operations

Action	Required	Optional
`add`	`content`	`category`, `tags`
`search`	`query`	`category`, `tags`, `limit`
`list`	—	`category`, `limit`
`update`	`memory_id`	`content`, `category`, `tags`
`delete`	`memory_id`	—
`export`	—	—
`import`	`data` (JSONL)	`mode` (merge/replace)
`stats`	—	—

`config` — Server configuration

Action	Required	Optional
`status`	—	—
`sync`	—	—
`set`	`key`, `value`	—

`help` — Full documentation

help(topic="memory")  # or "config"

MCP Resources

URI	Description
`mnemo://stats`	Database statistics and server status
`mnemo://recent`	10 most recently updated memories

MCP Prompts

Prompt	Parameters	Description
`save_summary`	`summary`	Generate prompt to save a conversation summary as memory
`recall_context`	`topic`	Generate prompt to recall relevant memories about a topic

Architecture

                  MCP Client (Claude, Cursor, etc.)
                         |
                    FastMCP Server
                   /      |       \
             memory    config    help
                |         |        |
            MemoryDB   Settings  docs/
            /     \
        FTS5    sqlite-vec
                    |
              EmbeddingBackend
              /            \
         LiteLLM        Qwen3 ONNX
            |           (local CPU)
  Gemini / OpenAI / Cohere

        Sync: rclone (embedded) -> Google Drive / S3 / ...

Development

# Install
uv sync

# Run
uv run mnemo-mcp

# Lint
uv run ruff check src/
uv run ty check src/

# Test
uv run pytest

Compatible With

Claude Desktop Claude Code Cursor VS Code Copilot Antigravity Gemini CLI OpenAI Codex OpenCode

Also by n24q02m

Server	Description	Install
better-notion-mcp	Notion API for AI agents	`npx -y @n24q02m/better-notion-mcp@latest`
wet-mcp	Web search, content extraction, library docs	`uvx --python 3.13 wet-mcp@latest`
better-email-mcp	Email (IMAP/SMTP) for AI agents	`npx -y @n24q02m/better-email-mcp@latest`
better-godot-mcp	Godot Engine for AI agents	`npx -y @n24q02m/better-godot-mcp@latest`

modalcom-ai-workers — GPU-accelerated AI workers on Modal.com (embedding, reranking)
qwen3-embed — Local embedding/reranking library used by mnemo-mcp

Contributing

See CONTRIBUTING.md

License

MIT - See LICENSE

Mnemo - Persistent AI Memory

Mnemo MCP Server

Features

Quick Start

Option 1: uvx (Recommended)

Option 2: Docker

Pre-install (optional)

Sync setup

Configuration

Embedding (3-Mode Architecture)

MCP Tools

`memory` — Core memory operations

`config` — Server configuration

`help` — Full documentation

MCP Resources

MCP Prompts

Architecture

Development

Compatible With

Also by n24q02m

Contributing

License

Maintenance

Resources

Tools

Latest Blog Posts

MCP directory API

Mnemo MCP Server

Features

Quick Start

Option 1: uvx (Recommended)

Option 2: Docker

Pre-install (optional)

Sync setup

Configuration

Embedding (3-Mode Architecture)

MCP Tools

memory — Core memory operations

config — Server configuration

help — Full documentation

MCP Resources

MCP Prompts

Architecture

Development

Compatible With

Also by n24q02m

Related Projects

Contributing

License

Maintenance

Resources

Tools

Latest Blog Posts

MCP directory API

`memory` — Core memory operations

`config` — Server configuration

`help` — Full documentation