Skip to main content
Glama
n24q02m

mnemo-mcp

Mnemo MCP Server

mcp-name: io.github.n24q02m/mnemo-mcp

Persistent AI memory with hybrid search and embedded sync. Open, free, unlimited.

CI codecov PyPI Docker License: MIT

Python SQLite MCP semantic-release Renovate

Features

  • Hybrid search: FTS5 full-text + sqlite-vec semantic + Qwen3-Embedding-0.6B (built-in)

  • Zero config mode: Works out of the box — local embedding, no API keys needed

  • Auto-detect embedding: Set API_KEYS for cloud embedding, auto-fallback to local

  • Embedded sync: rclone auto-downloaded and managed as subprocess

  • Multi-machine: JSONL-based merge sync via rclone (Google Drive, S3, etc.)

  • Proactive memory: Tool descriptions guide AI to save preferences, decisions, facts

Quick Start

The recommended way to run this server is via uvx:

uvx mnemo-mcp@latest

Alternatively, you can use pipx run mnemo-mcp.

{
  "mcpServers": {
    "mnemo": {
      "command": "uvx",
      "args": ["mnemo-mcp@latest"],
      "env": {
        // -- optional: LiteLLM Proxy (production, selfhosted gateway)
        // "LITELLM_PROXY_URL": "http://10.0.0.20:4000",
        // "LITELLM_PROXY_KEY": "sk-your-virtual-key",
        // -- optional: cloud embedding (Gemini > OpenAI > Cohere) for semantic search
        // -- without this, uses built-in local Qwen3-Embedding-0.6B (ONNX, CPU)
        // -- first run downloads ~570MB model, cached for subsequent runs
        "API_KEYS": "GOOGLE_API_KEY:AIza...",
        // -- optional: custom embedding endpoint (e.g. modalcom-ai-workers on Modal.com)
        // "EMBEDDING_API_BASE": "https://your-worker.modal.run",
        // "EMBEDDING_API_KEY": "your-key",
        // -- optional: sync memories across machines via rclone
        "SYNC_ENABLED": "true",                    // optional, default: false
        "SYNC_REMOTE": "gdrive",                   // required when SYNC_ENABLED=true
        "SYNC_INTERVAL": "300",                    // optional, auto-sync every 5min (0 = manual only)
        "RCLONE_CONFIG_GDRIVE_TYPE": "drive",      // required when SYNC_ENABLED=true
        "RCLONE_CONFIG_GDRIVE_TOKEN": "<base64>"   // required when SYNC_ENABLED=true, from: uvx mnemo-mcp setup-sync drive
      }
    }
  }
}

Option 2: Docker

{
  "mcpServers": {
    "mnemo": {
      "command": "docker",
      "args": [
        "run", "-i", "--rm",
        "--name", "mcp-mnemo",
        "-v", "mnemo-data:/data",                  // persists memories across restarts
        "-e", "LITELLM_PROXY_URL",                 // optional: pass-through from env below
        "-e", "LITELLM_PROXY_KEY",                 // optional: pass-through from env below
        "-e", "API_KEYS",                          // optional: pass-through from env below
        "-e", "EMBEDDING_API_BASE",                // optional: pass-through from env below
        "-e", "EMBEDDING_API_KEY",                 // optional: pass-through from env below
        "-e", "SYNC_ENABLED",                      // optional: pass-through from env below
        "-e", "SYNC_REMOTE",                       // required when SYNC_ENABLED=true: pass-through
        "-e", "SYNC_INTERVAL",                     // optional: pass-through from env below
        "-e", "RCLONE_CONFIG_GDRIVE_TYPE",         // required when SYNC_ENABLED=true: pass-through
        "-e", "RCLONE_CONFIG_GDRIVE_TOKEN",        // required when SYNC_ENABLED=true: pass-through
        "n24q02m/mnemo-mcp:latest"
      ],
      "env": {
        // -- optional: LiteLLM Proxy (production, selfhosted gateway)
        // "LITELLM_PROXY_URL": "http://10.0.0.20:4000",
        // "LITELLM_PROXY_KEY": "sk-your-virtual-key",
        // -- optional: cloud embedding (Gemini > OpenAI > Cohere) for semantic search
        // -- without this, uses built-in local Qwen3-Embedding-0.6B (ONNX, CPU)
        "API_KEYS": "GOOGLE_API_KEY:AIza...",
        // -- optional: custom embedding endpoint (e.g. modalcom-ai-workers on Modal.com)
        // "EMBEDDING_API_BASE": "https://your-worker.modal.run",
        // "EMBEDDING_API_KEY": "your-key",
        // -- optional: sync memories across machines via rclone
        "SYNC_ENABLED": "true",                    // optional, default: false
        "SYNC_REMOTE": "gdrive",                   // required when SYNC_ENABLED=true
        "SYNC_INTERVAL": "300",                    // optional, auto-sync every 5min (0 = manual only)
        "RCLONE_CONFIG_GDRIVE_TYPE": "drive",      // required when SYNC_ENABLED=true
        "RCLONE_CONFIG_GDRIVE_TOKEN": "<base64>"   // required when SYNC_ENABLED=true, from: uvx mnemo-mcp setup-sync drive
      }
    }
  }
}

Pre-install (optional)

Pre-download dependencies before adding to your MCP client config. This avoids slow first-run startup:

# Pre-download embedding model (~570MB) and validate API keys
uvx mnemo-mcp warmup

# With cloud embedding (validates API key, skips local download if cloud works)
API_KEYS="GOOGLE_API_KEY:AIza..." uvx mnemo-mcp warmup

Sync setup (one-time)

# Google Drive
uvx mnemo-mcp setup-sync drive

# Other providers (any rclone remote type)
uvx mnemo-mcp setup-sync dropbox
uvx mnemo-mcp setup-sync onedrive
uvx mnemo-mcp setup-sync s3

Opens a browser for OAuth and outputs env vars (RCLONE_CONFIG_*) to set. Both raw JSON and base64 tokens are supported.

Configuration

Variable

Default

Description

DB_PATH

~/.mnemo-mcp/memories.db

Database location

LITELLM_PROXY_URL

LiteLLM Proxy URL (e.g. http://10.0.0.20:4000). Enables proxy mode

LITELLM_PROXY_KEY

LiteLLM Proxy virtual key (e.g. sk-...)

API_KEYS

API keys (ENV:key,ENV:key). Optional: enables semantic search (SDK mode)

EMBEDDING_API_BASE

Custom embedding endpoint URL (optional, for SDK mode)

EMBEDDING_API_KEY

Custom embedding endpoint key (optional)

EMBEDDING_BACKEND

(auto-detect)

litellm (cloud API) or local (Qwen3). Auto: API_KEYS -> litellm, else local (always available)

EMBEDDING_MODEL

auto-detect

LiteLLM model name (optional)

EMBEDDING_DIMS

0 (auto=768)

Embedding dimensions (0 = auto-detect, default 768)

SYNC_ENABLED

false

Enable rclone sync

SYNC_REMOTE

rclone remote name (required when sync enabled)

SYNC_FOLDER

mnemo-mcp

Remote folder (optional)

SYNC_INTERVAL

0

Auto-sync seconds (optional, 0=manual)

LOG_LEVEL

INFO

Log level (optional)

Embedding (3-Mode Architecture)

Embedding is always available — a local model is built-in and requires no configuration.

Embedding access supports 3 modes, resolved by priority:

Priority

Mode

Config

Use case

1

Proxy

LITELLM_PROXY_URL + LITELLM_PROXY_KEY

Production (OCI VM, selfhosted gateway)

2

SDK

API_KEYS or EMBEDDING_API_BASE

Dev/local with direct API access

3

Local

Nothing needed

Offline, always available as fallback

No cross-mode fallback — if proxy is configured but unreachable, calls fail (no silent fallback to direct API).

  • Local mode: Qwen3-Embedding-0.6B, always available with zero config.

  • GPU auto-detection: If GPU is available (CUDA/DirectML) and llama-cpp-python is installed, automatically uses GGUF model (~480MB) instead of ONNX (~570MB) for better performance.

  • All embeddings stored at 768 dims (default). Switching providers never breaks the vector table.

  • Override with EMBEDDING_BACKEND=local to force local even with API keys.

API_KEYS supports multiple providers in a single string:

API_KEYS=GOOGLE_API_KEY:AIza...,OPENAI_API_KEY:sk-...,COHERE_API_KEY:co-...

Cloud embedding providers (auto-detected from API_KEYS, priority order):

Priority

Env Var (LiteLLM)

Model

Native Dims

Stored

1

GEMINI_API_KEY

gemini/gemini-embedding-001

3072

768

2

OPENAI_API_KEY

text-embedding-3-large

3072

768

3

COHERE_API_KEY

embed-multilingual-v3.0

1024

768

All embeddings are truncated to 768 dims (default) for storage. This ensures switching models never breaks the vector table. Override with EMBEDDING_DIMS if needed.

API_KEYS format maps your env var to LiteLLM's expected var (e.g., GOOGLE_API_KEY:key auto-sets GEMINI_API_KEY). Set EMBEDDING_MODEL explicitly for other providers.

MCP Tools

memory — Core memory operations

Action

Required

Optional

add

content

category, tags

search

query

category, tags, limit

list

category, limit

update

memory_id

content, category, tags

delete

memory_id

export

import

data (JSONL)

mode (merge/replace)

stats

config — Server configuration

Action

Required

Optional

status

sync

set

key, value

help — Full documentation

help(topic="memory")  # or "config"

MCP Resources

URI

Description

mnemo://stats

Database statistics and server status

mnemo://recent

10 most recently updated memories

MCP Prompts

Prompt

Parameters

Description

save_summary

summary

Generate prompt to save a conversation summary as memory

recall_context

topic

Generate prompt to recall relevant memories about a topic

Architecture

                  MCP Client (Claude, Cursor, etc.)
                         |
                    FastMCP Server
                   /      |       \
             memory    config    help
                |         |        |
            MemoryDB   Settings  docs/
            /     \
        FTS5    sqlite-vec
                    |
              EmbeddingBackend
              /            \
         LiteLLM        Qwen3 ONNX
            |           (local CPU)
  Gemini / OpenAI / Cohere

        Sync: rclone (embedded) -> Google Drive / S3 / ...

Development

# Install
uv sync

# Run
uv run mnemo-mcp

# Lint
uv run ruff check src/
uv run ty check src/

# Test
uv run pytest

Compatible With

Claude Desktop Claude Code Cursor VS Code Copilot Antigravity Gemini CLI OpenAI Codex OpenCode

Also by n24q02m

Server

Description

Install

better-notion-mcp

Notion API for AI agents

npx -y @n24q02m/better-notion-mcp@latest

wet-mcp

Web search, content extraction, library docs

uvx --python 3.13 wet-mcp@latest

better-email-mcp

Email (IMAP/SMTP) for AI agents

npx -y @n24q02m/better-email-mcp@latest

better-godot-mcp

Godot Engine for AI agents

npx -y @n24q02m/better-godot-mcp@latest

  • modalcom-ai-workers — GPU-accelerated AI workers on Modal.com (embedding, reranking)

  • qwen3-embed — Local embedding/reranking library used by mnemo-mcp

Contributing

See CONTRIBUTING.md

License

MIT - See LICENSE

-
security - not tested
-
license - not tested
-
quality - not tested

Latest Blog Posts

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/n24q02m/mnemo-mcp'

If you have feedback or need assistance with the MCP directory API, please join our Discord server