Which integrations are available for this server?

Enables semantic vector search by using OpenAI embeddings as an optional provider, improving memory recall with similarity queries.

How do I use recall-mcp?

1. Click on "Install Server". 2. Wait a few minutes for the server to deploy. Once ready, it will show a "Started" state. 3. In the chat, type @ followed by the MCP server name and your instructions, e.g., "@recall-mcp remind me of the context for the auth refactor" That's it! The server will respond to your query, and you can continue using it as needed. Here is a step-by-step guide with screenshots.

recall-mcp

by Dhari-Q

Overview Schema Related Servers Score Discussions

Python

Local

recall-mcp

One shared, layered, local-first brain for every AI CLI you use. Claude Code, Gemini CLI, Cursor, Continue, Zed — they all forget. recall-mcp is the memory they share.

License: MIT Python 3.10+ MCP Local-first

Built on the layered memory engine from Hermes Agent by Nous Research (MIT). recall-mcp packages that engine as a standalone MCP server so any AI client — not just Hermes — can plug into the same brain. Original architecture: theirs. Packaging, MCP surface, cross-CLI integration: this project. See Credits.

Quick start

# Install
pipx install recall-mcp

# Wire it into Claude Code (one-time)
echo '{"mcpServers":{"recall-mcp":{"type":"stdio","command":"recall-mcp"}}}' >> ~/.claude.json

# Restart Claude Code. Done.

That's it. Every conversation now writes to and reads from the same persistent brain — and so do Gemini CLI, Cursor, and any other MCP-aware client you wire up the same way.

Related MCP server: auxly-memory-cli

What it does

flowchart TD
    A[Claude Code] -- MCP --> M[recall-mcp]
    B[Gemini CLI] -- MCP --> M
    C[Cursor / Continue / Zed] -- MCP --> M
    M --> S[(SQLite<br/>facts)]
    M --> V[(ChromaDB<br/>vectors)]
    M --> E[(Entity<br/>graph)]
    M --> T[(Temporal<br/>lineage)]
    M --> F[(FTS5<br/>keyword)]
    classDef client fill:#1f6feb,stroke:#1f6feb,color:#fff,stroke-width:0
    classDef brain fill:#a371f7,stroke:#a371f7,color:#fff,stroke-width:0
    classDef store fill:#0d1117,stroke:#30363d,color:#7d8590
    class A,B,C client
    class M brain
    class S,V,E,T,F store

Every AI CLI has the same blind spot: each new session starts with amnesia. Native save_memory tools store flat lists that bloat the system prompt over time. Cloud memory services need accounts, paid tiers, and trust your data to a vendor.

recall-mcp gives you one brain shared by every MCP-aware AI client:

🧠 7 memory layers — vector similarity, BM25 keyword, entity graph, temporal lineage, importance scoring, forgetting engine, hybrid retrieval
🔌 Drop-in via MCP — works with Claude Code, Gemini CLI, Cursor, Continue, Zed, any client speaking Model Context Protocol
🏠 Local-first — SQLite + ChromaDB on your machine. No accounts, no Docker, no cloud lock-in
🔄 Brain-swappable — switch between Claude, Gemini, MiniMax, Qwen — they all share the same memory
🛡️ Graceful degradation — when embeddings hit rate limits, BM25 + entity + temporal carry the load. Never poisons the index

Install

pipx install recall-mcp

Or with uv:

uv tool install recall-mcp

Or from source:

git clone https://github.com/Dhari-Q/recall-mcp
cd recall-mcp
pip install -e .

Configure your AI client

Claude Code

Add to ~/.claude.json under your project's mcpServers:

{
  "mcpServers": {
    "recall-mcp": {
      "type": "stdio",
      "command": "recall-mcp"
    }
  }
}

Gemini CLI

Add to ~/.gemini/settings.json:

{
  "mcpServers": {
    "recall-mcp": {
      "command": "recall-mcp",
      "trust": true
    }
  }
}

Cursor

Add to ~/.cursor/mcp.json:

{
  "mcpServers": {
    "recall-mcp": {
      "command": "recall-mcp"
    }
  }
}

Restart your client. Done.

Five tools you'll use

Tool	Purpose
`memory_recall(query, top_k)`	Hybrid search across all layers — vector + BM25 + entity + temporal
`memory_remember(content, type, confidence, tags)`	Store a fact, decision, preference, or gotcha
`memory_recent_sessions(limit)`	List recent session summaries with decisions and bug fixes
`memory_search_entity(name, limit)`	Find memories tied to a specific file, project, person, or tool
`memory_stats()`	Sanity-check counts across every layer

Optional: real semantic search

By default, recall-mcp ships with BM25 keyword + entity graph + temporal retrieval — those work without any API key.

To enable vector / semantic search (queries like "how do I swap the AI" finding "switchable via /model" without shared keywords), point recall-mcp at an embeddings provider:

Create ~/.recall-mcp/.env (or export in your shell):

# MiniMax (global) — fastest path
MINIMAX_API_KEY=sk-...

# Or OpenAI
OPENAI_API_KEY=sk-...

# Or OpenRouter
OPENROUTER_API_KEY=sk-...

Vector layer activates automatically on next start.

Optional: auto-prefetch hook for Claude Code

The MCP tools above are deliberate — the model has to call them. For silent automatic recall on every prompt (like Claude Code's native memory but layered), add a UserPromptSubmit hook. See examples/claude_code_hook.md for the recipe.

Memory types

When you ask the model to remember something, it picks one of:

Type	Decay	Examples
`architecture`	Permanent	"We use ChromaDB for vectors"
`decision`	Permanent	"We chose MIT over GPL"
`convention`	Permanent	"All API calls go through retry_utils"
`pattern`	Permanent	"Use `with` statements for sqlite connections"
`gotcha`	Permanent	"MiniMax embeddings are NOT OpenAI-compatible"
`preference`	Permanent	"User prefers terse responses"
`progress`	7 days	"Finished MCP wiring on 2026-04-28"
`context`	30 days	Misc. background facts

Storage location

All data lives in $RECALL_MCP_HOME (defaults to ~/.recall-mcp/):

~/.recall-mcp/
├── memory/          # SQLite — facts + entity graph + temporal lineage
├── episodic/        # SQLite — session summaries
└── chroma/          # ChromaDB — vector embeddings

Set RECALL_MCP_HOME to point multiple machines at a synced folder (e.g., Syncthing) and your AI's memory follows you.

Architecture

recall-mcp exposes seven memory layers (originally designed in Hermes Agent), each backed by a focused storage engine:

Episodic (per-turn / per-session events) — SQLite
Semantic (extracted facts, decisions) — SQLite + ChromaDB
Entity graph (who/what/why, dependencies) — SQLite
Temporal lineage (millisecond timestamps, before/after queries) — SQLite
Importance scoring (not all memories equal) — derived
Forgetting engine (decay + Jaccard dedup) — derived
Hybrid retrieval (BM25 + vector + entity + temporal, fused with optional LLM re-rank) — runtime

When you call memory_recall, all four retrieval paths run in parallel, results are deduplicated, scored by source quality + importance, and returned ranked.

Credits

Memory architecture derived from Hermes by Nous Research (MIT). recall-mcp generalizes the layered memory + retrieval engine into a standalone MCP server that any AI client can plug into.

License

MIT — see LICENSE.

This server cannot be installed

license - permissive license

quality - not tested

maintenance

How are these scores calculated?

Maintenance

–Maintainers

–Response time

–Release cycle

–Releases (12mo)

Commit activity

Resources

GitHub Repository

Need Help?

Related Servers

Unclaimed servers have limited discoverability.

Looking for Admin?

If you are the server author, to access and configure the admin panel.

Latest Blog Posts

Your AI Chatbot Just Exposed Your CEO's Salary to an Intern
By Om-Shree-0709 on July 2, 2026.
Agent Identity
MCP Security
OAuth Delegation
Why MCP Servers Need Execution Sandboxing (And Why Your Current Stack Isn't Enough)
By Om-Shree-0709 on June 30, 2026.
Agentic Ai
Prompt Injection
WebAssembly
Lightport: Open-Sourcing Glama's AI Gateway
By punkpeye on April 27, 2026.
OpenAI
open source

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/Dhari-Q/recall-mcp'

If you have feedback or need assistance with the MCP directory API, please join our Discord server