What can you do with this server?

The recall-mcp server is a local, private knowledge base that lets AI assistants search, read, and write to your personal notes and documents. All processing runs locally — no data leaves your device. * Search documents (search_documents): Find relevant passages using natural language or keywords. Supports semantic (meaning-based, via local embeddings), keyword (exact matching), hybrid (Reciprocal Rank Fusion), or auto (semantic if available, otherwise keyword) modes. Results include source and relevance score. * Read a full document (get_document): Retrieve the complete text of a specific document by its source name. * List available documents (list_sources): See all documents in the knowledge base and the active search mode. * Add notes (add_note): Save a new plain-text or Markdown note with a title; it is indexed immediately and becomes searchable right away.

How do I use recall-mcp?

1. Click on "Install Server". 2. Wait a few minutes for the server to deploy. Once ready, it will show a "Started" state. 3. In the chat, type @ followed by the MCP server name and your instructions, e.g., "@recall-mcp search my notes for how to undo a git commit" That's it! The server will respond to your query, and you can continue using it as needed. Here is a step-by-step guide with screenshots.

recall-mcp

by jaswanthsurya007-source

Overview Schema Related Servers Score Discussions

Python

Local

🧠 Recall — a local, private knowledge-base MCP server

Python MCP License: MIT

Recall turns a folder of your own notes and documents into a searchable knowledge base that any AI assistant can use. It is a Model Context Protocol (MCP) server: connect it to Claude Desktop, Claude Code, or any MCP client, and the assistant can search, read, and add to your notes through well-defined tools.

It uses semantic search powered by local embeddings, so it finds passages by meaning, not just matching keywords — and it runs entirely on your machine. No API key, no cloud, your documents never leave your device.

Why this project is interesting

Retrieval-Augmented Generation (RAG) done locally — chunking, embeddings, and cosine-similarity retrieval, the core of modern AI knowledge systems.
Hybrid retrieval — fuses semantic and keyword results with Reciprocal Rank Fusion (RRF), the technique production search systems use.
Model Context Protocol — exposes capabilities as tools an LLM can call, the emerging standard for connecting AI assistants to real systems.
Privacy-first — semantic search runs on-device with a small embedding model; nothing is sent to a third party.
Graceful degradation — if the embedding model can't load, it automatically falls back to keyword search instead of breaking.

Related MCP server: memory-mcp

See it in action

Ask Claude (with Recall connected) "search my notes for how to undo a git commit" — it calls the search_documents tool and answers grounded in git-cheatsheet.md, entirely on your machine.

See the difference: keyword vs. semantic

Ask "how do I undo a commit?" against a small dev knowledge base:

Search mode	Top result	Why
Keyword	the doc that literally contains the words "undo a commit"	matches exact words
Semantic	`git-cheatsheet.md` → `git revert` makes a new commit that undoes an earlier one	matches meaning

Semantic search finds the genuinely useful answer even though the words don't overlap. That is the whole point of embeddings.

Retrieval quality (measured)

A small labelled eval (10 paraphrased queries over the sample docs) compares the three search modes. Semantic beats keyword clearly, especially at recall@1:

Mode	recall@1	recall@3
Keyword	40%	80%
Semantic	80%	90%
Hybrid	60%	90%

Reproduce it with python eval/run_eval.py. The corpus is small and topically overlapping, so treat the numbers as illustrative. (Pure semantic edges out hybrid here; hybrid tends to win when exact keyword matches matter — codes, names, error strings.) The harness is the real point: retrieval quality is measured, not assumed.

What the AI can do (the MCP tools)

Tool	What it does
`search_documents(query, limit, mode)`	Find the most relevant passages. `mode` can be `auto`, `semantic`, `keyword`, or `hybrid`.
`get_document(source)`	Return the full text of one document so the assistant can read or summarise it.
`list_sources()`	List the documents currently loaded and the active search mode.
`add_note(title, content)`	Save a new note into the knowledge base; it becomes searchable immediately.

How it works

        Your documents (.md / .txt / .pdf)
                 │
                 ▼
        ┌───────────────────┐
        │  DocumentStore     │   1. split each file into paragraph "chunks"
        │  (recall/store.py) │   2. embed every chunk into a vector (local model)
        └───────────────────┘
                 │  query
                 ▼
        ┌───────────────────┐
        │  Semantic search   │   embed the query, rank chunks by cosine similarity
        │  (or keyword)      │   (falls back to keyword search if no model)
        └───────────────────┘
                 │  tools
                 ▼
        ┌───────────────────┐        MCP (stdio / JSON-RPC)
        │  FastMCP server    │ ◀───────────────────────────▶  Claude Desktop,
        │  (recall/server.py)│                                 Claude Code, ...
        └───────────────────┘

Chunk — documents (Markdown, plain text, or PDF) are split on blank lines into passages, with each Markdown heading kept attached to the text it introduces, so results land on a precise, self-contained passage.
Embed — each chunk is turned into a vector with a local fastembed model (bge-small-en-v1.5, 384-dimensional vectors).
Retrieve — a query is embedded and compared to every chunk by cosine similarity; the closest chunks win.
Serve — the FastMCP server exposes search/read/write as MCP tools over stdio, so any MCP client can use them.

Quickstart

Requires Python 3.10+.

# 1. Clone and enter the project
git clone https://github.com/jaswanthsurya007-source/recall-mcp.git
cd recall-mcp

# 2. Create and activate a virtual environment
python -m venv .venv
# Windows (PowerShell):
.venv\Scripts\Activate.ps1
# macOS / Linux:
source .venv/bin/activate

# 3. Install
pip install -e .

# 4. Try a search from Python
python -c "from recall.store import DocumentStore; s=DocumentStore('data/documents'); print([r.chunk.source for r in s.search('how do I undo a commit', 1)])"

The first run downloads the embedding model (~66 MB) once, then caches it.

Behind a corporate proxy?

Recall uses truststore to trust your operating system's certificates automatically, so it works on networks that inspect TLS traffic (common at large companies) without extra configuration.

Connect it to Claude Desktop

Add Recall to your claude_desktop_config.json (Settings → Developer → Edit Config):

{
  "mcpServers": {
    "recall": {
      "command": "/absolute/path/to/recall-mcp/.venv/bin/python",
      "args": ["-m", "recall.server"],
      "env": {
        "RECALL_DOCS_DIR": "/absolute/path/to/recall-mcp/data/documents"
      }
    }
  }
}

On Windows, use the full path to python.exe and escape backslashes, e.g. "C:\\path\\to\\recall-mcp\\.venv\\Scripts\\python.exe".

Restart Claude Desktop, and you'll see Recall's tools available. Ask it things like "Search my notes for how to undo a git commit" or "Save a note titled 'Meeting' with these action items…".

Use your own documents

Point Recall at any folder of .md, .txt, or .pdf files:

# macOS / Linux: set RECALL_DOCS_DIR to your own notes folder
RECALL_DOCS_DIR="/path/to/my/notes" python -m recall.server

# Windows (PowerShell)
$env:RECALL_DOCS_DIR = "C:\path\to\my\notes"; python -m recall.server

The data/documents/ folder ships with a few sample notes so you can try it immediately.

Running the tests

pip install -e ".[dev]"
pytest -q

The test suite runs fully offline (keyword mode), so it needs no model download.

Project structure

recall-mcp/
├── .github/workflows/ # CI: ruff + pytest on every push
├── recall/
│   ├── server.py      # FastMCP server: defines the MCP tools
│   ├── store.py       # load → chunk → search (semantic, keyword, hybrid)
│   └── embeddings.py  # local embedding model wrapper (fastembed)
├── data/documents/    # sample knowledge base (.md and .pdf)
├── tests/             # offline pytest suite (+ fixtures/)
├── eval/              # retrieval-quality eval (recall@k)
├── pyproject.toml     # packaging + tooling config
├── requirements.txt
└── LICENSE

Design notes

Why local embeddings? Privacy and zero cost. fastembed uses ONNX runtime rather than PyTorch, so installs are small and inference is fast on CPU.
Why chunk by paragraph? It is simple and transparent, and it makes results land on a focused passage. A future version could use overlapping token windows.
Why a fallback to keyword search? A tool should never hard-fail. If the model can't be downloaded, search still works — just less cleverly.
Re-indexing on write is a full reload for clarity; at larger scale you would embed only the newly added chunks.

Roadmap

Retrieval-quality eval harness (recall@k)
Hybrid search (Reciprocal Rank Fusion of semantic + keyword)
PDF document support
Persist embeddings to disk so startup is instant on large corpora
Support HTML documents
Optional LLM-generated summaries via the Claude API
Expose documents as MCP resources, not just tools

License

MIT

Install Server

license - permissive license

quality

maintenance

How are these scores calculated?

Maintenance

–Maintainers

–Response time

–Release cycle

–Releases (12mo)

Commit activity

Resources

GitHub Repository

Need Help?

Related Servers

Unclaimed servers have limited discoverability.

Looking for Admin?

If you are the server author, to access and configure the admin panel.

Tools

Related MCP Servers

Nexus MCP for Obsidian
Note Taking Knowledge & Memory File Systems
ProfSynapse
A
license
-
quality
A
maintenance
Turns your Obsidian vault into an MCP-enabled workspace with tools for reading/writing notes, managing folders, running semantic searches, and maintaining long-term memory—all while keeping data local to your vault.
Last updated 2026-07-30
192,890
149
MIT
memory-mcp
Knowledge & Memory RAG Systems Search
Shaktisinhchavda
A
license
A
quality
B
maintenance
A local-first MCP server that exposes personal notes and files as unified semantic context for AI agents via vector search and file monitoring.
Last updated 2026-05-18
6
MIT
lake-of-vectors
Search RAG Systems Knowledge & Memory
tungpun
F
license
-
quality
D
maintenance
Enables semantic search over local knowledge bases like Obsidian notes, SQLite, and plaintext files, exposing results to Claude via MCP server.
Last updated 2026-04-13
agrasandhany
Note Taking Knowledge & Memory Search
yugandhar-maram
A
license
-
quality
C
maintenance
Turns local plain-text notes into searchable long-term memory for AI agents through the MCP protocol.
Last updated 2026-06-14
2
MIT

View all related MCP servers

Related MCP Connectors

trip2g
Serve a folder of Markdown notes as an MCP server: hybrid search, reading, and sourced answers.
XMemo
User-owned memory for AI agents, Copilot, Claude, IDEs, CLIs, and chat apps over remote MCP.
ContextLattice
Private-by-default, local-first memory/context/task orchestrator for MCP apps and agents.

View all MCP Connectors

Latest Blog Posts

Who's Calling? MCP Hosts Are an Identity Blind Spot (And the Spec Knows It)
By Om-Shree-0709 on July 25, 2026.
mcp
Agent Identity
OAuth 2.1
Your AI Chatbot Just Exposed Your CEO's Salary to an Intern
By Om-Shree-0709 on July 2, 2026.
Agent Identity
MCP Security
OAuth Delegation
Why MCP Servers Need Execution Sandboxing (And Why Your Current Stack Isn't Enough)
By Om-Shree-0709 on June 30, 2026.
Agentic Ai
Prompt Injection
WebAssembly

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/jaswanthsurya007-source/recall-mcp'

If you have feedback or need assistance with the MCP directory API, please join our Discord server