Skip to main content
Glama

🧠 Recall β€” a local, private knowledge-base MCP server

Python MCP License: MIT

Recall turns a folder of your own notes and documents into a searchable knowledge base that any AI assistant can use. It is a Model Context Protocol (MCP) server: connect it to Claude Desktop, Claude Code, or any MCP client, and the assistant can search, read, and add to your notes through well-defined tools.

It uses semantic search powered by local embeddings, so it finds passages by meaning, not just matching keywords β€” and it runs entirely on your machine. No API key, no cloud, your documents never leave your device.


Why this project is interesting

  • Retrieval-Augmented Generation (RAG) done locally β€” chunking, embeddings, and cosine-similarity retrieval, the core of modern AI knowledge systems.

  • Model Context Protocol β€” exposes capabilities as tools an LLM can call, the emerging standard for connecting AI assistants to real systems.

  • Privacy-first β€” semantic search runs on-device with a small embedding model; nothing is sent to a third party.

  • Graceful degradation β€” if the embedding model can't load, it automatically falls back to keyword search instead of breaking.

Related MCP server: mcp-apple-notes

See it in action

Ask Claude (with Recall connected) "search my notes for how to undo a git commit" β€” it calls the search_documents tool and answers grounded in git-cheatsheet.md, entirely on your machine.

See the difference: keyword vs. semantic

Ask "how do I undo a commit?" against a small dev knowledge base:

Search mode

Top result

Why

Keyword

the doc that literally contains the words "undo a commit"

matches exact words

Semantic

git-cheatsheet.md β†’ git revert makes a new commit that undoes an earlier one

matches meaning

Semantic search finds the genuinely useful answer even though the words don't overlap. That is the whole point of embeddings.

Retrieval quality (measured)

A small labelled eval (10 paraphrased queries over the 4 sample docs) shows semantic search recovering the right document more often as the window widens:

Mode

recall@1

recall@3

Keyword

80%

80%

Semantic

80%

100%

Reproduce it with python eval/run_eval.py. The corpus is small, so treat the numbers as illustrative β€” but the harness is the point: retrieval quality here is measured, not assumed.

What the AI can do (the MCP tools)

Tool

What it does

search_documents(query, limit, mode)

Find the most relevant passages. mode can be auto, semantic, or keyword.

get_document(source)

Return the full text of one document so the assistant can read or summarise it.

list_sources()

List the documents currently loaded and the active search mode.

add_note(title, content)

Save a new note into the knowledge base; it becomes searchable immediately.

How it works

        Your documents (.md / .txt)
                 β”‚
                 β–Ό
        β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
        β”‚  DocumentStore     β”‚   1. split each file into paragraph "chunks"
        β”‚  (recall/store.py) β”‚   2. embed every chunk into a vector (local model)
        β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
                 β”‚  query
                 β–Ό
        β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
        β”‚  Semantic search   β”‚   embed the query, rank chunks by cosine similarity
        β”‚  (or keyword)      β”‚   (falls back to keyword search if no model)
        β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
                 β”‚  tools
                 β–Ό
        β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”        MCP (stdio / JSON-RPC)
        β”‚  FastMCP server    β”‚ ◀───────────────────────────▢  Claude Desktop,
        β”‚  (recall/server.py)β”‚                                 Claude Code, ...
        β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
  1. Chunk β€” documents are split on blank lines into passages, with each Markdown heading kept attached to the text it introduces, so results land on a precise, self-contained passage rather than a whole file.

  2. Embed β€” each chunk is turned into a vector with a local fastembed model (bge-small-en-v1.5, 384-dimensional vectors).

  3. Retrieve β€” a query is embedded and compared to every chunk by cosine similarity; the closest chunks win.

  4. Serve β€” the FastMCP server exposes search/read/write as MCP tools over stdio, so any MCP client can use them.

Quickstart

Requires Python 3.10+.

# 1. Clone and enter the project
git clone https://github.com/jaswanthsurya007-source/recall-mcp.git
cd recall-mcp

# 2. Create and activate a virtual environment
python -m venv .venv
# Windows (PowerShell):
.venv\Scripts\Activate.ps1
# macOS / Linux:
source .venv/bin/activate

# 3. Install
pip install -e .

# 4. Try a search from Python
python -c "from recall.store import DocumentStore; s=DocumentStore('data/documents'); print([r.chunk.source for r in s.search('how do I undo a commit', 1)])"

The first run downloads the embedding model (~66 MB) once, then caches it.

Behind a corporate proxy?

Recall uses truststore to trust your operating system's certificates automatically, so it works on networks that inspect TLS traffic (common at large companies) without extra configuration.

Connect it to Claude Desktop

Add Recall to your claude_desktop_config.json (Settings β†’ Developer β†’ Edit Config):

{
  "mcpServers": {
    "recall": {
      "command": "/absolute/path/to/recall-mcp/.venv/bin/python",
      "args": ["-m", "recall.server"],
      "env": {
        "RECALL_DOCS_DIR": "/absolute/path/to/recall-mcp/data/documents"
      }
    }
  }
}

On Windows, use the full path to python.exe and escape backslashes, e.g. "C:\\path\\to\\recall-mcp\\.venv\\Scripts\\python.exe".

Restart Claude Desktop, and you'll see Recall's tools available. Ask it things like "Search my notes for how to undo a git commit" or "Save a note titled 'Meeting' with these action items…".

Use your own documents

Point Recall at any folder of .md / .txt files:

# macOS / Linux: set RECALL_DOCS_DIR to your own notes folder
RECALL_DOCS_DIR="/path/to/my/notes" python -m recall.server
# Windows (PowerShell)
$env:RECALL_DOCS_DIR = "C:\path\to\my\notes"; python -m recall.server

The data/documents/ folder ships with a few sample notes so you can try it immediately.

Running the tests

pip install -e ".[dev]"
pytest -q

The test suite runs fully offline (keyword mode), so it needs no model download.

Project structure

recall-mcp/
β”œβ”€β”€ recall/
β”‚   β”œβ”€β”€ server.py      # FastMCP server: defines the MCP tools
β”‚   β”œβ”€β”€ store.py       # load β†’ chunk β†’ search documents (semantic + keyword)
β”‚   └── embeddings.py  # local embedding model wrapper (fastembed)
β”œβ”€β”€ data/documents/    # sample knowledge base (.md notes)
β”œβ”€β”€ tests/             # offline pytest suite
β”œβ”€β”€ eval/              # retrieval-quality eval (recall@k)
β”œβ”€β”€ pyproject.toml     # packaging + tooling config
β”œβ”€β”€ requirements.txt
└── LICENSE

Design notes

  • Why local embeddings? Privacy and zero cost. fastembed uses ONNX runtime rather than PyTorch, so installs are small and inference is fast on CPU.

  • Why chunk by paragraph? It is simple and transparent, and it makes results land on a focused passage. A future version could use overlapping token windows.

  • Why a fallback to keyword search? A tool should never hard-fail. If the model can't be downloaded, search still works β€” just less cleverly.

  • Re-indexing on write is a full reload for clarity; at larger scale you would embed only the newly added chunks.

Roadmap

  • Retrieval-quality eval harness (recall@k)

  • Persist embeddings to disk so startup is instant on large corpora

  • Support PDF and HTML documents

  • Hybrid search (combine semantic + keyword scores)

  • Optional LLM-generated summaries via the Claude API

  • Expose documents as MCP resources, not just tools

License

MIT

Install Server
A
license - permissive license
A
quality
C
maintenance

Maintenance

–Maintainers
–Response time
–Release cycle
–Releases (12mo)
Commit activity

Resources

Unclaimed servers have limited discoverability.

Looking for Admin?

If you are the server author, to access and configure the admin panel.

Latest Blog Posts

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/jaswanthsurya007-source/recall-mcp'

If you have feedback or need assistance with the MCP directory API, please join our Discord server