local-recall-mcp
Indexes an Obsidian vault (or any folder of markdown notes) for semantic search, allowing retrieval of notes and headings based on content similarity.
Click on "Install Server".
Wait a few minutes for the server to deploy. Once ready, it will show a "Started" state.
In the chat, type
@followed by the MCP server name and your instructions, e.g., "@local-recall-mcpsearch memory for how we fixed the MCP connection issue"
That's it! The server will respond to your query, and you can continue using it as needed.
Here is a step-by-step guide with screenshots.
local-recall-mcp
Fully local long-term memory for AI agents. Semantic search over your notes and session logs from any MCP client â embeddings served by Ollama, so nothing ever leaves your machine.
Your agent forgets everything between sessions. Your session logs and notes already contain the answers â what worked, what failed, what you decided and why. local-recall-mcp turns those files into a searchable memory the agent can query before repeating old mistakes.
ð 100% local â no cloud APIs, no keys, no telemetry. Ollama does the embeddings
ðŠķ One tool, tiny footprint â a single
search_memorytool, so it barely costs any agent context⥠Incremental indexing â SHA-256 manifest re-embeds only changed files, purges deleted ones, and self-heals from a corrupted index
ð·ïļ Section-type filtering â map your headings (e.g.
What Did NOT Work) to types likefailed, then search only past failuresðĶ No database â the whole index is three flat files (
manifest.json,chunks.json,vectors.npy)
Quickstart
1. Get Ollama and the embedding model (~1.2 GB, multilingual):
ollama pull bge-m32. Create a config at ~/.local-recall/config.yaml:
ollama:
base_url: http://localhost:11434
embed_model: bge-m3
embed_timeout: 300
index_dir: ~/.local-recall/index
sources:
- path: ~/notes
pattern: "**/*.md"3. Register the server with your MCP client. For Claude Code:
claude mcp add recall -- uvx local-recall-mcpFor Claude Desktop (claude_desktop_config.json):
{
"mcpServers": {
"recall": {
"command": "uvx",
"args": ["local-recall-mcp"]
}
}
}4. Ask your agent things like "search memory for how we fixed the MCP connection issue". The first query builds the index; later queries re-embed only what changed.
Related MCP server: engram-mcp
Presets
Ready-made configs in configs/:
Preset | What it indexes |
Claude Code session logs ( | |
An Obsidian vault (or any folder of markdown notes) |
Copy one to ~/.local-recall/config.yaml, or point the server at it directly:
claude mcp add recall -- uvx local-recall-mcp --config /path/to/claude-code.yamlThe config path can also be set via the LOCAL_RECALL_CONFIG environment variable.
Configuration reference
ollama:
base_url: http://localhost:11434 # your Ollama endpoint
embed_model: bge-m3 # any Ollama embedding model
embed_timeout: 300 # seconds; first full build is the slow one
index_dir: ~/.local-recall/index # where the three index files live
sources: # any number of directories
- path: ~/notes
pattern: "**/*.md" # glob, relative to path
- path: ~/.claude/sessions
pattern: "*.tmp"
section_rules: # optional heading -> type mapping
- contains: "what worked" # case-insensitive substring of a ##/### heading
type: worked
- contains: "what did not work"
type: failedFiles are chunked on ##/### headings; files without headings become a single chunk. Each chunk gets a section_type from the first matching rule (other if none match), and the search_memory tool accepts a section_filter to narrow results to one type â the killer use case being "only show me past failures before I try this again."
How it works
sources (*.md, *.tmp, ...) ~/.local-recall/index/
â SHA-256 per file âââ manifest.json path -> hash
âž âââ chunks.json title/content/type
diff vs manifest âââš re-embed âââš âââ vectors.npy float32 matrix
(changed files only) (Ollama /api/embed, batched)
query âââš embed âââš cosine top-k over vectors âââš chunks, capped at 600 chars eachNo vector database, no background daemon. Sync happens lazily on each search call and is a no-op when nothing changed. A corrupted or misaligned index triggers a full rebuild automatically.
Non-goals
Kept deliberately small â these are out of scope for v0.x:
Embedding providers other than Ollama (local-first is the point)
External vector databases (flat files comfortably handle tens of thousands of chunks)
Reranking or hybrid search (cosine similarity only)
Parsers beyond markdown/plain text (no PDF, no HTML)
Any GUI
If you need one of these, open an issue describing the use case â real demand is what would justify a v0.2.
Development
git clone https://github.com/Chikoku-NEKO/local-recall-mcp
cd local-recall-mcp
pip install -e .
python -m unittest discover -s testsTests run offline against a deterministic fake embedding function.
License
This server cannot be installed
Maintenance
Resources
Unclaimed servers have limited discoverability.
Looking for Admin?
If you are the server author, to access and configure the admin panel.
Latest Blog Posts
- Your AI Chatbot Just Exposed Your CEO's Salary to an InternBy Om-Shree-0709 on .Agent IdentityMCP SecurityOAuth Delegation
- Why MCP Servers Need Execution Sandboxing (And Why Your Current Stack Isn't Enough)By Om-Shree-0709 on .Agentic AiPrompt InjectionWebAssembly
MCP directory API
We provide all the information about MCP servers via our MCP API.
curl -X GET 'https://glama.ai/api/mcp/v1/servers/Chikoku-NEKO/local-recall-mcp'
If you have feedback or need assistance with the MCP directory API, please join our Discord server