local-memory-mcp
Uses Ollama's /api/embed endpoint to generate embeddings for semantic memory search, storage, and retrieval.
Click on "Install Server".
Wait a few minutes for the server to deploy. Once ready, it will show a "Started" state.
In the chat, type
@followed by the MCP server name and your instructions, e.g., "@local-memory-mcpsearch for previous context on API design decisions"
That's it! The server will respond to your query, and you can continue using it as needed.
Here is a step-by-step guide with screenshots.
Local Memory MCP Server for Coding/AI Agents
This project is a local MCP (Model Context Protocol) server that exposes a small set of tools:
memory.search– semantic search over stored memoriesmemory.save– store a new memorymemory.supersede– mark an old memory as supersededmemory.delete– permanently remove a memory by idmemory.ping– sanity check / version output
What you can do with this project
Keep durable coding context across chat sessions (decisions, preferences, gotchas, API contracts).
Retrieve relevant past context semantically (not only keyword matching).
Scope memory per project using
WORKSPACE_KEYwhile keeping one shared local database.Correct memory over time by superseding outdated entries or deleting irrelevant ones.
Run everything locally (no external vector DB required).
Typical workflow
User asks a question in chat.
Agent calls
memory.searchto fetch relevant context.Agent answers using retrieved memory + current codebase context.
New durable insight is stored via
memory.save.Old memory is updated via
memory.supersedeor removed viamemory.delete.
It uses:
Bun + TypeScript
Zvec (
@zvec/zvec) as embedded in-process vector database (docs: https://zvec.org/en/docs/)Ollama
/api/embedwith embeddinggemma for embeddings (docs: https://docs.ollama.com/capabilities/embeddings, model: https://ollama.com/library/embeddinggemma)
Prerequisites
Bun installed
Ollama installed and running locally
Pull the embedding model:
ollama pull embeddinggemmaInstall
bun installRun
bun run startThis runs an MCP server over stdio.
Test
bun run testCurrent tests include:
tests/embed.test.ts– validates Ollama embedding response parsing and error handlingtests/memory-db.test.ts– validatessave,search,supersede, anddeleteon the Zvec-backed store
Environment variables
MEMORY_DB_PATH(default./data/memory.zvec)OLLAMA_BASE_URL(defaulthttp://localhost:11434)OLLAMA_EMBED_MODEL(defaultembeddinggemma)EMBEDDING_DIM(default768, must match your embedding model)WORKSPACE_KEY(defaultdefault)
VS Code
Workspace setup
This repo includes .vscode/mcp.json that registers this server:
command:
bunargs:
run start
You can adjust environment variables in that file.
Always-on across all projects
If you want this MCP server available in all workspaces, add it to your User MCP configuration instead of only .vscode/mcp.json:
Open Command Palette:
MCP: Open User ConfigurationAdd a server entry that starts this repo from a fixed directory.
Example (Linux):
{
"servers": {
"local-memory-mcp": {
"type": "stdio",
"command": "bun",
"args": ["--cwd", "/path/to/local-memory-mcp", "run", "start"],
"env": {
"MEMORY_DB_PATH": "/path/to/local-memory-mcp/data/memory.zvec",
"OLLAMA_BASE_URL": "http://localhost:11434",
"OLLAMA_EMBED_MODEL": "embeddinggemma",
"EMBEDDING_DIM": "768",
"WORKSPACE_KEY": "${workspaceFolderBasename}"
}
}
}
}Notes:
Use an absolute
MEMORY_DB_PATHso all projects use the same database.WORKSPACE_KEY=${workspaceFolderBasename}keeps memories separated per project automatically.Enable VS Code setting
chat.mcp.autoStart(Experimental) to auto-start/restart MCP servers when needed.
Docs:
https://code.visualstudio.com/docs/copilot/customization/mcp-servers
Claude Code
Add this server to Claude Code as a local stdio MCP server.
This repository already includes:
.mcp.jsonfor project-scoped Claude MCP configurationCLAUDE.mdfor memory-first agent behavior guidelines
User scope (all projects)
claude mcp add --transport stdio --scope user \
--env MEMORY_DB_PATH=/absolute/path/to/local-memory-mcp/data/memory.zvec \
--env OLLAMA_BASE_URL=http://localhost:11434 \
--env OLLAMA_EMBED_MODEL=embeddinggemma \
--env EMBEDDING_DIM=768 \
--env WORKSPACE_KEY=default \
local-memory-mcp -- bun --cwd /absolute/path/to/local-memory-mcp run startProject scope (shared in repository)
claude mcp add --transport stdio --scope project \
--env MEMORY_DB_PATH=./data/memory.zvec \
--env OLLAMA_BASE_URL=http://localhost:11434 \
--env OLLAMA_EMBED_MODEL=embeddinggemma \
--env EMBEDDING_DIM=768 \
--env WORKSPACE_KEY=${PWD##*/} \
local-memory-mcp -- bun run startProject .mcp.json example:
{
"mcpServers": {
"local-memory-mcp": {
"type": "stdio",
"command": "bun",
"args": ["run", "start"],
"env": {
"MEMORY_DB_PATH": "./data/memory.zvec",
"OLLAMA_BASE_URL": "http://localhost:11434",
"OLLAMA_EMBED_MODEL": "embeddinggemma",
"EMBEDDING_DIM": "768",
"WORKSPACE_KEY": "${PWD##*/}"
}
}
}
}Notes:
--scope projectwrites to.mcp.jsonin the project root.--scope userstores the server in your user Claude configuration.Keep all Claude flags before the server name, and put
--before the server command.
Useful commands:
claude mcp list
claude mcp get local-memory-mcp
claude mcp remove local-memory-mcpDocs:
https://code.claude.com/docs/en/mcp
Tool usage (examples)
Search
{
"tool": "memory.search",
"arguments": {
"query": "What is our policy for multi-session memory?",
"topK": 8,
"workspaceKey": "my-repo"
}
}Save
{
"tool": "memory.save",
"arguments": {
"workspaceKey": "my-repo",
"type": "decision",
"summary": "We use zvec with Ollama embeddinggemma for long-term memory.",
"text": "Decision: The Copilot/agent memory sidecar stores vectors in zvec and generates embeddings via Ollama /api/embed using embeddinggemma.",
"tags": ["memory", "zvec", "ollama", "embeddinggemma"],
"importance": 0.8
}
}Delete
{
"tool": "memory.delete",
"arguments": {
"workspaceKey": "my-repo",
"id": 42
}
}Implementation notes
The DB uses one Zvec collection with:
dense vector field
embeddingscalar fields for metadata (
workspaceKey,type,summary, etc.)
KNN queries are executed through Zvec
querySyncwith metadata filters.
Tests
Run all tests:
bun run testCurrent test coverage:
tests/embed.test.tsparses successful Ollama
/api/embedresponses intoFloat32Arrayverifies error handling when Ollama returns non-2xx responses
tests/memory-db.test.tsvalidates
save+searchbehavior with workspace/type filteringvalidates
supersedebehavior (superseded items are excluded from search)validates
deletebehavior and returned payload semantics
This server cannot be installed
Resources
Unclaimed servers have limited discoverability.
Looking for Admin?
If you are the server author, to access and configure the admin panel.
Latest Blog Posts
MCP directory API
We provide all the information about MCP servers via our MCP API.
curl -X GET 'https://glama.ai/api/mcp/v1/servers/basst85/local-memory-mcp'
If you have feedback or need assistance with the MCP directory API, please join our Discord server