neuromcp
Integrates with Ollama for generating vector embeddings using local models (such as nomic-embed-text) to power semantic search capabilities. Auto-detects Ollama at localhost:11434 with zero configuration required, falling back to built-in models if unavailable.
Supports OpenAI as an embedding provider for generating vector embeddings via API, enabling semantic search functionality as an alternative to local embedding models.
neuromcp
Semantic memory for AI agents — local-first MCP server with hybrid search, governance, and consolidation.
npx neuromcpWhy
AI agents forget everything between sessions. The default MCP memory server stores flat key-value pairs with keyword search — fine for "remember my name is Bob", useless for "what was the architectural decision we made about authentication last week?"
neuromcp solves this with hybrid search (vector embeddings + full-text), memory governance (namespaces, trust levels, lineage tracking), and automatic consolidation (dedup, decay, prune) — all running locally in a single SQLite file. No cloud, no API keys, no infrastructure.
Before & After
Without neuromcp | With neuromcp | |
Session memory | Gone when you close the terminal | Persisted, searchable, ranked by relevance |
Search | Exact keyword match | Semantic — "auth architecture" finds "JWT validation middleware" |
Duplicates | Same fact stored 50 times | Content-hash dedup + similarity-based merge |
Stale memories | Accumulate forever | Automatic decay, pruning, and TTL sweeps |
Multi-project | Everything in one pile | Namespace isolation per project |
Trust | All memories equal | Trust levels (high/medium/low) + source tracking |
Setup | API keys, cloud accounts, config files |
|
How It Works
Query: "how does auth work in this project?"
│
▼
┌───────────────────────┐
│ Hybrid Search │
│ │
│ Vector (semantic) │──→ "JWT tokens validated in middleware" (0.87)
│ FTS (keyword) │──→ "auth module uses passport.js" (0.91)
│ │
│ RRF Fusion │──→ Combined ranking, best results first
└───────────────────────┘
│
▼
Filters: namespace, category, trust, date range
│
▼
Top results returned to agentEmbeddings: Auto-detects Ollama at localhost:11434 for real semantic search (nomic-embed-text). Falls back to built-in ONNX model (bge-small-en-v1.5) for basic similarity. Zero config either way.
Storage: Single SQLite file (~/.neuromcp/memory.db), WAL mode for concurrent access, content-hash deduplication at write time.
Consolidation: Plan-then-commit pipeline — preview what will change before applying. Deduplicates near-identical memories, decays old memories' importance over time, prunes below-threshold, sweeps tombstoned records past TTL.
Real-World Example
You: Where were we with the payment integration?
Claude: [searches neuromcp for "payment integration"]
Found 3 relevant memories:
- (0.92) "Decided on Stripe over PayPal — better webhook reliability,
session from 2026-03-28"
- (0.87) "Payment webhook handler at src/api/webhooks/stripe.ts needs
idempotency keys — known bug, not yet fixed"
- (0.81) "Test coverage for payment module is at 64%, target is 80%"
Based on our previous sessions: we chose Stripe, the webhook handler
needs idempotency keys (that's still open), and test coverage needs
work. Want me to start with the idempotency fix?Quick Start
npx neuromcpCreates ~/.neuromcp/memory.db on first run. Downloads ONNX model automatically.
Recommended: Add Ollama for real semantic search
# Install Ollama from https://ollama.com, then:
ollama pull nomic-embed-textneuromcp auto-detects it. No config needed.
Provider | Semantic Quality | Setup |
Ollama + nomic-embed-text | Excellent — real semantic understanding, 8K context |
|
ONNX (built-in fallback) | Basic — keyword overlap, no deep semantics | Zero config |
Installation
Claude Code
// ~/.claude.json → mcpServers
{
"neuromcp": {
"type": "stdio",
"command": "npx",
"args": ["-y", "neuromcp"]
}
}Claude Desktop
// ~/Library/Application Support/Claude/claude_desktop_config.json
{
"mcpServers": {
"neuromcp": {
"command": "npx",
"args": ["-y", "neuromcp"]
}
}
}Cursor / Windsurf / Cline
Same format — add to your editor's MCP settings.
Per-project isolation
// .mcp.json in project root
{
"mcpServers": {
"neuromcp": {
"type": "stdio",
"command": "npx",
"args": ["-y", "neuromcp"],
"env": {
"NEUROMCP_DB_PATH": ".neuromcp/memory.db",
"NEUROMCP_NAMESPACE": "my-project"
}
}
}
}MCP Surface
Tools (8)
Tool | Description |
| Store with semantic dedup. Returns ID and match status. |
| Hybrid vector + FTS search with RRF ranking. Filters by namespace, category, tags, trust, date. |
| Retrieve by ID, namespace, category, or tags — no semantic search. |
| Soft-delete (tombstone). Supports |
| Dedup, decay, prune, sweep. |
| Counts, categories, trust distribution, DB size. |
| Export as JSONL or JSON. |
| Import with content-hash dedup. |
Resources (13)
URI | Description |
| Global statistics |
| Last 20 memories |
| All namespaces with counts |
| Server health + metrics |
| Per-namespace stats |
| Recent in namespace |
| Single memory by ID |
| Memories by tag |
| Tag within namespace |
| All in namespace (max 100) |
| Recent consolidation entries |
| Specific operation log |
| Active/recent operations |
Prompts (3)
Prompt | Description |
| Search relevant memories and format as LLM context |
| Show proposed memory alongside near-duplicates |
| Preview consolidation without applying |
Memory Governance
Namespaces isolate memories by project, agent, or domain. Each memory belongs to exactly one namespace. Use NEUROMCP_NAMESPACE env var or specify per-operation.
Trust levels (high, medium, low, unverified) indicate confidence in the source. High-trust memories rank higher in search results and resist decay.
Soft delete tombstones memories instead of removing them. Tombstoned records survive for NEUROMCP_TOMBSTONE_TTL_DAYS (default 30) — recoverable until the next consolidation sweep.
Content hashing (SHA-256) deduplicates at write time. Identical content in the same namespace returns the existing memory instead of creating a duplicate.
Lineage tracking records source (user, auto, consolidation, claude-code, error), project ID, and agent ID per memory. Full audit trail for governance.
Configuration
All via environment variables. Defaults work for most setups.
Variable | Default | Description |
|
| Database file path |
|
| Max database size |
|
|
|
|
| Model name (auto-detected) |
|
| Ollama server URL |
|
| Default namespace |
|
| Days before permanent sweep |
|
| Enable periodic consolidation |
|
| Consolidation frequency |
|
| Importance decay rate |
|
| Cosine similarity for dedup |
|
| Prune threshold |
|
| Auto-merge threshold |
|
| TTL sweep frequency |
|
|
|
Comparison
Feature | neuromcp | @modelcontextprotocol/server-memory | mem0 | cortex-mcp |
Search | Hybrid (vector + FTS + RRF) | Keyword only | Vector only | Vector only |
Embeddings | Built-in ONNX (zero config) | None | External API | External API |
Governance | Namespaces, trust, soft delete | None | None | Basic |
Consolidation | Plan-then-commit | None | None | Manual |
Storage | SQLite (single file) | JSON file | Cloud / Postgres | SQLite |
Infrastructure | Zero | Zero | Cloud account | Zero |
MCP surface | 8 tools, 13 resources, 3 prompts | 5 tools | N/A | 4 tools |
Contributing
See CONTRIBUTING.md for development setup and guidelines.
License
MIT
Latest Blog Posts
MCP directory API
We provide all the information about MCP servers via our MCP API.
curl -X GET 'https://glama.ai/api/mcp/v1/servers/AdelElo13/neuromcp'
If you have feedback or need assistance with the MCP directory API, please join our Discord server