π§ Hippocampus Memory MCP Server
Persistent, Semantic Memory for Large Language Models

Features β’ Installation β’ Quick Start β’ Documentation β’ Architecture
π Overview
A Python-based Model Context Protocol (MCP) server that gives LLMs persistent, hippocampus-inspired memory across sessions. Store, retrieve, consolidate, and forget memories using semantic similarity search powered by vector embeddings.
Why Hippocampus? Just like the human brain's hippocampus consolidates short-term memories into long-term storage, this server intelligently manages LLM memory through biological patterns:
π Consolidation - Merge similar memories to reduce redundancy
π§Ή Forgetting - Remove outdated information based on age/importance
π Semantic Retrieval - Find relevant memories through meaning, not keywords
β¨ Features
Feature | Description |
ποΈ Vector Storage | FAISS-powered semantic similarity search |
π― MCP Compliant | Full MCP 1.2.0 spec compliance via FastMCP |
𧬠Bio-Inspired | Hippocampus-style consolidation and forgetting |
π Security | Input validation, rate limiting, injection prevention |
π Semantic Search | Sentence transformer embeddings (CPU-optimized) |
βΎοΈ Unlimited Storage | No memory count limits, only per-item size limits |
π 100% Free | Local embedding model - no API costs |
π Quick Start
5 Core MCP Tools
memory_read # π Retrieve memories by semantic similarity
memory_write # βοΈ Store new memories with tags & metadata
memory_consolidate # π Merge similar memories
memory_forget # π§Ή Remove memories by age/importance/tags
memory_stats # π Get system statistics
π¦ Installation
Quick Install (Recommended)
pip install hippocampus-memory-mcp
Prerequisites: Python 3.9+ β’ ~200MB disk space (for embedding model)
Claude Desktop Integration
Add to your Claude Desktop config (claude_desktop_config.json):
{
"mcpServers": {
"memory": {
"command": "python",
"args": ["-m", "memory_mcp_server.server"]
}
}
}
π That's it! Claude will now have persistent memory across conversations.
Install from Source (Alternative)
# Clone the repository
git clone https://github.com/jameslovespancakes/Memory-MCP.git
cd Memory-MCP
# Install dependencies
pip install -r requirements.txt
# Run the server
python -m memory_mcp_server.server
π Documentation
Memory Operations via MCP
Once connected to Claude, use natural language:
"Remember that I prefer Python for backend development"
β Claude calls memory_write()
"What do you know about my programming preferences?"
β Claude calls memory_read()
"Consolidate similar memories to clean up storage"
β Claude calls memory_consolidate()
Direct API Usage
βοΈ Writing Memories
from memory_mcp_server.storage import MemoryStorage
from memory_mcp_server.tools import MemoryTools
storage = MemoryStorage(storage_path="my_memory")
await storage._ensure_initialized()
tools = MemoryTools(storage)
# Store with tags and importance
await tools.memory_write(
text="User prefers dark mode UI",
tags=["preference", "ui"],
importance_score=3.0,
metadata={"category": "settings"}
)
π Reading Memories
# Semantic search
result = await tools.memory_read(
query_text="What are my UI preferences?",
top_k=5,
min_similarity=0.3
)
# Filter by tags and date
result = await tools.memory_read(
query_text="Python learning",
tags=["learning", "python"],
date_range_start="2024-01-01"
)
π Consolidating Memories
# Merge similar memories (threshold: 0.85)
result = await tools.memory_consolidate(similarity_threshold=0.85)
print(f"Merged {result['consolidated_groups']} groups")
π§Ή Forgetting Memories
# Remove by age
await tools.memory_forget(max_age_days=30)
# Remove by importance
await tools.memory_forget(min_importance_score=2.0)
# Remove by tags
await tools.memory_forget(tags_to_forget=["temporary"])
Testing
Run the included test suite:
This tests all 5 operations with sample data.
ποΈ Architecture
βββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β MCP Client (Claude Desktop, etc.) β
βββββββββββββββββββββ¬ββββββββββββββββββββββββββββββββββ
β JSON-RPC over stdio
βββββββββββββββββββββΌββββββββββββββββββββββββββββββββββ
β FastMCP Server (server.py) β
β ββ memory_read β
β ββ memory_write β
β ββ memory_consolidate β
β ββ memory_forget β
β ββ memory_stats β
βββββββββββββββββββββ¬ββββββββββββββββββββββββββββββββββ
β
βββββββββββββββββββββΌββββββββββββββββββββββββββββββββββ
β Memory Tools (tools.py) β
β ββ Input validation & sanitization β
β ββ Rate limiting (100 req/min) β
βββββββββββββββββββββ¬ββββββββββββββββββββββββββββββββββ
β
βββββββββββββββββββββΌββββββββββββββββββββββββββββββββββ
β Storage Layer (storage.py) β
β ββ Sentence Transformers (all-MiniLM-L6-v2) β
β ββ FAISS Vector Index (cosine similarity) β
β ββ JSON persistence (memories.json) β
βββββββββββββββββββββββββββββββββββββββββββββββββββββββ
π Memory Lifecycle
Step | Process | Technology |
π Write | Text β 384-dim vector embedding | Sentence Transformers (CPU) |
πΎ Store | Normalized vector β FAISS index | FAISS IndexFlatIP |
π Search | Query β embedding β top-k similar | Cosine similarity |
π Consolidate | Group similar (>0.85) β merge | Vector clustering |
π§Ή Forget | Filter by age/importance/tags β delete | Metadata filtering |
π Security
Protection | Implementation |
π‘οΈ Injection Prevention | Regex filtering of script tags, eval(), path traversal |
β±οΈ Rate Limiting | 100 requests per 60-second window per client |
π Size Limits | 50KB text, 5KB metadata, 20 tags per memory |
β
Input Validation | Pydantic models + custom sanitization |
π Safe Logging | stderr only (prevents JSON-RPC corruption) |
βοΈ Configuration
Environment Variables
MEMORY_STORAGE_PATH="memory_data" # Storage directory
EMBEDDING_MODEL="all-MiniLM-L6-v2" # Model name
RATE_LIMIT_REQUESTS=100 # Max requests
RATE_LIMIT_WINDOW=60 # Time window (seconds)
Storage Limits
β
Unlimited total memories (no count limit)
β οΈ Per-memory limits: 50KB text, 5KB metadata, 20 tags
π Troubleshooting
First run downloads all-MiniLM-L6-v2 (~90MB). Ensure internet connection and ~/.cache/ write permissions.
pip uninstall torch transformers sentence-transformers -y
pip install torch==2.1.0 transformers==4.35.2 sentence-transformers==2.2.2
The model runs on CPU. Ensure 2GB+ free RAM. Reduce top_k in read operations if needed.
π License
MIT License - feel free to use in your projects!
π€ Contributing
PRs welcome! Please:
π Resources
Built with π§ for persistent LLM memory
Report Bug Β· Request Feature