Click on "Install Server".
Wait a few minutes for the server to deploy. Once ready, it will show a "Started" state.
In the chat, type
@followed by the MCP server name and your instructions, e.g., "@Vector Memory MCPfind my notes on how I resolved the CORS issue in the last project"
That's it! The server will respond to your query, and you can continue using it as needed.
Here is a step-by-step guide with screenshots.
Vector Memory MCP Server
A secure, vector-based memory server for Claude Desktop using sqlite-vec and sentence-transformers. This MCP server provides persistent semantic memory capabilities that enhance AI coding assistants by remembering and retrieving relevant coding experiences, solutions, and knowledge.
โจ Features
๐ Semantic Search: Vector-based similarity search using 384-dimensional embeddings
๐พ Persistent Storage: SQLite database with vector indexing via
sqlite-vec๐ท๏ธ Smart Organization: Categories and tags for better memory organization
๐ Security First: Input validation, path sanitization, and resource limits
โก High Performance: Fast embedding generation with
sentence-transformers๐งน Auto-Cleanup: Intelligent memory management and cleanup tools
๐ Rich Statistics: Comprehensive memory database analytics
๐ Automatic Deduplication: SHA-256 content hashing prevents storing duplicate memories
๐ Access Tracking: Monitors memory usage with access counts and timestamps for optimization
๐ง Smart Cleanup Algorithm: Prioritizes memory retention based on recency, access patterns, and importance
๐ ๏ธ Technical Stack
Component | Technology | Purpose |
Vector DB | sqlite-vec | Vector storage and similarity search |
Embeddings | sentence-transformers/all-MiniLM-L6-v2 | 384D text embeddings |
MCP Framework | FastMCP | High-level tools-only server |
Dependencies | uv script headers | Self-contained deployment |
Security | Custom validation | Path/input sanitization |
Testing | pytest + coverage | Comprehensive test suite |
๐ Project Structure
๐๏ธ Organization Guide
This project is organized for clarity and ease of use:
main.py- Start here! Main server entry pointsrc/- Core implementation (security, embeddings, memory store)claude-desktop-config.example.json- Configuration template
New here? Start with main.py and claude-desktop-config.example.json
๐ Quick Start
Prerequisites
Python 3.10 or higher (recommended: 3.11)
uv package manager
Claude Desktop app
Installing uv (if not already installed):
macOS and Linux:
Verify installation:
Installation
Option 1: Quick Install via uvx (Recommended)
The easiest way to use this MCP server - no cloning or setup required!
Once published to PyPI, you can use it directly:
Claude Desktop Configuration (using uvx):
Note:
--memory-limitis optional. Omit it to use default 10,000 entries.
Note: Publishing to PyPI is in progress. See PUBLISHING.md for details.
Option 2: Install from Source (For Development)
Clone the project:
git clone <repository-url> cd vector-memory-mcpInstall dependencies (automatic with uv): Dependencies are automatically managed via inline metadata in main.py. No manual installation needed.
To verify dependencies:
uv pip listTest the server:
# Test with sample working directory uv run main.py --working-dir ./test-memoryConfigure Claude Desktop:
Copy the example configuration:
cp claude-desktop-config.example.json ~/path/to/your/config/Open Claude Desktop Settings โ Developer โ Edit Config, and add (replace paths with absolute paths):
{ "mcpServers": { "vector-memory": { "command": "uv", "args": [ "run", "/absolute/path/to/vector-memory-mcp/main.py", "--working-dir", "/your/project/path", "--memory-limit", "100000" ] } } }Important:
Use absolute paths, not relative paths
--memory-limitis optional (default: 10,000)For large projects, use 100,000-1,000,000
Restart Claude Desktop and look for the MCP integration icon.
Option 3: Install with pipx (Alternative)
Claude Desktop Configuration (using pipx):
๐ Usage Guide
Available Tools
1. store_memory - Store Knowledge
Store coding experiences, solutions, and insights:
2. search_memories - Semantic Search
Find relevant memories using natural language:
3. list_recent_memories - Browse Recent
See what you've stored recently:
4. get_memory_stats - Database Health
View memory database statistics:
5. clear_old_memories - Cleanup
Clean up old, unused memories:
6. get_by_memory_id - Retrieve Specific Memory
Get full details of a specific memory by its ID:
Returns all fields including content, category, tags, timestamps, access count, and metadata.
7. delete_by_memory_id - Delete Memory
Permanently remove a specific memory from the database:
Removes the memory from both metadata and vector tables atomically.
Memory Categories
Category | Use Cases |
| Working code snippets, implementations |
| Bug fixes and debugging approaches |
| System design decisions and patterns |
| New concepts, tutorials, insights |
| Tool configurations, CLI commands |
| Debugging techniques and discoveries |
| Optimization strategies and results |
| Security considerations and fixes |
| Everything else |
๐ง Configuration
Command Line Arguments
The server supports the following arguments:
Available Options:
--working-dir(required): Directory where memory database will be stored--memory-limit(optional): Maximum number of memory entriesDefault: 10,000 entries
Minimum: 1,000 entries
Maximum: 10,000,000 entries
Recommended for large projects: 100,000-1,000,000
Working Directory Structure
Security Limits
Max memory content: 10,000 characters
Max total memories: Configurable via
--memory-limit(default: 10,000 entries)Max search results: 50 per query
Max tags per memory: 10 tags
Path validation: Blocks suspicious characters
๐ฏ Use Cases
For Individual Developers
For Team Workflows
For Learning & Growth
๐ How Semantic Search Works
The server uses sentence-transformers to convert your memories into 384-dimensional vectors that capture semantic meaning:
Example Searches
Query | Finds Memories About |
"authentication patterns" | JWT, OAuth, login systems, session management |
"database performance" | SQL optimization, indexing, query tuning, caching |
"React state management" | useState, Redux, Context API, state patterns |
"API error handling" | HTTP status codes, retry logic, error responses |
Similarity Scoring
0.9+ similarity: Extremely relevant, almost exact matches
0.8-0.9: Highly relevant, strong semantic similarity
0.7-0.8: Moderately relevant, good contextual match
0.6-0.7: Somewhat relevant, might be useful
<0.6: Low relevance, probably not helpful
๐ Database Statistics
The get_memory_stats tool provides comprehensive insights:
Statistics Fields Explained
total_memories: Current number of memories stored in the database
memory_limit: Maximum allowed memories (configurable via --memory-limit, default: 10,000)
usage_percentage: Database capacity usage (total_memories / memory_limit * 100)
categories: Breakdown of memory count by category type
recent_week_count: Number of memories created in the last 7 days
database_size_mb: Physical size of the SQLite database file on disk
health_status: Overall database health indicator based on usage and performance metrics
๐ก๏ธ Security Features
Input Validation
Sanitizes all user input to prevent injection attacks
Removes control characters and null bytes
Enforces length limits on all content
Path Security
Validates and normalizes all file paths
Prevents directory traversal attacks
Blocks suspicious character patterns
Resource Limits
Limits total memory count and individual memory size
Prevents database bloat and memory exhaustion
Implements cleanup mechanisms for old data
SQL Safety
Uses parameterized queries exclusively
No dynamic SQL construction from user input
SQLite WAL mode for safe concurrent access
๐ง Troubleshooting
Common Issues
Server Not Starting
Claude Desktop Not Connecting
Verify absolute paths in configuration
Check Claude Desktop logs:
~/Library/Logs/Claude/Restart Claude Desktop after config changes
Test server manually before configuring Claude
Memory Search Not Working
Verify sentence-transformers model downloaded successfully
Check database file permissions in memory/ directory
Try broader search terms
Review memory content for relevance
Performance Issues
Run
get_memory_statsto check database healthUse
clear_old_memoriesto clean up old entriesConsider increasing hardware resources for embedding generation
Debug Mode
Run the server manually to see detailed logs:
๐ Advanced Usage
Batch Memory Storage
Store multiple related memories by calling the tool multiple times through Claude Desktop interface.
Memory Organization Strategies
By Project
Use tags to organize by project:
["project-alpha", "frontend", "react"]["project-beta", "backend", "node"]["project-gamma", "devops", "docker"]
By Technology Stack
["javascript", "react", "hooks"]["python", "django", "orm"]["aws", "lambda", "serverless"]
By Problem Domain
["authentication", "security", "jwt"]["performance", "optimization", "caching"]["testing", "unit-tests", "mocking"]
Integration with Development Workflow
Code Review Learnings
Sprint Retrospectives
Technical Debt Tracking
๐ Performance Benchmarks
Based on testing with various dataset sizes:
Memory Count | Search Time | Storage Size | RAM Usage |
1,000 | <50ms | ~5MB | ~100MB |
5,000 | <100ms | ~20MB | ~200MB |
10,000 | <200ms | ~40MB | ~300MB |
Tested on MacBook Air M1 with sentence-transformers/all-MiniLM-L6-v2
๐ง Advanced Implementation Details
Database Indexes
The memory store uses 4 optimized indexes for performance:
idx_category: Speeds up category-based filtering and statistics
idx_created_at: Optimizes temporal queries and recent memory retrieval
idx_content_hash: Enables fast deduplication checks via SHA-256 hash lookups
idx_access_count: Improves cleanup algorithm efficiency by tracking usage patterns
Deduplication System
Content deduplication uses SHA-256 hashing to prevent storing identical memories:
Hash calculated on normalized content (trimmed, lowercased)
Check performed before insertion
Duplicate attempts return existing memory ID
Reduces storage overhead and maintains data quality
Access Tracking
Each memory tracks usage statistics for intelligent management:
access_count: Number of times memory retrieved via search or direct access
last_accessed_at: Timestamp of most recent access
created_at: Original creation timestamp
Used by cleanup algorithm to identify valuable vs. stale memories
Cleanup Algorithm
Smart cleanup prioritizes memory retention based on multiple factors:
Recency: Newer memories are prioritized over older ones
Access patterns: Frequently accessed memories are protected
Age threshold: Configurable days_old parameter for hard cutoff
Count limit: Maintains max_memories cap by removing least valuable entries
Scoring system: Combines access_count and recency for retention decisions
๐ค Contributing
This is a standalone MCP server designed for personal/team use. For improvements:
Fork the repository
Modify as needed for your use case
Test thoroughly with your specific requirements
Share improvements via pull requests
๐ License
This project is released under the MIT License.
๐ Acknowledgments
sqlite-vec: Alex Garcia's excellent SQLite vector extension
sentence-transformers: Nils Reimers' semantic embedding library
FastMCP: Anthropic's high-level MCP framework
Claude Desktop: For providing the MCP integration platform
Built for developers who want persistent AI memory without the complexity of dedicated vector databases.