local-only server
The server can only run on the client’s local machine because it depends on local resources.
Integrations
Mentioned as a potential cloud storage option where users should ensure sync is complete before accessing from another device.
MCP Memory Service
An MCP server providing semantic memory and persistent storage capabilities for Claude Desktop using ChromaDB and sentence transformers. This service enables long-term memory storage with semantic search capabilities, making it ideal for maintaining context across conversations and instances.
Features
- Semantic search using sentence transformers
- Natural language time-based recall (e.g., "last week", "yesterday morning")
- Tag-based memory retrieval system
- Persistent storage using ChromaDB
- Automatic database backups
- Memory optimization tools
- Exact match retrieval
- Debug mode for similarity analysis
- Database health monitoring
- Duplicate detection and cleanup
- Customizable embedding model
- Cross-platform compatibility (Apple Silicon, Intel, Windows, Linux)
- Hardware-aware optimizations for different environments
- Graceful fallbacks for limited hardware resources
Quick Start
For the fastest way to get started:
Docker and Smithery Integration
Docker Usage
The service can be run in a Docker container for better isolation and deployment:
To configure Docker's file sharing on macOS:
- Open Docker Desktop
- Go to Settings (Preferences)
- Navigate to Resources -> File Sharing
- Add any additional paths you need to share
- Click "Apply & Restart"
Smithery Integration
The service is configured for Smithery integration through smithery.yaml
. This configuration enables stdio-based communication with MCP clients like Claude Desktop.
To use with Smithery:
- Ensure your
claude_desktop_config.json
points to the correct paths:
- The
smithery.yaml
configuration handles stdio communication and environment setup automatically.
Testing with Claude Desktop
To verify your Docker-based memory service is working correctly with Claude Desktop:
- Build the Docker image with
docker build -t mcp-memory-service .
- Create the necessary directories for persistent storage:Copy
- Update your Claude Desktop configuration file:
- On macOS:
~/Library/Application Support/Claude/claude_desktop_config.json
- On Windows:
%APPDATA%\Claude\claude_desktop_config.json
- On Linux:
~/.config/Claude/claude_desktop_config.json
- On macOS:
- Restart Claude Desktop
- When Claude starts up, you should see the memory service initialize with a message:Copy
- Test the memory feature:
- Ask Claude to remember something: "Please remember that my favorite color is blue"
- Later in the conversation or in a new conversation, ask: "What is my favorite color?"
- Claude should retrieve the information from the memory service
If you experience any issues:
- Check the Claude Desktop console for error messages
- Verify Docker has the necessary permissions to access the mounted directories
- Ensure the Docker container is running with the correct parameters
- Try running the container manually to see any error output
For detailed installation instructions, platform-specific guides, and troubleshooting, see our documentation:
- Installation Guide - Comprehensive installation instructions for all platforms
- Troubleshooting Guide - Solutions for common issues
- Technical Documentation - Detailed technical procedures and specifications
- Scripts Documentation - Overview of available scripts and their usage
Configuration
Standard Configuration (Recommended)
Add the following to your claude_desktop_config.json
file to use UV (recommended for best performance):
Windows-Specific Configuration (Recommended)
For Windows users, we recommend using the wrapper script to ensure PyTorch is properly installed. See our Windows Setup Guide for detailed instructions.
The wrapper script will:
- Check if PyTorch is installed and properly configured
- Install PyTorch with the correct index URL if needed
- Run the memory server with the appropriate configuration
Hardware Compatibility
Platform | Architecture | Accelerator | Status |
---|---|---|---|
macOS | Apple Silicon (M1/M2/M3) | MPS | ✅ Fully supported |
macOS | Apple Silicon under Rosetta 2 | CPU | ✅ Supported with fallbacks |
macOS | Intel | CPU | ✅ Fully supported |
Windows | x86_64 | CUDA | ✅ Fully supported |
Windows | x86_64 | DirectML | ✅ Supported |
Windows | x86_64 | CPU | ✅ Supported with fallbacks |
Linux | x86_64 | CUDA | ✅ Fully supported |
Linux | x86_64 | ROCm | ✅ Supported |
Linux | x86_64 | CPU | ✅ Supported with fallbacks |
Linux | ARM64 | CPU | ✅ Supported with fallbacks |
Memory Operations
The memory service provides the following operations through the MCP server:
Core Memory Operations
store_memory
- Store new information with optional tagsretrieve_memory
- Perform semantic search for relevant memoriesrecall_memory
- Retrieve memories using natural language time expressionssearch_by_tag
- Find memories using specific tagsexact_match_retrieve
- Find memories with exact content matchdebug_retrieve
- Retrieve memories with similarity scores
For detailed information about tag storage and management, see our Tag Storage Documentation.
Database Management
create_backup
- Create database backupget_stats
- Get memory statisticsoptimize_db
- Optimize database performancecheck_database_health
- Get database health metricscheck_embedding_model
- Verify model status
Memory Management
delete_memory
- Delete specific memory by hashdelete_by_tag
- Delete all memories with specific tagcleanup_duplicates
- Remove duplicate entries
Configuration Options
Configure through environment variables:
Getting Help
If you encounter any issues:
- Check our Troubleshooting Guide
- Review the Installation Guide
- For Windows-specific issues, see our Windows Setup Guide
- Contact the developer via Telegram: t.me/doobeedoo
Project Structure
Development Guidelines
- Python 3.10+ with type hints
- Use dataclasses for models
- Triple-quoted docstrings for modules and functions
- Async/await pattern for all I/O operations
- Follow PEP 8 style guidelines
- Include tests for new features
License
MIT License - See LICENSE file for details
Acknowledgments
- ChromaDB team for the vector database
- Sentence Transformers project for embedding models
- MCP project for the protocol specification
Contact
Cloudflare Worker Implementation
A serverless implementation of the MCP Memory Service is now available using Cloudflare Workers. This implementation:
- Uses Cloudflare D1 for storage (serverless SQLite)
- Uses Workers AI for embeddings generation
- Communicates via Server-Sent Events (SSE) for MCP protocol
- Requires no local installation or dependencies
- Scales automatically with usage
Benefits of the Cloudflare Implementation
- Zero local installation: No Python, dependencies, or local storage needed
- Cross-platform compatibility: Works on any device that can connect to the internet
- Automatic scaling: Handles multiple users without configuration
- Global distribution: Low latency access from anywhere
- No maintenance: Updates and maintenance handled automatically
Available Tools in the Cloudflare Implementation
The Cloudflare Worker implementation supports all the same tools as the Python implementation:
Tool | Description |
---|---|
store_memory | Store new information with optional tags |
retrieve_memory | Find relevant memories based on query |
recall_memory | Retrieve memories using natural language time expressions |
search_by_tag | Search memories by tags |
delete_memory | Delete a specific memory by its hash |
delete_by_tag | Delete all memories with a specific tag |
cleanup_duplicates | Find and remove duplicate entries |
get_embedding | Get raw embedding vector for content |
check_embedding_model | Check if embedding model is loaded and working |
debug_retrieve | Retrieve memories with debug information |
exact_match_retrieve | Retrieve memories using exact content match |
check_database_health | Check database health and get statistics |
recall_by_timeframe | Retrieve memories within a specific timeframe |
delete_by_timeframe | Delete memories within a specific timeframe |
delete_before_date | Delete memories before a specific date |
Configuring Claude to Use the Cloudflare Memory Service
Add the following to your Claude configuration to use the Cloudflare-based memory service:
Replace your-worker-subdomain
with your actual Cloudflare Worker subdomain.
Deploying Your Own Cloudflare Memory Service
- Clone the repository and navigate to the Cloudflare Worker directory:Copy
- Install Wrangler (Cloudflare's CLI tool):Copy
- Login to your Cloudflare account:Copy
- Create a D1 database:Copy
- Update the
wrangler.toml
file with your database ID from the previous step. - Initialize the database schema:WhereCopy
schema.sql
contains:Copy - Deploy the worker:Copy
- Update your Claude configuration to use your new worker URL.
Testing Your Cloudflare Memory Service
After deployment, you can test your memory service using curl:
- List available tools:Copy
- Store a memory:Copy
- Retrieve memories:Copy
Limitations
- Free tier limits on Cloudflare Workers and D1 may apply
- Workers AI embedding models may differ slightly from the local sentence-transformers models
- No direct access to the underlying database for manual operations
- Cloudflare Workers have a maximum execution time of 30 seconds on free plans
This server cannot be installed
Provides semantic memory and persistent storage for Claude, leveraging ChromaDB and sentence transformers for enhanced search and retrieval capabilities.
- Features
- Quick Start
- Docker and Smithery Integration
- Configuration
- Hardware Compatibility
- Memory Operations
- Configuration Options
- Getting Help
- Project Structure
- Development Guidelines
- License
- Acknowledgments
- Contact
- Cloudflare Worker Implementation