RAG Context MCP Server
A lightweight Model Context Protocol (MCP) server that provides persistent memory and context management using local vector storage and database. This server enables AI assistants to store and retrieve contextual information efficiently using both semantic search and indexed retrieval.
Features
- Local Vector Storage: Uses Vectra for efficient vector similarity search
- Persistent Memory: SQLite database for reliable data persistence
- Semantic Search: Automatic text embedding using Xenova/all-MiniLM-L6-v2 model
- Hybrid Retrieval: Combines semantic search with indexed database queries
- Simple API: Just two tools -
setContext
andgetContext
- Lightweight: Minimal dependencies, runs entirely locally
- Privacy-First: All data stored locally, no external API calls
Installation
Using npm
Using npx (no installation required)
Configuration
For Claude Desktop
Add the following to your Claude Desktop configuration file:
macOS: ~/Library/Application Support/Claude/claude_desktop_config.json
Windows: %APPDATA%\Claude\claude_desktop_config.json
For Cursor
In Cursor settings, add the MCP server:
Environment Variables
RAG_CONTEXT_DATA_DIR
: Directory where the database and vector index will be stored (default:~/.rag-context-mcp
)
Usage
The server exposes two main tools:
setContext
Store information in memory with automatic vectorization:
getContext
Retrieve relevant context using semantic search:
System Prompt for AI Assistants
To effectively use this MCP server, add the following to your AI assistant's system prompt:
When to Retrieve Context
Retrieve context when:
- Starting a new conversation about a previously discussed topic
- Users reference past discussions or decisions
- You need to recall specific technical details or preferences
- Building upon previous work or solutions
How to Retrieve Context
Use the getContext
tool with:
- A natural language query describing what you're looking for
- Appropriate limit (usually 3-5 results)
- Threshold of 0.7 for balanced precision/recall
Example:
Best Practices
- Be Selective: Store important, reusable information, not every detail
- Use Clear Keys: Make keys descriptive and searchable
- Add Metadata: Include project names, categories, and dates
- Update Existing: Use the same key to update information rather than creating duplicates
- Query Naturally: Write queries as you would ask a colleague
Remember: This memory persists across all conversations, making you more helpful over time by remembering important context and user preferences.
/ ├── memories.db # SQLite database └── vectors.index # Vectra vector index
Running Tests
Privacy and Security
- All data is stored locally on your machine
- No external API calls for embeddings (uses local model)
- No telemetry or data collection
- You control where data is stored via
RAG_CONTEXT_DATA_DIR
Troubleshooting
Common Issues
- "VectorStore not initialized" error
- Ensure the data directory exists and has write permissions
- Check that the
RAG_CONTEXT_DATA_DIR
path is valid
- Slow first startup
- The embedding model is downloaded on first use (~30MB)
- Subsequent starts will be much faster
- High memory usage
- The embedding model requires ~200MB RAM
- Consider limiting the number of stored contexts
Contributing
Contributions are welcome! Please feel free to submit a Pull Request.
License
MIT License - see LICENSE file for details
Acknowledgments
Inspired by the MCP memory server example from Anthropic, but enhanced with:
- Local vector storage for better retrieval
- SQLite for reliable persistence
- Hybrid search capabilities
- Privacy-focused design
This server cannot be installed
A lightweight server that provides persistent memory and context management for AI assistants using local vector storage and database, enabling efficient storage and retrieval of contextual information through semantic search and indexed retrieval.
Related MCP Servers
- -securityFlicense-qualityA server that enables AI assistants to execute JavaScript code with persistent context through stateful REPL sessions, file operations, and package management features.Last updated -TypeScript
- -securityFlicense-qualityA Model Context Protocol server that provides persistent task management capabilities for AI assistants, allowing them to create, update, and track tasks beyond their usual context limitations.Last updated -1TypeScript
- -securityFlicense-qualityImplements long-term memory capabilities for AI assistants using PostgreSQL with pgvector for efficient vector similarity search, enabling semantic retrieval of stored information.Last updated -1JavaScript
- -securityAlicense-qualityA Model Context Protocol server that provides AI agents with persistent memory capabilities through Mem0, allowing them to store, retrieve, and semantically search memories.Last updated -2PythonMIT License