RAG Context MCP Server
A lightweight Model Context Protocol (MCP) server that provides persistent memory and context management using local vector storage and database. This server enables AI assistants to store and retrieve contextual information efficiently using both semantic search and indexed retrieval.
Features
Local Vector Storage: Uses Vectra for efficient vector similarity search
Persistent Memory: SQLite database for reliable data persistence
Semantic Search: Automatic text embedding using Xenova/all-MiniLM-L6-v2 model
Hybrid Retrieval: Combines semantic search with indexed database queries
Simple API: Just two tools -
setContext
andgetContext
Lightweight: Minimal dependencies, runs entirely locally
Privacy-First: All data stored locally, no external API calls
Installation
Using npm
Using npx (no installation required)
Configuration
For Claude Desktop
Add the following to your Claude Desktop configuration file:
macOS: ~/Library/Application Support/Claude/claude_desktop_config.json
Windows: %APPDATA%\Claude\claude_desktop_config.json
For Cursor
In Cursor settings, add the MCP server:
Environment Variables
RAG_CONTEXT_DATA_DIR
: Directory where the database and vector index will be stored (default:~/.rag-context-mcp
)
Usage
The server exposes two main tools:
setContext
Store information in memory with automatic vectorization:
getContext
Retrieve relevant context using semantic search:
System Prompt for AI Assistants
To effectively use this MCP server, add the following to your AI assistant's system prompt:
When to Retrieve Context
Retrieve context when:
Starting a new conversation about a previously discussed topic
Users reference past discussions or decisions
You need to recall specific technical details or preferences
Building upon previous work or solutions
How to Retrieve Context
Use the getContext
tool with:
A natural language query describing what you're looking for
Appropriate limit (usually 3-5 results)
Threshold of 0.7 for balanced precision/recall
Example:
Best Practices
Be Selective: Store important, reusable information, not every detail
Use Clear Keys: Make keys descriptive and searchable
Add Metadata: Include project names, categories, and dates
Update Existing: Use the same key to update information rather than creating duplicates
Query Naturally: Write queries as you would ask a colleague
Remember: This memory persists across all conversations, making you more helpful over time by remembering important context and user preferences.
<RAG_CONTEXT_DATA_DIR>/ ├── memories.db # SQLite database └── vectors.index # Vectra vector index
Running Tests
Privacy and Security
All data is stored locally on your machine
No external API calls for embeddings (uses local model)
No telemetry or data collection
You control where data is stored via
RAG_CONTEXT_DATA_DIR
Troubleshooting
Common Issues
"VectorStore not initialized" error
Ensure the data directory exists and has write permissions
Check that the
RAG_CONTEXT_DATA_DIR
path is valid
Slow first startup
The embedding model is downloaded on first use (~30MB)
Subsequent starts will be much faster
High memory usage
The embedding model requires ~200MB RAM
Consider limiting the number of stored contexts
Contributing
Contributions are welcome! Please feel free to submit a Pull Request.
License
MIT License - see LICENSE file for details
Acknowledgments
Inspired by the MCP memory server example from Anthropic, but enhanced with:
Local vector storage for better retrieval
SQLite for reliable persistence
Hybrid search capabilities
Privacy-focused design
This server cannot be installed
local-only server
The server can only run on the client's local machine because it depends on local resources.
A lightweight server that provides persistent memory and context management for AI assistants using local vector storage and database, enabling efficient storage and retrieval of contextual information through semantic search and indexed retrieval.
Related MCP Servers
- -securityFlicense-qualityA server that enables AI assistants to execute JavaScript code with persistent context through stateful REPL sessions, file operations, and package management features.
- -securityFlicense-qualityA Model Context Protocol server that provides persistent task management capabilities for AI assistants, allowing them to create, update, and track tasks beyond their usual context limitations.Last updated -5
- AsecurityAlicenseAqualityProvides a structured documentation system for context preservation in AI assistant environments, helping users create and manage memory banks for their projects.Last updated -372MIT License
- -securityFlicense-qualityAn MCP server enabling AI assistants to store, retrieve, and manage contextual information across conversations with features like persistent memory, advanced search, tagging, and privacy controls.Last updated -