Uses OpenAI's embedding models (text-embedding-3-large) to generate vector embeddings for document chunks in the RAG pipeline, enabling semantic search capabilities.
Click on "Install Server".
Wait a few minutes for the server to deploy. Once ready, it will show a "Started" state.
In the chat, type
@followed by the MCP server name and your instructions, e.g., "@MCP RAG Serversearch my knowledge base for articles about RAG architecture"
That's it! The server will respond to your query, and you can continue using it as needed.
Here is a step-by-step guide with screenshots.
MCP RAG Server
An MCP (Model Context Protocol) server that exposes RAG capabilities to Claude Code and other MCP clients.
This is a standalone extraction from my production portfolio site. See it in action at
The Problem
You're using Claude Code but:
No access to your documents — Claude can't search your knowledge base
Context is manual — you're copy-pasting relevant docs into prompts
RAG is disconnected — your vector database isn't accessible to AI tools
Integration is custom — every project builds its own RAG bridge
The Solution
MCP RAG Server provides:
Standard MCP interface — works with Claude Code, Claude Desktop, and any MCP client
Full RAG pipeline — hybrid search, query expansion, semantic chunking built-in
Simple tools —
rag_query,rag_search,index_document,get_statsZero config — point at ChromaDB and go
Results
From production usage:
Without MCP RAG | With MCP RAG |
Manual context copy-paste | Automatic retrieval |
No document search | Hybrid search built-in |
Static knowledge | Live vector database |
Custom integration per project | Standard MCP protocol |
Design Philosophy
Why MCP?
MCP (Model Context Protocol) standardizes how AI applications connect to external tools:
Instead of building custom integrations, MCP provides a universal interface that any MCP-compatible client can use.
Tools Exposed
Tool | Description |
| Query with hybrid search, returns formatted context |
| Raw similarity search, returns chunks with scores |
| Add a single document |
| Batch index multiple documents |
| Delete all docs from a source |
| Collection statistics |
| Clear all data (requires confirmation) |
Quick Start
1. Prerequisites
2. Install & Build
3. Configure Claude Code
Add to your Claude Code MCP configuration (~/.claude/mcp.json or project .mcp.json):
4. Use in Claude Code
API Reference
rag_query
Query the knowledge base with hybrid search. Returns formatted context suitable for LLM prompts.
rag_search
Raw similarity search without context formatting.
index_document
Add a document to the knowledge base.
get_stats
Get collection statistics.
Configuration
Environment Variables
Variable | Required | Default | Description |
| Yes | - | OpenAI API key for embeddings |
| No |
| ChromaDB URL |
| No |
| Collection name |
| No |
| Embedding model |
| No | Native | Reduced dimensions |
Project Structure
Advanced Usage
Programmatic Server Creation
Using with Claude Desktop
Same configuration works with Claude Desktop's MCP support:
Part of the Context Continuity Stack
This repo exposes context continuity as a protocol-level capability — giving any MCP client access to persistent semantic memory.
Layer | Role | This Repo |
Intra-session | Short-term memory | — |
Document-scoped | Injected content | — |
Retrieved | Long-term semantic memory via MCP | mcp-rag-server |
Progressive | Staged responses | — |
MCP RAG Server bridges the gap between vector databases and AI assistants. Instead of building custom integrations, any MCP-compatible tool (Claude Code, Claude Desktop, custom clients) gets instant access to your knowledge base.
Related repos:
rag-pipeline — The underlying RAG implementation
mcp-client-example — Reference client for connecting to this server
chatbot-widget — Session cache, Research Mode, conversation export
ai-orchestrator — Multi-model LLM routing
Contributing
Contributions welcome! Please:
Fork the repository
Create a feature branch (
git checkout -b feat/add-new-tool)Make changes with semantic commits
Open a PR with clear description
License
MIT License - see LICENSE for details.
Acknowledgments
Built with Claude Code.