Uses OpenAI's embedding models (text-embedding-3-large) to generate vector embeddings for document chunks in the RAG pipeline, enabling semantic search capabilities.
MCP RAG Server
An MCP (Model Context Protocol) server that exposes RAG capabilities to Claude Code and other MCP clients.
This is a standalone extraction from my production portfolio site. See it in action at
The Problem
You're using Claude Code but:
No access to your documents — Claude can't search your knowledge base
Context is manual — you're copy-pasting relevant docs into prompts
RAG is disconnected — your vector database isn't accessible to AI tools
Integration is custom — every project builds its own RAG bridge
The Solution
MCP RAG Server provides:
Standard MCP interface — works with Claude Code, Claude Desktop, and any MCP client
Full RAG pipeline — hybrid search, query expansion, semantic chunking built-in
Simple tools —
rag_query,rag_search,index_document,get_statsZero config — point at ChromaDB and go
Results
From production usage:
Without MCP RAG | With MCP RAG |
Manual context copy-paste | Automatic retrieval |
No document search | Hybrid search built-in |
Static knowledge | Live vector database |
Custom integration per project | Standard MCP protocol |
Design Philosophy
Why MCP?
MCP (Model Context Protocol) standardizes how AI applications connect to external tools:
Instead of building custom integrations, MCP provides a universal interface that any MCP-compatible client can use.
Tools Exposed
Tool | Description |
| Query with hybrid search, returns formatted context |
| Raw similarity search, returns chunks with scores |
| Add a single document |
| Batch index multiple documents |
| Delete all docs from a source |
| Collection statistics |
| Clear all data (requires confirmation) |
Quick Start
1. Prerequisites
2. Install & Build
3. Configure Claude Code
Add to your Claude Code MCP configuration (~/.claude/mcp.json or project .mcp.json):
4. Use in Claude Code
API Reference
rag_query
Query the knowledge base with hybrid search. Returns formatted context suitable for LLM prompts.
rag_search
Raw similarity search without context formatting.
index_document
Add a document to the knowledge base.
get_stats
Get collection statistics.
Configuration
Environment Variables
Variable | Required | Default | Description |
| Yes | - | OpenAI API key for embeddings |
| No |
| ChromaDB URL |
| No |
| Collection name |
| No |
| Embedding model |
| No | Native | Reduced dimensions |
Project Structure
Advanced Usage
Programmatic Server Creation
Using with Claude Desktop
Same configuration works with Claude Desktop's MCP support:
Related Projects
rag-pipeline - The underlying RAG implementation
topic-discovery - Multi-source topic aggregation
ai-orchestrator - Multi-model LLM routing
Contributing
Contributions welcome! Please:
Fork the repository
Create a feature branch (
git checkout -b feat/add-new-tool)Make changes with semantic commits
Open a PR with clear description
License
MIT License - see LICENSE for details.
Acknowledgments
Built with Claude Code.