Which integrations are available for this server?

Uses OpenAI's embedding models (text-embedding-3-large) to generate vector embeddings for document chunks in the RAG pipeline, enabling semantic search capabilities.

How do I use MCP RAG Server?

1. Click on "Install Server". 2. Wait a few minutes for the server to deploy. Once ready, it will show a "Started" state. 3. In the chat, type @ followed by the MCP server name and your instructions, e.g., "@MCP RAG Server search my knowledge base for articles about RAG architecture" That's it! The server will respond to your query, and you can continue using it as needed. Here is a step-by-step guide with screenshots.

MCP RAG Server

by 0xrdan

Overview Schema Related Servers Score Discussions

JavaScript

Hybrid

MCP RAG Server

An MCP (Model Context Protocol) server that exposes RAG capabilities to Claude Code and other MCP clients.

This is a standalone extraction from my production portfolio site. See it in action at danmonteiro.com.

The Problem

You're using Claude Code but:

No access to your documents — Claude can't search your knowledge base
Context is manual — you're copy-pasting relevant docs into prompts
RAG is disconnected — your vector database isn't accessible to AI tools
Integration is custom — every project builds its own RAG bridge

The Solution

MCP RAG Server provides:

Standard MCP interface — works with Claude Code, Claude Desktop, and any MCP client
Full RAG pipeline — hybrid search, query expansion, semantic chunking built-in
Simple tools — rag_query, rag_search, index_document, get_stats
Zero config — point at ChromaDB and go

# In Claude Code, after configuring the server:
"Search my knowledge base for articles about RAG architecture"
# Claude automatically uses rag_query tool and gets relevant context

Results

From production usage:

Without MCP RAG	With MCP RAG
Manual context copy-paste	Automatic retrieval
No document search	Hybrid search built-in
Static knowledge	Live vector database
Custom integration per project	Standard MCP protocol

Design Philosophy

Why MCP?

MCP (Model Context Protocol) standardizes how AI applications connect to external tools:

┌──────────────┐     MCP Protocol     ┌──────────────┐
│  MCP Client  │◀────────────────────▶│  MCP Server  │
│ (Claude Code)│                      │ (This repo)  │
└──────────────┘                      └──────────────┘
                                             │
                                      ┌──────▼──────┐
                                      │ RAG Pipeline │
                                      │  (ChromaDB)  │
                                      └─────────────┘

Instead of building custom integrations, MCP provides a universal interface that any MCP-compatible client can use.

Tools Exposed

Tool	Description
`rag_query`	Query with hybrid search, returns formatted context
`rag_search`	Raw similarity search, returns chunks with scores
`index_document`	Add a single document
`index_documents_batch`	Batch index multiple documents
`delete_by_source`	Delete all docs from a source
`get_stats`	Collection statistics
`clear_collection`	Clear all data (requires confirmation)

Quick Start

1. Prerequisites

# Start ChromaDB
docker run -p 8000:8000 chromadb/chroma

# Set OpenAI API key (for embeddings)
export OPENAI_API_KEY="sk-..."

2. Install & Build

git clone https://github.com/0xrdan/mcp-rag-server.git
cd mcp-rag-server
npm install
npm run build

3. Configure Claude Code

Add to your Claude Code MCP configuration (~/.claude/mcp.json or project .mcp.json):

{
  "mcpServers": {
    "rag": {
      "command": "node",
      "args": ["/path/to/mcp-rag-server/dist/server.js"],
      "env": {
        "OPENAI_API_KEY": "sk-...",
        "CHROMA_URL": "http://localhost:8000",
        "CHROMA_COLLECTION": "my_knowledge_base"
      }
    }
  }
}

4. Use in Claude Code

# Restart Claude Code to load the server
claude

# Now Claude has access to RAG tools:
"Index this document into my knowledge base: [paste content]"
"Search for information about transformer architectures"
"What do my docs say about error handling?"

API Reference

rag_query

Query the knowledge base with hybrid search. Returns formatted context suitable for LLM prompts.

// Input
{
  question: string;      // Required: the question to search for
  topK?: number;         // Optional: number of results (default: 5)
  threshold?: number;    // Optional: min similarity 0-1 (default: 0.5)
  filters?: object;      // Optional: metadata filters
}

// Output
{
  context: string;       // Formatted context for LLM
  chunks: [{
    content: string;
    score: number;
    metadata: object;
  }];
  stats: {
    totalChunks: number;
    avgSimilarity: number;
  };
}

rag_search

Raw similarity search without context formatting.

// Input
{
  query: string;         // Required: search query
  topK?: number;         // Optional: number of results (default: 10)
  filters?: object;      // Optional: metadata filters
}

// Output: Array of chunks with scores

index_document

Add a document to the knowledge base.

// Input
{
  id: string;            // Required: unique identifier
  title: string;         // Required: document title
  content: string;       // Required: document content
  source: string;        // Required: source identifier
  category?: string;     // Optional: category
  tags?: string[];       // Optional: tags array
}

// Output
{
  success: boolean;
  documentId: string;
  chunksIndexed: number;
}

get_stats

Get collection statistics.

// Output
{
  totalChunks: number;
  totalDocuments: number;
  // ... other stats from RAG pipeline
}

Configuration

Environment Variables

Variable	Required	Default	Description
`OPENAI_API_KEY`	Yes	-	OpenAI API key for embeddings
`CHROMA_URL`	No	`http://localhost:8000`	ChromaDB URL
`CHROMA_COLLECTION`	No	`mcp_knowledge_base`	Collection name
`EMBEDDING_MODEL`	No	`text-embedding-3-large`	Embedding model
`EMBEDDING_DIMENSIONS`	No	Native	Reduced dimensions

Project Structure

mcp-rag-server/
├── src/
│   ├── server.ts        # Main MCP server implementation
│   └── index.ts         # Exports
├── mcp-config.example.json  # Example Claude Code configuration
├── package.json
└── README.md

Advanced Usage

Programmatic Server Creation

import { createServer } from 'mcp-rag-server';

const server = await createServer({
  vectorDB: {
    host: 'http://custom-chroma:8000',
    collectionName: 'my_collection',
  },
  rag: {
    topK: 10,
    enableHybridSearch: true,
  },
});

Using with Claude Desktop

Same configuration works with Claude Desktop's MCP support:

// ~/Library/Application Support/Claude/claude_desktop_config.json (macOS)
{
  "mcpServers": {
    "rag": {
      "command": "node",
      "args": ["/path/to/mcp-rag-server/dist/server.js"]
    }
  }
}

Part of the Context Continuity Stack

This repo exposes context continuity as a protocol-level capability — giving any MCP client access to persistent semantic memory.

Layer	Role	This Repo
Intra-session	Short-term memory	—
Document-scoped	Injected content	—
Retrieved	Long-term semantic memory via MCP	mcp-rag-server
Progressive	Staged responses	—

MCP RAG Server bridges the gap between vector databases and AI assistants. Instead of building custom integrations, any MCP-compatible tool (Claude Code, Claude Desktop, custom clients) gets instant access to your knowledge base.

Related repos:

rag-pipeline — The underlying RAG implementation
mcp-client-example — Reference client for connecting to this server
chatbot-widget — Session cache, Research Mode, conversation export
ai-orchestrator — Multi-model LLM routing

Contributing

Contributions welcome! Please:

Fork the repository
Create a feature branch (git checkout -b feat/add-new-tool)
Make changes with semantic commits
Open a PR with clear description

License

MIT License - see LICENSE for details.

Acknowledgments

Built with Claude Code.

Co-Authored-By: Claude <noreply@anthropic.com>

This server cannot be installed

license - permissive license

quality - not tested

maintenance

How are these scores calculated?

Maintenance

–Maintainers

–Response time

–Release cycle

1Releases (12mo)

Resources

Need Help?

Related Servers

Unclaimed servers have limited discoverability.

Looking for Admin?

If you are the server author, to access and configure the admin panel.

Latest Blog Posts

Lightport: Open-Sourcing Glama's AI Gateway
By punkpeye on April 27, 2026.
open source
OpenAI
Tool Definition Quality Score (TDQS)
By punkpeye on April 3, 2026.
mcp
The Hackers Who Tracked My Sleep Cycle
By punkpeye on March 26, 2026.
security

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/0xrdan/mcp-rag-server'

If you have feedback or need assistance with the MCP directory API, please join our Discord server