Skip to main content
Glama

MCP-RAG

by seanshin0214

MCP-RAG

Your Personal NotebookLM for Claude Desktop

Universal RAG (Retrieval-Augmented Generation) MCP server for Claude Desktop. Index documents via CLI, search them in Claude Desktop with 0% hallucination.

License: MIT Node.js Version Python Version


What is MCP-RAG?

Think of it as NotebookLM for Claude Desktop:

  • πŸ“š Index any documents: PDF, Word, PowerPoint, Excel, ν•œκΈ€, TXT, MD

  • πŸ” Natural language search: Ask questions in Claude Desktop

  • βœ… 0% Hallucination: Answers based ONLY on your documents

  • πŸ’» 100% Local: All data stays on your computer (ChromaDB)

  • 🎯 Simple workflow: CLI for indexing β†’ Claude Desktop for searching


Architecture

β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”‚ Your Documents β”‚ β”‚ (PDF, DOCX, etc) β”‚ β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ β”‚ β–Ό [CLI: npm run cli add] β”‚ β–Ό β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”‚ ChromaDB Server β”‚ ◄─── Vector embeddings β”‚ (localhost:8000) β”‚ β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ β”‚ β–Ό β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”‚ MCP-RAG Server β”‚ β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ β”‚ β–Ό β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”‚ Claude Desktop β”‚ ◄─── You ask questions here! β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜

Two-Part System:

  1. CLI = Document management (add, delete, list)

  2. Claude Desktop = Search and Q&A


Quick Start

1. Install

git clone https://github.com/seanshin0214/mcp-rag.git cd mcp-rag npm install pip install chromadb

2. Start ChromaDB Server

Keep this running in a separate terminal:

chroma run --host localhost --port 8000

3. Add Documents (CLI)

# Add single document npm run cli add school "path/to/regulations.pdf" # Add multiple documents npm run cli add research "paper1.pdf" npm run cli add research "paper2.docx" npm run cli add work "handbook.pptx"

Supported formats:

  • Documents: PDF, DOCX, HWP, TXT, MD

  • Presentations: PPTX

  • Spreadsheets: XLSX, XLS

4. Configure Claude Desktop

Windows: %APPDATA%\Claude\claude_desktop_config.json

macOS: ~/Library/Application Support/Claude/claude_desktop_config.json

Add this:

{ "mcpServers": { "mcp-rag": { "command": "node", "args": ["/absolute/path/to/mcp-rag/src/index.js"] } } }

Important: Use your actual installation path!

5. Restart Claude Desktop

6. Ask Questions!

In Claude Desktop:

"What does the school collection say about attendance?"
"Search the research collection for methodology"
"Show me all my collections"

CLI Commands

# Add document npm run cli add <collection> <file> [-d "description"] # List all collections npm run cli list # Get collection info npm run cli info <collection> # Search test npm run cli search <collection> "query" # Delete collection npm run cli delete <collection>

Examples

# Add with description npm run cli add school "regulations.pdf" -d "School regulations 2024" # Add multiple files (PowerShell) Get-ChildItem "*.docx" | ForEach-Object { npm run cli add MyCollection $_.FullName } # Check what's indexed npm run cli list npm run cli info school

MCP Tools (Claude Desktop)

When you ask questions in Claude Desktop, these tools are automatically used:

Tool

Description

search_documents

Search in specific collection or all collections

list_collections

List all available collections

get_collection_info

Get details about a collection

Note: Document addition is CLI-only, not available in Claude Desktop.


How It Works

Indexing (CLI)

1. Read file (PDF/DOCX/PPTX/etc) 2. Extract text 3. Split into 500-token chunks (50-token overlap) 4. Generate embeddings (ChromaDB) 5. Store in collection

Searching (Claude Desktop)

1. You ask: "What's the attendance policy?" 2. MCP-RAG searches ChromaDB 3. Returns top 5 most relevant chunks 4. Claude answers using ONLY those chunks

Use Cases

πŸ“š Students

npm run cli add math "calculus-textbook.pdf" npm run cli add physics "lecture-notes.docx"

β†’ "Explain the concept of derivatives from my math collection"

🏒 Professionals

npm run cli add company "employee-handbook.pdf" npm run cli add project "requirements.docx"

β†’ "What's our vacation policy?"

πŸ”¬ Researchers

npm run cli add literature "papers/*.pdf" npm run cli add notes "research-notes.md"

β†’ "Summarize the methodology from the literature collection"


Features

  • βœ… Multi-collection support - Organize by topic

  • βœ… Semantic search - ChromaDB vector embeddings

  • βœ… Source attribution - See which document/chunk

  • βœ… Relevance scoring - Know how confident the match is

  • βœ… Multiple file formats - PDF, DOCX, PPTX, XLSX, HWP, TXT, MD

  • βœ… 100% local - No cloud, all on your machine

  • βœ… 0% hallucination - Only document-based answers


Comparison

Feature

NotebookLM

MCP-RAG

Platform

Google Cloud

Local

AI Model

Gemini

Claude

Privacy

Cloud

100% Local

Multi-collection

❌

βœ…

CLI

❌

βœ…

Cost

Free (limited)

Free (unlimited)


Troubleshooting

ChromaDB Connection Error

Problem: Cannot connect to ChromaDB

Solution:

chroma run --host localhost --port 8000

Keep this terminal open!

Claude Desktop: MCP Server Not Showing

  1. Check claude_desktop_config.json syntax

  2. Use absolute path (not relative)

  3. Restart Claude Desktop completely

  4. Check ChromaDB is running

No Search Results

# Verify documents are indexed npm run cli list npm run cli info <collection> # Re-index if needed npm run cli add <collection> <file>

Advanced

Batch Add Files

PowerShell:

Get-ChildItem "C:\docs\*.pdf" | ForEach-Object { npm run cli add MyCollection $_.FullName }

Bash:

for f in /path/to/docs/*.pdf; do npm run cli add MyCollection "$f" done

Custom Chunk Size

Edit src/indexer.js:

const CHUNK_SIZE = 500; // Tokens per chunk const CHUNK_OVERLAP = 50; // Overlap between chunks

Larger chunks = more context, fewer chunks Smaller chunks = more precise, more chunks


Project Structure

mcp-rag/ β”œβ”€β”€ src/ β”‚ β”œβ”€β”€ index.js # MCP server β”‚ β”œβ”€β”€ cli.js # CLI tool β”‚ └── indexer.js # Document processing β”œβ”€β”€ chroma/ # ChromaDB data (auto-created) β”œβ”€β”€ package.json β”œβ”€β”€ README.md β”œβ”€β”€ QUICK_START.md └── HOW_TO_USE.md

Requirements

  • Node.js 18+

  • Python 3.8+ (for ChromaDB)

  • Claude Desktop (latest version)


Contributing

Contributions welcome! This is a universal tool that can benefit many users.


License

MIT License - see LICENSE


Credits

Built with:


MCP-RAG - Your documents, Claude's intelligence, zero hallucination.

-
security - not tested
A
license - permissive license
-
quality - not tested

local-only server

The server can only run on the client's local machine because it depends on local resources.

Turns Claude Desktop into a personal document question-answering system using local vector search. Index PDF, TXT, and Markdown documents into collections and get answers based strictly on your documents with zero hallucination.

  1. What is MCP-RAG?
    1. Architecture
      1. Quick Start
        1. 1. Install
        2. 2. Start ChromaDB Server
        3. 3. Add Documents (CLI)
        4. 4. Configure Claude Desktop
        5. 5. Restart Claude Desktop
        6. 6. Ask Questions!
      2. CLI Commands
        1. Examples
      3. MCP Tools (Claude Desktop)
        1. How It Works
          1. Indexing (CLI)
          2. Searching (Claude Desktop)
        2. Use Cases
          1. πŸ“š Students
          2. 🏒 Professionals
          3. πŸ”¬ Researchers
        3. Features
          1. Comparison
            1. Troubleshooting
              1. ChromaDB Connection Error
              2. Claude Desktop: MCP Server Not Showing
              3. No Search Results
            2. Advanced
              1. Batch Add Files
              2. Custom Chunk Size
            3. Project Structure
              1. Requirements
                1. Contributing
                  1. License
                    1. Credits

                      MCP directory API

                      We provide all the information about MCP servers via our MCP API.

                      curl -X GET 'https://glama.ai/api/mcp/v1/servers/seanshin0214/mcp-rag'

                      If you have feedback or need assistance with the MCP directory API, please join our Discord server