Skip to main content
Glama

MCP RAG System

A comprehensive Retrieval-Augmented Generation (RAG) system built using the Model Context Protocol (MCP) for storing, processing, and searching PDF documents.

Features

šŸ”§ Tools

  • upload_pdf: Upload and process PDF files with automatic text extraction and chunking

  • search_documents: Semantic search across all uploaded documents using vector embeddings

  • list_documents: View all uploaded documents and their metadata

  • delete_document: Remove documents and their associated chunks from the system

  • get_rag_stats: Get comprehensive statistics about the RAG system

šŸ“¦ Resources

  • rag://documents: List all documents in the system

  • rag://document/{document_id}: Get full content of a specific document

  • rag://stats: Get system statistics

šŸ’¬ Prompts

  • rag_query_prompt: Generate prompts for RAG-based question answering

  • document_summary_prompt: Create document summarization prompts

  • search_suggestions_prompt: Generate better search query suggestions

Installation

  1. Install dependencies:

    pip install -r requirements.txt
  2. Download required models: The system will automatically download the sentence-transformers model on first use.

Usage

Starting the Server

python mcp_server.py

The server will start on http://localhost:8000 with SSE (Server-Sent Events) transport.

Using the Client

Demo Mode

python mcp_client.py # Choose option 1 for demo mode

Interactive Mode

python mcp_client.py # Choose option 2 for interactive mode

Available commands in interactive mode:

  • upload - Upload a PDF file

  • search - Search documents with a query

  • list - List all uploaded documents

  • stats - Show system statistics

  • quit - Exit the client

Example Workflow

  1. Upload a PDF:

    # Via tool call result = await session.call_tool("upload_pdf", arguments={ "file_path": "/path/to/document.pdf", "document_name": "My Research Paper" })
  2. Search documents:

    # Via tool call result = await session.call_tool("search_documents", arguments={ "query": "machine learning applications", "top_k": 5 })
  3. Use RAG prompt:

    # Get search results first, then use in prompt prompt = await session.get_prompt("rag_query_prompt", arguments={ "query": "What are the key findings?", "context_chunks": search_results_text })

System Architecture

Document Processing Pipeline

  1. PDF Upload → Text extraction using PyMuPDF/PyPDF2

  2. Text Chunking → Split into overlapping chunks (1000 chars, 200 overlap)

  3. Embedding Generation → Create vector embeddings using SentenceTransformers

  4. Storage → Store in FAISS index with metadata

Storage Structure

rag_storage/ ā”œā”€ā”€ documents/ # Original extracted text ā”œā”€ā”€ chunks/ # Individual text chunks ā”œā”€ā”€ embeddings/ # Numpy arrays of embeddings ā”œā”€ā”€ faiss_index.bin # FAISS vector index └── metadata.json # Document and chunk metadata

Vector Search

  • Model: all-MiniLM-L6-v2 (384-dimensional embeddings)

  • Index: FAISS IndexFlatIP (Inner Product similarity)

  • Search: Cosine similarity for semantic matching

Configuration

Chunk Settings

Modify in mcp_server.py:

def _create_text_chunks(text: str, chunk_size: int = 1000, overlap: int = 200):

Embedding Model

Change the model in RAGSystem.__init__():

self.embedding_model = SentenceTransformer('all-MiniLM-L6-v2')

Storage Location

Set custom storage directory:

rag_system = RAGSystem(storage_dir="custom_rag_storage")

API Reference

Tools

upload_pdf

  • Parameters: file_path (str), document_name (optional str)

  • Returns: Document ID, chunk count, success status

search_documents

  • Parameters: query (str), top_k (optional int, default 5)

  • Returns: Ranked list of relevant chunks with scores

list_documents

  • Parameters: None

  • Returns: List of all documents with metadata

delete_document

  • Parameters: document_id (str)

  • Returns: Success status and confirmation message

get_rag_stats

  • Parameters: None

  • Returns: System statistics (documents, chunks, storage size)

Resources

rag://documents

Returns formatted list of all documents in the system.

rag://document/{document_id}

Returns full text content of specified document with metadata header.

rag://stats

Returns formatted system statistics.

Prompts

rag_query_prompt

  • Parameters: query (str), context_chunks (str)

  • Returns: Structured prompt for RAG-based QA

document_summary_prompt

  • Parameters: document_content (str)

  • Returns: Prompt for document summarization

search_suggestions_prompt

  • Parameters: query (str), available_documents (str)

  • Returns: Prompt for generating better search queries

Performance Considerations

Memory Usage

  • Embeddings: ~1.5KB per chunk (384 float32 values)

  • FAISS index: Scales linearly with number of chunks

  • Text storage: Depends on document size and chunking

Search Speed

  • FAISS IndexFlatIP: O(n) search time

  • For large collections, consider IndexIVFFlat or IndexHNSW

Optimization Tips

  1. Batch uploads for multiple documents

  2. Adjust chunk size based on document type

  3. Use GPU with faiss-gpu for large datasets

  4. Implement caching for frequent queries

Troubleshooting

Common Issues

  1. PDF text extraction fails:

    • Ensure PDF is not password-protected

    • Try different PDF files to isolate the issue

    • Check PyMuPDF and PyPDF2 installation

  2. Memory errors with large documents:

    • Reduce chunk size

    • Process documents in batches

    • Monitor system memory usage

  3. Search returns no results:

    • Verify documents are uploaded successfully

    • Check query similarity to document content

    • Try broader search terms

  4. Server connection issues:

    • Ensure server is running on correct port

    • Check firewall settings

    • Verify MCP client configuration

Debug Mode

Enable detailed logging by modifying the server:

import logging logging.basicConfig(level=logging.DEBUG)

Contributing

  1. Fork the repository

  2. Create a feature branch

  3. Add tests for new functionality

  4. Submit a pull request

License

This project is licensed under the MIT License.

MCP

-
security - not tested
F
license - not found
-
quality - not tested

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/nitin-kumar101/MCP'

If you have feedback or need assistance with the MCP directory API, please join our Discord server