MCP RAG System
A comprehensive Retrieval-Augmented Generation (RAG) system built using the Model Context Protocol (MCP) for storing, processing, and searching PDF documents.
Features
š§ Tools
upload_pdf: Upload and process PDF files with automatic text extraction and chunking
search_documents: Semantic search across all uploaded documents using vector embeddings
list_documents: View all uploaded documents and their metadata
delete_document: Remove documents and their associated chunks from the system
get_rag_stats: Get comprehensive statistics about the RAG system
š¦ Resources
rag://documents: List all documents in the system
rag://document/{document_id}: Get full content of a specific document
rag://stats: Get system statistics
š¬ Prompts
rag_query_prompt: Generate prompts for RAG-based question answering
document_summary_prompt: Create document summarization prompts
search_suggestions_prompt: Generate better search query suggestions
Installation
Install dependencies:
pip install -r requirements.txtDownload required models: The system will automatically download the sentence-transformers model on first use.
Usage
Starting the Server
The server will start on http://localhost:8000 with SSE (Server-Sent Events) transport.
Using the Client
Demo Mode
Interactive Mode
Available commands in interactive mode:
upload- Upload a PDF filesearch- Search documents with a querylist- List all uploaded documentsstats- Show system statisticsquit- Exit the client
Example Workflow
Upload a PDF:
# Via tool call result = await session.call_tool("upload_pdf", arguments={ "file_path": "/path/to/document.pdf", "document_name": "My Research Paper" })Search documents:
# Via tool call result = await session.call_tool("search_documents", arguments={ "query": "machine learning applications", "top_k": 5 })Use RAG prompt:
# Get search results first, then use in prompt prompt = await session.get_prompt("rag_query_prompt", arguments={ "query": "What are the key findings?", "context_chunks": search_results_text })
System Architecture
Document Processing Pipeline
PDF Upload ā Text extraction using PyMuPDF/PyPDF2
Text Chunking ā Split into overlapping chunks (1000 chars, 200 overlap)
Embedding Generation ā Create vector embeddings using SentenceTransformers
Storage ā Store in FAISS index with metadata
Storage Structure
Vector Search
Model:
all-MiniLM-L6-v2(384-dimensional embeddings)Index: FAISS IndexFlatIP (Inner Product similarity)
Search: Cosine similarity for semantic matching
Configuration
Chunk Settings
Modify in mcp_server.py:
Embedding Model
Change the model in RAGSystem.__init__():
Storage Location
Set custom storage directory:
API Reference
Tools
upload_pdf
Parameters:
file_path(str),document_name(optional str)Returns: Document ID, chunk count, success status
search_documents
Parameters:
query(str),top_k(optional int, default 5)Returns: Ranked list of relevant chunks with scores
list_documents
Parameters: None
Returns: List of all documents with metadata
delete_document
Parameters:
document_id(str)Returns: Success status and confirmation message
get_rag_stats
Parameters: None
Returns: System statistics (documents, chunks, storage size)
Resources
rag://documents
Returns formatted list of all documents in the system.
rag://document/{document_id}
Returns full text content of specified document with metadata header.
rag://stats
Returns formatted system statistics.
Prompts
rag_query_prompt
Parameters:
query(str),context_chunks(str)Returns: Structured prompt for RAG-based QA
document_summary_prompt
Parameters:
document_content(str)Returns: Prompt for document summarization
search_suggestions_prompt
Parameters:
query(str),available_documents(str)Returns: Prompt for generating better search queries
Performance Considerations
Memory Usage
Embeddings: ~1.5KB per chunk (384 float32 values)
FAISS index: Scales linearly with number of chunks
Text storage: Depends on document size and chunking
Search Speed
FAISS IndexFlatIP: O(n) search time
For large collections, consider IndexIVFFlat or IndexHNSW
Optimization Tips
Batch uploads for multiple documents
Adjust chunk size based on document type
Use GPU with
faiss-gpufor large datasetsImplement caching for frequent queries
Troubleshooting
Common Issues
PDF text extraction fails:
Ensure PDF is not password-protected
Try different PDF files to isolate the issue
Check PyMuPDF and PyPDF2 installation
Memory errors with large documents:
Reduce chunk size
Process documents in batches
Monitor system memory usage
Search returns no results:
Verify documents are uploaded successfully
Check query similarity to document content
Try broader search terms
Server connection issues:
Ensure server is running on correct port
Check firewall settings
Verify MCP client configuration
Debug Mode
Enable detailed logging by modifying the server:
Contributing
Fork the repository
Create a feature branch
Add tests for new functionality
Submit a pull request
License
This project is licensed under the MIT License.
MCP
This server cannot be installed