Performs RAG (Retrieval-Augmented Generation) lookups using vector embeddings and semantic search to query documents and code repositories, with support for PDF text extraction, ChromaDB vector storage, and relevance-scored results.
Click on "Install Server".
Wait a few minutes for the server to deploy. Once ready, it will show a "Started" state.
In the chat, type
@followed by the MCP server name and your instructions, e.g., "@RooCode-RAG-Lookupfind documents about authentication implementation"
That's it! The server will respond to your query, and you can continue using it as needed.
Here is a step-by-step guide with screenshots.
RooCode-RAG-Lookup
RooCode MCP Server for performing RAG (Retrieval-Augmented Generation) lookups in documents and code repositories using vector embeddings and semantic search.
Example Usage
Ask a question: e.g. "What is the maximum number of entries* in a word document?" and prompt the LLM stating "use rag". The LLM is usally a decent judge of when it should use a tool or not and may decide to use the tool on its own.
*This is related to the maximum number of XML properties and elements addressable in Word
Features
Full RAG Implementation: Complete vector-based semantic search using ChromaDB and Haystack
Document Indexing: Automatic text extraction and chunking from PDF documents
Vector Embeddings: Sentence transformer embeddings for semantic similarity
RAG Lookup Tool: Search through documents and code repositories with relevance scoring
Test Tool: Simple hello world tool to verify MCP server connectivity
Async MCP Protocol: Full JSON-RPC 2.0 support via stdio
Installation
Install Python dependencies:
pip install -r requirements.txtConfigure RooCode to use this MCP server by adding the configuration from
mcp_config.jsonto your RooCode settings.
Configuration
Add the
mcp_config.jsonto your RooCode MCP server settings in the edit global settings part of MCP tools. If the tool is ready to use it will show a green status.Set the following environment variables:
RAG_LOOKUP_PATH: Path to this project directoryPYTHON_PATH: Path to your Python executable
Configure parameters in
parameters.py:EMBEDDING_MODEL: Sentence transformer model (default: all-mpnet-base-v2)COLLECTION_NAME: ChromaDB collection nameCHUNK_SIZE: Text chunk size in words (default: 500)CHUNK_OVERLAP: Overlap between chunks (default: 50)DEFAULT_TOP_K: Number of results to return (default: 5)
Available Tools
1. rag_lookup
Perform semantic search using RAG in documents and code repositories. Returns relevant chunks with similarity scores and metadata.
Parameters:
query(required): The search querysource(optional): Where to search - "documents", "repos", or "both" (default: "both")
Returns:
Relevant text chunks with similarity scores
Source file information and metadata
Statistics on documents searched
Example:
{
"query": "authentication implementation",
"source": "both"
}Response Format:
{
"status": "success",
"query": "authentication implementation",
"results": [
{
"content": "...",
"score": 0.85,
"metadata": {
"file_name": "document.txt",
"source_file": "/path/to/document.txt"
}
}
],
"metadata": {
"documents_searched": 5,
"repos_searched": 3,
"total_matches": 5
}
}2. say_hello
Simple test tool that returns a greeting message with timestamp.
Parameters:
name(optional): Name to include in greeting (default: "World")
Example:
{
"name": "RooCode"
}Usage
1. Extract and Index Documents
Place PDF documents in the Documents/ or Repos/ folders, then run:
# Extract text from PDFs
python extraction/parse_pdf.py
# Populate the vector database
python extraction/populate_database.py2. Query the RAG System
# Test RAG lookup directly
python query_rag.py
Or ask3. Use via MCP Server
Once configured in RooCode, use the rag_lookup tool through the MCP interface. There is an MCP menu in RooCode settings editing the global settings will give you json settings to edit {"mcpServers":{}}, copy and paste the mcp_config.json into the global MCP settings.
Testing
Test the MCP server locally:
# Using MCP inspector
npx @modelcontextprotocol/inspector python mcp_tool.py
# Direct stdio test
echo '{"jsonrpc":"2.0","id":1,"method":"tools/list"}' | python mcp_tool.pyProject Structure
RooCode-RAG-Lookup/
├── mcp_tool.py # Main MCP server implementation
├── query_rag.py # RAG query functions
├── parameters.py # Configuration parameters
├── run_rag_lookup.bat # Windows batch launcher
├── mcp_config.json # Example RooCode configuration
├── requirements.txt # Python dependencies
├── extraction/
│ ├── parse_pdf.py # PDF text extraction
│ └── populate_database.py # Database population and indexing
├── ExtractedText/ # Extracted text files (.txt + .meta.json)
├── chroma_db/ # ChromaDB vector database
└── README.md # This fileTechnology Stack
MCP Python SDK: Protocol implementation for RooCode integration
Haystack: Document processing and RAG pipeline framework
ChromaDB: Vector database for embeddings storage
Sentence Transformers: Semantic embeddings (all-mpnet-base-v2)
PDFPlumber: PDF text extraction with layout preservation
Async/Await: Concurrent request handling
JSON-RPC 2.0: Communication protocol
Stdio Transport: RooCode integration
How It Works
Document Extraction: PDFs are parsed using
parse_pdf.pywhich extracts text and metadataText Chunking: Documents are split into overlapping chunks using
DocumentSplitterEmbedding Generation: Text chunks are converted to 768-dimensional vectors using sentence transformers
Vector Storage: Embeddings are stored in ChromaDB with metadata for retrieval
Semantic Search: Queries are embedded and matched against stored vectors using cosine similarity
Result Ranking: Top-K most relevant chunks are returned with scores and metadata
Requirements
See requirements.txt for full dependencies. Key packages:
mcp>=1.0.0- MCP protocol supporthaystack-ai- RAG frameworkchroma-haystack- ChromaDB integrationsentence-transformers- Embedding modelspdfplumber- PDF extraction
License
MIT
This server cannot be installed
Resources
Unclaimed servers have limited discoverability.
Looking for Admin?
If you are the server author, to access and configure the admin panel.