Skip to main content
Glama

RooCode-RAG-Lookup

RooCode MCP Server for performing RAG (Retrieval-Augmented Generation) lookups in documents and code repositories using vector embeddings and semantic search.

Example Usage

Ask a question: e.g. "What is the maximum number of entries* in a word document?" and prompt the LLM stating "use rag". The LLM is usally a decent judge of when it should use a tool or not and may decide to use the tool on its own.

*This is related to the maximum number of XML properties and elements addressable in Word

Features

  • Full RAG Implementation: Complete vector-based semantic search using ChromaDB and Haystack

  • Document Indexing: Automatic text extraction and chunking from PDF documents

  • Vector Embeddings: Sentence transformer embeddings for semantic similarity

  • RAG Lookup Tool: Search through documents and code repositories with relevance scoring

  • Test Tool: Simple hello world tool to verify MCP server connectivity

  • Async MCP Protocol: Full JSON-RPC 2.0 support via stdio

Installation

  1. Install Python dependencies:

pip install -r requirements.txt
  1. Configure RooCode to use this MCP server by adding the configuration from mcp_config.json to your RooCode settings.

Configuration

  1. Add the mcp_config.json to your RooCode MCP server settings in the edit global settings part of MCP tools. If the tool is ready to use it will show a green status.

  2. Set the following environment variables:

    • RAG_LOOKUP_PATH: Path to this project directory

    • PYTHON_PATH: Path to your Python executable

  3. Configure parameters in parameters.py:

    • EMBEDDING_MODEL: Sentence transformer model (default: all-mpnet-base-v2)

    • COLLECTION_NAME: ChromaDB collection name

    • CHUNK_SIZE: Text chunk size in words (default: 500)

    • CHUNK_OVERLAP: Overlap between chunks (default: 50)

    • DEFAULT_TOP_K: Number of results to return (default: 5)

Available Tools

1. rag_lookup

Perform semantic search using RAG in documents and code repositories. Returns relevant chunks with similarity scores and metadata.

Parameters:

  • query (required): The search query

  • source (optional): Where to search - "documents", "repos", or "both" (default: "both")

Returns:

  • Relevant text chunks with similarity scores

  • Source file information and metadata

  • Statistics on documents searched

Example:

{ "query": "authentication implementation", "source": "both" }

Response Format:

{ "status": "success", "query": "authentication implementation", "results": [ { "content": "...", "score": 0.85, "metadata": { "file_name": "document.txt", "source_file": "/path/to/document.txt" } } ], "metadata": { "documents_searched": 5, "repos_searched": 3, "total_matches": 5 } }

2. say_hello

Simple test tool that returns a greeting message with timestamp.

Parameters:

  • name (optional): Name to include in greeting (default: "World")

Example:

{ "name": "RooCode" }

Usage

1. Extract and Index Documents

Place PDF documents in the Documents/ or Repos/ folders, then run:

# Extract text from PDFs python extraction/parse_pdf.py # Populate the vector database python extraction/populate_database.py

2. Query the RAG System

# Test RAG lookup directly python query_rag.py Or ask

3. Use via MCP Server

Once configured in RooCode, use the rag_lookup tool through the MCP interface. There is an MCP menu in RooCode settings editing the global settings will give you json settings to edit {"mcpServers":{}}, copy and paste the mcp_config.json into the global MCP settings.

Testing

Test the MCP server locally:

# Using MCP inspector npx @modelcontextprotocol/inspector python mcp_tool.py # Direct stdio test echo '{"jsonrpc":"2.0","id":1,"method":"tools/list"}' | python mcp_tool.py

Project Structure

RooCode-RAG-Lookup/ ├── mcp_tool.py # Main MCP server implementation ├── query_rag.py # RAG query functions ├── parameters.py # Configuration parameters ├── run_rag_lookup.bat # Windows batch launcher ├── mcp_config.json # Example RooCode configuration ├── requirements.txt # Python dependencies ├── extraction/ │ ├── parse_pdf.py # PDF text extraction │ └── populate_database.py # Database population and indexing ├── ExtractedText/ # Extracted text files (.txt + .meta.json) ├── chroma_db/ # ChromaDB vector database └── README.md # This file

Technology Stack

  • MCP Python SDK: Protocol implementation for RooCode integration

  • Haystack: Document processing and RAG pipeline framework

  • ChromaDB: Vector database for embeddings storage

  • Sentence Transformers: Semantic embeddings (all-mpnet-base-v2)

  • PDFPlumber: PDF text extraction with layout preservation

  • Async/Await: Concurrent request handling

  • JSON-RPC 2.0: Communication protocol

  • Stdio Transport: RooCode integration

How It Works

  1. Document Extraction: PDFs are parsed using parse_pdf.py which extracts text and metadata

  2. Text Chunking: Documents are split into overlapping chunks using DocumentSplitter

  3. Embedding Generation: Text chunks are converted to 768-dimensional vectors using sentence transformers

  4. Vector Storage: Embeddings are stored in ChromaDB with metadata for retrieval

  5. Semantic Search: Queries are embedded and matched against stored vectors using cosine similarity

  6. Result Ranking: Top-K most relevant chunks are returned with scores and metadata

Requirements

See requirements.txt for full dependencies. Key packages:

  • mcp>=1.0.0 - MCP protocol support

  • haystack-ai - RAG framework

  • chroma-haystack - ChromaDB integration

  • sentence-transformers - Embedding models

  • pdfplumber - PDF extraction

License

MIT

-
security - not tested
F
license - not found
-
quality - not tested

Latest Blog Posts

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/Mazchoo/RooCode-RAG-Lookup'

If you have feedback or need assistance with the MCP directory API, please join our Discord server