Skip to main content
Glama
rspraneeth

retrieval-only-RAG

by rspraneeth

retrieval-only-RAG

A PDF retrieval tool wrapped as an MCP server. It handles the R in RAG — your IDE agent (Cursor, Kiro, Claude Code) handles generation.

PDFs ──> [ load → chunk → embed → store → retrieve ]
                          │
                 returns matching chunks
                          │
              [ MCP server wraps the retriever ]
                          │
     IDE agent calls it ──┘  →  IDE agent writes the answer

No LLM inside this tool. Embeddings run locally (no cloud key needed).


Setup

python -m venv .venv
.venv\Scripts\activate        # Windows
pip install -r requirements.txt

Related MCP server: mcp-context

Usage

Index your PDFs — drop PDF files into pdfs/ then run:

python -m pdf_rag.cli index

Search — retrieve the top-k chunks for a question:

python -m pdf_rag.cli search "What is the difference between ArrayList and LinkedList?"

Output includes source filename, page number, and similarity score for each chunk.


MCP Server

Exposes one tool — search_pdfs(query) — that any MCP-compatible IDE agent can call.

python mcp_server.py

Cursor config (.cursor/mcp.json)

{
  "mcpServers": {
    "pdf-rag": {
      "command": "path/to/.venv/Scripts/python.exe",
      "args": ["path/to/mcp_server.py"],
      "cwd": "path/to/project"
    }
  }
}

Once connected, ask your IDE agent a question about your PDFs — it calls search_pdfs, gets the chunks, and writes the answer. You own retrieval; the agent owns generation.


Configuration (config.yaml)

pdf_folder: pdfs              # folder to scan for PDFs
vector_store: vector_store    # where the index is persisted
embedding_model: BAAI/bge-small-en-v1.5   # local HuggingFace model
top_k: 5                      # chunks returned per query

Project structure

pdf_rag/
  config.py      # load + validate config.yaml
  indexer.py     # PDF loading, chunking, embedding, persistence
  retriever.py   # similarity search + result formatting
  cli.py         # index / search commands
mcp_server.py    # MCP wrapper exposing search_pdfs()
config.yaml
requirements.txt
pdfs/            # drop your PDFs here (not committed)
vector_store/    # persisted index (not committed)

How the RAG split works

Layer

Who does it

How

Retrieval

This tool

LlamaIndex + local embeddings

Augmentation

MCP protocol

Retrieved chunks injected into agent context

Generation

IDE agent

Cursor / Kiro / Claude Code answers from chunks

The MCP server is editor-agnostic — swap Cursor for Kiro (or any MCP client) by changing only the connection config, no code changes needed.

F
license - not found
-
quality - not tested
C
maintenance

Maintenance

Maintainers
Response time
Release cycle
Releases (12mo)
Commit activity

Resources

Unclaimed servers have limited discoverability.

Looking for Admin?

If you are the server author, to access and configure the admin panel.

Latest Blog Posts

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/rspraneeth/retrieval-only-RAG'

If you have feedback or need assistance with the MCP directory API, please join our Discord server