retrieval-only-RAG
Click on "Install Server".
Wait a few minutes for the server to deploy. Once ready, it will show a "Started" state.
In the chat, type
@followed by the MCP server name and your instructions, e.g., "@retrieval-only-RAGWhat does the manual say about installation steps?"
That's it! The server will respond to your query, and you can continue using it as needed.
Here is a step-by-step guide with screenshots.
retrieval-only-RAG
A PDF retrieval tool wrapped as an MCP server. It handles the R in RAG — your IDE agent (Cursor, Kiro, Claude Code) handles generation.
PDFs ──> [ load → chunk → embed → store → retrieve ]
│
returns matching chunks
│
[ MCP server wraps the retriever ]
│
IDE agent calls it ──┘ → IDE agent writes the answerNo LLM inside this tool. Embeddings run locally (no cloud key needed).
Setup
python -m venv .venv
.venv\Scripts\activate # Windows
pip install -r requirements.txtRelated MCP server: mcp-context
Usage
Index your PDFs — drop PDF files into pdfs/ then run:
python -m pdf_rag.cli indexSearch — retrieve the top-k chunks for a question:
python -m pdf_rag.cli search "What is the difference between ArrayList and LinkedList?"Output includes source filename, page number, and similarity score for each chunk.
MCP Server
Exposes one tool — search_pdfs(query) — that any MCP-compatible IDE agent can call.
python mcp_server.pyCursor config (.cursor/mcp.json)
{
"mcpServers": {
"pdf-rag": {
"command": "path/to/.venv/Scripts/python.exe",
"args": ["path/to/mcp_server.py"],
"cwd": "path/to/project"
}
}
}Once connected, ask your IDE agent a question about your PDFs — it calls search_pdfs, gets the chunks, and writes the answer. You own retrieval; the agent owns generation.
Configuration (config.yaml)
pdf_folder: pdfs # folder to scan for PDFs
vector_store: vector_store # where the index is persisted
embedding_model: BAAI/bge-small-en-v1.5 # local HuggingFace model
top_k: 5 # chunks returned per queryProject structure
pdf_rag/
config.py # load + validate config.yaml
indexer.py # PDF loading, chunking, embedding, persistence
retriever.py # similarity search + result formatting
cli.py # index / search commands
mcp_server.py # MCP wrapper exposing search_pdfs()
config.yaml
requirements.txt
pdfs/ # drop your PDFs here (not committed)
vector_store/ # persisted index (not committed)How the RAG split works
Layer | Who does it | How |
Retrieval | This tool | LlamaIndex + local embeddings |
Augmentation | MCP protocol | Retrieved chunks injected into agent context |
Generation | IDE agent | Cursor / Kiro / Claude Code answers from chunks |
The MCP server is editor-agnostic — swap Cursor for Kiro (or any MCP client) by changing only the connection config, no code changes needed.
This server cannot be installed
Maintenance
Resources
Unclaimed servers have limited discoverability.
Looking for Admin?
If you are the server author, to access and configure the admin panel.
Latest Blog Posts
MCP directory API
We provide all the information about MCP servers via our MCP API.
curl -X GET 'https://glama.ai/api/mcp/v1/servers/rspraneeth/retrieval-only-RAG'
If you have feedback or need assistance with the MCP directory API, please join our Discord server