mcp-poc
Embeds queries using a local Ollama model to retrieve relevant document chunks from a vector store for RAG search.
Click on "Install Server".
Wait a few minutes for the server to deploy. Once ready, it will show a "Started" state.
In the chat, type
@followed by the MCP server name and your instructions, e.g., "@mcp-pocadd 5 and 3"
That's it! The server will respond to your query, and you can continue using it as needed.
Here is a step-by-step guide with screenshots.
mcp-poc
A small Model Context Protocol (MCP) server that exposes the retrieval step of the sibling rag-poc project as an MCP tool — "RAG over MCP."
MCP is a standard way to give an LLM client (Claude Code, Claude Desktop, …) access to tools that live outside the model. Here the pattern is deliberate: the server does retrieval only — it embeds your query, finds the most similar chunks in rag-poc's local vector store, and hands them back. The client's model does the generation, reading those chunks and writing a grounded, cited answer. The server never calls a chat model.
The tool
Tool | Signature | What it does |
|
| Embeds |
The model calling it is expected to answer from the returned chunks and cite each
source, or say it doesn't know if they don't contain the answer.
Related MCP server: MCP Calculator Demo
How it connects to rag-poc
This repo doesn't reimplement RAG — it imports rag-poc's rag package. The server
puts the rag-poc folder on sys.path and reuses its vector store, query embedder,
and input-sanitising hook. By default it expects rag-poc as a sibling folder
(../rag-poc); point elsewhere with the RAG_POC_PATH environment variable. The
store is read from RAG_POC_PATH/store.npz.
Prerequisites
Ollama running at
localhost:11434with thenomic-embed-textmodel pulled (ollama pull nomic-embed-text). The query must be embedded by the same model that embedded the documents.rag-poc ingested so its store exists — in the rag-poc folder:
python main.py ingest.
Setup
cd "C:\Coding Space\mcp-poc"
python -m venv .venv
.\.venv\Scripts\Activate.ps1
pip install -r requirements.txtTest it — two ways
1. MCP Inspector (no Claude needed):
mcp dev server.pyOpen the printed URL, pick rag_search, enter a query (e.g. "What is retrieval-augmented generation?"),
and inspect the returned chunks with their similarity scores.
2. From Claude Code. Register the server so rag_search appears in your session:
claude mcp add mcp-poc -- "C:\Coding Space\mcp-poc\.venv\Scripts\python.exe" "C:\Coding Space\mcp-poc\server.py"If rag-poc is not a sibling of this repo, pass its location when registering:
claude mcp add mcp-poc --env RAG_POC_PATH="C:\path\to\rag-poc" -- "C:\Coding Space\mcp-poc\.venv\Scripts\python.exe" "C:\Coding Space\mcp-poc\server.py"Then /mcp lists connected servers, and you can ask a question about your indexed
docs — Claude will call rag_search, pull the relevant chunks, and answer from them.
Remove it with claude mcp remove mcp-poc.
Next steps
Add a
rag_answertool that runs rag-poc's full local pipeline (Ollama generation) to compare "the client model generates" vs "the local model generates."Expose the indexed documents as MCP resources, or add a prompt template for a standard "answer with citations" instruction.
This server cannot be installed
Maintenance
Resources
Unclaimed servers have limited discoverability.
Looking for Admin?
If you are the server author, to access and configure the admin panel.
Latest Blog Posts
MCP directory API
We provide all the information about MCP servers via our MCP API.
curl -X GET 'https://glama.ai/api/mcp/v1/servers/RenoLu/mcp-poc'
If you have feedback or need assistance with the MCP directory API, please join our Discord server