Agentic RAG MCP
Provides tools for ingesting documents into a Supabase pgvector knowledge base and performing retrieval-augmented generation queries against it.
Click on "Install Server".
Wait a few minutes for the server to deploy. Once ready, it will show a "Started" state.
In the chat, type
@followed by the MCP server name and your instructions, e.g., "@Agentic RAG MCPask: explain the revision loop"
That's it! The server will respond to your query, and you can continue using it as needed.
Here is a step-by-step guide with screenshots.
Agentic RAG MCP
A multi-agent Retrieval-Augmented Generation system exposed as an MCP server. Ask a question and a LangGraph pipeline plans the retrieval, pulls evidence from a pgvector knowledge base, optionally augments it with live web research, drafts a cited answer, and then self-critiques it for grounding — revising until the answer is supported by the sources.
It plugs into any MCP client (Claude Code/Desktop, Cursor, Windsurf, …) as three tools:
ingest, ask, and search.
Why this design? A bare RAG endpoint is easy to copy; a multi-agent system that verifies its own answers and ships as an MCP server is not. The architecture is the moat — "easy to buy, hard to replicate."
Architecture
flowchart LR
Q([Question]) --> P[🧭 Planner<br/>plan + search queries]
P --> R[📚 Retriever<br/>pgvector top-k]
R --> W[🌐 Web Researcher<br/>Firecrawl • optional]
W --> S[✍️ Synthesizer<br/>cited answer]
S --> C{🔎 Critic<br/>grounded?}
C -- needs revision --> S
C -- grounded --> A([Answer + citations])
subgraph Stores
DB[(Supabase<br/>pgvector)]
end
R <-->|cosine search| DB
classDef agent fill:#1e293b,stroke:#7C3AED,color:#e2e8f0;
class P,R,W,S,C agent;Agent | Model / tool | Responsibility |
Planner | Claude ( | Decompose the question into focused search queries |
Retriever | Voyage embeddings + pgvector | Cosine top-k over the knowledge base |
Web Researcher | Firecrawl (optional) | Augment with live web results when a key is set |
Synthesizer | Claude | Draft an answer grounded in context, with |
Critic | Claude | Verify grounding; loop back for revision if unsupported |
Related MCP server: AI MCP System
MCP tools
Tool | Arguments | Returns |
|
| Scrapes the URL, chunks + embeds it, stores it. |
|
| Runs the full pipeline. |
|
| Retrieval only — top-k chunks with similarity scores |
Quickstart
# 1. Install (Python 3.10+)
uv venv && uv pip install -e ".[dev]" # or: pip install -e ".[dev]"
# 2. Configure
cp .env.example .env # fill in ANTHROPIC_API_KEY, VOYAGE_API_KEY, DATABASE_URL
# 3. Create the vector table (Supabase SQL editor or psql)
psql "$DATABASE_URL" -f sql/schema.sql
# 4. Run the MCP server (stdio by default)
agentic-rag-mcpConnect it to Claude Code
claude mcp add agentic-rag -s user \
--env ANTHROPIC_API_KEY=sk-ant-... \
--env VOYAGE_API_KEY=pa-... \
--env DATABASE_URL=postgresql://... \
-- agentic-rag-mcpThen, from the client: "ingest https://example.com/docs" → "ask: how do I configure X?".
How it works
Plan — Claude turns the question into a short plan + 1–5 search queries.
Retrieve — each query is embedded (Voyage
voyage-3.5) and matched against pgvector by cosine distance; results are de-duplicated and ranked.Research — if
FIRECRAWL_API_KEYis set, live web results are added to the context.Synthesize — Claude writes an answer grounded only in the numbered context, citing each claim as
[n].Critique — a strict fact-checker pass decides whether the answer is fully supported. If not (and revisions remain), it loops back to the synthesizer with feedback.
Configurable via env: RAG_MODEL, RAG_TOP_K, RAG_MAX_REVISIONS, RAG_EMBED_MODEL.
Evaluation
Answer quality is tracked with promptfoo — faithfulness, citation presence, and latency — so quality is measured, not asserted:
cd evals && promptfoo eval -c promptfooconfig.yamlSee evals/ for the rubric and test cases.
Deploy
Containerised and ready for Railway (HTTP transport):
railway up # uses Dockerfile + railway.json; set RAG_TRANSPORT=httpExpose RAG_HTTP_PORT and connect over --transport http. A cloudflared tunnel works for
local demos.
Project layout
src/agentic_rag_mcp/
config.py # env-driven settings
llm.py # Anthropic (Claude) helper — adaptive thinking, JSON parsing
embeddings.py # Voyage embeddings
store.py # pgvector store (psycopg)
web.py # Firecrawl web research (optional)
ingest.py # chunking + ingestion
state.py # LangGraph state
nodes.py # planner / retriever / researcher / synthesizer / critic
graph.py # graph assembly
server.py # FastMCP server (ingest / ask / search)
sql/schema.sql # pgvector schema
evals/ # promptfoo eval suiteLicense
MIT — see LICENSE.
Maintenance
Resources
Unclaimed servers have limited discoverability.
Looking for Admin?
If you are the server author, to access and configure the admin panel.
Latest Blog Posts
MCP directory API
We provide all the information about MCP servers via our MCP API.
curl -X GET 'https://glama.ai/api/mcp/v1/servers/enached134-ctrl/agentic-rag-mcp'
If you have feedback or need assistance with the MCP directory API, please join our Discord server