MMWRAG
Click on "Install Server".
Wait a few minutes for the server to deploy. Once ready, it will show a "Started" state.
In the chat, type
@followed by the MCP server name and your instructions, e.g., "@MMWRAGsearch for Riemann integral in Lebl's Basic Analysis I"
That's it! The server will respond to your query, and you can continue using it as needed.
Here is a step-by-step guide with screenshots.
MMWRAG
A bilingual (Russian/English) RAG over scientific literature (textbooks/papers): vision PDF parsing, BGE-M3 hybrid (dense+sparse) retrieval with a cross-encoder reranker, exposed as an MCP tool (search only — the consumer composes the answer). Every retrieval decision here is measurement-driven — see DECISIONS.md.
Features
Vision PDF parsing behind a swappable interface (cloud PaddleOCR-VL / local PP-StructureV3) — required because the text layer doesn't encode formula structure.
Structure-aware chunking (~512-token packing over blocks, page spans kept for citations).
BGE-M3 dense + sparse embeddings; Qdrant hybrid search with server-side RRF.
Cross-encoder reranker (
bge-reranker-v2-m3) over the top-N pool.Book-aware cross-lingual routing —
search(book_id=...)targets a specific book/language.MCP server (
search,list_books) over streamable HTTP — no answer generation.Eval harness — page-level
hit@k/MRR/recall@k, cross-book and cross-lingual.
Architecture
INDEXING PDF ─parse─> Page[] ─chunk─> Chunk[] ─BGE-M3 (dense+sparse)─> Qdrant
QUERY question ─HybridRetriever (RRF)─> top-N ─cross-encoder rerank─> top-k Source[]
MCP client ─/mcp─> search(query, top_k, book_id) ─> fragments {book_id, pages, text, score}
list_books() ─> indexed books + languageDetails in ARCHITECTURE.md.
Quickstart
# 1. dependencies (paddlepaddle-gpu is a manual prereq for the PARSING path only)
uv sync
# 2. vector database
docker compose up -d # Qdrant on :6333
# 3. bring your own PDF and index it
# parsing needs PADDLEOCR_TOKEN in .env (see .env.example);
# pipeline: parse(pdf) -> chunk_pages(...) -> index_chunks(...) (see notebooks/ for examples)
# 4. run the MCP server
uv run python -m src.mcp.server # streamable-http on 127.0.0.1:8000The corpus is not included (copyright). Search/MCP need Qdrant + the local models (BGE-M3, the reranker); CPU works (slower), GPU is faster. Parsing additionally needs a PaddleOCR-VL cloud token.
Demo
A real session against the MCP server (notebooks/mcp_smoke.py, output trimmed to
metadata):
tools: ['search', 'list_books']
list_books:
{'book_id': 'zorich_v1', 'title': 'Zorich — Mathematical Analysis I', 'language': 'ru', 'chunks': 1472}
{'book_id': 'zorich_v2', 'title': 'Zorich — Mathematical Analysis II', 'language': 'ru', 'chunks': 2526}
{'book_id': 'lebl', 'title': 'Lebl — Basic Analysis I', 'language': 'en', 'chunks': 722}
search RU (all books), top 3:
zorich_v1 159 2.125
zorich_v1 158–159 0.297
zorich_v2 517 -0.357
search RU routed to lebl (cross-lingual), top 3:
lebl 135–136 0.123
lebl 167 -0.047
lebl 208 -0.141The last call shows book-aware cross-lingual routing: a Russian query with
book_id="lebl" returns the English source (Lebl, p.135–136) that a plain cross-book
search buries behind the Russian equivalent (see DECISIONS.md §5).
Project structure
src/
parse/ vision PDF -> Page[] (cloud / local engines, idempotent cache)
chunk/ Page[] -> Chunk[] (structure-aware packing, page spans)
index/ Chunk[] -> BGE-M3 -> Qdrant (Embedder / VectorStore interfaces)
query/ HybridRetriever + RerankingRetriever; answer() with citations
mcp/ MCP server: search / list_books (pure core + thin FastMCP server)
eval/ page-level hit@k / MRR / recall@k; cross-book & cross-lingual
tests/ unit tests (pure logic on fakes; integration tests skip offline)
notebooks/ runnable examples & measurement runners (mcp_smoke, eval_*, diag_*)Status & roadmap
Pipeline (parse → chunk → index → query) and a measured retrieval stack (hybrid + reranker) are done; the MCP search server is done. Next: a network model-serving backend, client ingestion, and an agent layer over MCP. The reasoning and numbers behind each choice are in DECISIONS.md.
License
MIT © 2026 mikrominiw
This server cannot be installed
Maintenance
Resources
Unclaimed servers have limited discoverability.
Looking for Admin?
If you are the server author, to access and configure the admin panel.
Latest Blog Posts
MCP directory API
We provide all the information about MCP servers via our MCP API.
curl -X GET 'https://glama.ai/api/mcp/v1/servers/mikrominiw/scientific-rag-mcp'
If you have feedback or need assistance with the MCP directory API, please join our Discord server