OmniDocs RAG
Allows indexing JavaScript-rendered Docusaurus documentation sites using Playwright for full content extraction.
Allows indexing JavaScript-rendered GitBook documentation sites using Playwright for full content extraction.
Allows indexing GitHub repositories directly via the API, including source code and documentation files.
Allows indexing npm package metadata and README files for searchable knowledge.
Allows indexing PyPI package metadata and README files for searchable knowledge.
Allows indexing JavaScript-rendered VitePress documentation sites using Playwright for full content extraction.
Click on "Install Server".
Wait a few minutes for the server to deploy. Once ready, it will show a "Started" state.
In the chat, type
@followed by the MCP server name and your instructions, e.g., "@OmniDocs RAGsearch my local files for 'RAG pipeline'"
That's it! The server will respond to your query, and you can continue using it as needed.
Here is a step-by-step guide with screenshots.
Universal High-Performance RAG Knowledge Base for AI Agents Index local files, websites, GitHub repos, npm/PyPI packages — then search them with hybrid AI-powered retrieval. All through your IDE chat.
Features • Architecture • Quickstart • Tools • FAQ
✨ Features
🔍 Search Pipeline
Hybrid Search — ChromaDB vector + BM25 keyword scoring fused via Reciprocal Rank Fusion (RRF, k=60)
Cross-Encoder Reranking —
BAAI/bge-reranker-v2-m3rescores top candidates for surgical precisionQuery Expansion — RU→EN synonym expansion for mixed-language documentation (e.g. "создать" → "create post make new")
Result Deduplication — removes near-duplicate chunks (>80% similarity threshold)
📁 Universal Source Ingestion
40+ file formats —
.md,.py,.js,.ts,.json,.yaml,.html,.csv, and moreBinary documents — PDF, DOCX, XLSX, PPTX, Jupyter Notebooks (optional packages)
Websites — full async BFS crawler with boundary control, robots.txt, sitemap.xml
GitHub repositories — direct API tree walking
npm / PyPI packages — metadata + README extraction
ZIP archives — automatic extraction and indexing
JS-rendered sites — optional Playwright support for Docusaurus, GitBook, VitePress
⚡ Performance
GPU acceleration — auto-detects CUDA (RTX 3080 = ~11x speedup)
Incremental indexing — MD5 file hashing, only re-indexes changed files
Code-aware chunking — Python files split by class/function via AST, JS/TS via regex
Heading-aware chunking — Markdown split at
##/###with 2-sentence overlapBM25 persistence — survives server restarts via pickle cache
🛠️ Management
Multi-collection — separate knowledge bases per project
Auto-categorization — YAML frontmatter → H1 heading → filename fallback
File Watcher — auto-reindex on filesystem changes (watchdog, 2s debounce)
Admin tools — list, remove, delete, reindex — all through chat
100% Local & Free — no API keys, no Docker, no monthly fees
Related MCP server: Crawl4AI+SearXNG MCP Server
🏗️ Architecture
flowchart TD
Q(["🔍 User Query"])
Q --> QE
subgraph EXPAND["📝 Stage 0 — Query Expansion"]
QE["RU→EN Synonyms<br>(up to 3 variations)"]
end
QE --> VS
QE --> BM
subgraph HYBRID["⚡ Stage 1 — Hybrid Retrieval"]
VS["🧠 Vector Search<br>(ChromaDB + bge-m3)"]
BM["📝 Keyword Search<br>(BM25 Okapi)"]
end
VS --> RRF
BM --> RRF
subgraph FUSION["🔀 Stage 2 — Fusion"]
RRF["Reciprocal Rank Fusion<br>(k=60)"]
end
RRF --> CE
subgraph RERANK["🎯 Stage 3 — Reranking"]
CE["Cross-Encoder<br>bge-reranker-v2-m3"]
end
CE --> DD
subgraph DEDUP["🧹 Stage 4 — Dedup"]
DD["Remove >80%<br>similar results"]
end
DD --> OUT
subgraph OUTPUT["📋 Stage 5 — Result"]
OUT["Top-N Documents<br>with Breadcrumbs"]
end
style Q fill:#6366f1,color:#fff,stroke:#4338ca
style EXPAND fill:#0f172a,color:#e2e8f0,stroke:#334155
style HYBRID fill:#0f172a,color:#e2e8f0,stroke:#334155
style FUSION fill:#0f172a,color:#e2e8f0,stroke:#334155
style RERANK fill:#0f172a,color:#e2e8f0,stroke:#334155
style DEDUP fill:#0f172a,color:#e2e8f0,stroke:#334155
style OUTPUT fill:#0f172a,color:#e2e8f0,stroke:#334155
style QE fill:#581c87,color:#e9d5ff,stroke:#9333ea
style VS fill:#1e40af,color:#bfdbfe,stroke:#3b82f6
style BM fill:#065f46,color:#a7f3d0,stroke:#10b981
style RRF fill:#7c3aed,color:#ede9fe,stroke:#8b5cf6
style CE fill:#b45309,color:#fef3c7,stroke:#f59e0b
style DD fill:#0e7490,color:#cffafe,stroke:#06b6d4
style OUT fill:#1e3a5f,color:#bae6fd,stroke:#38bdf8How It Works
Query Expansion — generates up to 3 query variations with RU→EN programming synonyms
Hybrid Retrieval — searches by semantic meaning (
bge-m3, 8192 tokens) and exact keywords (BM25) simultaneouslyReciprocal Rank Fusion — mathematically combines ranks from both engines
Cross-Encoder Reranking —
bge-reranker-v2-m3deeply computes relevance for top candidatesDeduplication — removes near-identical results
Structured Output — results returned with breadcrumbs (e.g.,
README.md > Quickstart > Installation)
🚀 Quickstart
1. Prerequisites
Python 3.10+ (Tested on 3.13)
git
2. Install
git clone https://github.com/ElvinBayramov/OmniDocs-RAG.git
cd OmniDocs-RAG
python install.pyThe installer downloads AI models (~2.2GB for bge-m3 + bge-reranker), installs dependencies, and auto-configures your IDE.
⚠️ Enable GPU Acceleration (Crucial for Speed): By default,
pipmay install the CPU-only version of PyTorch on Windows. To unlock your NVIDIA GPU, run:pip install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cu124 --upgrade --force-reinstall
3. Configure (if manual)
Add to your IDE's MCP config (mcp_config.json):
{
"mcpServers": {
"markdown-rag": {
"command": "python",
"args": ["C:\\path\\to\\OmniDocs-RAG\\server.py"],
"env": {
"RAG_DOCS_PATH": "C:\\path\\to\\your\\docs",
"RAG_DEVICE": "cuda"
}
}
}
}4. Use
Just talk to your AI assistant:
> Index my project docs
> Search: how does authentication work?
> Index the FastAPI documentation from https://fastapi.tiangolo.com
> Add the langchain repo: github://langchain-ai/langchain/docsThe AI calls the MCP tools automatically — no UI, no buttons, just chat.
🛠️ MCP Tools
Tool | Description |
| Index local files (40+ formats, incremental) |
| Index websites, GitHub, npm, PyPI, ZIP |
| Hybrid search with reranking |
| Full system status: models, GPU, BM25, chunks |
| List all knowledge base collections |
| List files in a collection |
| Remove a file from the index |
| Delete an entire collection |
| Force full rebuild |
index_url() — Universal Source Ingestion
# Websites (async BFS crawler)
index_url("https://docs.python.org/3/library/asyncio.html")
# GitHub repositories
index_url("github://tiangolo/fastapi/docs")
# npm packages
index_url("npm://axios@1.6")
# PyPI packages
index_url("pypi://fastapi")
# ZIP archives
index_url("file:///path/to/docs.zip")⚙️ Configuration
All settings via environment variables:
Variable | Default | Description |
| parent directory | Folder to scan for files |
|
| ChromaDB storage location |
|
|
|
|
| Embedding model |
|
| Cross-Encoder model |
| — | Directory to watch for auto-reindex |
|
| Collection for file watcher |
| — | GitHub API token (higher rate limits) |
📁 Supported Formats
Text (no extra deps):
.md .txt .rst .log .html .htm
Code (wrapped in markdown):
.py .js .ts .jsx .tsx .css .java .go .rs .c .cpp .rb .php .swift .kt .lua .sh
Config:
.json .yaml .yml .toml .xml .csv .ini .cfg
Binary (optional packages):
Format | Install |
| |
Word (.docx) |
|
Excel (.xlsx) |
|
PowerPoint (.pptx) |
|
Jupyter (.ipynb) | built-in |
🏷️ Auto-Categorization
Every file gets a category automatically (no manual tagging needed):
Priority | Source | Example |
1 | YAML frontmatter |
|
2 | First |
|
3 | Filename stem |
|
❓ FAQ
Q: Does this send my data anywhere? A: No. 100% local. Models download once from HuggingFace, then everything runs offline. No API keys, no cloud.
Q: Do I need a GPU?
A: No, but it helps. CPU works fine for search (~200ms). GPU (CUDA) accelerates indexing ~11x. Set RAG_DEVICE=cuda.
Q: How do I update the index?
A: The server uses incremental indexing — only changed files are re-indexed. Just call index_documents() again, or enable the file watcher with RAG_WATCH_PATH.
Q: Why is the first search slow? A: The Cross-Encoder (~1.1GB) loads lazily on first query. All subsequent searches are instant.
Q: Does it support my language?
A: Yes. bge-m3 supports 100+ languages including English, Russian, Chinese, Japanese, Spanish, and more. Cross-language search works too (query in Russian, find results in English).
Q: Can I have separate knowledge bases per project?
A: Yes. Use the collection parameter: index_documents(path, collection="my-project"), then search_docs(query, collection="my-project").
📄 License
Licensed under the Apache License 2.0. See LICENSE for details.
This server cannot be installed
Maintenance
Resources
Unclaimed servers have limited discoverability.
Looking for Admin?
If you are the server author, to access and configure the admin panel.
Appeared in Searches
Latest Blog Posts
MCP directory API
We provide all the information about MCP servers via our MCP API.
curl -X GET 'https://glama.ai/api/mcp/v1/servers/ElvinBayramov/OmniDocs-RAG'
If you have feedback or need assistance with the MCP directory API, please join our Discord server