omni-rag-mcp
Enables embedding via Ollama for semantic search, requiring a running Ollama instance with a pulled model.
Provides local embedding using ONNX runtime with auto-downloaded all-MiniLM-L6-v2 model for zero-config semantic search.
Enables embedding via OpenAI API using text-embedding-3-small for semantic search, requiring an API key.
Click on "Install Server".
Wait a few minutes for the server to deploy. Once ready, it will show a "Started" state.
In the chat, type
@followed by the MCP server name and your instructions, e.g., "@omni-rag-mcpsearch for the user login flow"
That's it! The server will respond to your query, and you can continue using it as needed.
Here is a step-by-step guide with screenshots.
omni-rag-mcp
A general-purpose RAG MCP plugin for token-efficient semantic search over any directory of files. Auto-ingests the current working directory on first search and provides hybrid search (BM25 + semantic), directory overview, structural analysis, and dependency graphs.
Zero-config by default: local Qdrant storage, ONNX embeddings, no external services required. Supports code, markdown, PDFs, CSVs, and more via pluggable extractors.
Quick Start
pip install omni-rag-mcp
omni-rag-setupThat's it. Restart Claude Code and the plugin auto-indexes your working directory on first search.
How It Works
Your Files -> Extractors -> Chunking -> Embedding -> Qdrant (local)
|
Claude Code -> MCP Tool Call -> Hybrid Search -> Relevant SnippetsFirst search auto-ingests your working directory (extracts content, chunks, generates embeddings, stores in local Qdrant)
Subsequent searches are fast hybrid lookups (BM25 + semantic) -- no re-ingestion needed
Incremental updates detect git changes and only re-embed modified files
MCP Tools
Tool | Purpose |
| Hybrid search over indexed files (auto-ingests if needed) |
| Search filtered by file path pattern |
| Compressed directory overview (languages, structure, dependencies) |
| Function/class signatures without reading every file |
| Internal import/dependency graph |
| Index size and configuration |
| Manual re-index (incremental by default, |
| Is the index current? Any uncommitted changes? |
Embedding Providers
Zero-config by default. Choose your provider:
Provider | Config | Notes |
ONNX (default) | None needed | Auto-downloads all-MiniLM-L6-v2 (23MB, 384-dim) |
Ollama |
| Requires Ollama running with model pulled |
OpenAI |
| text-embedding-3-small |
Voyage |
| voyage-code-3 (optimized for code) |
Optional Extras
pip install omni-rag-mcp[pdf] # PDF extraction (PyMuPDF)
pip install omni-rag-mcp[docx] # Word document extraction
pip install omni-rag-mcp[image] # Image/OCR extraction (Tesseract + Pillow)
pip install omni-rag-mcp[all] # All optional extractorsStorage
By default, uses Qdrant in local/on-disk mode -- no Docker needed. Data stored in .omni-rag/ under your project directory.
For remote Qdrant:
OMNI_RAG_QDRANT_MODE=remote
OMNI_RAG_QDRANT_HOST=your-host
OMNI_RAG_QDRANT_PORT=6333Configuration
All settings via environment variables with OMNI_RAG_ prefix. See config/.env.example for the full reference.
Legacy RAG_ prefix variables are still supported with deprecation warnings.
Development
# Install with dev dependencies
pip install -e ".[dev]"
# Run tests
python -m pytest tests/ -v
# Health check
python scripts/health_check.pyManual MCP Registration
If omni-rag-setup doesn't work, add this to your Claude Code MCP config:
{
"mcpServers": {
"omni-rag": {
"command": "omni-rag"
}
}
}Maintenance
Resources
Unclaimed servers have limited discoverability.
Looking for Admin?
If you are the server author, to access and configure the admin panel.
Latest Blog Posts
MCP directory API
We provide all the information about MCP servers via our MCP API.
curl -X GET 'https://glama.ai/api/mcp/v1/servers/Suyash2013/codebase-rag-mcp'
If you have feedback or need assistance with the MCP directory API, please join our Discord server