Which integrations are available for this server?

Enables embedding via Ollama for semantic search, requiring a running Ollama instance with a pulled model. Provides local embedding using ONNX runtime with auto-downloaded all-MiniLM-L6-v2 model for zero-config semantic search. Enables embedding via OpenAI API using text-embedding-3-small for semantic search, requiring an API key.

How do I use omni-rag-mcp?

1. Click on "Install Server". 2. Wait a few minutes for the server to deploy. Once ready, it will show a "Started" state. 3. In the chat, type @ followed by the MCP server name and your instructions, e.g., "@omni-rag-mcp search for the user login flow" That's it! The server will respond to your query, and you can continue using it as needed. Here is a step-by-step guide with screenshots.

omni-rag-mcp

by Suyash2013

Overview Schema Related Servers Score Discussions

Python

Hybrid

omni-rag-mcp

A general-purpose RAG MCP plugin for token-efficient semantic search over any directory of files. Auto-ingests the current working directory on first search and provides hybrid search (BM25 + semantic), directory overview, structural analysis, and dependency graphs.

Zero-config by default: local Qdrant storage, ONNX embeddings, no external services required. Supports code, markdown, PDFs, CSVs, and more via pluggable extractors.

Quick Start

pip install omni-rag-mcp
omni-rag-setup

That's it. Restart Claude Code and the plugin auto-indexes your working directory on first search.

Related MCP server: Codebase Contextifier 9000

How It Works

Your Files  ->  Extractors  ->  Chunking  ->  Embedding  ->  Qdrant (local)
                                                                 |
Claude Code ->  MCP Tool Call  ->  Hybrid Search  ->  Relevant Snippets

First search auto-ingests your working directory (extracts content, chunks, generates embeddings, stores in local Qdrant)
Subsequent searches are fast hybrid lookups (BM25 + semantic) -- no re-ingestion needed
Incremental updates detect git changes and only re-embed modified files

MCP Tools

Tool	Purpose
`search`	Hybrid search over indexed files (auto-ingests if needed)
`search_by_file`	Search filtered by file path pattern
`get_context`	Compressed directory overview (languages, structure, dependencies)
`get_file_signatures`	Function/class signatures without reading every file
`get_dependency_graph`	Internal import/dependency graph
`stats`	Index size and configuration
`ingest`	Manual re-index (incremental by default, `force=True` for full)
`check_status`	Is the index current? Any uncommitted changes?

Embedding Providers

Zero-config by default. Choose your provider:

Provider	Config	Notes
ONNX (default)	None needed	Auto-downloads all-MiniLM-L6-v2 (23MB, 384-dim)
Ollama	`OMNI_RAG_EMBEDDING_PROVIDER=ollama`	Requires Ollama running with model pulled
OpenAI	`OMNI_RAG_EMBEDDING_PROVIDER=openai` + `OMNI_RAG_OPENAI_API_KEY=sk-...`	text-embedding-3-small
Voyage	`OMNI_RAG_EMBEDDING_PROVIDER=voyage` + `OMNI_RAG_VOYAGE_API_KEY=...`	voyage-code-3 (optimized for code)

Optional Extras

pip install omni-rag-mcp[pdf]    # PDF extraction (PyMuPDF)
pip install omni-rag-mcp[docx]   # Word document extraction
pip install omni-rag-mcp[image]  # Image/OCR extraction (Tesseract + Pillow)
pip install omni-rag-mcp[all]    # All optional extractors

Storage

By default, uses Qdrant in local/on-disk mode -- no Docker needed. Data stored in .omni-rag/ under your project directory.

For remote Qdrant:

OMNI_RAG_QDRANT_MODE=remote
OMNI_RAG_QDRANT_HOST=your-host
OMNI_RAG_QDRANT_PORT=6333

Configuration

All settings via environment variables with OMNI_RAG_ prefix. See config/.env.example for the full reference.

Legacy RAG_ prefix variables are still supported with deprecation warnings.

Development

# Install with dev dependencies
pip install -e ".[dev]"

# Run tests
python -m pytest tests/ -v

# Health check
python scripts/health_check.py

Manual MCP Registration

If omni-rag-setup doesn't work, add this to your Claude Code MCP config:

{
  "mcpServers": {
    "omni-rag": {
      "command": "omni-rag"
    }
  }
}

Install Server

license - permissive license

quality

maintenance

How are these scores calculated?

Maintenance

–Maintainers

<1hResponse time

–Release cycle

–Releases (12mo)

Commit activity

Issues opened vs closed

Resources

GitHub Repository

Need Help?

Related Servers

Unclaimed servers have limited discoverability.

Looking for Admin?

If you are the server author, to access and configure the admin panel.

Tools

View all tools

Latest Blog Posts

Your AI Chatbot Just Exposed Your CEO's Salary to an Intern
By Om-Shree-0709 on July 2, 2026.
Agent Identity
MCP Security
OAuth Delegation
Why MCP Servers Need Execution Sandboxing (And Why Your Current Stack Isn't Enough)
By Om-Shree-0709 on June 30, 2026.
Agentic Ai
Prompt Injection
WebAssembly
Lightport: Open-Sourcing Glama's AI Gateway
By punkpeye on April 27, 2026.
OpenAI
open source

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/Suyash2013/codebase-rag-mcp'

If you have feedback or need assistance with the MCP directory API, please join our Discord server