How do I use code-rag-mcp?

1. Click on "Install Server". 2. Wait a few minutes for the server to deploy. Once ready, it will show a "Started" state. 3. In the chat, type @ followed by the MCP server name and your instructions, e.g., "@code-rag-mcp find where authentication logic is implemented" That's it! The server will respond to your query, and you can continue using it as needed. Here is a step-by-step guide with screenshots.

code-rag-mcp

by qduc

Overview Schema Related Servers Score Discussions

Python

Local

Code-RAG

PyPI version Build Status License: MIT Python Versions

Semantic code search for your entire codebase. Ask questions in plain English, get relevant code snippets with source locations.

Instead of grepping for function names, ask "authentication logic" and find all related auth code across your project.

Related MCP server: CodeSense MCP

Quick Start

1. Install

Using uv (recommended):

uvx --from code-rag-mcp code-rag-setup --install

Note: This installs Code-RAG in an isolated uv tool environment and keeps your configuration across updates.

Using pip:

python -m venv .venv
source .venv/bin/activate
pip install code-rag-mcp
code-rag-setup

The setup wizard refuses to install optional dependencies into non-isolated Python by default. Use code-rag-setup --allow-global-python only if you intentionally want that behavior.

New to Python? Use the one-command installer:

curl -sSL https://raw.githubusercontent.com/qduc/code-rag/main/scripts/install.sh | bash

2. Add to Claude

claude mcp add -s user code-rag --transport stdio -- uvx "code-rag-mcp[local]"

If you installed with pip, activate your virtual environment first and use code-rag-mcp instead of the uvx command above.

Need other setup variants or configuration details? See Use with Claude Code (MCP Integration).

3. Try it out (optional)

Test with the CLI before using with Claude:

code-rag-cli --path /path/to/your/project

Claude's Opinion About Code-RAG (take it with a huge grain of salt!)

Code-RAG Review

I ran 9 diverse semantic queries to test the tool across different aspects of the codebase. Here's my assessment:

Strengths

Excellent Semantic Understanding

Query "shared embedding server lifecycle management" correctly identified the core design documentation and implementation (0.54 relevance)
"syntax-aware chunking for different languages" found both the implementation guide and actual code
Searches understood concepts, not just keywords - found relevant code even when exact terms didn't match

Good Relevance Scoring

Scores generally correlate well with actual relevance (0.4-0.6+ range for good matches)
Top results consistently more relevant than lower-ranked ones
The tool correctly prioritizes implementation code over test fixtures

Metadata Quality

Results include file paths, line numbers, and function/class context
Example: src/code_rag/api.py:443-462 incremental_reindex() | CodeRAGAPI (0.61)
This makes it easy to navigate to the exact location

Handles Complex Queries Well

Multi-concept queries like "incremental reindexing detect modified files" found the right code
Domain-specific terminology works well ("embedding model configuration", "MCP server tools")

Weaknesses

Generic Queries Return Scattered Results

"error handling and retry logic" (query 5) returned less focused results
Found timeout handlers, error printing, but no cohesive retry logic
This is expected for overly broad queries, but worth noting

Some Test Code Noise

"database initialization and connection" returned test fixtures as top results instead of actual implementation
The mock database in test_mcp_server.py ranked higher (0.45) than real database code (0.35)
Could be improved with result filtering or boosting production code

Limited Context Window

Results show ~20-30 lines per chunk, which is reasonable but sometimes cuts off mid-implementation
The expand_context parameter helps but isn't always used

Specific Observations

What Works Great:

Architecture/design queries: "shared embedding server lifecycle management" → perfect results
Implementation queries: "syntax-aware chunking" → found both docs and code
Specific subsystem queries: "incremental reindexing" → found the exact functions

What Could Be Better:

Very broad queries ("error handling") need more filtering
Test vs. production code disambiguation
Maybe add file type boosting (prefer src/ over tests/)

Recommendations

Add query guidance in the tool description - Users should know that specific queries work better than generic ones (you already have this in the MCP tool description, which is good)
Consider result filtering options - Allow excluding test files or prioritizing certain directories
Expand context by default - The expand_context parameter could be enabled by default for better chunk visibility
Add result grouping - Group results by file to show "This file has 3 relevant chunks"

Overall Assessment

Rating: 8.5/10

The semantic search works remarkably well for its intended purpose. It successfully finds relevant code based on conceptual queries, not just keyword matching. The relevance scoring is solid, and the metadata makes results actionable.

The main improvement areas are around filtering test code and handling overly broad queries. For a developer using this tool, the key insight is: be specific in your queries. "authentication token refresh logic" will work better than just "authentication."

This is a genuinely useful tool that would save significant time when exploring unfamiliar codebases.

Why Use Code-RAG?

Understand unfamiliar codebases - Ask questions instead of reading everything
Find examples - "error handling with retries" finds all relevant patterns
Refactoring aid - Locate all code related to a feature you're changing
Documentation - Extract context for writing docs or onboarding

Use with Claude Code (MCP Integration)

Code-RAG works as an MCP server, letting Claude automatically search your codebase during conversations.

Note on uv: Many examples below use uv (specifically uvx) for fast, zero-config execution. If you don't have uv installed, you can use standard pip or npx (if using a wrapper).

Quick Setup

Option 1: Using uvx (Recommended)

# Install uv first: https://github.com/astral-sh/uv

# Claude Code
# Local variant:
claude mcp add -s user code-rag --transport stdio -- uvx "code-rag-mcp[local]"
# Cloud variant:
claude mcp add -s user code-rag --transport stdio -- uvx "code-rag-mcp[cloud]"

Done! You can start using Code-RAG with Claude Code now.

Option 2: Using pip (Standard)

# Install in an isolated environment
python -m venv .venv
source .venv/bin/activate
pip install code-rag-mcp

# Register with Claude Code using the absolute path to the binary
claude mcp add -s user code-rag --transport stdio -- $(which code-rag-mcp)

Option 3: Local development installation

# Clone and install
git clone https://github.com/qduc/code-rag.git
cd code-rag
python -m venv .venv
source .venv/bin/activate
pip install -e .

# Register with Claude Code
claude mcp add -s user code-rag --transport stdio -- $(which code-rag-mcp)

Configuration

The MCP server reads configuration from environment variables or config files. Configure via your MCP client's settings:

Claude Desktop (~/Library/Application Support/Claude/claude_desktop_config.json):

{
  "mcpServers": {
    "code-rag": {
      "command": "uvx",
      "args": ["code-rag-mcp"],
      "env": {
        "CODE_RAG_EMBEDDING_MODEL": "nomic-ai/CodeRankEmbed",
        "CODE_RAG_DATABASE_TYPE": "chroma",
        "CODE_RAG_RERANKER_ENABLED": "true"
      }
    }
  }
}

Claude Code: configure environment variables or config files after registering the MCP server in the setup section above.

Common Configuration Options:

CODE_RAG_EMBEDDING_MODEL - Embedding model (default: nomic-ai/CodeRankEmbed)
- nomic-ai/CodeRankEmbed - Code-optimized, runs locally, requires GPU for best performance
- text-embedding-3-small - OpenAI embeddings, no GPU required (requires OPENAI_API_KEY)
CODE_RAG_DATABASE_TYPE - Database backend: chroma or qdrant (default: chroma)
CODE_RAG_CHUNK_SIZE - Chunk size in characters (default: 1024)
CODE_RAG_RERANKER_ENABLED - Enable result reranking, may yield better results but slower (default: false)
CODE_RAG_SHARED_SERVER - Share embedding server across instances, reduce memory footprint (default: true)

Example with OpenAI embeddings:

{
  "mcpServers": {
    "code-rag": {
      "command": "uvx",
      "args": ["code-rag-mcp"],
      "env": {
        "CODE_RAG_EMBEDDING_MODEL": "text-embedding-3-small",
        "OPENAI_API_KEY": "sk-...",
        "CODE_RAG_RERANKER_ENABLED": "true"
      }
    }
  }
}

Usage

Once configured, Claude can automatically search your codebase:

You: "Find the database connection logic"

Claude: [Automatically searches and finds the code]
        "I found the database connection logic in src/code_rag/db/connection.py..."

See docs/mcp.md for detailed setup and troubleshooting.

Basic Usage

# Different codebase
code-rag-cli --path /path/to/repo

# Force reindex
code-rag-cli --reindex

# More results
code-rag-cli --results 10

# Different embedding model (better for code)
code-rag-cli --model text-embedding-3-small # need to set OPENAI_API_KEY env

# Use Qdrant instead of ChromaDB
code-rag-cli --database qdrant

Configuration

Priority Order

Configuration is loaded in this order (higher priority overrides lower):

Environment variables (highest priority)
Custom config file via CODE_RAG_CONFIG_FILE environment variable
Project config: ./code-rag.config
User config: ~/.config/code-rag/config (auto-created with defaults)

For MCP servers: Set environment variables in your MCP client config (see MCP Integration section above).

For CLI usage: Use environment variables or config files.

If you run code-rag-setup without --install, the wizard only installs optional dependencies when the current Python environment is isolated (virtualenv, conda env, or uv tool env). This avoids accidentally modifying system or shared Python installations.

Environment Variables

# Use code-optimized embeddings (recommended)
export CODE_RAG_EMBEDDING_MODEL="nomic-ai/CodeRankEmbed"

# Or OpenAI embeddings
export OPENAI_API_KEY="sk-..."
export CODE_RAG_EMBEDDING_MODEL="text-embedding-3-small"

# Use Qdrant
export CODE_RAG_DATABASE_TYPE="qdrant"

# Adjust chunk size
export CODE_RAG_CHUNK_SIZE="2048"

# Enable reranking for better results
export CODE_RAG_RERANKER_ENABLED="true"

# Add custom ignore patterns (comma-separated)
export CODE_RAG_ADDITIONAL_IGNORE_PATTERNS="*.tmp,*.bak,logs/"

Supported Cloud Providers

Code-RAG supports various cloud embedding providers via LiteLLM. Set CODE_RAG_EMBEDDING_MODEL to the provider-specific model name and provide the necessary credentials:

Provider	Model Example	Required Environment Variables
OpenAI	`text-embedding-3-small`	`OPENAI_API_KEY`
Azure OpenAI	`azure/text-embedding-3-small`	`AZURE_API_KEY`, `AZURE_API_BASE`, `AZURE_API_VERSION`
Google Vertex AI	`vertex_ai/text-embedding-004`	`VERTEX_AI_PROJECT`, `VERTEX_AI_LOCATION`, plus `gcloud auth application-default login`
Cohere	`cohere/embed-english-v3.0`	`COHERE_API_KEY`
AWS Bedrock	`bedrock/amazon.titan-embed-text-v1`	`AWS_ACCESS_KEY_ID`, `AWS_SECRET_ACCESS_KEY`, `AWS_REGION_NAME`

For other providers (HuggingFace, Mistral, etc.), refer to the LiteLLM documentation for model names and required environment variables.

Config File Format

Config files use the same format (key=value):

# ~/.config/code-rag/config or ./code-rag.config
CODE_RAG_EMBEDDING_MODEL=nomic-ai/CodeRankEmbed
CODE_RAG_DATABASE_TYPE=chroma
CODE_RAG_CHUNK_SIZE=1024
CODE_RAG_RERANKER_ENABLED=false

Full configuration options in docs/IMPLEMENTATION.md.

How It Works

Scans your codebase (respects .gitignore)
Chunks code intelligently (syntax-aware for Python, JS, Go, Rust, Java, C/C++)
Embeds chunks as vectors using ML models
Stores in vector database (ChromaDB or Qdrant)
Searches semantically when you query

Pluggable architecture - swap databases, embedding models, or add new ones.

API Usage

Use programmatically:

from code_rag.api import CodeRAGAPI

api = CodeRAGAPI(database_type="chroma", embedding_model="all-MiniLM-L6-v2")
api.initialize_collection("myproject")

# Index
chunks = api.index_codebase("/path/to/project")

# Search
results = api.search("authentication logic", n_results=5)
for r in results:
    print(f"{r['file_path']} - {r['similarity']:.2f}")

Documentation

AGENTS.md - Developer onboarding and architecture overview
docs/IMPLEMENTATION.md - Detailed implementation reference
docs/mcp.md - MCP server setup guide

Supported Languages

Syntax-aware chunking for: Python, JavaScript, TypeScript, Go, Rust, Java, C, C++

Other languages use line-aware chunking (still works, just less context-aware).

Requirements

Python 3.10+
Minimal dependencies (ChromaDB + sentence-transformers by default)
Optional: OpenAI API key, Qdrant server

Troubleshooting

Import errors? pip install --force-reinstall --upgrade code-rag-mcp (or pip install -e . if developing locally)

Database issues? code-rag-cli --reindex

Memory issues? export CODE_RAG_BATCH_SIZE="16"

Wizard verification passed but first use still downloads models? Expected. The wizard verifies that the selected backend and credentials are present, but it does not force model downloads or make provider API calls during verification.

Development

Setup

# Install with dev dependencies
pip install -e ".[dev]"

Testing & Linting

# Run tests
pytest

# Format code
black .
isort .

# Linting
flake8

Contributing

Fork the repo
Create feature branch
Make changes
Add tests
Submit PR

See AGENTS.md for architecture and docs/IMPLEMENTATION.md for internals.

License

MIT License. See LICENSE for details.

Built with ChromaDB, Qdrant, sentence-transformers, and Tree-sitter

Install Server

license - permissive license

quality

maintenance

How are these scores calculated?

Maintenance

–Maintainers

–Response time

–Release cycle

–Releases (12mo)

Commit activity

Resources

GitHub Repository

Need Help?

Related Servers

Unclaimed servers have limited discoverability.

Looking for Admin?

If you are the server author, to access and configure the admin panel.

Tools

search_codebaseA

Latest Blog Posts

Your AI Chatbot Just Exposed Your CEO's Salary to an Intern
By Om-Shree-0709 on July 2, 2026.
Agent Identity
MCP Security
OAuth Delegation
Why MCP Servers Need Execution Sandboxing (And Why Your Current Stack Isn't Enough)
By Om-Shree-0709 on June 30, 2026.
Agentic Ai
Prompt Injection
WebAssembly
Lightport: Open-Sourcing Glama's AI Gateway
By punkpeye on April 27, 2026.
OpenAI
open source

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/qduc/code-rag'

If you have feedback or need assistance with the MCP directory API, please join our Discord server

Code-RAG

Table of Contents

Quick Start

1. Install

2. Add to Claude

3. Try it out (optional)

Claude's Opinion About Code-RAG (take it with a huge grain of salt!)

Code-RAG Review

Strengths

Weaknesses

Specific Observations

What Works Great:

What Could Be Better:

Recommendations

Overall Assessment

Why Use Code-RAG?

Use with Claude Code (MCP Integration)

Quick Setup

Configuration

Usage

Basic Usage

Configuration

Priority Order

Environment Variables

Supported Cloud Providers

Config File Format

How It Works

API Usage

Documentation

Supported Languages

Requirements

Troubleshooting

Development

Setup

Testing & Linting

Contributing

License

Maintenance

Resources

Looking for Admin?

Tools

Latest Blog Posts

MCP directory API