mcp-code-indexer
Click on "Install Server".
Wait a few minutes for the server to deploy. Once ready, it will show a "Started" state.
In the chat, type
@followed by the MCP server name and your instructions, e.g., "@mcp-code-indexerexplain what app.py does"
That's it! The server will respond to your query, and you can continue using it as needed.
Here is a step-by-step guide with screenshots.
MCP Code Indexer ๐
A production-ready Model Context Protocol (MCP) server that revolutionizes how AI agents navigate and understand codebases. Built for high-concurrency environments with advanced database resilience, the server provides instant access to intelligent descriptions, semantic search, and context-aware recommendations while maintaining 800+ writes/sec throughput.
๐ฏ What It Does
The MCP Code Indexer solves a critical problem for AI agents working with large codebases: understanding code structure without repeatedly scanning files. Instead of reading every file, agents can:
Query file purposes instantly with natural language descriptions
Search across codebases using full-text search
Get intelligent recommendations based on codebase size (overview vs search)
Generate condensed overviews for project understanding
Perfect for AI-powered code review, refactoring tools, documentation generation, and codebase analysis workflows.
โก Quick Start
๐จโ๐ป For Developers
Get started integrating MCP Code Indexer into your AI agent workflow:
# Install with Poetry
poetry add mcp-code-indexer
# Or with pip
pip install mcp-code-indexer
# Start the MCP server
mcp-code-indexer
# Connect your MCP client and start using tools
# See API Reference for complete tool documentation๐ For Web Applications
Enable HTTP/REST API access for browser-based applications:
# Start HTTP server with authentication
mcp-code-indexer --http --auth-token "your-secret-token"
# Custom host and port
mcp-code-indexer --http --host 0.0.0.0 --port 8080
# CORS configuration for web apps
mcp-code-indexer --http --cors-origins "https://localhost:3000" "https://myapp.com"๐ Complete HTTP API Reference โ
๐ค For AI-Powered Q&A
Ask questions about your codebase using natural language:
# Set OpenRouter API key for Claude access
export OPENROUTER_API_KEY="your-openrouter-api-key"
# Simple questions about project architecture
mcp-code-indexer --ask "What does this project do?" my-project
# Enhanced analysis with file search
mcp-code-indexer --deepask "How is authentication implemented?" web-app
# JSON output for programmatic use
mcp-code-indexer --ask "List the main components" my-project --json๐ค Complete Q&A Interface Guide โ
๐ง For System Administrators
Deploy and configure the server for your team:
# Production deployment with custom settings
mcp-code-indexer \
--token-limit 64000 \
--db-path /data/mcp-index.db \
--cache-dir /var/cache/mcp \
--log-level INFO
# Check installation
mcp-code-indexer --version๐ฏ For Everyone
New to MCP Code Indexer? Start here:
Install:
poetry add mcp-code-indexer(orpip install mcp-code-indexer)Run:
mcp-code-indexer --token-limit 32000Connect: Use your favorite MCP client
Explore: Try the
check_codebase_sizetool first
Development Setup:
# Clone and setup for contributing
git clone https://github.com/fluffypony/mcp-code-indexer.git
cd mcp-code-indexer
# Install with Poetry (recommended)
poetry install
# Or install in development mode with pip
pip install -e .
# Run the server
mcp-code-indexer --token-limit 32000๐ Git Hook Integration
๐ NEW Feature: Automated code indexing with AI-powered analysis! Keep your file descriptions synchronized automatically as your codebase evolves.
๐ค For Users: Quick Setup
# Set your OpenRouter API key
export OPENROUTER_API_KEY="sk-or-v1-your-api-key-here"
# Test git hook functionality
mcp-code-indexer --githook
# Install post-commit hook
cp examples/git-hooks/post-commit .git/hooks/
chmod +x .git/hooks/post-commit๐จโ๐ป For Developers: How It Works
The git hook integration provides intelligent automation:
๐ Git Analysis: Automatically analyzes git diffs after commits/merges
๐ค AI Processing: Uses OpenRouter API with Anthropic's Claude Sonnet 4
โก Smart Updates: Only processes files that actually changed
๐ Overview Maintenance: Updates project overview when structure changes
๐ก๏ธ Error Isolation: Git operations continue even if indexing fails
โฑ๏ธ Rate Limiting: Built-in retry logic with exponential backoff
๐ฏ Key Benefits
๐ก Zero Manual Work: Descriptions stay current without any effort โก Performance: Only analyzes changed files, not entire codebase ๐ Reliability: Robust error handling ensures git operations never fail ๐๏ธ Configurable: Support for custom models and timeout settings
Learn More: See Git Hook Setup Guide for complete configuration options and troubleshooting.
๐ง Vector Mode (BETA)
๐ NEW Feature: Semantic code search with vector embeddings! Experience AI-powered code discovery that understands context and meaning, not just keywords.
๐ฏ What is Vector Mode?
Vector Mode transforms how you search and understand codebases by using AI embeddings:
๐ Semantic Search: Find code by meaning, not just text matching
โก Real-time Indexing: Automatic embedding generation as code changes
๐ก๏ธ Secure by Default: Comprehensive secret redaction before API calls
๐ Multi-language: Python, JavaScript, TypeScript with AST-based chunking
๐ Smart Chunking: Context-aware code segmentation for optimal embeddings
๐ Quick Start
# Install MCP Code Indexer (includes vector mode)
pip install mcp-code-indexer
# Set required API keys
export VOYAGE_API_KEY="pa-your-voyage-api-key"
export TURBOPUFFER_API_KEY="your-turbopuffer-api-key"
# Optional: Configure region (default: gcp-europe-west3)
export TURBOPUFFER_REGION="gcp-europe-west3"
# Start with vector mode enabled
mcp-code-indexer --vector
# The daemon automatically starts and begins indexing your projects๐ก Key Features
๐ Secret Redaction: 20+ pattern types automatically detected and redacted
๐ณ Merkle Trees: Efficient change detection without full directory scans
๐๏ธ Circuit Breakers: Resilient API integration with automatic retry logic
๐ Production Ready: Built for high-concurrency with comprehensive monitoring
๐ง Advanced Configuration
# Custom configuration
mcp-code-indexer --vector --vector-config /path/to/config.yaml
# HTTP mode with vector search
mcp-code-indexer --vector --http --port 8080๐ ๏ธ Architecture
Vector Mode adds powerful new MCP tools:
vector_search- Semantic code search across projectsfind_similar_code- Find code similar to a given snippet or file sectionsimilarity_search- Find similar code patternsdependency_search- Discover code relationshipsvector_status- Monitor indexing progress
Status: Currently in BETA - foundations implemented, full pipeline in development.
๐ง Development Setup
๐จโ๐ป For Contributors
Contributing to MCP Code Indexer? Follow these steps for a proper development environment:
# Setup development environment
git clone https://github.com/fluffypony/mcp-code-indexer.git
cd mcp-code-indexer
# Install with Poetry (recommended)
poetry install
# Or use pip with virtual environment
python -m venv venv
source venv/bin/activate # On Windows: venv\Scripts\activate
pip install -e .[dev]
# Verify installation
python main.py --help
mcp-code-indexer --versionโ ๏ธ Important: The editable install (pip install -e .) is required for development. The project uses proper PyPI package structure with absolute imports like from mcp_code_indexer.database.database import DatabaseManager. Without editable installation, you'll get ModuleNotFoundError exceptions.
๐ฏ Development Workflow
# Activate virtual environment
source venv/bin/activate
# Run the server directly
python main.py --token-limit 32000
# Or use the installed CLI command
mcp-code-indexer --token-limit 32000
# Run tests
python -m pytest tests/ -v
# Run with coverage
python -m pytest tests/ --cov=src --cov-report=html
# Format code
black src/ tests/
isort src/ tests/
# Type checking
mypy src/๐ ๏ธ MCP Tools Available
The server provides 13 powerful MCP tools for intelligent codebase management. Whether you're an AI agent or human developer, these tools make navigating code effortless.
๐ฏ Essential Tools (Start Here)
Tool | Purpose | When to Use |
| Get navigation recommendations | First tool to call for any project |
| Find files by functionality | When you need specific files |
| Project architectural summary | Understanding system design |
๐ง Core Operations
Tool | Purpose | Best For |
| Retrieve file summaries | Quick file understanding |
| Store detailed file analysis | AI agents updating descriptions |
| Scan for undocumented files | Maintenance and coverage |
๐ Advanced Features
Tool | Purpose | Use Case |
| Complete project structure | Small-to-medium codebases |
| Technical vocabulary analysis | Domain understanding |
| Create project documentation | Architecture documentation |
| Search in project overviews | Finding specific topics |
| Find code similar to snippet/section | Code pattern discovery (Vector Mode) |
๐ฅ System Health
Tool | Purpose | For |
| Real-time performance monitoring | Production deployments |
๐ก Pro Tip: Always start with check_codebase_size to get personalized recommendations for navigating your specific codebase.
๐ Complete API Documentation: View all 13 tools with examples โ
๐ Git Hook Integration
Keep your codebase documentation automatically synchronized with automated analysis on every commit:
# Analyze current staged changes
mcp-code-indexer --githook
# Analyze a specific commit
mcp-code-indexer --githook abc123def
# Analyze using HEAD syntax
mcp-code-indexer --githook HEAD
mcp-code-indexer --githook HEAD~1
mcp-code-indexer --githook HEAD~3
# Analyze a commit range (perfect for rebases)
mcp-code-indexer --githook abc123 def456
mcp-code-indexer --githook HEAD~5 HEAD๐ฏ Perfect for:
Automated documentation that never goes stale
Rebase-aware analysis that handles complex git operations
Zero-effort maintenance with background processing
See the Git Hook Setup Guide for complete installation instructions including post-commit, post-merge, and post-rewrite hooks.
๐๏ธ Architecture Highlights
๐ Performance Optimized
SQLite with WAL mode for high-concurrency access (800+ writes/sec)
Smart connection pooling with optimized pool size (3 connections default)
FTS5 full-text search with prefix indexing for sub-100ms queries
Token-aware caching to minimize expensive operations
Write operation serialization to eliminate database lock conflicts
๐ก๏ธ Production Ready
Database resilience features with <2% error rate under high load
Exponential backoff retry logic with intelligent failure recovery
Comprehensive health monitoring with automatic pool refresh
Structured JSON logging with performance metrics tracking
Async-first design with proper resource cleanup
MCP protocol compliant with clean stdio streams
Upstream inheritance for fork workflows
Git integration with .gitignore support
๐จโ๐ป Developer Friendly
95%+ test coverage with async support and concurrent access tests
Integration tests for complete workflows including database stress testing
Performance benchmarks for large codebases with resilience validation
Clear error messages with MCP protocol compliance
Comprehensive configuration options for production tuning
๐ Documentation
Comprehensive documentation organized by user journey and expertise level.
๐ Getting Started (New Users)
Guide | Purpose | Time Investment |
Install and run your first server | 2 minutes | |
Master all 13 MCP tools | 15 minutes | |
REST API for web applications | 10 minutes | |
AI-powered codebase analysis | 8 minutes | |
Automate your workflow | 5 minutes |
๐๏ธ Production Deployment (Teams & Admins)
Guide | Focus | Best For |
Complete command documentation | All users | |
Project & database management | System administrators | |
Production setup & tuning | System administrators | |
High-concurrency optimization | DevOps teams | |
Production monitoring | Operations teams |
๐ง Advanced Topics (Power Users)
Guide | Depth | For |
System design deep dive | Developers & architects | |
Advanced error handling | Senior developers | |
Development workflow | Contributors |
๐ Quick References
Examples & Integrations - Ready-to-use configurations
Troubleshooting - Common issues & solutions
API Tools Summary - All 13 tools at a glance
๐ Reading Paths:
New to MCP Code Indexer? Quick Start โ API Reference โ HTTP API โ Q&A Interface
Web developers? Quick Start โ HTTP API Reference โ Q&A Interface โ Git Hooks
AI/ML engineers? Quick Start โ Q&A Interface โ API Reference โ Git Hooks
Setting up for a team? CLI Reference โ Configuration โ Administrative Commands โ Monitoring
Contributing to the project? Architecture โ Contributing โ API Reference
๐ฆ System Requirements
Python 3.8+ with asyncio support
SQLite 3.35+ (included with Python)
4GB+ RAM for large codebases (1000+ files)
SSD storage recommended for optimal performance
๐ Performance
Tested with codebases up to 10,000 files:
File description retrieval: < 10ms
Full-text search: < 100ms
Codebase overview generation: < 2s
Merge conflict detection: < 5s
๐ง Advanced Configuration
๐จโ๐ป For Developers: Basic Configuration
# Production setup with custom limits
mcp-code-indexer \
--token-limit 50000 \
--db-path /data/mcp-index.db \
--cache-dir /tmp/mcp-cache \
--log-level INFO
# Enable structured logging
export MCP_LOG_FORMAT=json
mcp-code-indexer๐ง For System Administrators: Database Resilience Tuning
Configure advanced database resilience features for high-concurrency environments:
# High-performance production deployment
mcp-code-indexer \
--token-limit 64000 \
--db-path /data/mcp-index.db \
--cache-dir /var/cache/mcp \
--log-level INFO \
--db-pool-size 5 \
--db-retry-count 7 \
--db-timeout 15.0 \
--enable-wal-mode \
--health-check-interval 20.0
# Environment variable configuration
export DB_POOL_SIZE=5
export DB_RETRY_COUNT=7
export DB_TIMEOUT=15.0
export DB_WAL_MODE=true
export DB_HEALTH_CHECK_INTERVAL=20.0
mcp-code-indexer --token-limit 64000Configuration Options
Parameter | Default | Description | Use Case |
| 3 | Database connection pool size | Higher for more concurrent clients |
| 5 | Max retry attempts for failed operations | Increase for unstable environments |
| 10.0 | Transaction timeout (seconds) | Increase for large operations |
| true | Enable WAL mode for concurrency | Always enable for production |
| 30.0 | Health monitoring interval (seconds) | Lower for faster issue detection |
๐ก Performance Tip: For environments with 10+ concurrent clients, use --db-pool-size 5 and --health-check-interval 15.0 for optimal throughput.
๐ค Integration Examples
With AI Agents
# Example: AI agent using MCP tools
async def analyze_codebase(project_path):
# Check if codebase is large
size_info = await mcp_client.call_tool("check_codebase_size", {
"projectName": "my-project",
"folderPath": project_path
})
if size_info["isLarge"]:
# Use search for large codebases
results = await mcp_client.call_tool("search_descriptions", {
"projectName": "my-project",
"folderPath": project_path,
"query": "authentication logic"
})
else:
# Get full overview for smaller projects
overview = await mcp_client.call_tool("get_codebase_overview", {
"projectName": "my-project",
"folderPath": project_path
})With CI/CD Pipelines
# Example: GitHub Actions integration
- name: Update Code Descriptions
run: |
python -c "
import asyncio
from mcp_client import MCPClient
async def update_descriptions():
client = MCPClient('mcp-code-indexer')
# Find files without descriptions
missing = await client.call_tool('find_missing_descriptions', {
'projectName': '${{ github.repository }}',
'folderPath': '.'
})
# Process with AI and update...
asyncio.run(update_descriptions())
"๐งช Testing
# Install with test dependencies using Poetry
poetry install --with test
# Or with pip
pip install mcp-code-indexer[test]
# Run full test suite
python -m pytest tests/ -v
# Run with coverage
python -m pytest tests/ --cov=src --cov-report=html
# Run performance tests
python -m pytest tests/ -m performance
# Run integration tests only
python -m pytest tests/integration/ -v๐ Monitoring
The server provides structured JSON logs for monitoring:
{
"timestamp": "2024-01-15T10:30:00Z",
"level": "INFO",
"message": "Tool search_descriptions completed",
"tool_usage": {
"tool_name": "search_descriptions",
"success": true,
"duration_seconds": 0.045,
"result_size": 1247
}
}๐ Command Line Options
Server Mode (Default)
mcp-code-indexer [OPTIONS]
Options:
--token-limit INT Maximum tokens before recommending search (default: 32000)
--db-path PATH SQLite database path (default: ~/.mcp-code-index/tracker.db)
--cache-dir PATH Cache directory path (default: ~/.mcp-code-index/cache)
--log-level LEVEL Logging level: DEBUG|INFO|WARNING|ERROR|CRITICAL (default: INFO)Git Hook Mode
mcp-code-indexer --githook [OPTIONS]
# Automated analysis of git changes using OpenRouter API
# Requires: OPENROUTER_API_KEY environment variableHTTP Server Mode
# Start HTTP/REST API server
mcp-code-indexer --http [OPTIONS]
# HTTP server with authentication
mcp-code-indexer --http --auth-token "your-secret-token"
# Custom host and port configuration
mcp-code-indexer --http --host 0.0.0.0 --port 8080Q&A Commands
# Simple AI-powered questions (requires OPENROUTER_API_KEY)
mcp-code-indexer --ask "What does this project do?" PROJECT_NAME
# Enhanced analysis with file search
mcp-code-indexer --deepask "How is authentication implemented?" PROJECT_NAME
# JSON output for programmatic use
mcp-code-indexer --ask "Question" PROJECT_NAME --jsonAdministrative Commands
# List all projects
mcp-code-indexer --getprojects
# Execute MCP tool directly
mcp-code-indexer --runcommand '{"method": "tools/call", "params": {...}}'
# Export descriptions for a project
mcp-code-indexer --dumpdescriptions PROJECT_ID
# Create local database for a project
mcp-code-indexer --makelocal /path/to/project
# Generate project documentation map
mcp-code-indexer --map PROJECT_NAME๐ก๏ธ Security Features
Input validation on all MCP tool parameters
SQL injection protection via parameterized queries
File system sandboxing with .gitignore respect
Error sanitization to prevent information leakage
Async resource cleanup to prevent memory leaks
๐จ Quick Troubleshooting
Common issues and instant solutions:
Issue | Quick Fix | Learn More |
"No module named 'mcp_code_indexer'" |
| |
"OPENROUTER_API_KEY not found" |
| |
"Database is locked" | Enable WAL mode: | |
"Large codebase - use search" | Normal for 200+ files. Use | |
HTTP authentication failed | Check | |
Q&A commands not working | Set | |
High memory usage | Reduce token limit: |
๐ก Not finding your issue? Check the complete troubleshooting guides in our documentation.
๐ Next Steps
Ready to supercharge your AI agents with intelligent codebase navigation?
๐ฏ Choose Your Path
๐ New to MCP Code Indexer?
Install and run your first server - Get up and running in 2 minutes
Master the API tools - Learn all 11 tools with examples
Try HTTP API access - REST API for web applications
Explore AI-powered Q&A - Ask questions about your code
Set up git hooks - Automate your workflow
๐ฅ Setting up for a team?
Learn all CLI commands - Complete command reference
Configure for production - Production deployment guide
Set up administrative workflows - Project & database management
Performance optimization - High-concurrency setup
Monitoring & alerts - Production monitoring
๐ง Want to contribute?
Understand the architecture - Technical deep dive
Development setup - Contribution workflow
Report issues - Share feedback and suggestions
๐ Learning Resources:
Examples & integrations - Ready-to-use configurations
Video tutorials - Coming soon!
Community discussions - Ask questions and share tips
๐ค Contributing
We welcome contributions! See our Contributing Guide for:
Development setup
Code style guidelines
Testing requirements
Pull request process
๐ License
MIT License - see LICENSE for details.
๐ Built With
Model Context Protocol - The foundation for tool integration
tiktoken - Fast BPE tokenization
aiosqlite - Async SQLite operations
aiohttp - Async HTTP client for OpenRouter API
tenacity - Robust retry logic and rate limiting
Pydantic - Data validation and settings
Transform how your AI agents understand code! ๐
๐ฏ New User? Get started in 2 minutes ๐จโ๐ป Developer? Explore the complete API ๐ง Production? Deploy with confidence
This server cannot be installed
Resources
Unclaimed servers have limited discoverability.
Looking for Admin?
If you are the server author, to access and configure the admin panel.
Latest Blog Posts
MCP directory API
We provide all the information about MCP servers via our MCP API.
curl -X GET 'https://glama.ai/api/mcp/v1/servers/fluffypony/mcp-code-indexer'
If you have feedback or need assistance with the MCP directory API, please join our Discord server