Skip to main content
Glama

Codebase MCP Server

by Ravenight13

Codebase MCP Server

A production-grade MCP (Model Context Protocol) server that indexes code repositories into PostgreSQL with pgvector for semantic search, designed specifically for AI coding assistants.

What's New in v2.0

Version 2.0 represents a major architectural refactoring focused exclusively on semantic code search capabilities. This release removes project management, entity tracking, and work item features to maintain single-responsibility focus.

Breaking Changes:

  • 14 tools removed (project management, entity tracking, work item features extracted to workflow-mcp)

  • 3 tools remaining: start_indexing_background, get_indexing_status, and search_code with multi-project support

  • Foreground index_repository removed (all indexing now uses background jobs to prevent timeouts)

  • Database schema simplified (9 tables dropped, project_id parameter added)

  • New environment variables for optional workflow-mcp integration

Migration Required: Existing v1.x users must follow the migration guide to upgrade safely. See Migration Guide for complete upgrade and rollback procedures.

What's Preserved: All indexed repositories and code embeddings remain searchable after migration.

What's Discarded: All v1.x project management data, entities, and work items are permanently removed.


Features

The Codebase MCP Server provides exactly 3 MCP tools for semantic code search with multi-project workspace support:

  1. start_indexing_background: Start a background indexing job for a repository

    • Returns job_id immediately to prevent MCP client timeouts

    • Accepts optional project_id parameter for workspace isolation

    • Default behavior: indexes to default project workspace if project_id not specified

    • Performance target: 60-second indexing for 10,000 files

  2. get_indexing_status: Poll the status of a background indexing job

    • Query job progress using job_id from start_indexing_background

    • Returns files_indexed, chunks_created, and completion status

    • Enables responsive UIs with progress indicators

  3. search_code: Semantic code search with natural language queries

    • Accepts optional project_id parameter to restrict search scope

    • Default behavior: searches default project workspace if project_id not specified

    • Performance target: 500ms p95 search latency

Multi-Project Support

The v2.0 architecture supports isolated project workspaces through the optional project_id parameter:

Single Project Workflow (default):

# Start background indexing job - uses default workspace job = await start_indexing_background(repo_path="/path/to/repo") job_id = job["job_id"] # Poll for completion while True: status = await get_indexing_status(job_id=job_id) if status["status"] in ["completed", "failed"]: break await asyncio.sleep(2) # Search without project_id - searches default workspace search_code(query="authentication logic")

Multi-Project Workflow:

# Index to specific project workspace job = await start_indexing_background( repo_path="/path/to/client-a-repo", project_id="client-a" ) job_id = job["job_id"] # Poll for completion while True: status = await get_indexing_status(job_id=job_id, project_id="client-a") if status["status"] in ["completed", "failed"]: break await asyncio.sleep(2) # Search specific project workspace search_code(query="authentication logic", project_id="client-a")

Use Cases:

  • Single Project: Individual developers or small teams working on one codebase

  • Multi-Project: Consultants managing multiple client codebases, organizations with separate product lines, or multi-tenant deployments requiring workspace isolation

Optional Integration: The project_id can be automatically resolved from Git repository context when the optional workflow-mcp server is configured. Without workflow-mcp, all operations default to a single shared workspace.

Quick Start

1. Database Setup

# Create database createdb codebase_mcp # Initialize schema psql -d codebase_mcp -f db/init_tables.sql

2. Install Dependencies

# Install dependencies including FastMCP framework uv sync # Or with pip pip install -r requirements.txt

Key Dependencies:

  • fastmcp>=0.1.0 - Modern MCP framework with decorator-based tools

  • anthropic-mcp - MCP protocol implementation

  • sqlalchemy>=2.0 - Async ORM

  • pgvector - PostgreSQL vector extension

  • ollama - Embedding generation

3. Configure Claude Desktop

Edit ~/Library/Application Support/Claude/claude_desktop_config.json:

{ "mcpServers": { "codebase-mcp": { "command": "uv", "args": [ "run", "--with", "fastmcp", "python", "/absolute/path/to/codebase-mcp/server_fastmcp.py" ] } } }

Important:

  • Use absolute paths!

  • Server uses FastMCP framework with decorator-based tool definitions

  • All logs go to /tmp/codebase-mcp.log (no stdout/stderr pollution)

4. Start Ollama

ollama serve ollama pull nomic-embed-text

5. Test

# Test database and tools uv run python tests/test_tool_handlers.py # Test repository indexing uv run python tests/test_embeddings.py

Current Status

Working Tools (3/3) ✅

Tool

Status

Description

start_indexing_background

✅ Working

Start background indexing job, returns job_id immediately

get_indexing_status

✅ Working

Poll indexing job status with files_indexed/chunks_created

search_code

✅ Working

Semantic code search with pgvector similarity

Recent Fixes (Oct 6, 2025)

  • ✅ Parameter passing architecture (Pydantic models)

  • ✅ MCP schema mismatches (status enums, missing parameters)

  • ✅ Timezone/datetime compatibility (PostgreSQL)

  • ✅ Binary file filtering (images, cache dirs)

Test Results

✅ Task Management: 7/7 tests passed ✅ Repository Indexing: 2 files indexed, 6 chunks created ✅ Embeddings: 100% coverage (768-dim vectors) ✅ Database: Connection pool, async operations working

Tool Usage Examples

Index a Repository (Background Job)

In Claude Desktop:

Index the repository at /Users/username/projects/myapp

Initial Response (immediate):

{ "job_id": "550e8400-e29b-41d4-a716-446655440000", "status": "pending", "message": "Indexing job started", "project_id": "default", "database_name": "cb_proj_default_00000000" }

Poll for Status:

Check the status of indexing job 550e8400-e29b-41d4-a716-446655440000

Completed Response:

{ "job_id": "550e8400-e29b-41d4-a716-446655440000", "status": "completed", "repo_path": "/Users/username/projects/myapp", "files_indexed": 234, "chunks_created": 1456, "error_message": null, "created_at": "2025-10-18T10:30:00Z", "started_at": "2025-10-18T10:30:01Z", "completed_at": "2025-10-18T10:30:15Z" }

Search Code

Search for "authentication middleware" in Python files

Response:

{ "results": [ { "file_path": "src/middleware/auth.py", "content": "def authenticate_request(request):\n ...", "start_line": 45, "similarity_score": 0.92 } ], "total_count": 5, "latency_ms": 250 }

Architecture

Claude Desktop ↔ FastMCP Server ↔ Tool Handlers ↔ Services ↔ PostgreSQL ↓ Ollama (embeddings)

MCP Framework: Built with FastMCP - a modern, decorator-based framework for building MCP servers with:

  • Type-safe tool definitions via @mcp.tool() decorators

  • Automatic JSON Schema generation from Pydantic models

  • Dual logging (file + MCP protocol) without stdout pollution

  • Async/await support throughout

See Multi-Project Architecture for detailed component diagrams.

Documentation

Database Schema

11 tables with pgvector for semantic search:

Core Tables:

  • repositories - Indexed repositories

  • code_files - Source files with metadata

  • code_chunks - Semantic chunks with embeddings (vector(768))

  • tasks - Development tasks with git tracking

  • task_status_history - Audit trail

See Multi-Project Architecture for complete schema documentation.

Technology Stack

  • MCP Framework: FastMCP 0.1+ (decorator-based tool definitions)

  • Server: Python 3.13+, FastAPI patterns, async/await

  • Database: PostgreSQL 14+ with pgvector extension

  • Embeddings: Ollama (nomic-embed-text, 768 dimensions)

  • ORM: SQLAlchemy 2.0 (async), Pydantic V2 for validation

  • Type Safety: Full mypy --strict compliance

Development

Running Tests

# Tool handlers uv run python tests/test_tool_handlers.py # Repository indexing uv run python tests/test_embeddings.py # Unit tests uv run pytest tests/ -v

Code Structure

codebase-mcp/ ├── server_fastmcp.py # FastMCP server entry point (NEW) ├── src/ │ ├── mcp/ │ │ └── tools/ # Tool handlers with service integration │ │ ├── tasks.py # Task management │ │ ├── indexing.py # Repository indexing │ │ └── search.py # Semantic search │ ├── services/ # Business logic layer │ │ ├── tasks.py # Task CRUD + git tracking │ │ ├── indexer.py # Indexing orchestration │ │ ├── scanner.py # File discovery │ │ ├── chunker.py # AST-based chunking │ │ ├── embedder.py # Ollama integration │ │ └── searcher.py # pgvector similarity search │ └── models/ # Database models + Pydantic schemas │ ├── task.py # Task, TaskCreate, TaskUpdate │ ├── code_chunk.py # CodeChunk │ └── ... └── tests/ ├── test_tool_handlers.py # Integration tests └── test_embeddings.py # Embedding validation

FastMCP Server Architecture:

  • server_fastmcp.py - Main entry point using @mcp.tool() decorators

  • Tool handlers in src/mcp/tools/ provide service integration

  • Services in src/services/ contain all business logic

  • Dual logging: file (/tmp/codebase-mcp.log) + MCP protocol

Installation

Prerequisites

Before installing Codebase MCP Server v2.0, ensure the following requirements are met:

Required Software:

  • PostgreSQL 14+ - Database with pgvector extension for vector similarity search

  • Python 3.11+ - Runtime environment (Python 3.13 compatible)

  • Ollama - Local embedding model server with nomic-embed-text model

System Requirements:

  • 4GB+ RAM recommended for typical workloads

  • SSD storage for optimal performance (database and embedding operations are I/O intensive)

  • Network access to Ollama server (default: localhost:11434)

Installation Commands

Install Codebase MCP Server v2.0 using pip:

# Install latest v2.0 release pip install codebase-mcp

Alternative Installation Methods:

# Install specific v2.0 version pip install codebase-mcp==2.0.0 # Install from source (for development) git clone https://github.com/cliffclarke/codebase-mcp.git cd codebase-mcp pip install -e .

Key Dependencies Installed Automatically:

  • fastmcp>=0.1.0 - Modern MCP framework

  • sqlalchemy>=2.0 - Async database ORM

  • pgvector - PostgreSQL vector extension Python bindings

  • ollama - Embedding generation client

  • pydantic>=2.0 - Data validation and settings

Verification Steps

After installation, verify the setup is correct:

# Verify codebase-mcp is installed codebase-mcp --version # Expected output: codebase-mcp 2.0.0 # Check PostgreSQL is accessible psql --version # Expected output: psql (PostgreSQL) 14.x or higher # Verify Ollama is running curl http://localhost:11434/api/tags # Expected output: JSON response with available models # Confirm embedding model is available ollama list | grep nomic-embed-text # Expected output: nomic-embed-text model listed

Setup Complete: If all verification steps pass, Codebase MCP Server v2.0 is ready for use. Proceed to the Quick Start section for first-time indexing and search operations.

Multi-Project Configuration

The Codebase MCP server supports automatic project switching based on your working directory using .codebase-mcp/config.json files.

Quick Start

  1. Create a config file in your project root:

mkdir -p .codebase-mcp cat > .codebase-mcp/config.json <<EOF { "version": "1.0", "project": { "name": "my-project", "id": "optional-uuid-here" }, "auto_switch": true } EOF
  1. Set your working directory (via MCP client):

await mcpClient.callTool("set_working_directory", { directory: "/absolute/path/to/your/project" });
  1. Use tools normally - they'll automatically use your project:

// Automatically uses "my-project" workspace const result = await mcpClient.callTool("start_indexing_background", { repo_path: "/path/to/repo" }); const jobId = result.job_id; // Poll for completion while (true) { const status = await mcpClient.callTool("get_indexing_status", { job_id: jobId }); if (status.status === "completed" || status.status === "failed") { break; } await sleep(2000); }

Config File Format

{ "version": "1.0", "project": { "name": "my-project-name", "id": "optional-project-uuid", "database_name": "optional-database-override" }, "auto_switch": true, "strict_mode": false, "dry_run": false, "description": "Optional project description" }

Fields:

  • version (required): Config version (currently "1.0")

  • project.name (required): Project identifier (used if no ID provided)

  • project.id (optional): Explicit project UUID (takes priority over name)

  • project.database_name (optional): Override computed database name (see Database Name Resolution below)

  • auto_switch (optional, default true): Enable automatic project switching

  • strict_mode (optional, default false): Reject operations if project mismatch

  • dry_run (optional, default false): Log intended switches without executing

Database Name Resolution:

The server determines which database to use in this order:

  1. Explicit - Uses exact database name specified

    {"project": {"database_name": "cb_proj_my_project_550e8400"}}
  2. Computed from - Automatically generates database name

    Format: cb_proj_{sanitized_name}_{id_prefix} Example: cb_proj_my_project_550e8400

Use Cases for

  • Recovering from database name mismatches

  • Migrating from old database naming schemes

  • Explicit control over database selection

  • Debugging and troubleshooting

Example - Auto-generated (default):

{ "version": "1.0", "project": { "name": "my-project", "id": "550e8400-e29b-41d4-a716-446655440000" } }

Database used: cb_proj_my_project_550e8400 (auto-computed)

Example - Explicit override:

{ "version": "1.0", "project": { "name": "my-project", "id": "550e8400-e29b-41d4-a716-446655440000", "database_name": "cb_proj_legacy_database_12345678" } }

Database used: cb_proj_legacy_database_12345678 (explicit override)

Project Resolution Priority

When you call MCP tools, the server resolves the project workspace using this 4-tier priority system:

  1. Explicit (highest priority)

    await mcpClient.callTool("start_indexing_background", { repo_path: "/path/to/repo", project_id: "explicit-project-id" // Always takes priority });
  2. Session-based config file (via set_working_directory)

    • Server searches up to 20 directory levels for .codebase-mcp/config.json

    • Cached with mtime-based invalidation for performance

    • Isolated per MCP session (multiple clients stay independent)

  3. workflow-mcp integration (external project tracking)

    • Queries workflow-mcp server for active project context

    • Configurable timeout and caching

  4. Default workspace (fallback)

    • Uses project_default schema when no other resolution succeeds

Multi-Session Isolation

The server maintains separate working directories for each MCP session (client connection):

// Session 1 (Claude Code instance A) await mcpClient1.callTool("set_working_directory", { directory: "/Users/alice/project-a" }); // Session 2 (Claude Code instance B) await mcpClient2.callTool("set_working_directory", { directory: "/Users/bob/project-b" }); // Each session independently resolves its own project // No cross-contamination between sessions

Config File Discovery

The server searches for .codebase-mcp/config.json by:

  1. Starting from your working directory

  2. Searching up to 20 parent directories

  3. Stopping at the first config file found

  4. Caching the result (with automatic invalidation on file modification)

Example directory structure:

/Users/alice/projects/my-app/ <- .codebase-mcp/config.json here ├── .codebase-mcp/ │ └── config.json ├── src/ │ └── components/ <- Working directory │ └── Button.tsx

If you set working directory to /Users/alice/projects/my-app/src/components/, the server will find the config at /Users/alice/projects/my-app/.codebase-mcp/config.json.

Performance

  • Config discovery: <50ms (with upward traversal)

  • Cache hit: <5ms

  • Session lookup: <1ms

  • Background cleanup: Hourly (removes sessions inactive >24h)

Database Setup

1. Create Database

# Connect to PostgreSQL psql -U postgres # Create database CREATE DATABASE codebase_mcp; # Enable pgvector extension \c codebase_mcp CREATE EXTENSION IF NOT EXISTS vector; \q

2. Initialize Schema

# Run database initialization script python scripts/init_db.py # Verify schema creation alembic current

The initialization script will:

  • Create all required tables (repositories, files, chunks, tasks)

  • Set up vector indexes for similarity search

  • Configure connection pooling

  • Apply all database migrations

3. Verify Setup

# Check database connectivity python -c "from src.database import Database; import asyncio; asyncio.run(Database.create_pool())" # Run migration status check alembic current

4. Database Reset & Cleanup

During development, you may need to reset your database using the following reset options:

  • scripts/clear_data.sh - Clear all data, keep schema (fastest, no restart needed)

  • scripts/reset_database.sh - Drop and recreate all tables (recommended for schema changes)

  • scripts/nuclear_reset.sh - Drop entire database (requires Claude Desktop restart)

# Quick data wipe (keeps schema) ./scripts/clear_data.sh # Full table reset (recommended) ./scripts/reset_database.sh # Nuclear option (drops database) ./scripts/nuclear_reset.sh

Running the Server

FastMCP Server (Recommended)

The primary way to run the server is via Claude Desktop or other MCP clients:

# Via Claude Desktop (configured in claude_desktop_config.json) # Server starts automatically when Claude Desktop launches # Manual testing with FastMCP CLI uv run --with fastmcp python server_fastmcp.py # With custom log level LOG_LEVEL=DEBUG uv run --with fastmcp python server_fastmcp.py

Server Entry Point: server_fastmcp.py in repository root

Logging: All output goes to /tmp/codebase-mcp.log (configurable via LOG_FILE env var)

Development Mode (Legacy FastAPI)

# Start with auto-reload (if FastAPI server exists) uvicorn src.main:app --reload --host 127.0.0.1 --port 3000 # With custom log level LOG_LEVEL=DEBUG uvicorn src.main:app --reload

Production Mode (Legacy)

# Start production server uvicorn src.main:app --host 0.0.0.0 --port 3000 --workers 4 # With gunicorn (recommended for production) gunicorn src.main:app -w 4 -k uvicorn.workers.UvicornWorker --bind 0.0.0.0:3000

stdio Transport (Legacy CLI Mode)

The legacy MCP server supports stdio transport for CLI clients via JSON-RPC 2.0 over stdin/stdout.

# Start stdio server (reads JSON-RPC from stdin) python -m src.mcp.stdio_server # Echo a single request echo '{"jsonrpc":"2.0","id":1,"method":"list_tasks","params":{"limit":5}}' | python -m src.mcp.stdio_server # Pipe requests from a file (one JSON-RPC request per line) cat requests.jsonl | python -m src.mcp.stdio_server # Interactive mode (type JSON-RPC requests manually) python -m src.mcp.stdio_server {"jsonrpc":"2.0","id":1,"method":"get_task","params":{"task_id":"..."}}

JSON-RPC 2.0 Request Format:

{ "jsonrpc": "2.0", "id": 1, "method": "search_code", "params": { "query": "async def", "limit": 10 } }

JSON-RPC 2.0 Response Format:

{ "jsonrpc": "2.0", "id": 1, "result": { "results": [...], "total_count": 42, "latency_ms": 250 } }

Available Methods:

  • search_code - Semantic code search

  • start_indexing_background - Start background indexing job

  • get_indexing_status - Poll indexing job status

Logging: All logs go to /tmp/codebase-mcp.log (configurable via LOG_FILE env var). No stdout/stderr pollution - only JSON-RPC protocol messages on stdout.

Health Check

# Check server health curl http://localhost:3000/health # Expected response: { "status": "healthy", "database": "connected", "ollama": "connected", "version": "0.1.0" }

Usage Examples

1. Index a Repository (Background Job)

# Start indexing job via MCP protocol { "tool": "start_indexing_background", "arguments": { "repo_path": "/path/to/your/repo" } } # Immediate response { "job_id": "uuid-here", "status": "pending", "message": "Indexing job started", "project_id": "default", "database_name": "cb_proj_default_00000000" } # Poll for status { "tool": "get_indexing_status", "arguments": { "job_id": "uuid-here" } } # Completed response { "job_id": "uuid-here", "status": "completed", "repo_path": "/path/to/your/repo", "files_indexed": 150, "chunks_created": 1200, "error_message": null, "created_at": "2025-10-18T10:30:00Z", "started_at": "2025-10-18T10:30:01Z", "completed_at": "2025-10-18T10:30:45Z" }

2. Search Code

# Search for authentication logic { "tool": "search_code", "arguments": { "query": "user authentication password validation", "limit": 10, "file_type": "py" } } # Response includes ranked code chunks with context { "results": [...], "total_count": 25, "latency_ms": 230 }

Architecture

┌─────────────────────────────────────────────────┐ │ MCP Client (AI) │ └─────────────────┬───────────────────────────────┘ │ SSE Protocol ┌─────────────────▼───────────────────────────────┐ │ MCP Server Layer │ │ ┌──────────────────────────────────────────┐ │ │ │ Tool Registration & Routing │ │ │ └──────────────────────────────────────────┘ │ │ ┌──────────────────────────────────────────┐ │ │ │ Request/Response Handling │ │ │ └──────────────────────────────────────────┘ │ └─────────────────┬───────────────────────────────┘ │ ┌─────────────────▼───────────────────────────────┐ │ Service Layer │ │ ┌────────────┐ ┌────────────┐ ┌────────────┐ │ │ │ Indexer │ │ Searcher │ │Task Manager│ │ │ └──────┬─────┘ └──────┬─────┘ └──────┬─────┘ │ │ │ │ │ │ │ ┌──────▼──────────────▼──────────────▼──────┐ │ │ │ Repository Service │ │ │ └──────┬─────────────────────────────────────┘ │ │ │ │ │ ┌──────▼─────────────────────────────────────┐ │ │ │ Embedding Service (Ollama) │ │ │ └─────────────────────────────────────────────┘│ └─────────────────┬───────────────────────────────┘ │ ┌─────────────────▼───────────────────────────────┐ │ Data Layer │ │ ┌──────────────────────────────────────────┐ │ │ │ PostgreSQL with pgvector │ │ │ │ ┌──────────┐ ┌──────────┐ ┌──────────┐ │ │ │ │ │Repository│ │ Files │ │ Chunks │ │ │ │ │ └──────────┘ └──────────┘ └──────────┘ │ │ │ │ ┌──────────┐ ┌──────────────────────┐ │ │ │ │ │ Tasks │ │ Vector Embeddings │ │ │ │ │ └──────────┘ └──────────────────────┘ │ │ │ └──────────────────────────────────────────┘ │ └──────────────────────────────────────────────────┘

Component Overview

  • MCP Layer: Handles protocol compliance, tool registration, SSE transport

  • Service Layer: Business logic for indexing, searching, task management

  • Repository Service: File system operations, git integration, .gitignore handling

  • Embedding Service: Ollama integration for generating text embeddings

  • Data Layer: PostgreSQL with pgvector for storage and similarity search

Data Flow

  1. Indexing: Repository → Parse → Chunk → Embed → Store

  2. Searching: Query → Embed → Vector Search → Rank → Return

  3. Task Tracking: Create → Update → Git Integration → Query

Testing

Run All Tests

# Run all tests with coverage pytest tests/ -v --cov=src --cov-report=term-missing # Run specific test categories pytest tests/unit/ -v # Unit tests only pytest tests/integration/ -v # Integration tests pytest tests/contract/ -v # Contract tests

Test Categories

  • Unit Tests: Fast, isolated component tests

  • Integration Tests: Database and service integration

  • Contract Tests: MCP protocol compliance validation

  • Performance Tests: Latency and throughput benchmarks

Coverage Requirements

  • Minimum coverage: 95%

  • Critical paths: 100%

  • View HTML report: open htmlcov/index.html

Performance Tuning

Database Optimization

-- Optimize vector searches CREATE INDEX ON chunks USING ivfflat (embedding vector_cosine_ops) WITH (lists = 100); -- Adjust work_mem for large result sets ALTER SYSTEM SET work_mem = '256MB'; SELECT pg_reload_conf();

Connection Pool Settings

# In .env DATABASE_POOL_SIZE=20 # Connection pool size DATABASE_MAX_OVERFLOW=10 # Max overflow connections DATABASE_POOL_TIMEOUT=30 # Connection timeout in seconds

Embedding Batch Size

# Adjust based on available memory EMBEDDING_BATCH_SIZE=100 # For systems with 8GB+ RAM EMBEDDING_BATCH_SIZE=50 # Default for 4GB RAM EMBEDDING_BATCH_SIZE=25 # For constrained environments

Troubleshooting

Common Issues

  1. Database Connection Failed

    • Check PostgreSQL is running: pg_ctl status

    • Verify DATABASE_URL in .env

    • Ensure database exists: psql -U postgres -l

  2. Ollama Connection Error

    • Check Ollama is running: curl http://localhost:11434/api/tags

    • Verify model is installed: ollama list

    • Check OLLAMA_BASE_URL in .env

  3. Slow Performance

    • Check database indexes: \di in psql

    • Monitor query performance: See logs at LOG_FILE path

    • Adjust batch sizes and connection pool

For detailed troubleshooting, see the Configuration Guide troubleshooting section.

Contributing

We follow a specification-driven development workflow using the Specify framework.

Development Workflow

  1. Feature Specification: Use /specify command to create feature specs

  2. Planning: Generate implementation plan with /plan

  3. Task Breakdown: Create tasks with /tasks

  4. Implementation: Execute tasks with /implement

Git Workflow

# Create feature branch git checkout -b 001-feature-name # Make atomic commits git add . git commit -m "feat(component): add specific feature" # Push and create PR git push origin 001-feature-name

Code Quality Standards

  • Type Safety: mypy --strict must pass

  • Linting: ruff check with no errors

  • Testing: All tests must pass with 95%+ coverage

  • Documentation: Update relevant docs with changes

Constitutional Principles

  1. Simplicity Over Features: Focus on core semantic search

  2. Local-First Architecture: No cloud dependencies

  3. Protocol Compliance: Strict MCP adherence

  4. Performance Guarantees: Meet stated benchmarks

  5. Production Quality: Comprehensive error handling

See .specify/memory/constitution.md for full principles.

FastMCP Migration (Oct 2025)

Migration Complete: The server has been successfully migrated from the legacy MCP SDK to the modern FastMCP framework.

What Changed

Before (MCP SDK):

# Old: Manual tool registration with JSON schemas class MCPServer: def __init__(self): self.tools = { "search_code": { "name": "search_code", "description": "...", "inputSchema": {...} } }

After (FastMCP):

# New: Decorator-based tool definitions @mcp.tool() async def search_code(query: str, limit: int = 10) -> dict[str, Any]: """Semantic code search with natural language queries.""" # Implementation

Key Benefits

  1. Simpler Tool Definitions: Decorators replace manual JSON schema creation

  2. Type Safety: Automatic schema generation from Pydantic models

  3. Dual Logging: File logging + MCP protocol without stdout pollution

  4. Better Error Handling: Structured error responses with context

  5. Cleaner Architecture: Separation of tool interface from business logic

Server Files

  • New Entry Point: server_fastmcp.py (root directory)

  • Legacy Server: src/mcp/mcp_stdio_server_v3.py (deprecated, will be removed)

  • Tool Handlers: src/mcp/tools/*.py (unchanged, reused by FastMCP)

  • Services: src/services/*.py (unchanged, business logic intact)

Configuration Update Required

Update your Claude Desktop config to use the new server:

{ "mcpServers": { "codebase-mcp": { "command": "uv", "args": ["run", "--with", "fastmcp", "python", "/path/to/server_fastmcp.py"] } } }

Migration Notes

  • All 6 MCP tools remain functional (100% backward compatible)

  • No database schema changes required

  • Tool signatures and responses unchanged

  • Logging now goes exclusively to /tmp/codebase-mcp.log

  • All tests pass with FastMCP implementation

Performance

FastMCP maintains performance targets:

  • Repository indexing: <60 seconds for 10K files

  • Code search: <500ms p95 latency

  • Async/await throughout for optimal concurrency

License

MIT License (LICENSE file pending).

Support

Quick Start

Basic Usage (Default Project)

For most users, the default project workspace is sufficient. All indexing now uses background jobs to prevent MCP client timeouts:

# Start background indexing job (returns immediately) job = await start_indexing_background(repo_path="/path/to/your/repo") job_id = job["job_id"] # Poll for completion while True: status = await get_indexing_status(job_id=job_id) if status["status"] in ["completed", "failed"]: break await asyncio.sleep(2) # Check result if status["status"] == "completed": print(f"✅ Indexed {status['files_indexed']} files, {status['chunks_created']} chunks") else: print(f"❌ Indexing failed: {status['error_message']}") # Search code results = await search_code(query="function to handle authentication") # Search with filters results = await search_code( query="database query", file_type="py", limit=20 )

The server automatically uses a default project workspace (project_default) if no project ID is specified.

Multi-Project Usage

For users managing multiple codebases or client projects, use the project_id parameter to isolate repositories:

# Index repositories with project_id job_a = await start_indexing_background( repo_path="/path/to/client-a-repo", project_id="client-a" ) job_b = await start_indexing_background( repo_path="/path/to/client-b-repo", project_id="client-b" ) # Poll both jobs for job in [job_a, job_b]: while True: status = await get_indexing_status(job_id=job["job_id"]) if status["status"] in ["completed", "failed"]: break await asyncio.sleep(2) # Search within specific project results_a = await search_code( query="authentication logic", project_id="client-a" ) results_b = await search_code( query="payment processing", project_id="client-b" )

Each project has its own isolated database schema, ensuring repositories and embeddings are completely separated.

workflow-mcp Integration (Optional)

The Codebase MCP Server can optionally integrate with workflow-mcp for automatic project context resolution. This is an advanced feature and not required for basic usage.

Standalone Usage (Default)

By default, Codebase MCP operates independently:

# Works out of the box without workflow-mcp job = await start_indexing_background(repo_path="/path/to/repo") results = await search_code(query="search query")

Integration with workflow-mcp

If you're using workflow-mcp to manage development projects, Codebase MCP can automatically resolve project context:

# Set workflow-mcp URL in environment export WORKFLOW_MCP_URL=http://localhost:8001
# Now project_id is automatically resolved from workflow-mcp's active project job = await start_indexing_background(repo_path="/path/to/repo") # Uses active project results = await search_code(query="search query") # Searches in active project's context

How It Works:

  1. Codebase MCP queries workflow-mcp for the active project

  2. If an active project exists, it's used as the project_id

  3. If no active project or workflow-mcp is unavailable, falls back to default project

  4. You can still override with --project-id flag

Configuration:

# In .env file WORKFLOW_MCP_URL=http://localhost:8001 # Optional, enables integration

See Also: workflow-mcp repository for details on project workspace management.

Documentation

Comprehensive documentation is available for different use cases:

For quick setup, refer to the Installation section above.

Contributing

We welcome contributions to the Codebase MCP Server. This project follows a specification-driven development workflow.

Getting Started

  1. Read the Architecture: Start with docs/architecture/multi-project-design.md to understand the system design

  2. Review the Constitution: See .specify/memory/constitution.md for project principles

  3. Follow the Workflow: Use the Specify workflow documented in CLAUDE.md

Development Process

  1. Create a feature specification using /specify command

  2. Plan the implementation with /plan

  3. Generate tasks using /tasks

  4. Implement incrementally with atomic commits

Code Standards

  • Type Safety: Full mypy --strict compliance

  • Testing: 95%+ test coverage, contract tests for MCP protocol

  • Performance: Meet benchmarks (60s indexing, 500ms search p95)

  • Documentation: Update docs with all changes

Code of Conduct

This project adheres to a code of conduct that promotes a welcoming, inclusive environment. We expect:

  • Respectful communication in issues and PRs

  • Constructive feedback focused on code and ideas

  • Recognition that contributors volunteer their time

  • Patience with maintainers and fellow contributors

By participating, you agree to uphold these standards.

Acknowledgments

  • MCP framework powered by FastMCP

  • Built with FastAPI, SQLAlchemy, and Pydantic

  • Vector search powered by pgvector

  • Embeddings via Ollama and nomic-embed-text

  • Code parsing with tree-sitter

  • MCP protocol by Anthropic

-
security - not tested
F
license - not found
-
quality - not tested

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/Ravenight13/codebase-mcp'

If you have feedback or need assistance with the MCP directory API, please join our Discord server