Codebase MCP Server

codebase-mcp
tests
benchmarks

README.md•6.03 KiB

# Codebase MCP Performance Benchmarks This directory contains performance benchmarks for validating constitutional performance targets for the codebase-mcp server. ## Overview Performance benchmarks use `pytest-benchmark` to measure and validate system performance against constitutional guarantees defined in `.specify/memory/constitution.md`. ## Benchmark Categories ### 1. Indexing Performance (`test_indexing_perf.py`) - **Target**: Index 10,000 files in <60s (p95) - Constitutional Principle IV - **Iterations**: 5 benchmark runs with 1 warmup round - **Validates**: FR-001 from specs/011-performance-validation-multi/spec.md ### 2. Search Performance (`test_search_perf.py`) - **Target**: Search latency <500ms (p95) - Constitutional Principle IV - **Validates**: FR-002 from specs/011-performance-validation-multi/spec.md ### 3. Workflow MCP Performance (`test_workflow_perf.py`) - **Target**: Project switching <50ms (p95) - Constitutional Principle IV - **Validates**: FR-003, FR-004 from specs/011-performance-validation-multi/spec.md ## Running Benchmarks ### Run All Benchmarks ```bash # Run all benchmarks with statistics pytest tests/benchmarks/ --benchmark-only -v # Save results to JSON baseline pytest tests/benchmarks/ --benchmark-only \ --benchmark-json=performance_baselines/benchmark_results.json ``` ### Run Specific Benchmark Suite ```bash # Indexing benchmarks only pytest tests/benchmarks/test_indexing_perf.py --benchmark-only -v # Search benchmarks only pytest tests/benchmarks/test_search_perf.py --benchmark-only -v # Workflow benchmarks only pytest tests/benchmarks/test_workflow_perf.py --benchmark-only -v ``` ### Compare Against Baseline ```bash # Compare current run against saved baseline pytest tests/benchmarks/test_indexing_perf.py --benchmark-only \ --benchmark-compare=performance_baselines/indexing_baseline.json \ --benchmark-compare-fail=mean:10% ``` ### Generate Performance Report ```bash # Run with histogram visualization pytest tests/benchmarks/ --benchmark-only \ --benchmark-histogram=performance_reports/histogram # Run with table output pytest tests/benchmarks/ --benchmark-only \ --benchmark-columns=min,max,mean,stddev,median,ops,rounds ``` ## Benchmark Architecture ### Test Fixtures (`conftest.py`) - `test_engine`: Function-scoped async database engine - `session`: Function-scoped async database session with automatic rollback - `benchmark_repository`: Generated 10,000-file test repository ### Test Repository Generation Benchmarks use synthetic repositories generated via `tests/fixtures/test_repository.py`: - **File count**: 10,000 files (60% Python, 40% JavaScript) - **File sizes**: 100 bytes to 50KB (realistic distribution) - **Directory depth**: Up to 5 levels (matches production codebases) - **Code complexity**: Functions, classes, imports (tree-sitter validated syntax) ### Performance Validation Model All benchmarks create `PerformanceBenchmarkResult` instances (from `src/models/performance.py`) to: - Store benchmark results with Decimal precision - Validate percentile ordering (p50 ≤ p95 ≤ p99) - Determine pass/fail status against constitutional targets - Enable regression detection across benchmark runs ## Constitutional Compliance Benchmarks validate these constitutional principles: - **Principle IV**: Performance Guarantees - Indexing: <60s (p95) for 10,000 files - Search: <500ms (p95) - Project switching: <50ms (p95) - Entity queries: <100ms (p95) - **Principle VII**: TDD (Test-Driven Development) - Benchmarks serve as performance regression tests - Fail fast when performance degrades - **Principle VIII**: Type Safety - Full mypy --strict compliance - Pydantic models for benchmark results ## Benchmark Results Format Each benchmark generates a `PerformanceBenchmarkResult` with: ```python { "benchmark_id": "uuid", "server_id": "codebase-mcp" | "workflow-mcp", "operation_type": "index" | "search" | "project_switch" | "entity_query", "timestamp": "2025-10-13T10:30:00Z", "latency_p50_ms": Decimal("245.32"), "latency_p95_ms": Decimal("478.91"), "latency_p99_ms": Decimal("512.45"), "latency_mean_ms": Decimal("268.77"), "latency_min_ms": Decimal("198.12"), "latency_max_ms": Decimal("543.88"), "sample_size": 5, "test_parameters": { "file_count": 10000, "iterations": 5 }, "pass_status": "pass" | "warning" | "fail", "target_threshold_ms": Decimal("60000.0") } ``` ## Interpreting Results ### Pass/Fail Status - **pass**: p95 latency < constitutional target - **warning**: p95 latency < target * 1.1 (within 10% threshold) - **fail**: p95 latency > target * 1.1 (exceeds acceptable range) ### Variance Validation Indexing and search benchmarks also validate variance (coefficient of variation): - **Target**: CV < 5% (consistent performance) - **Formula**: CV = (stddev / mean) × 100% ## CI/CD Integration Benchmarks are designed for CI/CD pipeline integration: ```yaml # .github/workflows/performance.yml - name: Run Performance Benchmarks run: | pytest tests/benchmarks/ --benchmark-only \ --benchmark-json=results.json \ --benchmark-compare=baselines/baseline.json \ --benchmark-compare-fail=mean:10% ``` ## Troubleshooting ### Slow Benchmark Execution - Each indexing benchmark takes ~30-60s (by design, testing 10K files) - Use `--benchmark-only` to skip regular tests - Consider `--benchmark-skip` when running unit tests ### Database Connection Issues - Ensure PostgreSQL is running: `pg_isready` - Check `TEST_DATABASE_URL` environment variable - Verify database exists: `createdb codebase_mcp_test` ### Event Loop Conflicts - All async fixtures use function scope (see conftest.py) - Each benchmark gets fresh engine + session - No session/module scope async fixtures (prevents loop conflicts) ## References - **Specification**: `specs/011-performance-validation-multi/spec.md` - **Quickstart**: `specs/011-performance-validation-multi/quickstart.md` - **Data Model**: `specs/011-performance-validation-multi/data-model.md` - **Constitution**: `.specify/memory/constitution.md` (Principle IV)

Loading blob content...

Latest Blog Posts

Redis vs ioredis vs valkey-glide
By punkpeye on January 26, 2026.
benchmark
Redis
valkey
Quickstart: Publish an MCP Server to the MCP Registry
By punkpeye on January 24, 2026.
mcp
official reference mirror
Official MCP Registry Server.json Requirements
By punkpeye on January 24, 2026.
mcp
official reference mirror

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/Ravenight13/codebase-mcp'

If you have feedback or need assistance with the MCP directory API, please join our Discord server

README.md•6.03 KiB