Code-Index-MCP

README.md•6.54 KiB

# MCP Server Benchmark Suite This benchmark suite provides comprehensive performance testing and validation for the MCP Server, implementing the IBenchmarkRunner and IPerformanceMonitor interfaces to ensure all components meet the performance requirements specified in ROADMAP.md. ## Performance Requirements Validation ✅ | Metric | Target | Implementation Status | |--------|--------|----------------------| | Symbol Lookup | < 100ms (p95) | ✅ Automated validation | | Semantic Search | < 500ms (p95) | ✅ Automated validation | | Indexing Speed | 10K files/minute | ✅ Throughput measurement | | Memory Usage | < 2GB for 100K files | ✅ Memory extrapolation | ## Interface Compliance The benchmark framework implements multiple interface contracts: ### IBenchmarkRunner (BenchmarkRunner) - `run_indexing_benchmark()` - Validates indexing throughput - `run_search_benchmark()` - Tests search performance across query types - `run_memory_benchmark()` - Measures memory usage at scale - `generate_benchmark_report()` - Creates comprehensive performance reports ### IPerformanceMonitor & IIndexPerformanceMonitor (BenchmarkSuite) - Real-time timer management with `start_timer()` / `stop_timer()` - Performance metrics collection with `record_indexing_time()` / `record_search_time()` - Statistical analysis with `get_performance_metrics()` and `get_slow_queries()` - SLO validation with `validate_performance_requirements()` ## Implementation Components ### Index Engine Performance Features - **Incremental Updates**: File hash-based change detection using MD5 - **Batch Processing**: Configurable concurrency with semaphore-based rate limiting - **Progress Tracking**: Throughput calculation and task monitoring - **Cache Management**: File metadata and hash caching for performance ### Query Optimizer Performance Features - **Multi-factor Cost Model**: CPU, I/O, and memory cost estimation - **Query Selectivity**: Statistics-based filter optimization - **Index Selection**: Cost-based index usage decisions - **Result Caching**: Automatic cache key generation for expensive queries ### Supported Query Types for Benchmarking - **Symbol Search**: Functions, classes, variables lookup - **Text Search**: Full-text search with SQLite FTS5 - **Fuzzy Search**: Approximate matching with trigrams - **Semantic Search**: Vector-based similarity (when available) - **Reference Search**: Symbol usage location finding ## Running Benchmarks ### Using pytest-benchmark Run individual benchmarks with pytest: ```bash # Run all benchmarks pytest tests/test_benchmarks.py -v --benchmark-only # Run specific benchmark group pytest tests/test_benchmarks.py -k "symbol_lookup" --benchmark-only # Generate detailed benchmark report pytest tests/test_benchmarks.py --benchmark-only --benchmark-autosave ``` ### Using the Standalone Runner Run the complete benchmark suite: ```bash # Quick benchmarks python -m mcp_server.benchmarks.run_benchmarks --quick # Full benchmark suite python -m mcp_server.benchmarks.run_benchmarks --full # Compare with previous results python -m mcp_server.benchmarks.run_benchmarks --compare # Specific plugins only python -m mcp_server.benchmarks.run_benchmarks --plugins python,javascript ``` ### Programmatic Usage ```python from mcp_server.benchmarks import BenchmarkSuite, BenchmarkRunner from mcp_server.plugins.python_plugin import PythonPlugin from mcp_server.plugins.js_plugin import JavaScriptPlugin # Create plugins plugins = [PythonPlugin(), JavaScriptPlugin()] # Run benchmarks runner = BenchmarkRunner() result = runner.run_benchmarks(plugins) # Check if SLOs are met if all(result.validations.values()): print("All performance requirements met!") else: print("Some requirements failed:", result.validations) ``` ## Benchmark Components ### BenchmarkSuite The main benchmark suite that runs performance tests: - `benchmark_symbol_lookup()`: Tests symbol definition lookup performance - `benchmark_search()`: Tests fuzzy and semantic search performance - `benchmark_indexing()`: Measures file indexing throughput - `benchmark_memory_usage()`: Tracks memory consumption for different codebase sizes - `benchmark_cache_performance()`: Evaluates cache hit/miss performance ### BenchmarkRunner Orchestrates benchmark execution and reporting: - Runs complete benchmark suites - Saves results with history tracking - Detects performance regressions - Generates HTML and text reports - Exports CI-friendly metrics ### PerformanceMetrics Tracks performance measurements: - Timing samples (min, max, mean, median, p95, p99) - Memory usage - CPU utilization - SLO validation ## Output Reports The benchmark suite generates several reports: 1. **HTML Report** (`benchmark_report.html`): - Visual performance metrics - SLO validation status - Regression analysis - Historical trends 2. **Text Summary** (`benchmark_summary.txt`): - Console-friendly summary - Key metrics and validation results 3. **JSON Results** (`benchmark_YYYYMMDD_HHMMSS.json`): - Complete benchmark data - Machine-readable format - Historical tracking 4. **CI Metrics** (`ci_metrics.json`): - Simplified metrics for CI/CD - Pass/fail status - Key performance indicators ## Integration with CI/CD ### GitHub Actions Example ```yaml - name: Run Performance Benchmarks run: | python -m mcp_server.benchmarks.run_benchmarks --compare - name: Upload Benchmark Results uses: actions/upload-artifact@v3 with: name: benchmark-results path: benchmark_results/ - name: Check Performance SLOs run: | python -c " import json with open('benchmark_results/ci_metrics.json') as f: data = json.load(f) if not data['passed']: print('Performance requirements not met!') exit(1) " ``` ## Performance Tips 1. **Warm-up Runs**: The suite includes warm-up iterations to stabilize measurements 2. **Statistical Significance**: Uses p95/p99 percentiles instead of averages 3. **Isolation**: Run benchmarks on idle systems for consistent results 4. **Regression Detection**: Automatically compares with previous runs (10% threshold) ## Troubleshooting ### High Variance in Results - Ensure system is idle during benchmarks - Increase iteration counts for more stable results - Check for background processes affecting performance ### Memory Measurements - Memory usage includes Python interpreter overhead - Use relative measurements between runs - Force garbage collection between tests ### Regression Detection - First run creates baseline (no regression check) - Subsequent runs compare against history - Adjust threshold with `threshold_percent` parameter

Loading blob content...

Latest Blog Posts

Redis vs ioredis vs valkey-glide
By punkpeye on January 26, 2026.
benchmark
Redis
valkey
Quickstart: Publish an MCP Server to the MCP Registry
By punkpeye on January 24, 2026.
mcp
official reference mirror
Official MCP Registry Server.json Requirements
By punkpeye on January 24, 2026.
mcp
official reference mirror

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/ViperJuice/Code-Index-MCP'

If you have feedback or need assistance with the MCP directory API, please join our Discord server

README.md•6.54 KiB