Skip to main content
Glama
README.md7.6 kB
# Scout MCP Performance Benchmarks Comprehensive performance benchmarking suite for scout_mcp FastMCP server. ## Quick Start ```bash # Run all benchmarks python -m pytest benchmarks/ -v -s # Run specific benchmark suite python -m pytest benchmarks/test_connection_pool.py -v -s python -m pytest benchmarks/test_ssh_operations.py -v -s python -m pytest benchmarks/test_config_parsing.py -v -s python -m pytest benchmarks/test_uri_parsing.py -v -s python -m pytest benchmarks/test_end_to_end.py -v -s # CPU profiling python benchmarks/profile_cpu.py # Memory profiling python benchmarks/profile_memory.py ``` ## Benchmark Suites ### 1. Connection Pool (`test_connection_pool.py`) Tests connection pooling performance, lock contention, and resource usage. **Key Tests:** - `test_cold_start_latency` - First connection establishment - `test_warm_connection_latency` - Cached connection retrieval - `test_concurrent_single_host_lock_contention` - Lock contention under load - `test_concurrent_multi_host_parallelism` - Parallel connection creation - `test_pool_memory_footprint` - Memory usage scaling - `test_cleanup_task_overhead` - Background cleanup impact - `test_stale_connection_detection` - Connection failover **Results:** ``` Cold start: 10.47ms Warm lookup: 0.02ms Lock contention (100 req): 1.70ms avg, 2.60ms P95 Multi-host (10): 104ms (BOTTLENECK - should be 15ms) Memory (100 conns): 7.9 KB ``` --- ### 2. SSH Operations (`test_ssh_operations.py`) Tests SSH command execution performance and overhead. **Key Tests:** - `test_stat_path_latency` - File type detection - `test_cat_file_latency` - File reading - `test_ls_dir_latency` - Directory listing - `test_run_command_latency` - Arbitrary command execution - `test_large_file_transfer` - 1MB file transfer - `test_concurrent_operations` - Concurrent SSH ops **Results:** ``` stat_path: 1.10ms (13% overhead) cat_file: 1.13ms (13% overhead) ls_dir: 1.13ms (13% overhead) Large file (1MB): 93.46 MB/s (7% overhead) ``` --- ### 3. Configuration Parsing (`test_config_parsing.py`) Tests SSH config parsing and caching performance. **Key Tests:** - `test_ssh_config_parsing_cold` - Initial parse (100 hosts) - `test_ssh_config_parsing_cached` - Cached access - `test_large_ssh_config_parsing` - Large config (1000 hosts) - `test_host_lookup_performance` - Individual host lookup - `test_allowlist_filtering_performance` - Filter performance - `test_regex_parsing_overhead` - Regex overhead analysis **Results:** ``` Parse 100 hosts: 1.67ms (59,913 hosts/s) Parse 1000 hosts: 10.16ms (98,460 hosts/s) Cached access: 0.011ms avg, 0.013ms P95 ``` --- ### 4. URI Parsing (`test_uri_parsing.py`) Tests URI parsing performance and edge cases. **Key Tests:** - `test_uri_parsing_latency` - Standard URI parsing - `test_hosts_command_parsing` - Special 'hosts' command - `test_long_path_parsing` - Long path handling - `test_error_case_performance` - Invalid URI handling **Results:** ``` URI parsing: 0.0011ms avg (<0.01% overhead) Long paths (305 chars): 0.0007ms Error cases: 0.0004ms ``` --- ### 5. End-to-End (`test_end_to_end.py`) Tests full request flow from API to SSH. **Key Tests:** - `test_full_request_latency_cold` - Cold start request - `test_full_request_latency_warm` - Warm request - `test_concurrent_requests_same_host` - Concurrent same host - `test_concurrent_requests_different_hosts` - Concurrent multi-host - `test_mixed_operation_workload` - Mixed operations - `test_hosts_command_performance` - Hosts listing **Results:** ``` Cold start: 16.10ms Warm request: 10.62ms avg, 10.99ms P95 Single-host (50 conc): 2,186 req/s Multi-host (10 conc): 149 req/s (BOTTLENECK) Hosts command: 0.01ms ``` --- ## Profiling Tools ### CPU Profiling ```bash python benchmarks/profile_cpu.py ``` **Output:** - Top 30 functions by cumulative time - Top 20 functions by internal time - Call counts - Full profile saved to `.cache/cpu_profile.txt` **Findings:** - Most time in `asyncio.sleep()` (mock network) - Lock operations: <1% of total time - No CPU bottlenecks (I/O bound) --- ### Memory Profiling ```bash python benchmarks/profile_memory.py ``` **Output:** - Memory growth after pool creation - Memory growth after 100 connections - Memory growth after 1000 reuses - Top memory consumers - Leak detection **Findings:** - Peak memory: 0.07 MB (100 connections) - Per-connection: ~80 bytes - No memory leaks detected --- ## Performance Metrics ### Summary Table | Component | Metric | Value | Status | |-----------|--------|-------|--------| | Pool | Cold start | 10.47ms | ✅ Good | | Pool | Warm lookup | 0.02ms | ✅ Excellent | | Pool | Lock contention | 1.70ms avg | ⚠️ Moderate | | Pool | Multi-host | 104ms | ❌ Critical | | Pool | Memory (100 conns) | 7.9 KB | ✅ Excellent | | SSH | stat_path | 1.10ms | ✅ Good | | SSH | cat_file | 1.13ms | ✅ Good | | SSH | Large file (1MB) | 93.46 MB/s | ✅ Good | | Config | Parse 100 hosts | 1.67ms | ✅ Good | | Config | Cached access | 0.011ms | ✅ Excellent | | URI | Parsing | 0.0011ms | ✅ Excellent | | E2E | Cold request | 16.10ms | ✅ Good | | E2E | Warm request | 10.62ms | ✅ Good | --- ## Known Issues ### Critical Bottlenecks 1. **Global Lock Serialization** (`pool.py:44`) - Impact: 10x slowdown on parallel connections - Test: `test_concurrent_multi_host_parallelism` (FAILS) - Expected: 15ms, Actual: 104ms 2. **No Connection Pool Limits** (`pool.py:35`) - Impact: Memory exhaustion under load - Test: No limit enforcement in pool initialization 3. **No Request Concurrency Limits** (`server.py:36`) - Impact: DoS vulnerability - Test: No backpressure mechanism --- ## Interpreting Results ### Good Performance ``` [PERF] Metric: X.XXms ``` If value is: - <1ms: ✅ Excellent (negligible overhead) - 1-10ms: ✅ Good (acceptable for network operations) - 10-50ms: ⚠️ Moderate (check if expected) - >50ms: ❌ Poor (investigate bottleneck) ### Throughput ``` Throughput: X,XXX req/s ``` Target thresholds: - >1000 req/s: ✅ Excellent - 500-1000 req/s: ✅ Good - 100-500 req/s: ⚠️ Moderate - <100 req/s: ❌ Poor ### Memory ``` Memory: X.XX MB ``` Target per 100 connections: - <1 MB: ✅ Excellent - 1-10 MB: ✅ Good - 10-50 MB: ⚠️ Moderate - >50 MB: ❌ Poor --- ## Adding New Benchmarks ### Template ```python import pytest import time import statistics @pytest.mark.asyncio async def test_my_benchmark() -> None: """Benchmark description.""" latencies = [] for _ in range(100): start = time.perf_counter() # ... operation to benchmark ... latencies.append(time.perf_counter() - start) avg = statistics.mean(latencies) * 1000 p95 = statistics.quantiles(latencies, n=20)[18] * 1000 print(f"\n[PERF] My benchmark (n=100):") print(f" Avg: {avg:.2f}ms") print(f" P95: {p95:.2f}ms") assert avg < TARGET_MS, "Performance regression" ``` --- ## CI Integration Add to `.github/workflows/benchmark.yml`: ```yaml name: Benchmarks on: [push, pull_request] jobs: benchmark: runs-on: ubuntu-latest steps: - uses: actions/checkout@v3 - uses: actions/setup-python@v4 with: python-version: '3.11' - run: pip install -e .[dev] - run: python -m pytest benchmarks/ -v - run: python benchmarks/profile_cpu.py - run: python benchmarks/profile_memory.py ``` --- ## References - Full analysis: `.docs/performance-analysis.md` - Executive summary: `.docs/performance-summary.md` - Bottleneck visualization: `.docs/performance-bottlenecks.md`

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/jmagar/scout_mcp'

If you have feedback or need assistance with the MCP directory API, please join our Discord server