Codebase MCP Server

testing-strategy.md•21.8 KiB

# Testing Strategy: Minimal but Effective ## Overview The MCP split project adopts a "minimal but sufficient" testing approach focused on MCP protocol compliance and basic functionality. This strategy balances quality assurance with development velocity, avoiding over-testing while ensuring production readiness. --- ## Testing Philosophy ### Minimal but Sufficient **What We Test:** - ✅ MCP protocol compliance (tool schemas, error formats) - ✅ Core functionality (index → search → results) - ✅ Basic error handling (invalid inputs, missing data) - ✅ Performance targets (latency SLAs) - ✅ Integration points (codebase-mcp ↔ workflow-mcp) **What We Don't Test:** - ❌ Edge cases beyond common scenarios - ❌ Exhaustive input validation (trust Pydantic) - ❌ UI/UX testing (no UI in MCP servers) - ❌ Load testing beyond performance targets - ❌ Security testing (trust-based, local-first) **Rationale:** - MCPs are internal tools (not public APIs) - FastMCP + Pydantic provide built-in validation - Local-first architecture limits attack surface - Development velocity prioritized over exhaustive coverage --- ## Test Categories ### 1. MCP Protocol Compliance Tests **Purpose:** Ensure MCP specification adherence **What to Test:** #### Tool Registration ```python # tests/test_mcp_compliance.py import pytest from fastmcp import FastMCP def test_tool_registration(): """Verify all tools registered with FastMCP.""" from src.server import mcp tools = mcp.list_tools() tool_names = [tool["name"] for tool in tools] # Codebase-MCP expected tools assert "index_repository" in tool_names assert "search_code" in tool_names # Verify tool count (catch missing registrations) assert len(tool_names) == 2 ``` #### Tool Schema Validation ```python def test_search_code_schema(): """Verify search_code tool schema follows MCP spec.""" from src.server import mcp schema = mcp.get_tool_schema("search_code") # Required MCP fields assert "name" in schema assert "description" in schema assert "parameters" in schema # Parameter structure params = schema["parameters"] assert params["type"] == "object" assert "properties" in params assert "required" in params # Specific parameters properties = params["properties"] assert "query" in properties assert properties["query"]["type"] == "string" # Optional parameter marked correctly assert "project_id" in properties assert "query" in params["required"] assert "project_id" not in params["required"] # Optional ``` #### Error Format Compliance ```python def test_mcp_error_format(): """Verify errors follow MCP error format (JSON-RPC 2.0).""" from src.tools.search_code import search_code from fastmcp import McpError # Trigger error (invalid project_id) with pytest.raises(McpError) as exc_info: await search_code(query="test", project_id="invalid-uuid") error = exc_info.value # MCP error structure assert hasattr(error, "code") assert hasattr(error, "message") assert hasattr(error, "data") # JSON-RPC 2.0 error codes assert error.code == -32602 # Invalid params # Error message is descriptive assert len(error.message) > 0 assert "project_id" in error.message.lower() # Error data provides context assert "project_id" in error.data assert error.data["project_id"] == "invalid-uuid" ``` #### No stdout/stderr Pollution ```python def test_no_stdout_pollution(capsys): """Verify no stdout/stderr output during tool execution.""" from src.tools.search_code import search_code # Execute tool await search_code( query="test", project_id="550e8400-e29b-41d4-a716-446655440000" ) # Capture output captured = capsys.readouterr() # MCP protocol requires no stdout/stderr assert captured.out == "" assert captured.err == "" # Note: Logging to file is allowed, just not stdout/stderr ``` **When to Run:** - After tool implementation - After schema changes - Before every PR merge - In CI/CD pipeline **Acceptance Criteria:** - All tools registered correctly - All schemas valid (name, description, parameters, required fields) - All errors use MCP format (code, message, data) - No stdout/stderr during tool execution --- ### 2. Basic Functionality Tests **Purpose:** Validate core workflows work end-to-end **What to Test:** #### Codebase-MCP: Index → Search Flow ```python # tests/test_basic_functionality.py import pytest import tempfile import os @pytest.fixture async def test_project(): """Create a test project with sample code.""" project_id = "test-proj-001" with tempfile.TemporaryDirectory() as tmpdir: # Create sample Python file sample_file = os.path.join(tmpdir, "sample.py") with open(sample_file, "w") as f: f.write(""" def hello_world(): print("Hello, world!") def fibonacci(n): if n <= 1: return n return fibonacci(n-1) + fibonacci(n-2) """) yield project_id, tmpdir async def test_index_then_search(test_project): """Test basic workflow: index repository then search.""" project_id, repo_path = test_project # Step 1: Index repository from src.tools.index_repository import index_repository index_result = await index_repository( repo_path=repo_path, project_id=project_id ) # Verify indexing succeeded assert index_result["status"] == "success" assert index_result["files_indexed"] == 1 assert index_result["chunks_created"] > 0 # Step 2: Search indexed code from src.tools.search_code import search_code search_result = await search_code( query="fibonacci function", project_id=project_id ) # Verify search found results assert len(search_result["results"]) > 0 # Verify result structure first_result = search_result["results"][0] assert "chunk_id" in first_result assert "file_path" in first_result assert "content" in first_result assert "similarity_score" in first_result # Verify content relevance assert "fibonacci" in first_result["content"].lower() ``` #### Workflow-MCP: Project Lifecycle ```python # tests/test_basic_functionality.py async def test_project_lifecycle(): """Test basic workflow: create → switch → query.""" from src.tools.create_project import create_project from src.tools.switch_project import switch_project from src.tools.get_active_project import get_active_project # Step 1: Create project project = await create_project( name="Test Project", description="Integration test project" ) project_id = project["id"] assert project["name"] == "Test Project" assert project["status"] == "active" # Step 2: Switch to project await switch_project(project_id=project_id) # Step 3: Verify active project active = await get_active_project() assert active["project_id"] == project_id ``` **When to Run:** - After core feature implementation - After refactoring - Before every PR merge - In CI/CD pipeline **Acceptance Criteria:** - Index → search workflow completes successfully - Search returns relevant results - Project lifecycle (create → switch → query) works - All tool outputs match expected schemas --- ### 3. Integration Tests **Purpose:** Validate cross-MCP interactions **What to Test:** #### Codebase-MCP + Workflow-MCP Integration ```python # tests/test_integration.py @pytest.mark.integration async def test_multi_project_isolation(): """Test project isolation across both MCPs.""" # Setup: Create two projects in workflow-mcp from workflow_mcp.tools.create_project import create_project as create_wf_project project_a = await create_wf_project(name="Project A") project_b = await create_wf_project(name="Project B") # Setup: Index same code in both projects (codebase-mcp) from codebase_mcp.tools.index_repository import index_repository await index_repository( repo_path="/tmp/shared-repo", project_id=project_a["id"] ) await index_repository( repo_path="/tmp/shared-repo", project_id=project_b["id"] ) # Test: Switch to Project A (workflow-mcp) from workflow_mcp.tools.switch_project import switch_project await switch_project(project_id=project_a["id"]) # Test: Search should scope to Project A (codebase-mcp) from codebase_mcp.tools.search_code import search_code results_a = await search_code(query="test function") # Verify: All results from Project A assert all(r["project_id"] == project_a["id"] for r in results_a["results"]) # Test: Switch to Project B await switch_project(project_id=project_b["id"]) # Test: Search should scope to Project B results_b = await search_code(query="test function") # Verify: All results from Project B assert all(r["project_id"] == project_b["id"] for r in results_b["results"]) # Verify: No cross-project leakage project_a_chunk_ids = {r["chunk_id"] for r in results_a["results"]} project_b_chunk_ids = {r["chunk_id"] for r in results_b["results"]} assert project_a_chunk_ids.isdisjoint(project_b_chunk_ids) ``` #### Project Switching with Cache Invalidation ```python @pytest.mark.integration async def test_project_switch_invalidates_cache(): """Verify active project cache invalidates on switch.""" from codebase_mcp.cache.project_cache import _project_cache from workflow_mcp.tools.switch_project import switch_project from codebase_mcp.helpers.project_helper import get_active_project # Setup: Get active project (caches it) project_a_id = await get_active_project() assert _project_cache.get("active_project") == project_a_id # Action: Switch projects project_b_id = "different-project-id" await switch_project(project_id=project_b_id) # Verify: Cache invalidated assert _project_cache.get("active_project") is None # Verify: Next call fetches new project new_active = await get_active_project() assert new_active == project_b_id ``` **When to Run:** - After both MCPs implemented - After integration point changes - Before major releases - In pre-deployment testing **Acceptance Criteria:** - Project isolation enforced (no cross-project leakage) - Project switching updates active context - Cache invalidation works correctly - Both MCPs communicate via MCP protocol --- ### 4. Performance Validation Tests **Purpose:** Ensure performance targets met **What to Test:** #### Search Latency (p95) ```python # tests/test_performance.py import pytest import time import statistics @pytest.mark.performance async def test_search_latency_p95(): """Verify search latency <500ms at p95.""" project_id = "perf-test-project" query = "function implementation" # Setup: Index 10K files (~50K chunks) await setup_large_codebase(project_id, num_files=10000) # Execute: 100 search queries latencies = [] for i in range(100): start = time.time() await search_code(query=query, project_id=project_id) latency_ms = (time.time() - start) * 1000 latencies.append(latency_ms) # Measure: p95 latency p95_latency = statistics.quantiles(latencies, n=20)[18] # 95th percentile # Verify: <500ms target assert p95_latency < 500, f"p95 latency {p95_latency}ms exceeds 500ms target" # Report: Additional metrics print(f"p50: {statistics.median(latencies):.1f}ms") print(f"p95: {p95_latency:.1f}ms") print(f"p99: {max(latencies):.1f}ms") ``` #### Index Creation (60s target) ```python @pytest.mark.performance async def test_index_creation_latency(): """Verify indexing <60s for 10K files.""" project_id = "index-perf-test" repo_path = "/tmp/large-repo" # Pre-created with 10K files # Execute: Index repository start = time.time() result = await index_repository( repo_path=repo_path, project_id=project_id, force_reindex=True ) duration = time.time() - start # Verify: <60s target assert duration < 60, f"Indexing took {duration:.1f}s, exceeds 60s target" # Verify: All files indexed assert result["files_indexed"] == 10000 assert result["status"] == "success" ``` #### Project Switching (<50ms) ```python @pytest.mark.performance async def test_project_switch_latency(): """Verify project switching <50ms.""" project_a = await create_project(name="Project A") project_b = await create_project(name="Project B") # Warm up await switch_project(project_id=project_a["id"]) # Measure: 20 switches latencies = [] for i in range(20): target = project_a["id"] if i % 2 == 0 else project_b["id"] start = time.time() await switch_project(project_id=target) latency_ms = (time.time() - start) * 1000 latencies.append(latency_ms) # Verify: p95 <50ms p95 = statistics.quantiles(latencies, n=20)[18] assert p95 < 50, f"Project switching p95 {p95:.1f}ms exceeds 50ms target" ``` **When to Run:** - After performance-sensitive changes - Before major releases - Weekly in CI/CD (separate job) - Manual before deployment **Acceptance Criteria:** - Search latency: <500ms p95 - Index creation: <60s for 10K files - Project switching: <50ms p95 - Project creation: <200ms **Note:** Performance tests are slower, run separately from unit tests --- ## Test Automation ### Pytest Configuration ```toml # pyproject.toml [tool.pytest.ini_options] asyncio_mode = "auto" testpaths = ["tests"] python_files = ["test_*.py"] python_classes = ["Test*"] python_functions = ["test_*"] # Markers for test categories markers = [ "unit: Unit tests (fast)", "integration: Integration tests (require both MCPs)", "performance: Performance tests (slow)", "mcp_compliance: MCP protocol compliance tests" ] # Async timeout asyncio_default_fixture_loop_scope = "function" timeout = 300 # 5 minutes max per test # Coverage settings addopts = "--cov=src --cov-report=html --cov-report=term" ``` ### Running Tests #### All Tests ```bash pytest tests/ -v ``` #### Unit Tests Only (Fast) ```bash pytest tests/ -v -m "unit" ``` #### Integration Tests ```bash # Requires both MCPs running docker-compose up -d codebase-mcp workflow-mcp pytest tests/ -v -m "integration" ``` #### Performance Tests ```bash # Run separately (slow) pytest tests/ -v -m "performance" ``` #### MCP Compliance Tests ```bash pytest tests/ -v -m "mcp_compliance" ``` #### Watch Mode (Development) ```bash pytest-watch tests/ -v ``` --- ## Test Fixtures ### Database Fixtures ```python # tests/conftest.py import pytest import asyncpg @pytest.fixture(scope="session") async def test_db(): """Create test database for session.""" # Create test database conn = await asyncpg.connect( host="localhost", user="postgres", password="postgres", database="postgres" ) await conn.execute("DROP DATABASE IF EXISTS test_codebase_mcp") await conn.execute("CREATE DATABASE test_codebase_mcp") await conn.close() # Run migrations test_conn = await asyncpg.connect( host="localhost", user="postgres", password="postgres", database="test_codebase_mcp" ) # Apply schema with open("src/database/schema.sql") as f: await test_conn.execute(f.read()) yield test_conn # Cleanup await test_conn.close() conn = await asyncpg.connect( host="localhost", user="postgres", password="postgres", database="postgres" ) await conn.execute("DROP DATABASE test_codebase_mcp") await conn.close() @pytest.fixture async def clean_db(test_db): """Clean all tables before each test.""" await test_db.execute("TRUNCATE TABLE code_chunks, repositories, projects CASCADE") yield test_db ``` ### Project Fixtures ```python @pytest.fixture async def test_project(clean_db): """Create a test project.""" project_id = str(uuid.uuid4()) await clean_db.execute( "INSERT INTO projects (id, name) VALUES ($1, $2)", project_id, "Test Project" ) return project_id @pytest.fixture async def test_repository(test_project, clean_db): """Create a test repository with sample code.""" repo_id = str(uuid.uuid4()) repo_path = "/tmp/test-repo" # Create sample code os.makedirs(repo_path, exist_ok=True) with open(f"{repo_path}/sample.py", "w") as f: f.write("def test(): pass") # Insert into database await clean_db.execute( "INSERT INTO repositories (id, path, project_id) VALUES ($1, $2, $3)", repo_id, repo_path, test_project ) yield repo_id, repo_path # Cleanup import shutil shutil.rmtree(repo_path, ignore_errors=True) ``` --- ## When to Run Tests ### During Development - **After each task:** Run relevant tests (not full suite) ```bash pytest tests/test_search_code.py -v ``` - **Before committing:** Run affected tests ```bash pytest tests/ -k "search or project" -v ``` ### Before Creating PR - **Full test suite:** All tests must pass ```bash pytest tests/ -v ``` - **MCP compliance:** Ensure protocol adherence ```bash pytest tests/ -v -m "mcp_compliance" ``` - **Coverage check:** Ensure adequate coverage ```bash pytest tests/ --cov=src --cov-report=term --cov-fail-under=80 ``` ### CI/CD Pipeline ```yaml # .github/workflows/test.yml name: Tests on: [push, pull_request] jobs: unit-tests: runs-on: ubuntu-latest steps: - uses: actions/checkout@v3 - uses: actions/setup-python@v4 with: python-version: '3.11' - run: pip install -e ".[test]" - run: pytest tests/ -v -m "unit" integration-tests: runs-on: ubuntu-latest services: postgres: image: postgres:14 env: POSTGRES_PASSWORD: postgres steps: - uses: actions/checkout@v3 - uses: actions/setup-python@v4 with: python-version: '3.11' - run: pip install -e ".[test]" - run: pytest tests/ -v -m "integration" performance-tests: runs-on: ubuntu-latest # Run weekly, not on every push if: github.event_name == 'schedule' steps: - uses: actions/checkout@v3 - uses: actions/setup-python@v4 with: python-version: '3.11' - run: pip install -e ".[test]" - run: pytest tests/ -v -m "performance" ``` --- ## Test Coverage Targets ### Codebase-MCP - **Tools:** 90% coverage (critical path) - **Database:** 80% coverage (migrations + queries) - **Cache:** 85% coverage (edge cases important) - **Helpers:** 75% coverage (utility functions) - **Overall:** 80% minimum ### Workflow-MCP - **Tools:** 90% coverage (critical path) - **Database:** 80% coverage (entity relationships) - **Session management:** 85% coverage (state critical) - **Overall:** 80% minimum **Rationale:** - Focus on critical paths (tools, database) - Lower coverage acceptable for utilities - 80% balances quality with velocity --- ## Testing Anti-Patterns (Avoid) ### ❌ Over-Testing ```python # Don't test Pydantic validation (trust the library) def test_search_code_rejects_negative_limit(): """Pydantic already validates this!""" with pytest.raises(ValidationError): await search_code(query="test", limit=-1) ``` ### ❌ Testing Implementation Details ```python # Don't test private methods def test_private_cache_eviction(): """Internal implementation, may change""" cache = ProjectCache() cache._evict_lru() # Private method assert cache._size == 0 ``` ### ❌ Slow Tests in Unit Test Suite ```python # Don't put slow tests with fast tests def test_search_with_large_codebase(): """This belongs in performance tests, not unit tests""" # Indexes 100K files... ``` ### ❌ Tests with External Dependencies ```python # Don't rely on external services def test_search_code(): """Don't rely on real Ollama server""" # Calls live Ollama API... # Use mock instead! ``` --- ## Mocking Strategies ### Mock External Dependencies ```python # tests/test_search_code.py from unittest.mock import AsyncMock, patch @pytest.mark.asyncio async def test_search_code_with_mock_embeddings(): """Test search without calling real embedding service.""" # Mock embedding generation mock_embedding = [0.1] * 768 # Fake embedding vector with patch('src.embeddings.generate_embedding', new=AsyncMock(return_value=mock_embedding)): result = await search_code( query="test query", project_id="test-project-id" ) # Verify search executed (with fake embedding) assert len(result["results"]) >= 0 ``` ### Mock Workflow-MCP Integration ```python @pytest.mark.asyncio async def test_get_active_project_with_mock(): """Test active project retrieval without workflow-mcp.""" mock_project_id = "mock-project-123" with patch('src.helpers.project_helper.workflow_mcp.call_tool', new=AsyncMock(return_value={"current_session_id": mock_project_id})): project_id = await get_active_project() assert project_id == mock_project_id ``` --- ## Summary **Test Categories:** 1. **MCP Protocol Compliance** - Tool registration, schemas, error formats 2. **Basic Functionality** - Core workflows (index → search) 3. **Integration Tests** - Cross-MCP interactions 4. **Performance Validation** - Latency SLAs **When to Run:** - Unit tests: After each task, before commit - Integration tests: Before PR, in CI/CD - Performance tests: Weekly, before releases **Test Automation:** - Pytest with async support - Fixtures for database, projects, repositories - Docker Compose for integration tests - CI/CD pipeline with separate jobs **Coverage Targets:** - Tools: 90% (critical) - Database: 80% - Overall: 80% minimum **Key Principles:** - Minimal but sufficient (don't over-test) - Trust Pydantic + FastMCP (don't test library code) - Mock external dependencies (Ollama, workflow-mcp) - Fast unit tests, separate performance tests - Focus on protocol compliance + core workflows

Loading blob content...

Latest Blog Posts

Redis vs ioredis vs valkey-glide
By punkpeye on January 26, 2026.
benchmark
Redis
valkey
Quickstart: Publish an MCP Server to the MCP Registry
By punkpeye on January 24, 2026.
mcp
official reference mirror
Official MCP Registry Server.json Requirements
By punkpeye on January 24, 2026.
mcp
official reference mirror

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/Ravenight13/codebase-mcp'

If you have feedback or need assistance with the MCP directory API, please join our Discord server

testing-strategy.md•21.8 KiB