Codebase MCP Server

PLANNING-REVIEW.md•48.2 KiB

# codebase-mcp Planning Review **Date**: 2025-10-11 **Reviewer**: Senior Planning Reviewer **Plan Version**: Initial (Pre-Implementation) --- ## Executive Summary The codebase-mcp refactoring plan demonstrates **strong architectural thinking** with clear separation of concerns, comprehensive documentation, and adherence to constitutional principles. The plan successfully addresses the core objective: transforming a monolithic MCP into a pure semantic code search service with multi-project support. **Overall Quality**: 8/10 (Good to Excellent) **Strengths**: - Clear constitutional framework with non-negotiable principles - Comprehensive user stories with testable acceptance criteria - Detailed phase-by-phase implementation roadmap - Strong emphasis on testing (TDD, performance, protocol compliance) - Well-documented migration path for existing users **Critical Concerns**: 1. **Missing database naming strategy** - No clear specification for how project databases are named/discovered 2. **Incomplete error handling specification** - workflow-mcp integration failure scenarios need more detail 3. **No rollback strategy** - Missing plan for reverting if refactor fails mid-implementation 4. **Performance baseline missing** - No current performance metrics to compare against 5. **Concurrent access patterns undefined** - Multiple clients accessing same project database **Recommendation**: **APPROVED WITH CHANGES** Address critical issues #1-3 before implementation begins. Issues #4-5 can be addressed during Phase 0-1. --- ## Critical Issues (Must Fix Before Implementation) ### C1: Database Naming and Discovery Strategy Incomplete **Location**: `refactoring-plan.md` Phase 8, `specify-prompt.txt` Questions section **Problem**: The plan mentions `database_name = f"codebase_{project_id}"` but doesn't specify: - How project_id format is validated (alphanumeric only? length limits?) - How database names are sanitized (what if project_id contains spaces/special chars?) - How the system discovers what databases exist (list all `codebase_*` databases?) - What happens if database creation fails due to permissions **Impact**: Runtime errors, SQL injection risks, unclear operational model **Required Changes**: 1. Add validation rules for project_id: ```python # Example specification needed project_id: str = Field( pattern=r'^[a-z0-9-]{1,50}$', description="Lowercase alphanumeric with hyphens, max 50 chars" ) ``` 2. Document database naming convention explicitly: ```markdown ## Database Naming Convention - Format: `codebase_<project_id>` - Example: project_id="my-app" → database="codebase_my-app" - Invalid characters replaced with hyphens - Length limit: 63 chars (PostgreSQL identifier limit) ``` 3. Add database discovery endpoint specification: ```markdown ### Future Tool (v2.1.0) - `list_indexed_projects()` → returns list of project_ids with databases ``` 4. Specify permission handling in `tech-stack.md`: ```markdown ## Database Permissions - User must have CREATEDB privilege - Error message: "Insufficient permissions to create database. Grant CREATEDB: ALTER USER postgres CREATEDB;" ``` ### C2: workflow-mcp Integration Error Handling Underspecified **Location**: `refactoring-plan.md` Phase 7, `user-stories.md` Story 3 **Problem**: The plan shows basic error handling (`return None` if workflow-mcp unavailable) but doesn't specify: - How to distinguish between "workflow-mcp not installed" vs "workflow-mcp crashed" vs "network timeout" - Whether to cache active_project_id (and for how long?) - What happens if workflow-mcp returns invalid project_id (doesn't exist, no database) - Whether to retry on transient failures **Impact**: Poor user experience, silent failures, unclear debugging **Required Changes**: 1. Add error categorization in `refactoring-plan.md`: ```python class WorkflowMCPError(Exception): """Base class for workflow-mcp integration errors""" pass class WorkflowMCPUnavailable(WorkflowMCPError): """workflow-mcp not running or unreachable""" message = "workflow-mcp unavailable. Provide explicit project_id parameter." class WorkflowMCPTimeout(WorkflowMCPError): """workflow-mcp request timed out""" message = "workflow-mcp timed out. Retrying with explicit project_id..." ``` 2. Specify caching strategy in `implementation-phases.md` Phase 7: ```markdown ### Active Project Caching - Cache active_project_id for 60 seconds - Invalidate on explicit project_id parameter - Log cache hits/misses for observability ``` 3. Add validation in `user-stories.md` Story 3: ```markdown ### Scenario 5: Invalid project_id from workflow-mcp workflow-mcp returns project_id="nonexistent-project" search_code(query="test", project_id=None) # Expected: Error "Project 'nonexistent-project' not indexed. Run index_repository first." ``` ### C3: No Rollback Strategy Defined **Location**: Missing from `refactoring-plan.md` and `implementation-phases.md` **Problem**: The plan assumes linear success through all 12 phases. No specification for: - What to do if Phase 7 (multi-project support) fails after Phase 2-6 deleted code - How to revert database migrations if tests fail - Whether to maintain a "last known good" branch - How to handle partial migration state (some projects migrated, others not) **Impact**: High risk of getting stuck in broken state mid-refactor **Required Changes**: 1. Add "Rollback Strategy" section to `implementation-phases.md`: ```markdown ## Rollback Strategy ### Per-Phase Rollback - Each phase has its own commit (micro-commits) - Rollback: `git revert <phase-commit-hash>` - Test rollback after each phase completion ### Database Migration Rollback Create `migrations/002_rollback.sql`: ```sql -- Restore non-search tables from backup -- Remove project_id columns -- Restore original unique constraints ``` ### Emergency Rollback (Full Refactor) If refactor must be abandoned: 1. Checkout main branch: `git checkout main` 2. Restore database backup: `psql codebase_mcp < backup-before-002.sql` 3. Document lessons learned for next attempt ``` 2. Add phase-specific rollback instructions in each phase: ```markdown ### Phase 2: Database Schema Refactoring **Rollback Plan**: - If migration fails: `psql codebase_mcp < backup-before-002.sql` - If tests fail after migration: Revert schema.sql, revert migration script ``` ### C4: Performance Baseline Missing **Location**: `implementation-phases.md` Phase 1, `user-stories.md` Story 5 **Problem**: Phase 11 defines performance targets (<500ms search, <60s indexing for 10k files) but Phase 1 baseline doesn't capture current performance. Can't prove refactor doesn't regress performance. **Impact**: Risk of shipping slower implementation, no data-driven optimization **Required Changes**: 1. Add to Phase 1 baseline tasks in `implementation-phases.md`: ```markdown ### Phase 1: Additional Baseline Metrics 4. **Capture Performance Baseline** ```bash # Create test repository with 10k files python scripts/generate_test_repo.py --files=10000 --output=/tmp/test-repo # Benchmark indexing (current implementation) time python -m codebase_mcp.tools.search index /tmp/test-repo # Benchmark search (100 queries, measure p50/p95) python scripts/benchmark_search.py --queries=100 --output=docs/baseline/search-latency.json ``` 5. **Document Baseline Performance** Create `docs/baseline/performance-before.md`: ```markdown # Performance Baseline (Before Refactor) ## Indexing - 10,000 files: X seconds - Throughput: Y files/second ## Search - p50 latency: X ms - p95 latency: Y ms - p99 latency: Z ms ``` ``` 2. Add comparison requirement to Phase 11 acceptance criteria: ```markdown ### Acceptance Criteria - [ ] Search latency <500ms (p95) AND no worse than 20% regression from baseline - [ ] Indexing throughput <60s for 10k files AND no worse than 10% regression ``` ### C5: Concurrent Access Patterns Undefined **Location**: `refactoring-plan.md` Phase 8, `tech-stack.md` AsyncPG section **Problem**: Plan specifies connection pooling (5-20 connections per project) but doesn't address: - What happens when multiple Claude Code instances search same project simultaneously? - Whether connection pools are shared across MCP server instances (if running multiple servers) - How indexing locks repositories (prevent concurrent index operations on same repo) - Whether search during indexing returns partial/stale results **Impact**: Data races, connection pool exhaustion, unclear concurrency model **Required Changes**: 1. Add "Concurrency Model" section to `tech-stack.md`: ```markdown ## Concurrency Model ### Single MCP Server Instance (Primary Use Case) - Connection pool shared by all requests to that server - 20 concurrent searches per project (pool max_size) - Indexing operations use advisory locks (prevents concurrent indexing of same repo) ### Multiple MCP Server Instances (Advanced) - Each server has independent connection pools - PostgreSQL handles concurrent connections (default max_connections=100) - Advisory locks shared across servers (PostgreSQL-level) ### Search During Indexing - Search returns results from current index state - No blocking (search and index are concurrent) - Stale results possible during re-indexing (acceptable tradeoff) ``` 2. Add advisory lock implementation to `refactoring-plan.md` Phase 7: ```python async def index_repository_db(pool, repo_path, project_id, force_reindex): """Index repository with advisory lock to prevent concurrent indexing""" async with pool.acquire() as conn: # Acquire advisory lock (hash of repo_path + project_id) lock_id = hash(f"{project_id}:{repo_path}") % (2**31) acquired = await conn.fetchval( "SELECT pg_try_advisory_lock($1)", lock_id ) if not acquired: raise RuntimeError( f"Repository {repo_path} is currently being indexed. " "Wait for completion or use force_reindex=True." ) try: # Indexing logic here... pass finally: # Release lock await conn.execute("SELECT pg_advisory_unlock($1)", lock_id) ``` --- ## Important Issues (Should Fix) ### I1: Migration Guide Lacks Detailed Data Migration Instructions **Location**: `refactoring-plan.md` Phase 10 **Problem**: Migration guide tells users to "restore to workflow-mcp's database schema" for work items/tasks data but doesn't provide concrete steps or SQL scripts. **Suggested Fix**: Add data migration section: ```markdown ## Data Migration to workflow-mcp ### Step 1: Export Non-Search Data ```bash # Export work items pg_dump codebase_mcp \ --table=work_items \ --table=work_item_dependencies \ --data-only \ --file=work_items_export.sql # Export tasks pg_dump codebase_mcp \ --table=tasks \ --table=task_planning_references \ --data-only \ --file=tasks_export.sql ``` ### Step 2: Transform Schema (if needed) ```sql -- workflow-mcp may have different column names -- Manual review and adjustment required ``` ### Step 3: Import to workflow-mcp ```bash psql workflow_mcp < work_items_export.sql psql workflow_mcp < tasks_export.sql ``` ``` ### I2: No Specification for Database Cleanup/Archival **Location**: Missing from all documents **Problem**: Multi-project design means one database per project. No plan for: - How to delete project databases when project is removed - Whether to archive databases before deletion (backup old projects) - How to reclaim disk space from unused projects - Whether to warn users about orphaned databases **Suggested Fix**: Add to `refactoring-plan.md`: ```markdown ## Future Tool: Database Lifecycle Management (v2.2.0) ### delete_project_index(project_id: str, backup: bool = True) - Optionally backup database to `backups/codebase_<project_id>_<timestamp>.sql` - Drop database `codebase_<project_id>` - Remove connection pool entry ### list_orphaned_databases() - Query PostgreSQL for `codebase_*` databases - Compare against workflow-mcp's active projects - Return list of orphaned databases (no matching project) ``` ### I3: Error Messages Not Fully Specified **Location**: `user-stories.md` Story 8 has examples but incomplete **Problem**: Story 8 shows 4 error scenarios but many more exist: - Invalid file_type parameter (e.g., "python" instead of "py") - Invalid directory parameter (e.g., "../../../etc/passwd") - Embedding generation fails (Ollama model unloaded from memory) - Database connection pool exhausted (all 20 connections busy) - Repository path is a file, not a directory **Suggested Fix**: Create comprehensive error catalog in new document `docs/error-catalog.md`: ```markdown # Error Catalog ## E001: Invalid Repository Path **Trigger**: `repo_path` does not exist or is not a directory **Error**: `ValueError("Repository path does not exist: /path")` **Action**: "Verify the path is correct and points to a directory" ## E002: Invalid Project ID Format **Trigger**: `project_id` contains invalid characters **Error**: `ValueError("Invalid project_id format. Use lowercase alphanumeric with hyphens.")` **Action**: "Example valid project_id: my-project-123" ## E003: Connection Pool Exhausted **Trigger**: All 20 connections in use **Error**: `TimeoutError("Database connection pool exhausted for project <project_id>")` **Action**: "Wait for active queries to complete or increase max_size in connection.py" [... continue for all error scenarios] ``` ### I4: No Specification for Incremental Updates **Location**: `implementation-phases.md` mentions "Future Enhancements" but no detail **Problem**: Plan mentions "incremental indexing (only changed files)" as future work but doesn't specify: - How to detect changed files (git diff? filesystem timestamps? content hash?) - Whether to store file hashes in database for change detection - How to handle file deletions (remove chunks? mark as deleted?) **Suggested Fix**: Not blocking for v2.0.0, but add placeholder in `user-stories.md`: ```markdown ## Future Story: Incremental Indexing (v2.1.0) **As a** developer re-indexing a large repository **I need** only changed files to be re-indexed **So that** re-indexing is fast (seconds instead of minutes) ### Acceptance Criteria 1. Store file content hash (SHA-256) in repositories table 2. On re-index, compare current hash vs stored hash 3. Skip unchanged files (hash match) 4. Update chunks only for changed files 5. Remove chunks for deleted files [NEEDS CLARIFICATION: How to detect file renames?] ``` ### I5: Test Data Generation Not Specified **Location**: `implementation-phases.md` Phase 11 mentions "large_indexed_repo" fixture but no details **Problem**: Performance tests need realistic test data (10k files) but plan doesn't specify: - What languages to generate (Python, JS, TypeScript mix?) - File size distribution (realistic: small configs, large source files) - How to ensure deterministic generation (for reproducible benchmarks) **Suggested Fix**: Add test data specification to `implementation-phases.md` Phase 1: ```markdown ### Phase 1: Create Test Data Generators Create `tests/fixtures/generate_test_repo.py`: ```python def generate_test_repo( num_files: int, output_dir: Path, languages: list[str] = ["py", "js", "ts"], seed: int = 42 # For reproducibility ) -> Path: """ Generate realistic test repository with: - 70% small files (<1KB): configs, tests - 25% medium files (1-10KB): source code - 5% large files (>10KB): generated code, data - Mixture of languages (weighted by usage) """ pass ``` ``` ### I6: No Multi-Language Testing Strategy **Location**: `tech-stack.md` mentions tree-sitter supports 50+ languages, but no test plan **Problem**: Plan assumes tree-sitter "just works" for all languages but doesn't specify: - Which languages are tested in integration tests (Python? JS? Rust?) - Whether to test chunking quality per language (functions split correctly?) - How to handle languages without tree-sitter parser (fallback to line-based chunking?) **Suggested Fix**: Add to `implementation-phases.md` Phase 11: ```markdown ### Phase 11: Multi-Language Testing Create `tests/test_language_support.py`: ```python @pytest.mark.parametrize("language,file_path", [ ("python", "tests/fixtures/sample.py"), ("javascript", "tests/fixtures/sample.js"), ("typescript", "tests/fixtures/sample.ts"), ("rust", "tests/fixtures/sample.rs"), ("go", "tests/fixtures/sample.go"), ]) @pytest.mark.asyncio async def test_language_chunking_quality(language, file_path): """Test that tree-sitter chunks code correctly per language""" chunks = await chunk_file(file_path, language) # Verify functions not split across chunks # Verify classes kept together # Verify imports in separate chunks pass ``` ``` --- ## Suggestions (Nice to Have) ### S1: Add Architectural Decision Records (ADRs) **Location**: New document `docs/adr/` **Suggestion**: Document key decisions with rationale: - ADR-001: Why pgvector instead of dedicated vector store (Pinecone, Weaviate)? - ADR-002: Why one database per project instead of multi-tenancy in single database? - ADR-003: Why FastMCP instead of raw MCP Python SDK? - ADR-004: Why AsyncPG instead of SQLAlchemy async? **Benefit**: Future maintainers understand tradeoffs, avoid relitigating decisions ### S2: Add Observability/Metrics Specification **Location**: New section in `tech-stack.md` **Suggestion**: Specify what metrics to track: ```markdown ## Observability ### Metrics (Prometheus-compatible) - `codebase_search_latency_seconds{project_id, percentile}` - Search latency histogram - `codebase_indexing_duration_seconds{project_id}` - Indexing time histogram - `codebase_embeddings_generated_total{project_id}` - Counter of embedding requests - `codebase_database_connections{project_id, state}` - Connection pool usage ### Logs (Structured JSON) - Every search: `{"op": "search", "project_id": "...", "query": "...", "latency_ms": 250}` - Every index: `{"op": "index", "project_id": "...", "files": 1234, "duration_s": 45.2}` - Errors: `{"op": "search", "error": "...", "project_id": "...", "traceback": "..."}` ``` **Benefit**: Production debugging, performance monitoring, SLA tracking ### S3: Add Configuration File Examples **Location**: New file `docs/configuration-examples.md` **Suggestion**: Show example configurations for different scenarios: ```yaml # Example 1: Local development (single developer) POSTGRES_HOST: localhost POSTGRES_PORT: 5432 OLLAMA_URL: http://localhost:11434 WORKFLOW_MCP_URL: http://localhost:3000/mcp/workflow-mcp # Example 2: Team setup (shared PostgreSQL, local Ollama) POSTGRES_HOST: postgres.team.local POSTGRES_PORT: 5432 OLLAMA_URL: http://localhost:11434 # Each dev runs own Ollama # Example 3: No workflow-mcp (standalone) POSTGRES_HOST: localhost POSTGRES_PORT: 5432 OLLAMA_URL: http://localhost:11434 # WORKFLOW_MCP_URL not set (explicit project_id required) ``` **Benefit**: Easier onboarding, reduced support questions ### S4: Add Troubleshooting Guide **Location**: New file `docs/troubleshooting.md` **Suggestion**: Common problems and solutions: ```markdown ## Troubleshooting ### Problem: Search is slow (>1s latency) **Possible Causes**: - HNSW index not created (check `\d code_chunks` in psql) - Connection pool size too small (increase max_size in connection.py) - Ollama model not loaded in memory (first query loads model) **Solution**: ```sql -- Verify HNSW index exists SELECT indexname, indexdef FROM pg_indexes WHERE tablename = 'code_chunks'; -- Should show: idx_code_chunks_embedding using hnsw ``` ### Problem: Indexing fails with "Out of memory" **Possible Causes**: - Batch size too large (50 files at once) - Large files (>1MB) cause embedding generation to fail **Solution**: Reduce batch size in `indexing.py`: ```python BATCH_SIZE = 25 # Reduce from 50 ``` ``` ### S5: Add Diagram: Multi-Project Architecture **Location**: `README.md` or `docs/architecture.md` **Suggestion**: Visual diagram showing: ``` ┌─────────────────────────────────────────────────────┐ │ Claude Code (MCP Client) │ └────────────────────┬────────────────────────────────┘ │ ┌────────────┼────────────┐ ▼ ▼ ▼ ┌─────────┐ ┌─────────┐ ┌─────────┐ │workflow │ │codebase │ │ other │ │ -mcp │ │ -mcp │ │ MCPs │ └────┬────┘ └────┬────┘ └─────────┘ │ │ │ get_active_project_id() └───────────>│ │ ┌────────────┼────────────┐ ▼ ▼ ▼ ┌─────────┐ ┌─────────┐ ┌─────────┐ │codebase_│ │codebase_│ │codebase_│ │project_a│ │project_b│ │project_c│ └─────────┘ └─────────┘ └─────────┘ ``` **Benefit**: Clearer mental model of system architecture ### S6: Add Upgrade Path from v1.x to v2.0 in Code **Location**: New migration tool `scripts/migrate_v1_to_v2.py` **Suggestion**: Automated migration script: ```python #!/usr/bin/env python3 """ Automated migration from codebase-mcp v1.x to v2.0 Usage: python scripts/migrate_v1_to_v2.py --backup --project-id=default """ import argparse import subprocess def main(): parser = argparse.ArgumentParser() parser.add_argument("--backup", action="store_true", help="Backup database before migration") parser.add_argument("--project-id", required=True, help="Project ID for existing data") args = parser.parse_args() if args.backup: print("Creating backup...") subprocess.run(["pg_dump", "codebase_mcp", "-f", "backup-v1.sql"]) print("Running migration...") subprocess.run(["psql", "codebase_mcp", "-f", "migrations/002_remove_non_search_tables.sql"]) print("Migration complete!") print(f"All existing data tagged with project_id: {args.project_id}") if __name__ == "__main__": main() ``` **Benefit**: Reduces migration friction, fewer support requests --- ## Completeness Analysis ### Requirements Coverage: 90% **Well Covered**: - ✅ Functional requirements: Search, indexing, multi-project support - ✅ Non-functional requirements: Performance, local-first, protocol compliance - ✅ User stories: 8 comprehensive stories with acceptance criteria - ✅ Technical stack: Fully specified with rationale - ✅ Constitutional principles: 11 principles, clear enforcement **Missing/Incomplete**: - ⚠️ Database naming/discovery strategy (Critical C1) - ⚠️ Error handling for workflow-mcp integration (Critical C2) - ⚠️ Rollback strategy (Critical C3) - ⚠️ Concurrent access patterns (Critical C5) - ⚠️ Data migration to workflow-mcp (Important I1) ### Success Criteria Coverage: 85% **Measurable Criteria**: - ✅ Functional: "Index and search across 3+ projects simultaneously" - ✅ Performance: "<500ms search latency (p95)", "<60s indexing for 10k files" - ✅ Quality: "mypy --strict passes", "coverage >80%", "mcp-inspector 100%" **Incomplete Criteria**: - ⚠️ No baseline comparison (how to prove no regression?) (Critical C4) - ⚠️ Operational criteria vague ("Clear error messages" - what's the standard?) ### Edge Cases Coverage: 70% **Well Covered**: - ✅ Repository doesn't exist (Story 8, Scenario 1) - ✅ Invalid project_id (Story 8, Scenario 2) - ✅ Ollama not running (Story 8, Scenario 3) - ✅ Empty search results (Story 1, Scenario 2) **Missing Edge Cases**: - ⚠️ Project has no repositories indexed (search returns empty, or error?) - ⚠️ Repository path is a file, not directory - ⚠️ Repository contains only binary files (no text files to index) - ⚠️ Project_id contains SQL injection attempt - ⚠️ Two repositories with same path in different projects - ⚠️ Database disk full during indexing - ⚠️ Ollama model unloaded from memory mid-operation ### Error Scenarios Coverage: 75% **Documented Errors**: - ✅ Missing repository path - ✅ Invalid project_id - ✅ Ollama unavailable - ✅ PostgreSQL unavailable - ✅ Schema version mismatch **Missing Error Scenarios**: - ⚠️ Invalid file_type parameter format - ⚠️ Directory traversal attack via directory parameter - ⚠️ Connection pool exhaustion - ⚠️ workflow-mcp returns malformed response - ⚠️ Database migration partially completed (half tables dropped) --- ## Correctness Analysis ### Phase Dependency Order: Correct ✅ The 12 phases follow logical dependency order: - Phase 2 (schema) before Phase 4 (database operations) ✅ - Phase 3 (remove tools) before Phase 5 (update server) ✅ - Phase 7 (multi-project) before Phase 8 (connection pooling) ✅ - Phase 11 (performance) before Phase 12 (final validation) ✅ **No circular dependencies detected**. ### Git Strategy: Correct ✅ - Micro-commits after each phase (working state guaranteed) ✅ - Conventional commit format specified ✅ - Feature branch isolation (`002-refactor-pure-search`) ✅ - Tag for release (v2.0.0) ✅ **Concern**: No mention of squashing commits before merge. With 12+ commits, PR review may be difficult. **Suggestion**: Add to Phase 12: ```markdown ### Optional: Squash Commits for Clean History ```bash # Interactive rebase to squash into ~3-4 logical commits git rebase -i main # Group by: 1) Remove non-search features, 2) Add multi-project, 3) Documentation ``` ``` ### Testing Strategy: Mostly Correct ✅ - TDD approach (tests before implementation) ✅ - Unit, integration, protocol, performance tests specified ✅ - mcp-inspector validation included ✅ **Concern**: Test execution order not specified. Should run: 1. Fast unit tests (on every commit) 2. Integration tests (before push) 3. Performance tests (before merge) 4. Protocol tests (before merge) **Suggestion**: Add to `implementation-phases.md`: ```markdown ## Test Execution Strategy ### Pre-Commit (Fast, <10s) ```bash pytest tests/unit/ -v ``` ### Pre-Push (Medium, <2min) ```bash pytest tests/ -v --ignore=tests/performance/ ``` ### Pre-Merge (Full, <10min) ```bash pytest tests/ -v --cov=src/codebase_mcp --cov-report=html pytest tests/test_performance.py -v -m performance mcp-inspector http://localhost:8000 --full-validation ``` ``` ### Performance Targets: Realistic ✅ - <500ms search latency (p95): Realistic for pgvector HNSW indexes ✅ - <60s indexing for 10k files: Realistic at ~166 files/second ✅ - 20 concurrent connections per project: Realistic for AsyncPG pooling ✅ **Validation**: Targets align with pgvector benchmarks (sub-ms vector search on millions of vectors) and Ollama embedding generation (~200ms per query). --- ## Clarity Analysis ### Developer Implementability: 85% **Clear Sections**: - ✅ Phase-by-phase breakdown with specific files to modify - ✅ Code examples showing "before" and "after" states - ✅ Acceptance criteria for each phase (clear success definition) - ✅ Git commit messages specified per phase **Ambiguous Sections**: - ⚠️ "Update imports across codebase" (Phase 3) - Which files? How to find them systematically? - ⚠️ "Optimize if needed" (Phase 11) - What's the optimization procedure? Profile with what tool? - ⚠️ "Follow migration guide step-by-step" (Phase 10) - Guide hasn't been tested yet, may have errors **Suggestions**: 1. Add grep commands for finding all import references: ```bash # Find all files importing removed tools grep -r "from.*work_items import" src/ grep -r "from.*tasks import" src/ ``` 2. Specify profiling tool: ```bash # Profile search_code function python -m cProfile -o search.prof -m pytest tests/test_performance.py::test_search_latency snakeviz search.prof # Visualize profile ``` ### Terminology Consistency: 90% **Consistent Terms**: - ✅ "project_id" used throughout (not "projectId" or "project-id") - ✅ "repository" vs "project" distinction clear - ✅ "embedding" vs "vector" used consistently **Inconsistent Terms**: - ⚠️ "workflow-mcp" vs "workflow_mcp" (hyphen vs underscore) - ⚠️ "code_chunks" vs "chunks" interchangeably **Suggestion**: Add glossary to `README.md`: ```markdown ## Glossary - **project**: A workspace containing one or more repositories (workflow-mcp concept) - **project_id**: Unique identifier for a project (e.g., "my-app") - **repository**: A single code repository indexed for search - **chunk**: A segment of code (function, class, or ~500 lines) with embedding - **embedding**: 768-dimensional vector representing semantic meaning of code - **workflow-mcp**: Separate MCP server handling project/task management ``` ### Scope Clarity: 95% **Clear In-Scope**: - ✅ Semantic code search ✅ - ✅ Multi-project support ✅ - ✅ Repository indexing ✅ **Clear Out-of-Scope**: - ✅ Work item tracking (moved to workflow-mcp) ✅ - ✅ Task management (moved to workflow-mcp) ✅ - ✅ Vendor tracking (moved to workflow-mcp) ✅ - ✅ Code analysis/linting ✅ **Ambiguous**: - ⚠️ "Context-aware results" (Story 1) - Is this just lines before/after, or semantic context? - ⚠️ "Metadata preservation" (README) - What metadata? File path, language, git commit hash? **Suggestion**: Clarify in `user-stories.md` Story 1: ```markdown ### Context Lines - context_before: 3 lines immediately before matched chunk - context_after: 3 lines immediately after matched chunk - NOT semantic context (that would be "similar code" feature, out of scope) ``` --- ## Risk Analysis ### High-Risk Areas #### R1: workflow-mcp Integration Brittleness (HIGH) **Risk**: codebase-mcp depends on workflow-mcp for project context. If workflow-mcp changes API or becomes unavailable, codebase-mcp breaks. **Impact**: Users can't search without explicitly providing project_id every time **Mitigation Specified**: Fallback to explicit project_id parameter ✅ **Additional Mitigation Needed**: 1. Version workflow-mcp API in URL: ```python WORKFLOW_MCP_URL = "http://localhost:3000/api/v1/mcp/workflow-mcp" # If v1 fails, try v2, then fall back to explicit project_id ``` 2. Add workflow-mcp health check at startup: ```python async def check_workflow_mcp_health(): """Warn if workflow-mcp unavailable at startup""" try: project_id = await get_active_project_id() if project_id: logger.info(f"workflow-mcp integration: OK (active project: {project_id})") else: logger.warning("workflow-mcp integration: UNAVAILABLE (explicit project_id required)") except Exception as e: logger.error(f"workflow-mcp integration: ERROR ({e})") ``` #### R2: Database Proliferation (MEDIUM) **Risk**: One database per project means users with 50 projects have 50 databases. Operational overhead (backups, monitoring, disk space). **Impact**: High disk usage, PostgreSQL connection limit hit, slow backups **Mitigation Specified**: None ❌ **Mitigation Needed**: 1. Document in `README.md`: ```markdown ## Multi-Project Considerations ### Database Count - One database per project (project isolation guarantee) - Example: 20 projects = 20 databases (codebase_project-1, codebase_project-2, ...) ### PostgreSQL Configuration Increase `max_connections` if you have many projects: ```sql ALTER SYSTEM SET max_connections = 200; -- Default 100 ``` ### Disk Space Each database requires: - Small project (1k files): ~100 MB - Medium project (10k files): ~1 GB - Large project (100k files): ~10 GB ``` 2. Add database archival strategy (Important I2) #### R3: Performance Regression Risk (MEDIUM) **Risk**: Refactoring introduces performance regressions (slower search, slower indexing). No baseline means can't detect regression. **Impact**: Users downgrade back to v1.x **Mitigation Specified**: Performance tests in Phase 11 ✅ **Missing**: Baseline comparison (Critical C4) ❌ **Mitigation Needed**: Add baseline capture in Phase 1 (see Critical C4) #### R4: Tree-sitter Language Support Gaps (LOW) **Risk**: tree-sitter doesn't support some languages (e.g., proprietary DSLs). Indexing fails for those files. **Impact**: Incomplete index for mixed-language repositories **Mitigation Specified**: None ❌ **Mitigation Needed**: 1. Add fallback chunking strategy in `refactoring-plan.md`: ```python async def chunk_file(file_path: Path, language: str) -> list[Chunk]: """Chunk file with tree-sitter, fallback to line-based chunking""" try: # Try tree-sitter parser chunks = await chunk_with_tree_sitter(file_path, language) except LanguageNotSupported: logger.warning(f"tree-sitter parser unavailable for {language}, using line-based chunking") chunks = await chunk_by_lines(file_path, max_lines=500) return chunks ``` 2. Document supported languages in `README.md`: ```markdown ## Supported Languages - **Full Support** (tree-sitter AST-aware): Python, JavaScript, TypeScript, Rust, Go, Java, C, C++, C#, Ruby, PHP - **Basic Support** (line-based chunking): Any text file - **Not Supported**: Binary files (images, compiled binaries) ``` #### R5: Data Loss During Migration (LOW) **Risk**: Migration script bugs cause data loss. Users lose indexed repositories or code chunks. **Impact**: Users must re-index all repositories (time-consuming) **Mitigation Specified**: Backup instructions in migration guide ✅ **Additional Mitigation**: 1. Add migration validation script in `migrations/002_validate.sql`: ```sql -- Validate migration completed successfully DO $$ DECLARE table_count INT; BEGIN -- Check only 2 tables remain SELECT COUNT(*) INTO table_count FROM information_schema.tables WHERE table_schema = 'public'; IF table_count != 2 THEN RAISE EXCEPTION 'Expected 2 tables (repositories, code_chunks), found %', table_count; END IF; -- Check project_id column exists IF NOT EXISTS ( SELECT 1 FROM information_schema.columns WHERE table_name = 'repositories' AND column_name = 'project_id' ) THEN RAISE EXCEPTION 'Migration failed: project_id column missing from repositories'; END IF; RAISE NOTICE 'Migration validation: PASSED'; END $$; ``` --- ## Gap Analysis ### Requirements Gaps 1. **Database Lifecycle Management** (Important I2) - Missing: How to delete project databases - Missing: How to archive old projects - Missing: How to reclaim disk space 2. **Search Result Ranking Customization** (Nice to Have) - Missing: Can users adjust similarity threshold? (default: no minimum, returns top N) - Missing: Can users boost results from specific directories? - Missing: Can users exclude certain file patterns from search? 3. **Authentication/Authorization** (Acknowledged Out of Scope) - Missing: Multi-user environments (who can search which projects?) - Acceptable: Relies on filesystem permissions ✅ ### Testing Gaps 1. **Load Testing** (Important) - Missing: What happens with 1000 concurrent searches? (performance test has 100 queries sequentially) - Missing: Connection pool behavior under load - Missing: PostgreSQL query queue behavior 2. **Chaos Testing** (Nice to Have) - Missing: What if PostgreSQL crashes during indexing? - Missing: What if Ollama crashes during embedding generation? - Missing: What if disk fills up during indexing? 3. **Security Testing** (Critical) - Missing: SQL injection testing (even though using parameterized queries) - Missing: Path traversal testing (directory parameter) - Missing: Resource exhaustion testing (giant file crashes indexing?) **Suggestion**: Add security testing to Phase 11: ```python @pytest.mark.security @pytest.mark.asyncio async def test_sql_injection_in_project_id(): """Test that SQL injection attempts are rejected""" malicious_project_id = "'; DROP TABLE repositories; --" with pytest.raises(ValueError, match="Invalid project_id format"): await search_code( SearchCodeParams( query="test", project_id=malicious_project_id ) ) @pytest.mark.security @pytest.mark.asyncio async def test_path_traversal_in_directory_filter(): """Test that path traversal is blocked""" malicious_directory = "../../../etc/" result = await search_code( SearchCodeParams( query="test", project_id="test-project", directory=malicious_directory ) ) # Should return empty results (no files in that directory within project) assert len(result["results"]) == 0 ``` ### Performance Gaps 1. **Latency Breakdown** (Important I2) - Missing: How much of <500ms is embedding generation vs database query? - Missing: If search is slow, how to diagnose bottleneck? **Suggestion**: Add latency breakdown logging: ```python async def search_code(params: SearchCodeParams) -> dict: start = time.time() # Generate embedding embed_start = time.time() query_embedding = await generate_embedding(params.query) embed_duration = time.time() - embed_start # Database query db_start = time.time() results = await search_code_db(...) db_duration = time.time() - db_start total_duration = time.time() - start logger.info( "search_latency", total_ms=total_duration * 1000, embedding_ms=embed_duration * 1000, database_ms=db_duration * 1000, project_id=project_id ) return { "results": results, "latency_ms": total_duration * 1000, "latency_breakdown": { "embedding_ms": embed_duration * 1000, "database_ms": db_duration * 1000, } } ``` 2. **Indexing Throughput by Language** (Nice to Have) - Missing: Does Python index faster than Rust? (tree-sitter performance varies) - Missing: Baseline throughput by file type ### Documentation Gaps 1. **Deployment Guide** (Important) - Missing: How to deploy in production? (Docker? systemd service?) - Missing: How to configure PostgreSQL for production? (connection limits, memory) - Missing: How to configure Ollama for production? (GPU vs CPU, memory limits) **Suggestion**: Create `docs/deployment-guide.md`: ```markdown # Production Deployment Guide ## System Requirements - CPU: 4 cores minimum, 8 cores recommended - RAM: 8 GB minimum, 16 GB recommended - Disk: SSD preferred, 100 GB per 10 projects - GPU: Optional, speeds up Ollama embeddings by 5-10x ## PostgreSQL Configuration ```sql ALTER SYSTEM SET shared_buffers = '4GB'; ALTER SYSTEM SET effective_cache_size = '12GB'; ALTER SYSTEM SET maintenance_work_mem = '1GB'; ALTER SYSTEM SET max_connections = 200; ALTER SYSTEM SET work_mem = '20MB'; ``` ## Ollama Configuration ```bash # Set memory limit (prevents OOM) export OLLAMA_MAX_LOADED_MODELS=1 export OLLAMA_NUM_PARALLEL=1 # Start Ollama service ollama serve ``` ## codebase-mcp as systemd service ```ini [Unit] Description=Codebase MCP Server After=network.target postgresql.service [Service] Type=simple User=codebase WorkingDirectory=/opt/codebase-mcp ExecStart=/opt/codebase-mcp/venv/bin/python -m codebase_mcp.server Restart=on-failure Environment="POSTGRES_HOST=localhost" Environment="POSTGRES_PORT=5432" Environment="OLLAMA_URL=http://localhost:11434" [Install] WantedBy=multi-user.target ``` ``` 2. **API Reference Documentation** (Important) - Missing: Full API documentation (only examples in user-stories.md) - Missing: OpenAPI/Swagger spec for HTTP API (if applicable) 3. **Development Setup Guide** (Important) - Missing: Step-by-step setup for contributors - Missing: How to run tests locally - Missing: How to build documentation **Suggestion**: Create `CONTRIBUTING.md`: ```markdown # Contributing to codebase-mcp ## Development Setup 1. **Clone Repository** ```bash git clone https://github.com/yourusername/codebase-mcp.git cd codebase-mcp ``` 2. **Install Dependencies** ```bash python -m venv venv source venv/bin/activate pip install -r requirements.txt pip install -r requirements-dev.txt ``` 3. **Setup PostgreSQL** ```bash # Start PostgreSQL (Docker) docker compose up -d postgres # Create test database createdb codebase_mcp_test psql codebase_mcp_test < src/codebase_mcp/database/schema.sql ``` 4. **Setup Ollama** ```bash # Install Ollama curl -fsSL https://ollama.com/install.sh | sh # Pull embedding model ollama pull nomic-embed-text ``` 5. **Run Tests** ```bash pytest tests/ -v ``` ## Code Standards - Black for formatting: `black src/ tests/` - Ruff for linting: `ruff check src/ tests/` - mypy for type checking: `mypy --strict src/` - Coverage >80%: `pytest --cov=src/codebase_mcp --cov-report=term` ## Pull Request Process 1. Create feature branch: `git checkout -b feature-name` 2. Write tests (TDD approach) 3. Implement feature 4. Run full test suite: `pytest tests/` 5. Update documentation 6. Submit PR with descriptive title and description ``` --- ## Positive Aspects ### Excellent Constitutional Framework ⭐⭐⭐⭐⭐ The 11 constitutional principles are **exceptionally well-defined**: - Clear statement, rationale, constraints for each principle - Non-negotiable principles explicitly marked - Enforcement mechanisms specified (CI/CD, code review, /analyze) - Acceptable deviations documented (shows thoughtfulness) **Example of Excellence**: Principle IV (Performance Guarantees) specifies: - Exact targets (<500ms, <60s) - Batching strategies (100 files/batch, 50 embeddings/batch) - Index types (HNSW, not IVFFlat) - Connection pooling (min 5, max 20) This level of specificity is rare and will prevent ambiguity during implementation. ### Comprehensive User Stories ⭐⭐⭐⭐⭐ User stories demonstrate **strong requirements engineering**: - Follows "As a/I need/So that" format consistently - Testable acceptance criteria (not vague "should work" statements) - Concrete test scenarios with expected outcomes - Edge cases documented (empty results, errors, offline mode) **Example of Excellence**: Story 5 (Performance) includes: - Specific scenario: "50,000 files (250,000 chunks)" - Measurable outcome: "p95 latency <500ms" - Failure mode: "Mock Ollama to return embedding in 250ms" - Degradation handling: "Warning logged, total time <550ms, degraded but acceptable" This is production-quality requirements specification. ### Detailed Implementation Roadmap ⭐⭐⭐⭐ The 12-phase implementation plan is **thorough and practical**: - Clear objectives per phase - Specific files to modify (with before/after code examples) - Acceptance criteria for each phase - Time estimates (17-25 hours total, realistic) - Git strategy with commit messages specified **Example of Excellence**: Phase 2 (Database Schema) includes: - SQL migration script (with rollback warning) - Before/after schema comparison - Test instructions (`psql codebase_mcp_test < migrations/002_*.sql`) - Acceptance criteria ("Migration tested on test database") This level of detail enables a developer to execute the plan with minimal guidance. ### Strong Testing Strategy ⭐⭐⭐⭐ The plan emphasizes **test-driven development** throughout: - TDD approach (tests before implementation) specified in constitution - Multiple testing layers: unit, integration, protocol, performance - Specific test scenarios in user stories - mcp-inspector validation for protocol compliance - Performance benchmarks with p50/p95 targets **Example of Excellence**: Phase 11 includes: - Performance test suite with latency histograms - 100 searches to calculate p95 - Assertion: `assert p95 < 500` - Optimization procedure if targets not met This rigor will prevent quality regressions. ### Clear Migration Strategy ⭐⭐⭐⭐ The migration guide shows **empathy for users**: - Explicit breaking changes documented - Tool mapping (old → new) - Database migration script provided - Configuration examples (before/after) - FAQ section addresses common questions **Example of Excellence**: Migration guide includes: - Backup instructions **before** destructive operations - Step-by-step tool call updates (with code examples) - Checklist for migration validation This will reduce support burden and user frustration. ### Technology Stack Justification ⭐⭐⭐⭐ `tech-stack.md` demonstrates **informed technical decisions**: - Each technology has "Purpose" and "Justification" sections - Version constraints specified (e.g., `>=3.11,<4.0`) - Key features explained (why AsyncPG over psycopg2?) - Non-negotiable stack marked clearly **Example of Excellence**: PostgreSQL section specifies: - Version constraint: `>=14.0` (with rationale: pgvector performance) - Extensions required: `vector`, `pg_trgm` - Index type: HNSW with specific parameters (`m = 16, ef_construction = 64`) This prevents bikeshedding and ensures consistent decisions. --- ## Overall Recommendation ### Recommendation: APPROVED WITH CHANGES ✅ **Rationale**: The codebase-mcp refactoring plan is **well-architected and thoroughly documented**. The constitutional framework, user stories, and implementation roadmap demonstrate strong software engineering practices. However, **five critical issues** must be addressed before implementation to prevent runtime failures and operational confusion. ### Required Changes Before Implementation **Must Fix (Critical)**: 1. ✅ **C1**: Specify database naming and discovery strategy (SQL injection risk, unclear operations) 2. ✅ **C2**: Detail workflow-mcp integration error handling (poor UX, silent failures) 3. ✅ **C3**: Define rollback strategy (high risk of getting stuck in broken state) 4. ✅ **C4**: Capture performance baseline in Phase 1 (can't prove no regression) 5. ✅ **C5**: Document concurrent access patterns (data races, connection exhaustion) **Should Fix (Important)**: - I1: Add data migration SQL scripts (user friction) - I2: Plan database lifecycle management (disk space issues) - I3: Create comprehensive error catalog (debugging difficulty) **Nice to Have (Suggestions)**: - S1: Add Architectural Decision Records (future maintainability) - S2: Specify observability/metrics (production debugging) - S3: Provide configuration examples (easier onboarding) ### Confidence Level: High (85%) **Confidence in Plan Success**: - Architecture: 95% (excellent separation of concerns, local-first design) - Requirements: 85% (comprehensive user stories, some edge cases missing) - Implementation: 80% (detailed phases, but lacks rollback strategy) - Testing: 90% (strong TDD approach, performance benchmarks) - Migration: 85% (clear guide, but data migration needs more detail) **Risk Areas**: - workflow-mcp integration brittleness (mitigated by fallback to explicit project_id) - Database proliferation (20+ databases for power users, needs monitoring) - Performance regression (mitigated by performance tests, but needs baseline) ### Next Steps 1. **Address Critical Issues** (C1-C5) - 4-6 hours of planning work 2. **Review Updated Plan** - Stakeholder approval of changes 3. **Begin Phase 0** (Prerequisites) - Validate workflow-mcp integration 4. **Execute Phases 1-12** - Follow implementation roadmap 5. **Monitor Phase 1-2** - High-risk phases (database schema changes) --- ## Conclusion This is a **high-quality refactoring plan** that follows industry best practices: - Constitutional principles establish guardrails - User stories provide clear requirements - Implementation phases are actionable - Testing strategy ensures quality - Migration guide reduces user friction With the **five critical issues addressed**, this plan has a **high probability of success** (85%+). The plan demonstrates maturity in software architecture, requirements engineering, and operational thinking. **Approval Status**: ✅ **APPROVED WITH CHANGES** **Conditional on**: Addressing critical issues C1-C5 before implementation begins --- **Reviewer Signature**: Senior Planning Reviewer **Date**: 2025-10-11

Loading blob content...

Latest Blog Posts

Redis vs ioredis vs valkey-glide
By punkpeye on January 26, 2026.
benchmark
Redis
valkey
Quickstart: Publish an MCP Server to the MCP Registry
By punkpeye on January 24, 2026.
mcp
official reference mirror
Official MCP Registry Server.json Requirements
By punkpeye on January 24, 2026.
mcp
official reference mirror

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/Ravenight13/codebase-mcp'

If you have feedback or need assistance with the MCP directory API, please join our Discord server

PLANNING-REVIEW.md•48.2 KiB