MCP Codebase Index

CHANGELOG.md•11.6 KiB

# Changelog All notable changes to this project will be documented in this file. The format is based on [Keep a Changelog](https://keepachangelog.com/en/1.0.0/), and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0.html). ## [1.6.0] - 2025-11-18 ### 🎉 Stable Release: Modular Architecture & Vector Visualization This is a **major stable release** graduating from beta with comprehensive refactoring and new visualization capabilities. ### Added - **🎨 Vector Visualization System** - Complete 2D/3D visualization of codebase embeddings - `visualize_collection`: Explore entire codebase in embedding space with UMAP dimensionality reduction - `visualize_query`: See how search queries relate to code in vector space - `export_visualization_html`: Generate interactive standalone HTML visualizations - K-means clustering support for identifying code patterns - Three output formats: `summary` (text), `plotly` (interactive JSON), `json` (structured data) - Smart sampling algorithms for large codebases (up to 5000 vectors) - Performance optimized: <5s for 1000 vectors, <15s for 5000 vectors - **📚 Comprehensive Documentation** - New `docs/guides/VECTOR_VISUALIZATION.md` (700+ lines) - Complete tool reference with all parameters - 6 detailed use case scenarios with examples - Technical deep-dive (UMAP, K-means, sampling strategies) - Performance benchmarks and optimization tips - Troubleshooting guide and best practices - Updated main README with visualization section - Enhanced navigation and quick reference guides ### Changed - **🏗️ Major Architecture Refactoring** (40% code reduction, zero breaking changes) - Reduced `server.ts` from 2060 lines to 1237 lines (823 lines removed) - Extracted 11 handler methods into 4 modular files: - `handlers/search.handler.ts` (74 lines): Search operations - `handlers/enhancement.handler.ts` (131 lines): Prompt enhancement & telemetry - `handlers/visualization.handler.ts` (296 lines): All visualization features - `handlers/indexing.handler.ts` (544 lines): Status, check, repair operations - Implemented context injection pattern for clean dependency management - Created `types/handlers.types.ts` for shared interfaces - Added `templates/visualization.template.ts` for HTML export ### Technical Details - **Modular Handler Architecture**: - Context injection pattern for all handlers - Clean separation: server orchestrates, handlers execute - Fully typed with TypeScript interfaces - Easier testing and maintenance - **Visualization Layer**: - `src/visualization/visualizer.ts`: UMAP dimensionality reduction - `src/visualization/reducer.ts`: Data sampling algorithms - `src/visualization/exporter.ts`: HTML template generation - `src/visualization/vectorRetriever.ts`: Efficient vector fetching - `src/visualization/types.ts`: Type definitions - **Benefits**: - Better code organization and readability - Clear dependencies via context interfaces - Simpler to extend with new features - Improved maintainability ### Migration - **Zero breaking changes** - Fully backward compatible - All existing tools and APIs work exactly the same - New visualization tools available immediately - No configuration changes required ### Package Information - **Size**: 109.4 KB (118 files) - **Dependencies**: Stable and tested - **Build**: TypeScript 5.x with strict mode - **Node**: Compatible with Node.js 16+ --- ## [1.5.4-beta.19] - 2025-11-18 ### Changed - **Major Refactoring: Extracted handlers into modular files for better maintainability** - Reduced `server.ts` from 2060 lines to 1237 lines (40% reduction, 823 lines removed) - Extracted 11 handler methods into 4 modular files with context injection pattern - Created `src/mcp/handlers/` directory structure: - `search.handler.ts` (74 lines): `handleSearch` - `enhancement.handler.ts` (131 lines): `handleEnhancePrompt`, `handleEnhancementTelemetry` - `visualization.handler.ts` (296 lines): `handleVisualizeCollection`, `handleVisualizeQuery`, `handleExportVisualizationHtml` - `indexing.handler.ts` (544 lines): `handleIndexingStatus`, `handleCheckIndex`, `handleRepairIndex` - Implemented context injection pattern for all handlers - Maintains identical public API (no breaking changes) - All handlers properly typed with context interfaces ### Technical Details - **Pattern Used**: Context injection with interfaces - Each handler receives a context object with all required dependencies - Server methods create context and delegate to handler functions - Clean separation of concerns: server orchestrates, handlers execute - **Benefits**: - Easier to test individual handlers - Clear dependencies via context interfaces - Simpler to maintain and extend - Better code organization and readability - **Migration**: Fully backward compatible - no changes required for users ## [1.4.10] - 2025-11-08 ### Fixed - **CRITICAL: Checkpoint resume system not working correctly** - **Root cause**: FileWatcher.scanForChanges() updated hashes for ALL scanned files during scan, not just indexed files - **Issue**: After indexing 30 files and restarting, scan found 0 changed files because metadata contained hashes for all 470 files - **Solution**: Only update file hashes AFTER successful indexing via new `updateFileHash()` method - Metadata now contains hashes only for actually indexed files - Resume now correctly detects remaining files that need indexing ### Changed - FileWatcher.scanForChanges() no longer updates hashes during scan - Added FileWatcher.updateFileHash() method called after successful indexing - Server now calls updateFileHash() for each indexed file to maintain accurate metadata - Checkpoint resume now works as designed: Index 30 files → Stop → Restart → Index remaining 440 files ### Technical Details - Before: scanForChanges() did `this.fileHashes.set(filePath, hash)` for all files - After: scanForChanges() only detects changes, updateFileHash() stores hash after indexing - This ensures metadata (index-metadata.json) only contains hashes for indexed files - Fixes scenario: "qdrant empty → index 30 → stop → restart → shows 0 changed files" ## [1.4.9] - 2025-11-08 ### Fixed - **Checkpoint system appearing to work but not resuming correctly** - Added double-check with `getVectorCount()` to query actual Qdrant data - Fixed Case 1b in checkAndFixSync() to distinguish between: - Collection truly deleted (actualCount = 0) → Clear and re-index - Valid checkpoint (actualCount > 0) → Resume from checkpoint - Previous logic relied on collection metadata which could be stale ### Added - QdrantVectorStore.getVectorCount() method to query actual point count - Better logging for checkpoint resume vs collection deleted scenarios ## [1.4.2] - 2025-11-08 ### Added - **Debug logging for file scanning** - Shows total source files scanned - Shows number of changed files detected - Shows number of ignored directories - Helps diagnose why some files aren't being indexed ### Fixed - Improved directory ignore logic with better pattern matching - Now checks directory basename directly - Better handling of nested ignore patterns - More accurate path separator matching ## [1.4.1] - 2025-11-08 ### Fixed - **Critical: Sync check between Qdrant and memory state** - Server now detects when Qdrant collection is deleted but memory state remains - Automatically resets state and forces full re-index when mismatch detected - Clears FileWatcher hashes to ensure all files are re-scanned - Prevents "stuck" state where server thinks files are indexed but vectors are missing - Logs clear warnings when sync issues are detected ### Added - `checkAndFixSync()` method to verify Qdrant-memory consistency on startup - `clearFileHashes()` method in FileWatcher for forced re-scanning - Detailed sync check logging with vector count and file count comparison ### Changed - Startup sequence now includes sync verification step - Better error recovery for out-of-sync scenarios ## [1.4.0] - 2025-11-08 ### Added - **Enhanced Status Reporting** (Issue #4) - Real-time progress tracking with percentage and current file - ETA calculation based on average processing speed - Performance metrics (files/sec, avg time per file, total duration) - Error tracking with timestamps (last 10 errors) - Verbose flag for detailed error logs - Human-readable time and size formatting - RPM (Requests Per Minute) visibility in status - **Incremental Indexing** (Issue #3) - File categorization: new, modified, unchanged, deleted - MD5 hash tracking for change detection - Priority queue: new/modified files indexed first - Pending queue for files exceeding daily quota - Persistent state in `memory/incremental_state.json` - Automatic queue processing on next run - 90%+ quota savings for daily updates - **Progress Tracking Types** - `IndexingProgress` interface with totalFiles, processedFiles, currentFile, percentage, startTime, estimatedTimeRemaining - `PerformanceMetrics` interface with filesPerSecond, averageTimePerFile, totalDuration, chunksProcessed - `IndexingError` interface with filePath, error, timestamp ### Changed - **Optimized Rate Limiting for text-embedding-004** - Increased batch size from 20 to 25 chunks - Parallel processing: 25 chunks/second = 1,500 RPM (100% API utilization) - Removed unnecessary delays within batches - Better handling of 1,500 RPM limit (was causing errors at ~679 chunks) - **Daily Quota Management** - Increased daily quota limit from 950 to 10,000 chunks - Added RPM_LIMIT constant (1,500) for text-embedding-004 - More accurate quota tracking per file - Better quota warning messages - **Performance Improvements** - 25x faster than sequential processing - 1,500 chunks/minute with parallel batching - Real-time metrics calculation during indexing - Efficient batch processing with Promise.all() ### Fixed - Rate limit errors when indexing >679 chunks - Quota calculation accuracy - Progress tracking during long indexing sessions - Error propagation in batch processing ### Documentation - Updated README.md with new features and performance metrics - Added detailed implementation docs in `github-issues/issue-4-status-reporting.md` - Enhanced performance comparison tables - Added real-world indexing time examples ## [1.3.7] - 2025-11-07 ### Changed - Previous stable release --- ## Release Notes v1.4.0 ### 🎯 Key Features **Enhanced Status Reporting** - Know exactly what's happening during indexing - See progress, ETA, and performance in real-time - Track errors with timestamps for easy debugging **Incremental Indexing** - Only re-indexes changed files after initial index - Saves 90%+ API quota on subsequent runs - Automatic queue management for large codebases **Optimized Performance** - 25 chunks/second parallel processing - 100% utilization of 1,500 RPM limit - Fixed quota errors that occurred at ~679 chunks ### 📈 Performance Impact **Before v1.4.0:** - 20 chunks/second - Could hit quota errors at 679 chunks - No progress visibility - Re-indexed everything every time **After v1.4.0:** - 25 chunks/second (25% faster) - No quota errors (tested up to 10,000+ chunks) - Real-time progress with ETA - Only indexes changed files (90%+ savings) ### 🔧 Breaking Changes None - fully backward compatible! ### 📦 Upgrade Instructions ```bash # If using npx (no action needed - auto-updates) # Just restart VS Code # If installed globally npm update -g @ngotaico/mcp-codebase-index # If installed locally npm update @ngotaico/mcp-codebase-index ``` ### 🙏 Acknowledgments Thanks to all users who reported the 679-chunk quota issue!

Loading blob content...

Latest Blog Posts

Redis vs ioredis vs valkey-glide
By punkpeye on January 26, 2026.
benchmark
Redis
valkey
Quickstart: Publish an MCP Server to the MCP Registry
By punkpeye on January 24, 2026.
mcp
official reference mirror
Official MCP Registry Server.json Requirements
By punkpeye on January 24, 2026.
mcp
official reference mirror

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/NgoTaiCo/mcp-codebase-index'

If you have feedback or need assistance with the MCP directory API, please join our Discord server

CHANGELOG.md•11.6 KiB