ChunkHound

CHANGELOG.md•33.7 KiB

# Changelog All notable changes to ChunkHound will be documented in this file. The format is based on [Keep a Changelog](https://keepachangelog.com/en/1.0.0/), and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0.html). ## [Unreleased] ### Breaking Changes - **HTTP MCP server removed** - ChunkHound now supports stdio transport only for MCP connections - `chunkhound mcp http` command removed - `--http`, `--port`, `--host` CLI flags removed - FastMCP dependency removed - Migration: Use `chunkhound mcp` (stdio) instead. All major MCP clients (Claude Code, Claude Desktop, VS Code) support stdio transport. - Rationale: Simplified codebase, reduced dependencies, focused on primary use case (stdio is the standard for MCP) ### Added - OpenAI Responses API support for reasoning models (gpt-5.1, gpt-5.1-codex, o-series, gpt-5-pro) - enables deep code research with enhanced reasoning capabilities - Automatic API routing between Chat Completions and Responses API based on model compatibility - supports 30+ models including all GPT-5, GPT-4.1, GPT-4o, and o-series models - Reasoning effort control for deep research LLM operations - configurable levels (none, minimal, low, medium, high) via `CHUNKHOUND_LLM_CODEX_REASONING_EFFORT` with per-role overrides for utility and synthesis operations - Structured JSON output support for Responses API - maintains schema validation consistency across both Chat Completions and Responses endpoints ### Performance - LanceDB table creation now detects embedding dimensions upfront from configured embedding provider, eliminating O(n) table recreation penalty during first embedding insertion - significantly improves indexing performance for large codebases (e.g., 16,000+ chunks no longer require full table migration) ### Fixed - Global chunk deduplication now applies to all parsers (YAML, Universal) - prevents duplicate chunk IDs that caused indexing failures with repeated config values ## [4.1.0b1] - 2025-11-15 ### Added - PHP configuration files with top-level return arrays are now searchable - config patterns like `return ['key' => 'value'];` are automatically indexed - Universal config-literal parsing across Python, JavaScript, TypeScript, and JSX/TSX - all exported configuration objects and arrays are now discoverable through semantic search ### Enhanced - Windows compatibility improved with cross-platform temporary directory handling for Claude Code CLI provider - JavaScript-family parsers (JavaScript, TypeScript, JSX/TSX) internally streamlined to reduce maintenance overhead while preserving all functionality - Version management now supports PEP 440 pre-release formats (beta, release candidate) for clearer update channels - Version tagging includes safety checks to prevent accidental releases from uncommitted work ### Fixed - Code quality improvements addressing linting warnings for cleaner, more maintainable codebase ## [4.0.1] - 2025-11-12 ### Fixed - Package build configuration now excludes test fixtures from distribution, reducing package size and removing unnecessary test data from published releases ## [4.0.0] - 2025-11-12 ### Added - Map-reduce synthesis for dramatically improved research accuracy - clusters related files and synthesizes them separately before combining insights - Compact numbered citation system `[1][2][3]` replacing verbose `file.py:123` references for better readability - Automatic query expansion with intelligent deduplication to find more relevant results - Structured JSON output support for LLM providers enabling programmatic research workflows - Tree progress display with event system for visual research feedback - `chunkhound research <query>` command for direct code research without starting MCP server - `chunkhound index --simulate [--json]` - Dry-run mode showing which files would be indexed without making changes - `chunkhound diagnose [--json]` - Troubleshooting command comparing ChunkHound's decisions vs git's ignore rules - `chunkhound calibrate` - Automatic batch size performance tuning for Qwen3 reranker - `--show-sizes` flag for file size reporting during indexing - Swift language support with tree-sitter parsing for classes, protocols, functions, and properties (`.swift`, `.swiftinterface`) - Objective-C support with content detection to disambiguate from MATLAB (`.m` files) - Zig language support with comprehensive tree-sitter parsing - Haskell language support for functions, types, classes, and modules (`.hs`, `.lhs`, `.hs-boot`, `.hsig`, `.hsc`) - HCL (HashiCorp Configuration Language) support for Terraform with nested object parsing (`.hcl`, `.tf`, `.tfvars`) - Vue.js Single File Component (SFC) support with specialized parsing for template, script, and style sections - Svelte Single File Component support with specialized parsing for template, script, and style sections (`.svelte`) - Vue cross-reference tracking between template elements and script definitions for enhanced semantic understanding - PHP language support with comprehensive parsing for classes, interfaces, traits, functions, methods, namespaces, and PHPDoc comments - RapidYAML parser using native bindings (10-100x faster than tree-sitter for large YAML files) - Helm template sanitizer for Go template syntax in Kubernetes manifests - Automatic fallback to tree-sitter parser when RapidYAML encounters issues - Benchmark harness comparing PyYAML, universal, and RapidYAML performance (`scripts/bench_yaml.py`) - Repo-aware ignore engine respecting repository boundaries and preventing rule leakage between sibling repos - Workspace overlay mode collecting .gitignore rules from root and nested files with correct anchoring - Combined exclusion modes: `indexing.exclude_mode` supports `"combined"`, `"config_only"`, or `"gitignore_only"` - Wildcard directory segment matching for patterns like `**/.venv*/` and `**/*.phar/` - Git pathspec capping with fallback to prevent pathspec explosion (default: 128, env: `CHUNKHOUND_INDEXING__GIT_PATHSPEC_CAP`) - Real-time telemetry for git pathspec usage and exclusion sources - TEI (Text Embeddings Inference) reranking format support alongside Cohere format - Automatic reranker format detection from response field names (Cohere vs TEI) - Thread-safe format caching for consistent reranker behavior across requests - Authorization header support for TEI endpoints with `--api-key` flag - Qwen3 reranker with automatic batch size calibration for optimal performance - Async regex search methods for concurrent search operations - Claude Code CLI provider with direct integration (`claude-code-cli`) - Codex CLI provider for synthesis workflows - AWS Anthropic Bedrock provider using official Anthropic SDK - Provider-specific synthesis concurrency limits: OpenAI (3), Bedrock (5), Claude CLI (1) - Smart change detection using checksums for verification when mtime/size differ - Content hash support in both DuckDB and LanceDB providers - DuckDB schema migration with `files.content_hash` column (idempotent via `ALTER TABLE IF NOT EXISTS`) - LanceDB execute_query adapter for lightweight batch SELECT operations - In-memory database mode for simulate on fresh workspaces (no .chunkhound/ directory created) - Checkpointing and recovery for more robust indexing coordinator - Per-file timeout controls: `indexing.per_file_timeout_seconds`, `indexing.per_file_timeout_min_size_kb` - Configurable host parameter for HTTP MCP server (`--host` for binding to specific interfaces) - Size-based filtering threshold for structured config files (JSON/YAML/TOML) - Environment variable override for DB executor timeout: `CHUNKHOUND_DB_EXECUTE_TIMEOUT` - Comprehensive test suites for Swift, Objective-C, Zig, Java, C#, Python, PHP, Vue, HCL - Test fixtures for refactored research modules with fake providers and better mocks ### Enhanced - Native git bindings for gitignore exclusions replacing Python-based pattern matching (10-100x faster indexing) - Parallel directory discovery with auto-scaling for enterprise monorepos - Concurrent file parsing using ProcessPoolExecutor across CPU cores - Lazy parser instantiation reducing startup time - Single-file fast path using in-process handling (no ProcessPool overhead) - Single-read checksum verification eliminating redundant file I/O - Provider-aware embedding concurrency: OpenAI (8 concurrent batches), VoyageAI (40 concurrent batches) - Automatic retry logic for VoyageAI embedding provider - Real-time embedding pass: dedicated "embed" phase after quick parse/store for new chunks - Removed redundant reranking passes from deep research pipeline - xxHash3-64 replacing SHA-256 for faster file change detection - Git pathspec capping preventing pathspec explosion (configurable via env) - In-memory DuckDB for simulate mode on fresh workspaces - Automatic parser worker auto-scaling to CPU count when timeouts enabled (capped at 32) - Split progress reporting: "Parsing files" vs "Handling files" with live cumulative info - Better error messages and truncation detection for LLM responses - Non-TTY progress fallback properly working in CI environments - Improved diagnostics for parse/store errors with clearer failure messages - Post-run prompt to add timed-out files to `indexing.exclude` when interactive - Skipped file counts broken out into "Unchanged" and "Filtered" buckets - Raw markdown output from code_research tool for better formatting in Claude - Lazy imports for MCP-safe stdio operation - Proper JSON-RPC handshake reliability - Test-mode patches for Codex CLI integration (env-gated, no production impact) - Increased startup wait time for Mac CI stability (3s → 5s) - TEI reranking format comprehensive guide in CLAUDE.md - Test coverage documentation with refactoring progress - README improvements with startup profile CAP notes and exclusions section updates - Benchmark instructions for YAML parser performance testing - MCP setup improvements with multi-client support and `--show-setup` flag ### Changed - **BREAKING**: Removed `depth` parameter from `code_research` MCP tool - system now auto-scales synthesis budgets based on repository size - **BREAKING**: Checksum algorithm switched from SHA-256 to xxHash3-64 for faster file change detection - all files will be reindexed on first run after upgrade - **BREAKING**: Default exclusion behavior changed - providing `indexing.exclude` list no longer disables .gitignore (use `exclude_mode: "config_only"` for legacy behavior) - **BREAKING**: RapidYAML is now the default YAML parser (set `CHUNKHOUND_YAML_ENGINE=tree` to revert to tree-sitter) - **BREAKING**: LanceDB provider now requires `content_hash` column in files schema - Default per-file timeout enabled: `indexing.per_file_timeout_seconds=3.0` (previously `0`, disabled) - Parser workers auto-scale to CPU count when timeouts enabled (capped at 32) - Combined exclusion mode is now default: overlays gitignore + config excludes instead of replacing - Model defaults updated to Haiku 4.5 for claude-code-cli and bedrock providers - Deep research service refactored into specialized modules: question_generator, synthesis_engine, budget_calculator, citation_manager, quality_validator - Search service refactored into strategies: context_retriever, single_hop_strategy, multi_hop_strategy, result_enhancer - Extracted research pipeline modules: unified_search, query_expander, file_reader, context_manager ### Fixed - Fixed double "**/" prefix preventing root file matches in default excludes - Fixed real-time indexing for newly added languages - Fixed file diversity collapse in deep research using proper reranking - Fixed TOML parser to extract only matched node content instead of entire file - Fixed tree-sitter language names for C# and Makefile parsers - Fixed .gitignore pattern handling and error logging - Fixed symbol validation inconsistency in Chunk.from_dict() - Fixed Config.__init__ to respect target_dir kwarg in tests - Fixed DuckDB `get_file_by_path(as_model=True)` to return correct mtime and size_bytes for accurate skip checks - Fixed registry provider instance handling (was storing lambda instead of provider) - Fixed orphaned embeddings cleanup with proper per-call db_path configuration - Fixed LanceDB optimize() API usage for 0.21.0+ (cleanup_older_than parameter) - Fixed single-file indexing to use in-process path and call on_batch for immediate storage - Fixed missing sources in synthesis by using correct chunk.content field (was chunk.code) - Fixed flaky multi-hop semantic chain test - Fixed reranker single-batch top_k filtering for consistency across backends - Fixed concurrent rerank calls using aiohttp (replaced custom socket-based HTTP) - Fixed MCP stdio flow for code_research end-to-end reliability - Fixed non-TTY progress manager regression (added minimal Progress shim for CI) - Fixed exception classes to allow __traceback__ assignment (removed frozen dataclass) - Fixed Windows path separator issues in gitignore pattern generation and matching - Fixed ProcessPoolExecutor segfault on Linux by forcing spawn multiprocessing - Fixed flaky QA test with file processing completion polling - Fixed real-time indexing flakiness with proper timeout handling and task cleanup ### Removed - Removed AWS Bedrock provider (consolidated to Anthropic SDK-based Bedrock provider) - Removed research tools setup section from CONTRIBUTING.md (obsolete) - Removed obsolete tests incompatible with refactored modular architecture ### Security - Removed embedded API key from `.chunkhound.json` - use environment variables instead (e.g., `CHUNKHOUND_EMBEDDING__API_KEY`) ## [3.3.1] - 2025-09-25 ### Enhanced - Dependency updates to latest stable versions for improved stability and performance - Test infrastructure reliability with better provider detection and error handling ### Fixed - Tree-sitter 0.25.x API compatibility ensuring parsing works with latest language parsers - Code formatting and import organization for cleaner, more maintainable codebase ## [3.3.0] - 2025-09-21 ### Added - Official Windows support with full CI testing across Windows, macOS, and Ubuntu - Command-line search functionality (`chunkhound search`) for semantic and regex queries without starting MCP - CONTRIBUTING.md guidelines - Setup wizard when `.chunkhound.json` isn't found in the directory ### Fixed - File exclude patterns (**/tmp/**) on Linux systems - Regex search path resolution across platforms ## [3.2.0] - 2025-08-24 ### Enhanced - Semantic search upgraded from two-hop to dynamic multi-hop expansion with intelligent stopping criteria, delivering more comprehensive and contextually relevant results while avoiding search explosion ## [3.1.0] - 2025-08-21 ### Added - PDF document parsing and indexing with full text extraction using PyMuPDF integration ### Enhanced - Language support expanded to 29 languages with comprehensive documentation breakdown ### Fixed - JSON file parsing now extracts specific node content instead of entire file content, improving search precision and reducing noise ## [3.0.1] - 2025-08-21 ### Enhanced - Documentation site improved with cross-linking between pages and hero image for better navigation - OpenAI-compatible endpoint flexibility increased by making API keys optional for local deployments - Test infrastructure reliability improved with comprehensive CI fixes and timeout handling ### Fixed - JSON file parsing now handles empty chunks correctly, eliminating indexing failures on common JSON patterns - Test suite stability enhanced with proper background task cleanup and configuration isolation - GitHub Actions workflow simplified and made more reliable by removing redundant processes ## [3.0.0] - 2025-08-20 ### Added - VoyageAI embedding provider with advanced two-hop semantic search and reranking capabilities - GitHub Pages documentation site with interactive examples and improved navigation - Intelligent file exclusion system with .gitignore support and JSON size filtering - Advanced makefile parsing with dependency analysis for better code comprehension - Comprehensive test suite for database consistency and integration testing - Real-time filesystem indexing with MCP integration for live code monitoring ### Enhanced - Parsing system completely rebuilt with cAST (Code AST) algorithm for universal language support - Configuration system dramatically simplified with fewer user-facing options for easier setup - OpenAI provider unified to handle both standard and custom OpenAI-compatible endpoints - MCP server reliability improved with proper initialization sequencing and watchdog coordination - Test infrastructure enhanced with Ollama compatibility and extended timeouts - Directory indexing consolidated between CLI and MCP with shared service architecture ### Fixed - MCP server initialization blocking resolved - no more startup deadlocks during directory scanning - Custom OpenAI endpoint configuration now properly recognized and applied - Real-time indexing now generates missing embeddings for unchanged code chunks - SSL verification disabled for custom OpenAI-compatible endpoints to support local deployments - Watchdog filesystem monitoring no longer blocks MCP server startup process - MCP server properly respects target directory path arguments across all operations ### Removed - TEI (Text Embeddings Inference) provider support - simplified provider ecosystem - BGE provider support - consolidated to core providers for better maintenance - Legacy parsing system replaced with modern cAST algorithm - Obsolete configuration documentation and setup files cleaned up ## [2.8.1] - 2025-07-20 ### Enhanced - Architecture documentation significantly improved for better LLM comprehension and AI-assisted development workflows ### Fixed - Type annotation syntax errors that could cause import failures in Python 3.10+ environments - Enhanced smoke tests now detect forward reference type annotation issues early ## [2.8.0] - 2025-07-20 ### Added - MCP HTTP transport support alongside stdio transport for flexible deployment options ### Enhanced - Configuration system unified across CLI and MCP components for consistent behavior - File change processing reliability improved in MCP servers with better debouncing and coordination - Database portability enhanced with relative path storage ### Fixed - MCP server initialization deadlocks and startup crashes resolved with proper async coordination - File deletion handling improved using IndexingCoordinator for better reliability - MCP server tool discovery enhanced with fallback logic for better error recovery - File path resolution improved in DuckDB provider for cross-platform consistency ## [2.7.0] - 2025-07-12 ### Fixed - MCP server now uses configured embedding model instead of hardcoded text-embedding-3-small default, ensuring semantic search works with any configured model - MCP test environment improvements with comprehensive test data and configuration files ## [2.6.3] - 2025-07-10 ### Fixed - Configuration merge precedence now correctly preserves environment variables over JSON config values - MCP server semantic search now works properly when running from different directories ### Removed - Removed obsolete Ubuntu 20 Dockerfile as issue was resolved in configuration system ## [2.6.2] - 2025-07-10 ### Fixed - MCP server now properly loads embedding provider configuration from target directory ## [2.6.1] - 2025-07-10 ### Fixed - MCP server now properly respects CLI-provided project root directory for configuration loading - Configuration files (.chunkhound.json) are now correctly loaded when running MCP server from different directories ## [2.6.0] - 2025-07-10 ### Fixed - MCP server crashes on Ubuntu and Linux systems when running from different directories by fixing database path resolution and process coordination - Enhanced TaskGroup error reporting to show underlying causes instead of generic wrapper errors - Configuration file loading in MCP server now properly respects .chunkhound.json files in target directories - Database lock conflicts between multiple MCP instances resolved with proper process detection ### Enhanced - Docker test infrastructure for MCP server validation to prevent future regressions - Improved error messages for debugging MCP server issues with detailed analysis ## [2.5.4] - 2025-07-10 ### Fixed - MCP server reliability on Ubuntu and other Linux distributions when running from different directories - Database path resolution consistency across all MCP server components ## [2.5.3] - 2025-07-10 ### Fixed - MCP server communication reliability improved by removing debug logging that interfered with JSON-RPC protocol ## [2.5.2] - 2025-07-10 ### Added - Automatic database optimization during embedding generation to maintain performance with large datasets (every 1000 batches, configurable via `CHUNKHOUND_EMBEDDING_OPTIMIZATION_BATCH_FREQUENCY`) ### Fixed - MCP server compatibility on Ubuntu and other strict platforms by preserving virtual environment context in subprocesses - OpenAI embedding provider crash on Ubuntu due to async resource creation outside event loop context ## [2.5.1] - 2025-01-09 ### Fixed - Project detection now properly respects CHUNKHOUND_PROJECT_ROOT environment variable, ensuring MCP command works correctly when launched from any directory - Removed duplicate MCP parser function that could cause confusion ## [2.5.0] - 2025-01-09 ### Enhanced - MCP positional path argument now controls complete project scope - database location, config file search, and watch paths are all set to the specified directory instead of just watch paths ### Fixed - MCP launcher import path resolution when running from different directories, eliminating TaskGroup errors on Ubuntu and other strict platforms ## [2.4.4] - 2025-01-09 ### Fixed - Ubuntu TaskGroup crash fixed by removing problematic directory change in MCP launcher ## [2.4.3] - 2025-01-09 ### Fixed - MCP server now works correctly when launched from any directory, not just the project root - Fixed path resolution inconsistencies that caused TaskGroup errors on Ubuntu deployments ## [2.4.2] - 2025-01-09 ### Added - MCP command now accepts optional path argument to specify directory for indexing and watching (defaults to current directory) ### Fixed - Parser architecture inconsistencies resolved across C, Bash, and Makefile parsers for consistent search functionality - MCP server database duplication eliminated through proper async task isolation - LanceDB storage growth controlled with automatic optimization during quiet periods - MCP server reliability improved with corrected import structure and dependency resolution - Python parser behavior now consistent between CLI and MCP modes - Search operation freezes after file deletion resolved with proper thread safety ## [2.4.1] - 2025-01-09 ### Fixed - Package structure consolidated under chunkhound/ directory for improved import reliability and Python packaging best practices ## [2.4.0] - 2025-01-09 ### Fixed - LanceDB storage growth issue resolved with automatic database optimization during quiet periods - Configuration system project root detection for .chunkhound.json files improved ### Changed - Enhanced database provider architecture with capability detection and activity tracking - Modernized configuration system by removing legacy registry config building ## [2.3.1] - 2025-07-09 ### Fixed - MCP server communication reliability improved by preventing stderr output from corrupting JSON-RPC messages - Enhanced configuration documentation with automatic .chunkhound.json detection examples ## [2.3.0] - 2025-07-08 ### Changed - **BREAKING**: Configuration system completely refactored with centralized management and clear precedence hierarchy - **BREAKING**: Automatic configuration file loading removed - config files now only load with explicit `--config` flag - **BREAKING**: Environment variables standardized to `CHUNKHOUND_*` prefix with `__` delimiters (e.g., `CHUNKHOUND_EMBEDDING__API_KEY`) - **BREAKING**: Legacy `OPENAI_API_KEY` and `OPENAI_BASE_URL` environment variables no longer supported ### Added - Complete CLI argument coverage for all configuration options - Centralized configuration precedence: CLI args → Config file → Environment variables → Defaults - Comprehensive migration guide for updating existing configurations - Database file gitignore pattern for Lance database files ### Fixed - MCP server database duplication caused by shared transaction state across async tasks - Parser architecture inconsistencies for C, Bash, and Makefile language parsers - Configuration auto-detection issues that caused deployment complexity ## [2.2.0] - 2025-01-07 ### Fixed - Database freezing during concurrent file operations through proper async/sync boundary handling - Thread safety issues in DuckDB provider with synchronized WAL cleanup and operation timeouts - LanceDB duplicate file entries through atomic merge operations and path normalization - File deletion operations now properly handle async contexts without blocking the event loop ### Changed - Aligned LanceDB provider with serial executor pattern for consistency with DuckDB - Improved path normalization to handle symlinks and different path representations - Enhanced database operation reliability with proper thread isolation ### Added - Support for complete configuration storage including API keys in .chunkhound.json files - Consolidated embedding provider creation system for consistent behavior across CLI and config files ## [2.1.4] - 2025-07-03 ### Fixed - CLI argument defaults no longer override config file values - Updated dependencies via uv.lock ## [2.1.3] - 2025-07-03 ### Changed - Consolidated embedding provider creation to use single factory pattern for consistency - Reduced embedding provider log verbosity for cleaner output ## [2.1.2] - 2025-07-03 ### Fixed - API key configuration loading from .chunkhound.json files - Configuration precedence documentation to match actual behavior ### Added - Complete configuration examples with API key and security guidance ## [2.1.1] - 2025-07-03 ### Added - Centralized version management system for consistent versioning across all components ### Changed - Simplified version updates through automated scripts - Enhanced installation and development documentation - Code formatting improvements and linting cleanup ### Fixed - Version consistency across CLI, MCP server, and package initialization - Import statement in package `__init__.py` for better module exposure ## [2.1.0] - 2025-07-02 ### Fixed - Database duplication in MCP server by implementing single-threaded executor pattern - WAL corruption handling during DuckDB catalog replay - Parser architecture inconsistencies for C, Bash, and Makefile parsers - DuckDB foreign key constraint transaction limitations - Python parser CLI/MCP divergence through unified factory pattern - Connection management architectural violations ### Changed - Consolidated database operations through DuckDBProvider executor pattern - Simplified ConnectionManager to handle only connection lifecycle - Updated file discovery patterns to include all 16 supported languages - Removed deprecated connection methods and schema fields - Enhanced transaction handling with contextvars for task isolation ### Added - Automatic database migration system for schema updates - Enhanced parser functionality for C pointer functions and Bash function bodies - Task-local transaction state management - Comprehensive executor methods for database operations ## [2.0.0] - 2025-06-26 ### Added - 10 new language parsers: Rust, Go, C++, C, Kotlin, Groovy, Bash, TOML, Makefile, Matlab - Search pagination with response size limits - Registry-based parser architecture - MCP search task coordinator - Test coverage for file modification tracking - Comment and docstring indexing for all language parsers - Background periodic indexing for better performance - Path filtering support for targeted searches - HNSW index WAL recovery with enhanced checkpoints - Embedding cache optimization with CRC32-based content tracking ### Changed - **BREAKING**: 'run' command renamed to 'index' with current directory default - **BREAKING**: Parser system refactored to registry pattern - Centralized language support in Language enum - Optimized embedding performance with token-aware batching - Enhanced PyInstaller compatibility - Improved cross-platform build support (Windows, Ubuntu Docker) - Enhanced MCP server JSON-RPC communication with logging suppression ### Fixed - Parser error handling and registry integration - OpenAI token limit handling - PyInstaller module path resolution - Database WAL corruption issues on server exit - File watcher cancellation responsiveness - Signal handler safety by removing unsafe database operations - Windows PyInstaller and MATLAB dependency issues - Build workflow reliability across platforms ## [1.2.3] - 2025-06-23 ### Changed - Default database location changed to current directory for better persistence ### Fixed - OpenAI token limit exceeded error with dynamic batching for large embedding requests - Empty chunk filtering to reduce noise in search results - Python parser validation for empty symbol names - Windows build support with comprehensive GitHub Actions workflow - macOS Intel build issues with UV package manager installation - Cross-platform build workflow reliability ### Added - Windows build support with automated testing - Enhanced debugging for build processes across platforms ## [1.2.2] - 2024-12-15 ### Added - File watching CLI for real-time code monitoring ### Changed - Unified JavaScript and TypeScript parsers - Default database location to current directory ### Fixed - Empty symbol validation in Python parser ## [1.2.1] - 2024-11-28 ### Added - Ubuntu 20.04 build support - Token limit management for MCP search ### Fixed - Duplicate chunks after file edits - File modification detection race conditions ## [1.2.0] - 2024-11-15 ### Added - C# language support - JSON, YAML, and plain text file support - File watching with real-time indexing ### Fixed - File deletion handling - Database connection issues ## [1.1.0] - 2025-06-12 ### Added - Multi-language support: TypeScript, JavaScript, C#, Java, and Markdown - Comprehensive CLI interface - Binary distribution with faster startup ### Changed - Improved CLI startup performance (90% faster) - Binary startup performance (16x faster) ### Fixed - Version display consistency - Cross-platform build issues ## [1.0.1] - 2025-06-11 ### Added - Python 3.10+ compatibility - PyPI publishing - Standalone executable support - MCP server integration ### Fixed - Dependency conflicts - OpenAI model parameter handling - Binary compilation issues ## [1.0.0] - 2025-06-10 ### Added - Initial release of ChunkHound - Python parsing with tree-sitter - DuckDB backend for storage and search - OpenAI embeddings for semantic search - CLI interface for indexing and searching - MCP server for AI assistant integration - File watching for real-time indexing - Regex search capabilities For more information, visit: https://github.com/chunkhound/chunkhound [Unreleased]: https://github.com/chunkhound/chunkhound/compare/v4.1.0b1...HEAD [4.1.0b1]: https://github.com/chunkhound/chunkhound/compare/v4.0.1...v4.1.0b1 [4.0.1]: https://github.com/chunkhound/chunkhound/compare/v4.0.0...v4.0.1 [4.0.0]: https://github.com/chunkhound/chunkhound/compare/v3.3.1...v4.0.0 [3.3.1]: https://github.com/chunkhound/chunkhound/compare/v3.3.0...v3.3.1 [3.3.0]: https://github.com/chunkhound/chunkhound/compare/v3.2.0...v3.3.0 [3.2.0]: https://github.com/chunkhound/chunkhound/compare/v3.1.0...v3.2.0 [3.1.0]: https://github.com/chunkhound/chunkhound/compare/v3.0.1...v3.1.0 [3.0.1]: https://github.com/chunkhound/chunkhound/compare/v3.0.0...v3.0.1 [3.0.0]: https://github.com/chunkhound/chunkhound/compare/v2.8.1...v3.0.0 [2.8.1]: https://github.com/chunkhound/chunkhound/compare/v2.8.0...v2.8.1 [2.8.0]: https://github.com/chunkhound/chunkhound/compare/v2.7.0...v2.8.0 [2.7.0]: https://github.com/chunkhound/chunkhound/compare/v2.6.3...v2.7.0 [2.6.3]: https://github.com/chunkhound/chunkhound/compare/v2.6.2...v2.6.3 [2.6.2]: https://github.com/chunkhound/chunkhound/compare/v2.6.1...v2.6.2 [2.6.1]: https://github.com/chunkhound/chunkhound/compare/v2.6.0...v2.6.1 [2.6.0]: https://github.com/chunkhound/chunkhound/compare/v2.5.4...v2.6.0 [2.5.4]: https://github.com/chunkhound/chunkhound/compare/v2.5.3...v2.5.4 [2.5.3]: https://github.com/chunkhound/chunkhound/compare/v2.5.2...v2.5.3 [2.5.2]: https://github.com/chunkhound/chunkhound/compare/v2.5.1...v2.5.2 [2.5.1]: https://github.com/chunkhound/chunkhound/compare/v2.5.0...v2.5.1 [2.5.0]: https://github.com/chunkhound/chunkhound/compare/v2.4.4...v2.5.0 [2.4.4]: https://github.com/chunkhound/chunkhound/compare/v2.4.3...v2.4.4 [2.4.3]: https://github.com/chunkhound/chunkhound/compare/v2.4.2...v2.4.3 [2.4.2]: https://github.com/chunkhound/chunkhound/compare/v2.4.1...v2.4.2 [2.4.1]: https://github.com/chunkhound/chunkhound/compare/v2.4.0...v2.4.1 [2.4.0]: https://github.com/chunkhound/chunkhound/compare/v2.3.1...v2.4.0 [2.3.1]: https://github.com/chunkhound/chunkhound/compare/v2.3.0...v2.3.1 [2.3.0]: https://github.com/chunkhound/chunkhound/compare/v2.2.0...v2.3.0 [2.2.0]: https://github.com/chunkhound/chunkhound/compare/v2.1.4...v2.2.0 [2.1.4]: https://github.com/chunkhound/chunkhound/compare/v2.1.3...v2.1.4 [2.1.3]: https://github.com/chunkhound/chunkhound/compare/v2.1.2...v2.1.3 [2.1.2]: https://github.com/chunkhound/chunkhound/compare/v2.1.1...v2.1.2 [2.1.1]: https://github.com/chunkhound/chunkhound/compare/v2.1.0...v2.1.1 [2.1.0]: https://github.com/chunkhound/chunkhound/compare/v2.0.0...v2.1.0 [2.0.0]: https://github.com/chunkhound/chunkhound/compare/v1.2.3...v2.0.0 [1.2.3]: https://github.com/chunkhound/chunkhound/compare/v1.2.2...v1.2.3 [1.2.2]: https://github.com/chunkhound/chunkhound/compare/v1.2.1...v1.2.2 [1.2.1]: https://github.com/chunkhound/chunkhound/compare/v1.2.0...v1.2.1 [1.2.0]: https://github.com/chunkhound/chunkhound/compare/v1.1.0...v1.2.0 [1.1.0]: https://github.com/chunkhound/chunkhound/compare/v1.0.1...v1.1.0 [1.0.1]: https://github.com/chunkhound/chunkhound/compare/v1.0.0...v1.0.1 [1.0.0]: https://github.com/chunkhound/chunkhound/releases/tag/v1.0.0

Loading blob content...

Latest Blog Posts

Redis vs ioredis vs valkey-glide
By punkpeye on January 26, 2026.
benchmark
Redis
valkey
Quickstart: Publish an MCP Server to the MCP Registry
By punkpeye on January 24, 2026.
mcp
official reference mirror
Official MCP Registry Server.json Requirements
By punkpeye on January 24, 2026.
mcp
official reference mirror

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/ofriw/chunkhound'

If you have feedback or need assistance with the MCP directory API, please join our Discord server

CHANGELOG.md•33.7 KiB