Code-Index-MCP

Code-Index-MCP
docs
implementation

CONTEXTUAL_EMBEDDINGS_SUMMARY.md•7.03 KiB

# Contextual Embeddings Service - Phase 2 Implementation Summary ## Overview The Contextual Embeddings Service has been successfully implemented as Phase 2 of the enhanced document processing system. This service uses Claude to generate rich, searchable context for each document chunk, significantly improving semantic search capabilities. ## Key Components Implemented ### 1. Core Service (`contextual_embeddings.py`) The main service provides: - **Intelligent Document Categorization**: Automatically detects document types (code, documentation, tutorial, reference, configuration) based on file path and content - **Prompt Templates**: Specialized templates for each document category to generate optimal context - **Caching System**: Both memory and disk-based caching to avoid redundant API calls - **Batch Processing**: Efficient parallel processing of multiple chunks with progress tracking - **Cost & Token Monitoring**: Tracks API usage and provides cost estimates - **Graceful Degradation**: Works even without the anthropic package installed (returns mock contexts) ### 2. Document Categories Six specialized categories for optimal context generation: 1. **CODE**: Source code files with focus on functionality and dependencies 2. **DOCUMENTATION**: General documentation with emphasis on concepts 3. **TUTORIAL**: Learning materials highlighting objectives and prerequisites 4. **REFERENCE**: API docs and specifications with technical details 5. **CONFIGURATION**: Config files with option purposes and values 6. **GENERAL**: Fallback for unclassified content ### 3. Prompt Template System Each category has a tailored prompt template with: - **System Prompt**: Sets the AI's role and objectives - **User Prompt Template**: Formats chunk content with metadata - **Context Focus**: Category-specific emphasis (e.g., learning objectives for tutorials, API details for references) ### 4. Caching Infrastructure Multi-level caching for efficiency: - **Memory Cache**: Fast in-process cache for current session - **Disk Cache**: Persistent JSON-based cache across sessions - **Cache Keys**: Based on content hash + document path + category - **Cache Validation**: Automatic loading and verification ### 5. Metrics & Monitoring Comprehensive tracking includes: - Total chunks processed - Cache hit rates - Token usage (input/output) - Cost estimation (Claude 3.5 Sonnet pricing) - Processing time - Error tracking ## API Usage ### Basic Usage ```python from mcp_server.document_processing import ( ContextualEmbeddingService, DocumentChunk, ChunkType, ChunkMetadata ) # Initialize service service = ContextualEmbeddingService( api_key="your-api-key", # Or set ANTHROPIC_API_KEY env var enable_prompt_caching=True, max_concurrent_requests=5 ) # Create a chunk chunk = DocumentChunk( id="chunk_1", content="def fibonacci(n): return n if n <= 1 else fibonacci(n-1) + fibonacci(n-2)", type=ChunkType.CODE_BLOCK, metadata=ChunkMetadata( document_path="/src/math.py", section_hierarchy=["Math Functions"], chunk_index=0, total_chunks=1, has_code=True, language="python" ) ) # Generate context context, was_cached = await service.generate_context_for_chunk(chunk) ``` ### Batch Processing ```python # Process multiple chunks with progress tracking contexts = await service.generate_contexts_batch( chunks, document_context={"project": "MCP Server", "version": "1.0"}, progress_callback=lambda processed, total: print(f"{processed}/{total}") ) # Get processing metrics metrics = service.get_metrics() print(f"Total cost: ${metrics.total_cost:.4f}") print(f"Cache hit rate: {metrics.cached_chunks/metrics.total_chunks*100:.1f}%") ``` ### Integration with Document Processing The service integrates seamlessly with existing document processors: ```python from mcp_server.plugins.markdown_plugin import MarkdownPlugin # Process document plugin = MarkdownPlugin() processed_doc = plugin.process_document(file_path, content) # Generate contexts for all chunks contexts = await service.generate_contexts_batch(processed_doc.chunks) # Enhance chunks with context for chunk in processed_doc.chunks: chunk.context = contexts[chunk.id] ``` ## Features & Benefits ### 1. Intelligent Context Generation - Category-specific prompts ensure relevant context - Captures relationships, dependencies, and purpose - Optimized for semantic search queries ### 2. Performance Optimization - Prompt caching reduces API costs by ~50% - Local caching eliminates redundant calls - Concurrent processing with rate limiting - Batch operations for efficiency ### 3. Cost Management - Real-time token tracking - Cost estimation per operation - Cache metrics for optimization - Configurable concurrency limits ### 4. Reliability - Graceful error handling - Fallback contexts on API failures - Works without anthropic package (mock mode) - Comprehensive error logging ### 5. Extensibility - Easy to add new document categories - Customizable prompt templates - Pluggable cache backends - Metrics export capability ## File Structure ``` mcp_server/document_processing/ ├── contextual_embeddings.py # Main service implementation ├── __init__.py # Updated with new exports └── ... tests/ ├── test_contextual_embeddings.py # Full test suite (requires anthropic) ├── test_contextual_embeddings_mock.py # Mock tests (no dependencies) └── ... demos/ ├── demo_contextual_embeddings.py # Full API demonstration ├── demo_contextual_simple.py # Simple usage example └── demo_contextual_integration.py # Integration with doc processing ``` ## Configuration ### Environment Variables - `ANTHROPIC_API_KEY`: API key for Claude access - `MCP_CONTEXT_CACHE_DIR`: Custom cache directory (optional) ### Service Parameters - `model`: Claude model to use (default: claude-3-5-sonnet-20241022) - `max_concurrent_requests`: Rate limiting (default: 5) - `enable_prompt_caching`: Use Anthropic's caching (default: True) - `cache_dir`: Local cache location ## Testing The implementation includes comprehensive tests: 1. **Unit Tests**: Document categorization, caching, templates 2. **Integration Tests**: Full service workflow 3. **Mock Tests**: Run without anthropic dependency 4. **Performance Tests**: Batch processing, caching efficiency Run tests: ```bash # Mock tests (no dependencies) python test_contextual_embeddings_mock.py # Full tests (requires anthropic) pytest test_contextual_embeddings.py ``` ## Next Steps With Phase 2 complete, the system is ready for: 1. **Phase 3**: Multi-modal embedding generation combining contexts with code understanding 2. **Phase 4**: Production deployment with monitoring and optimization 3. **Phase 5**: Advanced features like cross-document context and incremental updates ## Conclusion The Contextual Embeddings Service successfully enhances document chunks with rich, searchable context using Claude's advanced language understanding. The implementation is production-ready with robust caching, error handling, and monitoring capabilities.

Loading blob content...

Latest Blog Posts

Redis vs ioredis vs valkey-glide
By punkpeye on January 26, 2026.
benchmark
Redis
valkey
Quickstart: Publish an MCP Server to the MCP Registry
By punkpeye on January 24, 2026.
mcp
official reference mirror
Official MCP Registry Server.json Requirements
By punkpeye on January 24, 2026.
mcp
official reference mirror

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/ViperJuice/Code-Index-MCP'

If you have feedback or need assistance with the MCP directory API, please join our Discord server

CONTEXTUAL_EMBEDDINGS_SUMMARY.md•7.03 KiB