# Contextual Embeddings Service - Phase 2 Implementation Summary
## Overview
The Contextual Embeddings Service has been successfully implemented as Phase 2 of the enhanced document processing system. This service uses Claude to generate rich, searchable context for each document chunk, significantly improving semantic search capabilities.
## Key Components Implemented
### 1. Core Service (`contextual_embeddings.py`)
The main service provides:
- **Intelligent Document Categorization**: Automatically detects document types (code, documentation, tutorial, reference, configuration) based on file path and content
- **Prompt Templates**: Specialized templates for each document category to generate optimal context
- **Caching System**: Both memory and disk-based caching to avoid redundant API calls
- **Batch Processing**: Efficient parallel processing of multiple chunks with progress tracking
- **Cost & Token Monitoring**: Tracks API usage and provides cost estimates
- **Graceful Degradation**: Works even without the anthropic package installed (returns mock contexts)
### 2. Document Categories
Six specialized categories for optimal context generation:
1. **CODE**: Source code files with focus on functionality and dependencies
2. **DOCUMENTATION**: General documentation with emphasis on concepts
3. **TUTORIAL**: Learning materials highlighting objectives and prerequisites
4. **REFERENCE**: API docs and specifications with technical details
5. **CONFIGURATION**: Config files with option purposes and values
6. **GENERAL**: Fallback for unclassified content
### 3. Prompt Template System
Each category has a tailored prompt template with:
- **System Prompt**: Sets the AI's role and objectives
- **User Prompt Template**: Formats chunk content with metadata
- **Context Focus**: Category-specific emphasis (e.g., learning objectives for tutorials, API details for references)
### 4. Caching Infrastructure
Multi-level caching for efficiency:
- **Memory Cache**: Fast in-process cache for current session
- **Disk Cache**: Persistent JSON-based cache across sessions
- **Cache Keys**: Based on content hash + document path + category
- **Cache Validation**: Automatic loading and verification
### 5. Metrics & Monitoring
Comprehensive tracking includes:
- Total chunks processed
- Cache hit rates
- Token usage (input/output)
- Cost estimation (Claude 3.5 Sonnet pricing)
- Processing time
- Error tracking
## API Usage
### Basic Usage
```python
from mcp_server.document_processing import (
ContextualEmbeddingService,
DocumentChunk,
ChunkType,
ChunkMetadata
)
# Initialize service
service = ContextualEmbeddingService(
api_key="your-api-key", # Or set ANTHROPIC_API_KEY env var
enable_prompt_caching=True,
max_concurrent_requests=5
)
# Create a chunk
chunk = DocumentChunk(
id="chunk_1",
content="def fibonacci(n): return n if n <= 1 else fibonacci(n-1) + fibonacci(n-2)",
type=ChunkType.CODE_BLOCK,
metadata=ChunkMetadata(
document_path="/src/math.py",
section_hierarchy=["Math Functions"],
chunk_index=0,
total_chunks=1,
has_code=True,
language="python"
)
)
# Generate context
context, was_cached = await service.generate_context_for_chunk(chunk)
```
### Batch Processing
```python
# Process multiple chunks with progress tracking
contexts = await service.generate_contexts_batch(
chunks,
document_context={"project": "MCP Server", "version": "1.0"},
progress_callback=lambda processed, total: print(f"{processed}/{total}")
)
# Get processing metrics
metrics = service.get_metrics()
print(f"Total cost: ${metrics.total_cost:.4f}")
print(f"Cache hit rate: {metrics.cached_chunks/metrics.total_chunks*100:.1f}%")
```
### Integration with Document Processing
The service integrates seamlessly with existing document processors:
```python
from mcp_server.plugins.markdown_plugin import MarkdownPlugin
# Process document
plugin = MarkdownPlugin()
processed_doc = plugin.process_document(file_path, content)
# Generate contexts for all chunks
contexts = await service.generate_contexts_batch(processed_doc.chunks)
# Enhance chunks with context
for chunk in processed_doc.chunks:
chunk.context = contexts[chunk.id]
```
## Features & Benefits
### 1. Intelligent Context Generation
- Category-specific prompts ensure relevant context
- Captures relationships, dependencies, and purpose
- Optimized for semantic search queries
### 2. Performance Optimization
- Prompt caching reduces API costs by ~50%
- Local caching eliminates redundant calls
- Concurrent processing with rate limiting
- Batch operations for efficiency
### 3. Cost Management
- Real-time token tracking
- Cost estimation per operation
- Cache metrics for optimization
- Configurable concurrency limits
### 4. Reliability
- Graceful error handling
- Fallback contexts on API failures
- Works without anthropic package (mock mode)
- Comprehensive error logging
### 5. Extensibility
- Easy to add new document categories
- Customizable prompt templates
- Pluggable cache backends
- Metrics export capability
## File Structure
```
mcp_server/document_processing/
├── contextual_embeddings.py # Main service implementation
├── __init__.py # Updated with new exports
└── ...
tests/
├── test_contextual_embeddings.py # Full test suite (requires anthropic)
├── test_contextual_embeddings_mock.py # Mock tests (no dependencies)
└── ...
demos/
├── demo_contextual_embeddings.py # Full API demonstration
├── demo_contextual_simple.py # Simple usage example
└── demo_contextual_integration.py # Integration with doc processing
```
## Configuration
### Environment Variables
- `ANTHROPIC_API_KEY`: API key for Claude access
- `MCP_CONTEXT_CACHE_DIR`: Custom cache directory (optional)
### Service Parameters
- `model`: Claude model to use (default: claude-3-5-sonnet-20241022)
- `max_concurrent_requests`: Rate limiting (default: 5)
- `enable_prompt_caching`: Use Anthropic's caching (default: True)
- `cache_dir`: Local cache location
## Testing
The implementation includes comprehensive tests:
1. **Unit Tests**: Document categorization, caching, templates
2. **Integration Tests**: Full service workflow
3. **Mock Tests**: Run without anthropic dependency
4. **Performance Tests**: Batch processing, caching efficiency
Run tests:
```bash
# Mock tests (no dependencies)
python test_contextual_embeddings_mock.py
# Full tests (requires anthropic)
pytest test_contextual_embeddings.py
```
## Next Steps
With Phase 2 complete, the system is ready for:
1. **Phase 3**: Multi-modal embedding generation combining contexts with code understanding
2. **Phase 4**: Production deployment with monitoring and optimization
3. **Phase 5**: Advanced features like cross-document context and incremental updates
## Conclusion
The Contextual Embeddings Service successfully enhances document chunks with rich, searchable context using Claude's advanced language understanding. The implementation is production-ready with robust caching, error handling, and monitoring capabilities.