TDZ C64 Knowledge

tdz-c64-knowledge
archive
historical-docs

PERFORMANCE_ANALYSIS.md•8.62 KiB

# Performance Analysis Report **TDZ C64 Knowledge Base - v2.14.0** **Date**: 2024-12-21 **Benchmark Runtime**: 6.27 seconds ## Executive Summary Comprehensive performance benchmarking reveals: - ✅ **FTS5 search**: Excellent performance (<1ms) with caching - ⚠️ **Semantic search**: 39x slower than FTS5 (14.53ms vs 0.37ms) - 🔴 **Entity extraction**: Extreme variance (0.12ms to 3293ms - 27,000x range) - ⚠️ **Document ingestion**: High variance on large documents (96ms-477ms - 5x range) **Priority Optimizations:** 1. **HIGH**: Cache semantic search embeddings 2. **HIGH**: Implement entity extraction result caching 3. **MEDIUM**: Optimize large document chunking strategy 4. **LOW**: Consider hybrid search weight optimization --- ## Detailed Benchmark Results ### 1. Document Ingestion Performance | Size | Mean | Median | Min | Max | StdDev | Iterations | |------|------|--------|-----|-----|--------|------------| | **Small** | 5.54ms | 2.26ms | 1.83ms | 67.91ms | 14.68ms | 20 | | **Medium** | 20.11ms | 13.70ms | 12.11ms | 79.29ms | 20.81ms | 10 | | **Large** | 182.01ms | 111.06ms | 96.42ms | 477.27ms | 165.47ms | 5 | **Findings:** - Linear scaling with document size (expected) - High variance on large documents (5x difference between min/max) - Likely caused by: - Chunking overhead - Semantic embedding generation (if enabled) - Database transaction timing - File I/O variability **Optimization Recommendations:** 1. Profile large document chunking to identify bottleneck 2. Batch embed generation for chunks (if semantic enabled) 3. Consider async document processing for large files 4. Investigate file I/O caching --- ### 2. Search Performance | Method | Mean | Median | Min | Max | StdDev | Results | Iterations | |--------|------|--------|-----|-----|--------|---------|------------| | **FTS5** | 0.37ms | 0.0036ms | 0.0033ms | 36.31ms | 3.63ms | 10 | 100 | | **Semantic** | 14.53ms | 14.22ms | 12.28ms | 19.37ms | 1.38ms | 10 | 50 | | **Hybrid** | 19.44ms | 18.95ms | 15.63ms | 26.40ms | 1.93ms | 10 | 50 | | **Complex Query** | 0.063ms | 0.0048ms | 0.0045ms | 5.77ms | 0.58ms | 2 | 100 | **Key Findings:** #### FTS5 Search (✅ Excellent) - Sub-millisecond performance with caching (median: 0.0036ms) - Occasional cache misses cause spikes (max: 36.31ms) - Scales well: 100 docs still averages 0.16ms - **No optimization needed** #### Semantic Search (⚠️ Slow) - Consistent 14.53ms average (39x slower than FTS5) - Low variance (1.38ms stdev) - predictable performance - Bottleneck: Query embedding generation + FAISS search - **Optimization potential**: Cache query embeddings for repeated searches #### Hybrid Search (⚠️ Slower than both) - 19.44ms average (slower than semantic alone!) - Combines FTS5 + semantic but adds overhead - Current weight: Equal blending (0.5/0.5) - **Issue**: Overhead of running both searches + merging results exceeds benefits **Optimization Recommendations:** 1. **Cache semantic query embeddings** - Hash query text, cache embedding for 1 hour 2. **Optimize hybrid search**: - Run FTS5 and semantic in parallel (currently sequential) - Adjust default weight to favor FTS5 (0.7 FTS5 / 0.3 semantic) - Consider early termination if FTS5 returns high-confidence results 3. **Query result caching** - Cache full search results for identical queries (TTL: 5 minutes) --- ### 3. Entity Extraction Performance (🔴 Critical Issue) | Metric | Value | |--------|-------| | **Mean** | 329.47ms | | **Median** | 0.15ms | | **Min** | 0.12ms | | **Max** | 3293ms (3.3 seconds!) | | **StdDev** | 1041ms | | **Variance Ratio** | **27,000x** (min to max) | **Analysis:** This extreme variance indicates a clear pattern: - **First extraction per document**: 3.3 seconds (LLM API call) - **Subsequent extractions**: 0.12ms (database cache hit) **Why this matters:** - Unpredictable user experience - First-time entity extraction appears to "hang" - No progress indication for long operations **Optimization Recommendations:** 1. **Implement aggressive result caching**: ```python # Cache entity extraction results by doc_id + confidence_threshold cache_key = f"{doc_id}:{confidence_threshold}" if cache_key in entity_cache: return entity_cache[cache_key] ``` 2. **Add progress callbacks** for LLM operations: ```python def extract_entities(doc_id, progress_callback=None): if progress_callback: progress_callback("Preparing document chunks...") # ... chunking ... if progress_callback: progress_callback("Calling LLM for entity extraction...") # ... LLM call ... ``` 3. **Background entity extraction**: - Extract entities automatically after document ingestion - Store results immediately in database - User never experiences the 3.3s wait 4. **Batch entity extraction**: - Process multiple documents in parallel - Reduces per-document overhead --- ### 4. Large Dataset Performance | Test | Documents | Mean | Results | Iterations | |------|-----------|------|---------|------------| | **Search (100 docs)** | 100 | 0.16ms | 10 | 50 | | **Large Result Set** | 50 | 0.37ms | 12 | 20 | **Findings:** - FTS5 search scales excellently (minimal degradation with 100 docs) - Large result sets (12+ results) have no performance penalty - Database indexes are effective **Optimization Recommendations:** - None needed for current scale - Monitor performance at 1000+ documents - Consider result pagination for very large result sets (100+) --- ## Performance Bottleneck Summary ### Critical Issues (🔴 Fix Immediately) 1. **Entity extraction variance** - 27,000x range (0.12ms to 3.3s) - Solution: Aggressive caching, background extraction, progress callbacks ### Important Issues (⚠️ High Impact) 2. **Semantic search slowness** - 39x slower than FTS5 - Solution: Cache query embeddings, parallel hybrid search 3. **Hybrid search overhead** - Slower than individual methods - Solution: Parallel execution, optimize weights ### Minor Issues (ℹ️ Low Priority) 4. **Large document ingestion variance** - 5x range - Solution: Profile chunking, optimize I/O --- ## Recommended Optimization Roadmap ### Phase 1: Quick Wins (2-4 hours) 1. **Implement semantic query embedding cache** (~60 lines) - LRU cache with 1-hour TTL - Expected improvement: 10-12ms → 2-4ms (3-6x faster) 2. **Add entity extraction result caching** (~40 lines) - Store in memory + database - Expected improvement: Eliminates 3.3s first-time delay 3. **Parallel hybrid search** (~80 lines) - ThreadPoolExecutor for FTS5 + semantic - Expected improvement: 19.44ms → 14-15ms (25% faster) ### Phase 2: Background Processing (4-6 hours) 4. **Automatic entity extraction on ingestion** (~120 lines) - Background thread pool - Queue-based processing - Expected improvement: Zero user-facing extraction delays 5. **Query result caching** (~60 lines) - Cache complete search results (TTL: 5 minutes) - Expected improvement: Instant results for repeated queries ### Phase 3: Advanced Optimizations (6-8 hours) 6. **Large document chunking optimization** (~100 lines) - Profile and optimize chunking algorithm - Investigate parallel chunk processing 7. **Hybrid search weight auto-tuning** (~80 lines) - ML-based weight optimization based on query patterns - A/B testing framework --- ## Baseline Metrics for Tracking Use these baselines to measure optimization effectiveness: | Operation | Current | Target | Improvement | |-----------|---------|--------|-------------| | FTS5 Search | 0.37ms | 0.37ms | - (already optimal) | | Semantic Search | 14.53ms | **4-5ms** | **3x faster** | | Hybrid Search | 19.44ms | **15ms** | **25% faster** | | Entity Extraction (first) | 3293ms | **<100ms** | **33x faster** | | Entity Extraction (cached) | 0.12ms | 0.12ms | - (already optimal) | | Large Doc Ingestion | 182ms | **150ms** | **20% faster** | --- ## Testing Strategy for Optimizations After each optimization: 1. **Run benchmark suite**: `python benchmark.py` 2. **Compare results**: Check improvement vs baseline 3. **Verify correctness**: Ensure results match pre-optimization 4. **Load test**: Test with 1000+ documents 5. **Update PERFORMANCE_ANALYSIS.md**: Document improvements --- ## Conclusion The system has excellent FTS5 search performance but suffers from: - **Entity extraction unpredictability** (highest priority) - **Semantic search overhead** (high impact on hybrid search) - **Document ingestion variance** (lower priority) Implementing Phase 1 optimizations (2-4 hours) will deliver: - 3-6x faster semantic search - Elimination of 3.3s entity extraction delays - 25% faster hybrid search **Recommended next step**: Implement Phase 1 optimizations immediately.

Loading blob content...

Latest Blog Posts

Redis vs ioredis vs valkey-glide
By punkpeye on January 26, 2026.
benchmark
Redis
valkey
Quickstart: Publish an MCP Server to the MCP Registry
By punkpeye on January 24, 2026.
mcp
official reference mirror
Official MCP Registry Server.json Requirements
By punkpeye on January 24, 2026.
mcp
official reference mirror

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/MichaelTroelsen/tdz-c64-knowledge'

If you have feedback or need assistance with the MCP directory API, please join our Discord server

PERFORMANCE_ANALYSIS.md•8.62 KiB