Skip to main content
Glama
README.md8.58 kB
# Performance Guide Comprehensive performance tuning and optimization guide for the Local Search MCP Server. ## 📊 Performance Benchmarks ### Baseline Performance (8-core Intel i7, 16GB RAM, SSD) | Operation | File Size | Threads | Time | Throughput | Memory Peak | |-----------|-----------|---------|------|------------|-------------| | **File Indexing** | 100 files, 2MB | 8 | ~8s | ~12.5 files/sec | ~900MB | | **Document Search** | 10K chunks | 1 | ~45ms | 22K searches/sec | ~50MB | | **Repository Processing** | Medium repo (1GB) | 4 | ~25s | ~40MB/sec | ~1.5GB | | **Concurrent Indexing** | 50 files | 16 | ~4s | ~12.5 files/sec | ~1.2GB | | **File Watcher Updates** | 10 changed files | 1-4 | ~2s | Real-time | ~200MB | ### Scaling Performance #### CPU Core Scaling (Test: 100 documentation files) | CPU Cores | Time | Speedup | Efficiency | |-----------|------|---------|------------| | 1 | ~32s | 1.0x | 100% | | 2 | ~19s | 1.7x | 85% | | 4 | ~11s | 2.9x | 73% | | 8 | ~8s | 4.0x | 50% | | 16 | ~7s | 4.6x | 29% | #### Memory Scaling (Test: Varying document counts) | Documents | Index Size | Memory Usage | Search Speed | |-----------|------------|--------------|--------------| | 1K docs | ~50MB | ~200MB | ~10ms | | 10K docs | ~500MB | ~1GB | ~45ms | | 50K docs | ~2.5GB | ~4GB | ~150ms | | 100K docs | ~5GB | ~8GB | ~300ms | ## ⚡ Performance Optimization ### Hardware Optimization #### CPU Configuration ```typescript // Optimal for modern systems (8+ cores) const optimalConfig = { maxFileProcessingConcurrency: os.cpus().length, maxEmbeddingConcurrency: os.cpus().length, maxDirectoryConcurrency: os.cpus().length * 2, maxFileWatcherConcurrency: Math.max(1, os.cpus().length / 2) }; ``` ```typescript // Memory-constrained systems (4GB RAM) const memoryConservative = { maxFileProcessingConcurrency: 2, maxEmbeddingConcurrency: 1, maxDirectoryConcurrency: 4, maxFileWatcherConcurrency: 1, // Smaller chunks reduce memory pressure chunkSize: 500, overlap: 50 }; ``` #### GPU Acceleration ```typescript // Default configuration with auto-detect const embeddingConfig = { useGPU: true, // Auto-detect GPU availability model: 'Xenova/paraphrase-multilingual-minilm-l12-v2', batchSize: 32, // Optimal batch size for GPU // Fallback behavior cpuFallback: true, // Use CPU if GPU fails gpuMemoryLimit: '4GB' // Optional GPU memory limit }; ``` ### Chunking Strategy Optimization #### Technical Documentation ```typescript // Best for API docs, technical specs const technicalChunking = { chunkSize: 800, // Smaller chunks for precision overlap: 150, // More overlap for context method: 'sentence-aware', // Content-aware splitting preserveCodeBlocks: true, preserveMarkdownHeaders: true, maxChunkSize: 1500 }; ``` #### Narrative Documentation ```typescript // Best for guides, tutorials, blogs const narrativeChunking = { chunkSize: 1200, // Larger chunks for flow overlap: 200, // Standard overlap method: 'paragraph-aware', // Structure preservation splitOnHeaders: true, preserveLists: true, maxChunkSize: 2000 }; ``` ### Database Optimization #### SQLite Configuration ```typescript // Performance-optimized database settings const dbConfig = { // Connection settings journal_mode: 'WAL', // Write-Ahead Logging synchronous: 'NORMAL', // Balanced safety/performance cache_size: -512000, // 512MB cache (negative for KB) temp_store: 'MEMORY', // Temp tables in memory // Reading optimization mmap_size: 268435456, // 256MB memory mapping // Maintenance auto_vacuum: 'INCREMENTAL', // Optimize space usage foreign_keys: 'ON' // Enable FK constraints }; ``` #### Index Maintenance ```typescript // Regular maintenance for optimal performance async function optimizeIndex() { // Analyze query patterns await db.exec('ANALYZE vector_index'); // Rebuild fragmented indexes await db.exec('REINDEX vector_index'); // Clean up WAL file await db.exec('PRAGMA wal_checkpoint(TRUNCATE)'); // Compact database await db.exec('VACUUM'); } ``` ### Network Optimization #### Repository Processing ```typescript // Optimized for GitHub API rate limits const repoConfig = { // Parallel processing with rate limiting maxRepositoryConcurrency: 3, // GitHub API rate limit friendly // Chunked downloads downloadBatchSize: 10, // Process files in batches // Connection pooling maxConnections: 6, // HTTP connection limits timeout: 30000, // 30-second timeout // Retry strategy retryAttempts: 3, retryDelay: 1000, // Exponential backoff jitter: 0.1 // Add randomness to avoid thundering herd }; ``` ## 📊 Current Monitoring The system provides basic performance timing through the Logger utility. Processing durations and chunk counts are logged for all operations. ## 🎯 Advanced Optimizations ### Embedding Model Optimization #### Model Selection Guide ```typescript // Model performance comparison (throughput, quality, size) const modelOptions = { // Small model - fast, good enough for most use cases 'Xenova/paraphrase-multilingual-minilm-l12-v2': { size: '90MB', throughput: '150 docs/sec', quality: 'good', useCase: 'general purpose' }, // Large model - slower, better quality 'Xenova/paraphrase-multilingual-mpnet-base-v2': { size: '420MB', throughput: '75 docs/sec', quality: 'excellent', useCase: 'high precision required' } }; ``` #### Current Batch Processing The system processes embeddings in small batches (10 chunks) with event loop yielding between batches to prevent MCP server blocking. ### Database Query Optimization #### Vector Search Optimization ```sql -- Optimized vector search query EXPLAIN QUERY PLAN SELECT chunk_text, embedding <=> $query_embedding as distance, -- Cosine distance file_path, chunk_index, metadata FROM vector_chunks WHERE distance < $threshold ORDER BY distance ASC LIMIT $limit; ``` #### Index Maintenance Basic database maintenance can be performed using: - `VACUUM` - Reclaim space and optimize tables - `ANALYZE` - Update query optimizer statistics - `PRAGMA wal_checkpoint(TRUNCATE)` - Clean up WAL file ## 🚨 Performance Troubleshooting ### Performance Degradation Issues #### Symptoms and Solutions **Symptom: Search queries getting slower over time** ```bash # Check database file size ls -lh local-search-index.db # Check fragmentation sqlite3 local-search-index.db "PRAGMA integrity_check;" sqlite3 local-search-index.db "PRAGMA freelist_count;" # Optimize database sqlite3 local-search-index.db "VACUUM; ANALYZE;" ``` **Symptom: High memory usage during indexing** ```bash # Monitor memory usage node --inspect --max-old-space-size=4096 build/index.js # Reduce concurrency export MAX_FILE_CONCURRENCY=2 # Use smaller chunks export CHUNK_SIZE=500 ``` **Symptom: Embedding generation is slow** ```bash # Check GPU usage nvidia-smi # If GPU available # Disable GPU if causing issues export USE_GPU=false # Check model cache ls -la ~/.cache/transformers/ ``` ### Diagnostic Tools #### Performance Logging ```typescript // Enable detailed performance logging const logger = { timing: (label: string, duration: number) => { console.log(`[${label}] ${duration.toFixed(2)}ms`); }, memory: () => { const usage = process.memoryUsage(); console.log(`Memory: RSS=${(usage.rss/1024/1024).toFixed(1)}MB, Heap=${(usage.heapUsed/1024/1024).toFixed(1)}MB`); }, profile: async (name: string, fn: () => Promise<any>) => { const startTime = performance.now(); const startMemory = process.memoryUsage(); try { const result = await fn(); const endTime = performance.now(); const endMemory = process.memoryUsage(); console.log(`[${name}]`); console.log(` Time: ${(endTime - startTime).toFixed(2)}ms`); console.log(` Memory Delta: ${(endMemory.rss - startMemory.rss)/1024/1024}MB`); return result; } catch (error) { console.error(`[${name}] Error:`, error); throw error; } } }; ``` ### Resource Monitoring Memory usage and processing statistics are logged automatically during operations. --- **Monitor**, **profile**, **optimize** - repeat. Performance is a continuous process. Use these tools and techniques to keep your Local Search MCP Server running at peak efficiency.

Latest Blog Posts

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/PatrickRuddiman/local-search-mcp'

If you have feedback or need assistance with the MCP directory API, please join our Discord server