Faulkner DB

faulkner-db
ingestion

DEPLOYMENT_GUIDE.md•7.2 KiB

# Extraction Optimization - Deployment Guide ## Quick Summary **Problem**: Phase 2 extraction taking 18-20+ hours due to sequential LLM calls **Solution**: Batch LLM requests (20 conversations per call instead of 1) **Result**: 1.5x faster extraction (proven by testing) **Time to Deploy**: 5 minutes --- ## Current Status ``` Phase 2 Progress: 58.5% complete (133/227 batches) Elapsed Time: 15.7 hours Remaining: ~5,657 conversations Current Rate: ~141 conversations/minute Estimated Remaining: 8-10 hours (if no optimization) ``` --- ## Step 1: Validate the Optimization (Optional but Recommended) ### Quick Validation Test (5 minutes) ```bash # Test batching on a small sample cd /path/to/faulkner-db source venv/bin/activate # Run a small test python3 ingestion/agent_genesis_optimized.py \ --batch-size 20 \ --llm-batch 20 \ --parallel 1 ``` **Expected Output**: ``` 🚀 OPTIMIZED AGENT GENESIS EXTRACTION (5-10x faster) Batch Size: 20 | LLM Batch: 20 | Parallel: 1 📡 Gathering ALL conversations from Agent Genesis corpus... 'code implementation': +... new ... ✅ Total unique conversations gathered: ... 📦 Processing X conversations in Y batches... [████████░░░░░░░░░░░░░░░░░░░░] X% | Batch N/Y | Nodes: Z (20%) | Rate: 2.0+/min ``` **Success Criteria**: - ✅ Extraction rate > 2.0 conversations/second (vs 1.5 baseline) - ✅ No JSON parsing errors - ✅ Success rate > 20% - ✅ Progress bar updates smoothly **If all pass**: Proceed to Step 2 --- ## Step 2: Deploy to Remaining Phase 2 ### Stop Current Process (Gracefully) ```bash # The current process (069df0) can be stopped gracefully: # - It saves a checkpoint every batch # - Optimized version will read the checkpoint and resume # Option A: Let it finish naturally (it's almost done) # Option B: Send SIGTERM to stop (it will save checkpoint) kill -TERM <PID> # If you know the process ID # Or just let it complete - it's at 58.5% and will finish in ~8-10 hours ``` ### Launch Optimized Extraction ```bash cd /path/to/faulkner-db source venv/bin/activate # Deploy optimized version python3 ingestion/agent_genesis_optimized.py \ --batch-size 100 \ --llm-batch 20 \ --parallel 1 ``` **What This Does**: 1. Loads the existing checkpoint (resumes from 58.5%) 2. Gathers remaining conversations 3. Filters out lowest 20% by relevance 4. Batches conversations into groups of 20 5. Sends each batch to LLM (1 call per 20 conversations) 6. Saves checkpoint after each batch 7. Continues until all conversations processed --- ## Step 3: Monitor Progress ### Live Monitoring ```bash # In a separate terminal, monitor the process tail -f /path/to/output.log # Or check checkpoint file for stats cat ingestion/optimized_checkpoint.json | python3 -m json.tool ``` ### What to Watch For ``` ✅ Good Signs: - Rate: 2.0+ conversations/second - Success: 20-25% extraction rate - Batches: Processing 5-10 batches/minute - Errors: < 1% ⚠️ Warning Signs: - Rate drops below 1.0/second - Success rate drops below 15% - Consistent timeouts (> 5 in a row) - Memory usage growing (> 2GB) ❌ Error Signs: - JSON parsing errors - Network failures (persistent) - Out of memory - Checkpoint not saving ``` --- ## Step 4: Expected Timeline ### With Optimization Deployed Now ``` Current State (2025-11-10 15:00 UTC): Progress: 58.5% (133/227 batches) Time elapsed: 15.7 hours Remaining conversations: ~5,657 With Batching Optimization: Speedup: 1.5x Remaining time: 5-7 hours Expected completion: ~2025-11-10 21:00-23:00 UTC Without Optimization: Remaining time: 8-10 hours Expected completion: ~2025-11-11 01:00-03:00 UTC Time Saved: 3 hours ``` --- ## Alternative: Larger Batches (Experimental) If you want to push harder for more speed: ```bash # Try batch size of 40 (more aggressive) python3 ingestion/agent_genesis_optimized.py \ --batch-size 100 \ --llm-batch 40 \ --parallel 1 ``` **Expected**: 2.0-2.5x faster (3-5 hours remaining) **Risk**: Slightly less stable, needs validation first --- ## Troubleshooting ### Issue: "JSON decode error" **Cause**: LLM response doesn't contain valid JSON **Solution**: - Reduce batch size: `--llm-batch 15` - This gives LLM less content to handle - May be slightly slower but more reliable ### Issue: "Timeout waiting for LLM" **Cause**: LLM taking too long to respond **Solution**: - Reduce batch size: `--llm-batch 15` - Or increase timeout in code (advanced) - Check if LLM system is overloaded ### Issue: "Memory usage growing" **Cause**: Checkpoint file or extraction cache getting large **Solution**: - Reduce batch size: `--batch-size 50` - Clean up old checkpoint files - Restart the process (normal operation) ### Issue: "Rate is slower than expected" **Cause**: LLM slower than benchmarked **Solution**: - This is normal - benchmarks are optimistic - 1.2-1.5x speedup is realistic - Still better than sequential approach --- ## Configuration Recommendations ### Recommended Settings ``` Default (Safe, Proven): --batch-size 100 --llm-batch 20 --parallel 1 Expected: 1.5x faster Risk: Very low Aggressive (Experimental): --batch-size 100 --llm-batch 40 --parallel 1 Expected: 2.0x faster Risk: Medium Conservative (Extra Safe): --batch-size 50 --llm-batch 15 --parallel 1 Expected: 1.3x faster Risk: Very low ``` ### Don't Use These ``` ❌ DON'T: --parallel > 1 Reason: Local LLM is serial, parallelism adds overhead ❌ DON'T: --llm-batch > 50 Reason: LLM struggles with large prompts, JSON errors increase ❌ DON'T: --batch-size > 200 Reason: Checkpoints become large, recovery slower ``` --- ## Files Created ``` ingestion/ ├── agent_genesis_comprehensive.py ← Current (slow, sequential) ├── agent_genesis_optimized.py ← NEW (batched, 1.5x faster) ✨ ├── benchmark_extraction.py ← Test suite for validating ├── analyze_benchmark.py ← Analysis of test results ├── OPTIMIZATION_SUMMARY.md ← Detailed technical explanation ├── OPTIMIZATION_RESULTS.md ← Test results & projections ├── DEPLOYMENT_GUIDE.md ← This file └── optimized_checkpoint.json ← NEW checkpoint (when deployed) ``` --- ## Summary ### What Was Discovered - Sequential LLM calls are the bottleneck (0.4-0.5s per conversation) - Batching 20 conversations per LLM call saves 30% time overhead - Parallelism doesn't help (local LLM is serial internally) - Smart filtering adds marginal benefit (1.3x on filtered set) ### What Was Built - Optimized extraction with batching: **1.5x faster (proven)** - Benchmark suite for testing strategies - Analysis tools to understand performance - Documentation for deployment ### Next Steps 1. **NOW**: Review this guide 2. **5 min**: Run validation test (optional) 3. **1 min**: Deploy optimized version 4. **5-7 hours**: Wait for Phase 2 to complete 5. **Tomorrow**: Enjoy faster knowledge base updates! --- ## Questions? If something goes wrong: 1. Check `ingestion/optimized_checkpoint.json` for current progress 2. Review the output logs for errors 3. Refer to "Troubleshooting" section above 4. The checkpoint system means you can always resume **Good luck!** 🚀

Loading blob content...

Latest Blog Posts

Don't Use Large Strings as Cache Keys
By punkpeye on January 11, 2026.
markdown
node-js
cache
What are Claude Skills?
By punkpeye on January 10, 2026.
mcp
skills
How to Test MCP Streamable HTTP Endpoints Using cURL
By punkpeye on January 2, 2026.
tutorial
bash

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/Platano78/faulkner-db'

If you have feedback or need assistance with the MCP directory API, please join our Discord server

DEPLOYMENT_GUIDE.md•7.2 KiB