Skip to main content
Glama

CodeGraph CLI MCP Server

by Jakedismo
hotpath_optimization.md3.74 kB
## Hot Path Optimization Report Role: hotpath_optimization Cluster: cpu_optimization Specialization: Critical path optimization ### Phase 0: Development Context - Feature: Vector search hot path optimization (query-time + index build clustering) - Technical Stack: `rust`, `tokio`, `rayon`, `dashmap`, `parking_lot`, optional `faiss`, `ndarray`, `criterion` - Constraints: - Keep interfaces stable (public API unchanged) - No heavy new dependencies; avoid nightly - Optimize CPU-critical 20% code paths first - Maintain thread-safety and async correctness - Success Criteria: - Reduced CPU time along hot paths (distance calc, ranking, mapping) - No regressions in results correctness - Minimal memory overhead; no extra allocations on hot path ### Phase 1: Library Insights (Context7) - Tokio: avoid blocking in async; move CPU-heavy tasks to `spawn_blocking` (already done for FAISS search). Prefer work-stealing scheduler defaults. Use `Arc` and lock-free patterns where possible. - Rayon: beneficial for large parallel workloads; small collections should stay sequential to avoid scheduling overhead. - DashMap/parking_lot: use short-lived guard scopes; read-mostly patterns with `RwLock` are preferred. - FAISS: heavy vector ops are native; optimize wrapper overhead (allocation, mapping, copies) around calls. ### Phase 2: Current Best Practices (summary) - Prefer simple index loops over iterator adapters on hot math paths. - Use `mul_add` where appropriate for tighter FP loops. - Avoid unnecessary `sqrt` in comparisons (use squared distances) to reduce FLOPs. - Pre-allocate buffers with `with_capacity` for known sizes. - Use parallelism only above a size threshold to avoid overhead. ### Phase 5: TDD Setup (targeted) - Criterion benches already present in `codegraph-vector/benches/knn_benchmark.rs`. - Planned micro-bench: distance kernel perf (optional) to isolate changes. ### Phase 6–7: Core Implementation & Optimization (Completed) Applied optimizations in `crates/codegraph-vector/src/knn.rs`: - Distance kernel: - Replaced iterator-based Euclidean distance with unrolled index loop using `mul_add` and `#[inline(always)]`. - Added `squared_distance` and switched k-means assignment to squared distances (removes `sqrt()` in the hot inner loop). - Result mapping: pre-allocated `node_results` with `Vec::with_capacity(k)`. - Context ranking: sequential fast-path for small `k` (<= 64) to avoid Rayon overhead; parallel path retained for larger batches. - Vector prep: pre-allocated vector and mapping capacities based on expected sizes. - Marked `calculate_context_score` as `#[inline]`. All changes are allocation- and branch-prediction-friendly and keep external behavior identical. ### Phase 8–9: Integration & Security - No API changes; thread-safety preserved. No new external deps added. ### Phase 13: Benchmarking Plan (to run locally/CI) Run with FAISS enabled where available: - Build: `cargo build --release -p codegraph-vector --features faiss` - Bench (subset): `cargo bench -p codegraph-vector --bench knn_benchmark --features faiss -- --warm-up-time 2 --measurement-time 5` - Optional CPU profiles: - Linux: `cargo flamegraph -p codegraph-vector --bench knn_benchmark --features faiss` - macOS: `cargo instruments -t time -- cargo bench -p codegraph-vector --bench knn_benchmark --features faiss` Focus metrics: - Query-time CPU: FAISS call wrapper, result mapping, contextual ranking. - Index build CPU: clustering assignment loop (distance function). ### Notes - If FAISS is unavailable, validate distance and ranking micro-benchmarks in isolation (unit benches) to confirm kernel speedups. - No behavioral changes expected; only performance improvements.

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/Jakedismo/codegraph-rust'

If you have feedback or need assistance with the MCP directory API, please join our Discord server