M.I.M.I.R - Multi-agent Intelligent Memory & Insight Repository

Overview Schema Related Servers Score Discussions

Mimir
testing
benchmarks

README.md•8.69 KiB

# Mimir Performance Benchmarks Comprehensive benchmark suite for measuring Mimir's performance across three core areas: 1. **File Indexing Pipeline** - Chunking, embedding generation, and Neo4j writes 2. **Vector Search** - Semantic search latency and accuracy 3. **Neo4j Graph Queries** - Relationship traversal and graph operations ## Quick Start ```bash # Run all benchmarks npm run bench # Run specific benchmark suite npm run bench -- mimir-performance # Run with custom iterations npm run bench -- --iterations 1000 # Generate JSON output for analysis npm run bench -- --reporter=json > results/benchmark-$(date +%Y%m%d).json ``` ## Prerequisites - **Neo4j running** on `localhost:7687` (or set `NEO4J_URI`) - **Neo4j credentials** set in environment: ```bash export NEO4J_USER=neo4j export NEO4J_PASSWORD=password ``` - **Sufficient memory** for test data (creates 1000 nodes + relationships) ## Benchmark Categories ### 1. File Indexing Pipeline Measures end-to-end file indexing performance: | Benchmark | File Size | Iterations | What It Tests | |-----------|-----------|------------|---------------| | Small file | 5 KB | 50 | Fast indexing of small files | | Medium file | 50 KB | 20 | Typical source file indexing | | Large file | 500 KB | 10 | Large documentation/data files | | Batch 100 files | 5 KB each | 5 | Concurrent indexing throughput | **Metrics:** - Total indexing time (ms) - Chunking overhead - Embedding generation time (mocked) - Neo4j write latency ### 2. Vector Search Performance Tests semantic search across different result sizes: | Benchmark | Top-K | Iterations | What It Tests | |-----------|-------|------------|---------------| | Vector search | 10 | 100 | Fast retrieval for small result sets | | Vector search | 25 | 100 | Standard search result size | | Vector search | 50 | 100 | Large result set performance | | Hybrid search | 25 | 50 | Combined vector + full-text (RRF) | **Metrics:** - Query latency (ms) - Cosine similarity computation time - Full-text search overhead (hybrid) - Reciprocal Rank Fusion time ### 3. Neo4j Graph Query Performance Benchmarks common graph operations: | Benchmark | Query Type | Iterations | What It Tests | |-----------|------------|------------|---------------| | Node lookup by ID | Point query | 1000 | Index performance | | Node lookup by property | Filtered scan | 500 | Property index efficiency | | Traversal depth 1 | Single hop | 200 | Direct relationship queries | | Traversal depth 2 | Two hops | 100 | Multi-hop traversal | | Traversal depth 3 | Three hops | 50 | Deep graph exploration | | Subgraph extraction | APOC path | 50 | Complex subgraph queries | | Batch create 100 nodes | Write batch | 50 | Write throughput | | Batch create 1000 nodes | Large write | 10 | Large batch performance | | Batch create 100 edges | Relationship writes | 50 | Edge creation speed | | Complex aggregation | Analytics | 100 | Aggregation performance | ## Interpreting Results ### Vitest Benchmark Output ``` ✓ testing/benchmarks/mimir-performance.bench.ts (3) 45231ms ✓ File Indexing Pipeline (4) 12453ms name hz min max mean p75 p99 p995 p999 rme samples · Index small file (5KB) 8.2341 115.32 128.45 121.47 123.12 127.89 128.21 128.45 ±1.23% 50 · Index medium file (50KB) 2.1234 465.23 489.12 471.02 475.34 487.23 488.45 489.12 ±2.11% 20 ``` **Key Metrics:** - **hz (ops/sec)**: Higher is better - operations per second - **mean (ms)**: Average latency - lower is better - **p99 (ms)**: 99th percentile latency - lower is better - **rme (%)**: Relative margin of error - lower is better (<5% ideal) ### Performance Targets (M1 Max, 32GB RAM) | Category | Operation | Target | Good | Needs Improvement | |----------|-----------|--------|------|-------------------| | Indexing | Small file (5KB) | <150ms | <200ms | >300ms | | Indexing | Medium file (50KB) | <500ms | <800ms | >1200ms | | Vector Search | Top 10 | <50ms | <100ms | >200ms | | Vector Search | Top 50 | <150ms | <250ms | >500ms | | Graph Query | Depth 1 | <10ms | <25ms | >50ms | | Graph Query | Depth 3 | <100ms | <200ms | >500ms | | Batch Write | 100 nodes | <50ms | <100ms | >200ms | ### Baseline Results (Example) ``` Platform: macOS 14.6.0, M1 Max, 32GB RAM Neo4j: 5.x Community Edition Node.js: v20.x File Indexing: - Small (5KB): 121ms avg (8.2 ops/sec) - Medium (50KB): 471ms avg (2.1 ops/sec) - Large (500KB): 4.2s avg (0.24 ops/sec) Vector Search: - Top 10: 45ms avg (22 ops/sec) - Top 25: 89ms avg (11 ops/sec) - Top 50: 142ms avg (7 ops/sec) Graph Queries: - Depth 1: 8ms avg (125 ops/sec) - Depth 2: 34ms avg (29 ops/sec) - Depth 3: 98ms avg (10 ops/sec) ``` ## Running in CI/CD ### GitHub Actions Example ```yaml name: Performance Benchmarks on: push: branches: [main] pull_request: branches: [main] jobs: benchmark: runs-on: ubuntu-latest services: neo4j: image: neo4j:5-community env: NEO4J_AUTH: neo4j/password ports: - 7687:7687 options: >- --health-cmd "cypher-shell -u neo4j -p password 'RETURN 1'" --health-interval 10s --health-timeout 5s --health-retries 5 steps: - uses: actions/checkout@v4 - name: Setup Node.js uses: actions/setup-node@v4 with: node-version: '20' cache: 'npm' - name: Install dependencies run: npm ci - name: Run benchmarks env: NEO4J_URI: neo4j://localhost:7687 NEO4J_USER: neo4j NEO4J_PASSWORD: password run: | npm run bench -- --reporter=json > benchmark-results.json - name: Upload results uses: actions/upload-artifact@v4 with: name: benchmark-results path: benchmark-results.json - name: Comment PR with results if: github.event_name == 'pull_request' uses: actions/github-script@v7 with: script: | const fs = require('fs'); const results = JSON.parse(fs.readFileSync('benchmark-results.json', 'utf8')); // Format and post results as PR comment ``` ## Comparing Results ### Track Performance Over Time ```bash # Run benchmarks and save with timestamp npm run bench -- --reporter=json > results/bench-$(date +%Y%m%d-%H%M%S).json # Compare two runs node scripts/compare-benchmarks.js results/bench-20250101.json results/bench-20250115.json ``` ### Regression Detection Set performance thresholds in CI: ```bash # Fail if any benchmark regresses by >20% npm run bench:check-regression -- --threshold 0.20 ``` ## Troubleshooting ### Neo4j Connection Issues ```bash # Check Neo4j is running docker ps | grep neo4j # Test connection cypher-shell -u neo4j -p password "RETURN 1" # Check environment variables echo $NEO4J_URI echo $NEO4J_USER ``` ### Out of Memory If benchmarks fail with OOM: ```bash # Increase Node.js heap size NODE_OPTIONS="--max-old-space-size=8192" npm run bench # Reduce test data size (edit mimir-performance.bench.ts) # Change: UNWIND range(1, 1000) -> UNWIND range(1, 100) ``` ### Slow Benchmarks Reduce iterations for faster runs: ```bash # Quick smoke test (10 iterations each) npm run bench -- --iterations 10 # Skip slow benchmarks npm run bench -- --exclude "Batch index 100 files" ``` ## Adding New Benchmarks ```typescript // In mimir-performance.bench.ts describe('My New Benchmark Category', () => { bench('My specific test', async () => { // Your benchmark code here const result = await myFunction(); return result; }, { iterations: 100, // Number of times to run warmup: 10 // Warmup iterations (not counted) }); }); ``` ## Best Practices 1. **Isolate benchmarks** - Each should be independent 2. **Use realistic data** - Match production workloads 3. **Warm up** - First few iterations are slower (JIT compilation) 4. **Consistent environment** - Close other apps, use same hardware 5. **Multiple runs** - Run 3-5 times, take median 6. **Document changes** - Note any config/hardware changes ## Publishing Results ### Generate Markdown Report ```bash npm run bench:report ``` Creates `results/BENCHMARK_REPORT.md` with formatted tables. ### Share Results Commit results to `docs/benchmarks/`: ``` docs/benchmarks/ ├── 2025-01-15-m1-max.md ├── 2025-01-15-intel-i9.md └── baseline-results.md ``` ## Resources - [Vitest Benchmark API](https://vitest.dev/api/bench.html) - [Neo4j Performance Tuning](https://neo4j.com/docs/operations-manual/current/performance/) - [Node.js Performance Best Practices](https://nodejs.org/en/docs/guides/simple-profiling/) --- **Last Updated:** 2025-01-15 **Maintainer:** Mimir Development Team

Loading blob content...

Latest Blog Posts

Redis vs ioredis vs valkey-glide
By punkpeye on January 26, 2026.
benchmark
Redis
valkey
Quickstart: Publish an MCP Server to the MCP Registry
By punkpeye on January 24, 2026.
mcp
official reference mirror
Official MCP Registry Server.json Requirements
By punkpeye on January 24, 2026.
mcp
official reference mirror

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/orneryd/Mimir'

If you have feedback or need assistance with the MCP directory API, please join our Discord server

README.md•8.69 KiB