Skip to main content
Glama
orneryd

M.I.M.I.R - Multi-agent Intelligent Memory & Insight Repository

by orneryd
README.md3.05 kB
All k-means operations now have structured logging with the `[K-MEANS]` prefix: | Log | Description | |-----|-------------| | `[K-MEANS] ✅ ENABLED` | Clustering enabled with mode/clusters/init | | `[K-MEANS] 🔄 STARTING` | Clustering starting with embedding count | | `[K-MEANS] ✅ COMPLETE` | Clustering complete with stats + duration | | `[K-MEANS] ⏭️ SKIPPED` | Skipped (too few embeddings or not enabled) | | `[K-MEANS] ❌ FAILED` | Failed with error | | `[K-MEANS] 🔍 SEARCH` | Search executed with mode + timing | ## 2. Test Data Generator Tool New tool at `cmd/kmeans-test-data/main.go`: ```bash # Generate 5000 embeddings with 20 natural clusters and save to file go run cmd/kmeans-test-data/main.go -mode clusters -count 5000 -clusters 20 # Import directly into NornicDB go run cmd/kmeans-test-data/main.go -mode clusters -count 5000 -db ./data/kmeans-test # Generate larger dataset for stress testing go run cmd/kmeans-test-data/main.go -mode download -download large-text -db ./data/stress-test ``` **Modes:** - `synthetic` - Random uniformly distributed embeddings - `clusters` - Embeddings with natural cluster structure (best for k-means testing) - `download` - Pre-defined datasets (sift-small, glove-25, text-1024, large-text) **Features:** - Idempotent (same seed = same data) - Ground truth cluster labels for validation - Statistics reporting (cluster sizes, norms, memory) - Direct import to NornicDB or JSON export ## Full Test Flow ```bash # 1. Generate test data with clusters go run cmd/kmeans-test-data/main.go -mode clusters -count 5000 -clusters 50 -db ./data/kmeans-test # 2. Enable clustering and start NornicDB export NORNICDB_KMEANS_CLUSTERING_ENABLED=true go run cmd/nornicdb/main.go -data ./data/kmeans-test # 3. Watch logs for k-means activity # [K-MEANS] ✅ Clustering ENABLED | mode=CPU clusters=100 ... # [K-MEANS] 🔄 STARTING | embeddings=5000 # [K-MEANS] ✅ COMPLETE | clusters=100 embeddings=5000 iterations=12 duration=234ms ``` ## K-Means Clustering Integration ### How to Enable ```bash export NORNICDB_KMEANS_CLUSTERING_ENABLED=true ``` ### What Happens | Stage | Action | |-------|--------| | **Startup** | If flag enabled, clustering initialized (CPU mode) | | **GPU Available** | Upgrades to GPU-accelerated clustering | | **Index Build** | After indexes built → triggers clustering | | **Embed Queue Empty** | After batch embedding completes → auto-triggers clustering | | **Search** | Uses cluster-accelerated search when active (10-50x faster) | ### Smart Behavior - **Minimum threshold**: Only clusters when 1000+ embeddings (below this, brute-force is faster) - **Fire-and-forget**: Clustering runs in background, doesn't block embedding worker - **Auto-upgrade**: Starts with CPU, upgrades to GPU if available later ### New Files Modified - `pkg/search/search.go` - Added clustering methods and cluster-accelerated search - `pkg/nornicdb/embed_queue.go` - Added `onQueueEmpty` callback - `pkg/nornicdb/db.go` - Wired everything together with feature flag

Latest Blog Posts

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/orneryd/Mimir'

If you have feedback or need assistance with the MCP directory API, please join our Discord server