Skip to main content
Glama
orneryd

M.I.M.I.R - Multi-agent Intelligent Memory & Insight Repository

by orneryd
gpu-acceleration.md4.17 kB
# GPU Acceleration **10-100x speedup for vector operations with GPU acceleration.** ## Overview NornicDB leverages GPU acceleration for: - Vector similarity search - K-means clustering - Embedding generation - Matrix operations ## Supported Backends | Backend | Platform | Performance | |---------|----------|-------------| | Metal | macOS (Apple Silicon) | Excellent | | CUDA | NVIDIA GPUs | Excellent | | OpenCL | AMD/Intel GPUs | Good | | Vulkan | Cross-platform | Good | | CPU | Fallback | Baseline | ## Automatic Detection GPU acceleration is automatically detected and enabled: ``` 🟢 GPU Acceleration: Available (backend: metal) ``` If no GPU is available: ``` 🟡 GPU Acceleration: Unavailable (using CPU fallback) ``` ## Configuration ### Environment Variables ```bash # Force specific backend export NORNICDB_GPU_BACKEND=metal # metal, cuda, opencl, vulkan, cpu # Disable GPU (force CPU) export NORNICDB_GPU_BACKEND=cpu ``` ### Docker with GPU **Apple Silicon (Metal):** ```yaml services: nornicdb: image: timothyswt/nornicdb-arm64-metal:latest # Metal is automatically enabled ``` **NVIDIA (CUDA):** ```yaml services: nornicdb: image: timothyswt/nornicdb-amd64-cuda:latest deploy: resources: reservations: devices: - driver: nvidia count: 1 capabilities: [gpu] ``` ## Performance Benchmarks ### Vector Search (1M vectors, 1024 dimensions) | Backend | Latency | Throughput | |---------|---------|------------| | Metal (M2) | 2ms | 500 qps | | CUDA (A100) | 1ms | 1000 qps | | CPU (AVX2) | 50ms | 20 qps | ### K-Means Clustering (100K points) | Backend | Time | Speedup | |---------|------|---------| | Metal | 120ms | 50x | | CUDA | 80ms | 75x | | CPU | 6000ms | 1x | ## GPU Memory Management ### Memory Limits ```go // GPU manager respects memory limits gpuManager := gpu.NewManager(&gpu.Config{ Backend: gpu.BackendMetal, MaxMemoryMB: 2048, // Limit to 2GB BatchSize: 1000, // Process in batches }) ``` ### Automatic Batching For large datasets, operations are automatically batched: ```go // Search 10M vectors in batches results, err := searchService.VectorSearch(ctx, query, 10000000, 10) // Automatically batched to fit GPU memory ``` ## Supported Operations ### Vector Search ```go // GPU-accelerated cosine similarity search results, err := db.HybridSearch(ctx, "query text", queryEmbedding, // GPU computes similarity nil, 100, ) ``` ### K-Means Clustering ```go // GPU-accelerated clustering clusterer := kmeans.NewGPUClusterer(&kmeans.Config{ K: 10, MaxIters: 100, Tolerance: 0.0001, UseGPU: true, }) centroids, assignments, err := clusterer.Cluster(vectors) ``` ### Embedding Generation ```go // GPU-accelerated local embeddings embedder := embed.NewLocalEmbedder(&embed.Config{ ModelPath: "/models/mxbai-embed-large.gguf", GPULayers: -1, // Auto-detect }) ``` ## Troubleshooting ### GPU Not Detected ```bash # Check GPU status curl http://localhost:7474/status | jq .gpu # Force CPU fallback export NORNICDB_GPU_BACKEND=cpu ``` ### Out of Memory ```bash # Reduce batch size export NORNICDB_GPU_BATCH_SIZE=500 # Limit GPU memory export NORNICDB_GPU_MAX_MEMORY_MB=1024 ``` ### CUDA Errors ```bash # Check NVIDIA driver nvidia-smi # Check CUDA version nvcc --version # Ensure container has GPU access docker run --gpus all nvidia/cuda:11.8-base nvidia-smi ``` ### Metal Errors (macOS) ```bash # Check Metal support system_profiler SPDisplaysDataType | grep Metal # Ensure running on Apple Silicon uname -m # Should show arm64 ``` ## Fallback Behavior If GPU acceleration fails, NornicDB automatically falls back to CPU: ```go // Automatic fallback result, err := gpuManager.VectorSearch(vectors, query, k) if err == gpu.ErrGPUUnavailable { // Falls back to CPU automatically result = cpuSearch(vectors, query, k) } ``` ## See Also - **[Vector Search](../user-guides/vector-search.md)** - Search guide - **[Memory Decay](memory-decay.md)** - Time-based scoring - **[Performance](../performance/)** - Benchmarks

Latest Blog Posts

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/orneryd/Mimir'

If you have feedback or need assistance with the MCP directory API, please join our Discord server