M.I.M.I.R - Multi-agent Intelligent Memory & Insight Repository

Overview Schema Related Servers Score Discussions

gpu-acceleration.md•4.07 KiB

# GPU Acceleration **10-100x speedup for vector operations with GPU acceleration.** ## Overview NornicDB leverages GPU acceleration for: - Vector similarity search - K-means clustering - Embedding generation - Matrix operations ## Supported Backends | Backend | Platform | Performance | |---------|----------|-------------| | Metal | macOS (Apple Silicon) | Excellent | | CUDA | NVIDIA GPUs | Excellent | | OpenCL | AMD/Intel GPUs | Good | | Vulkan | Cross-platform | Good | | CPU | Fallback | Baseline | ## Automatic Detection GPU acceleration is automatically detected and enabled: ``` 🟢 GPU Acceleration: Available (backend: metal) ``` If no GPU is available: ``` 🟡 GPU Acceleration: Unavailable (using CPU fallback) ``` ## Configuration ### Environment Variables ```bash # Force specific backend export NORNICDB_GPU_BACKEND=metal # metal, cuda, opencl, vulkan, cpu # Disable GPU (force CPU) export NORNICDB_GPU_BACKEND=cpu ``` ### Docker with GPU **Apple Silicon (Metal):** ```yaml services: nornicdb: image: timothyswt/nornicdb-arm64-metal:latest # Metal is automatically enabled ``` **NVIDIA (CUDA):** ```yaml services: nornicdb: image: timothyswt/nornicdb-amd64-cuda:latest deploy: resources: reservations: devices: - driver: nvidia count: 1 capabilities: [gpu] ``` ## Performance Benchmarks ### Vector Search (1M vectors, 1024 dimensions) | Backend | Latency | Throughput | |---------|---------|------------| | Metal (M2) | 2ms | 500 qps | | CUDA (A100) | 1ms | 1000 qps | | CPU (AVX2) | 50ms | 20 qps | ### K-Means Clustering (100K points) | Backend | Time | Speedup | |---------|------|---------| | Metal | 120ms | 50x | | CUDA | 80ms | 75x | | CPU | 6000ms | 1x | ## GPU Memory Management ### Memory Limits ```go // GPU manager respects memory limits gpuManager := gpu.NewManager(&gpu.Config{ Backend: gpu.BackendMetal, MaxMemoryMB: 2048, // Limit to 2GB BatchSize: 1000, // Process in batches }) ``` ### Automatic Batching For large datasets, operations are automatically batched: ```go // Search 10M vectors in batches results, err := searchService.VectorSearch(ctx, query, 10000000, 10) // Automatically batched to fit GPU memory ``` ## Supported Operations ### Vector Search ```go // GPU-accelerated cosine similarity search results, err := db.HybridSearch(ctx, "query text", queryEmbedding, // GPU computes similarity nil, 100, ) ``` ### K-Means Clustering ```go // GPU-accelerated clustering clusterer := kmeans.NewGPUClusterer(&kmeans.Config{ K: 10, MaxIters: 100, Tolerance: 0.0001, UseGPU: true, }) centroids, assignments, err := clusterer.Cluster(vectors) ``` ### Embedding Generation ```go // GPU-accelerated local embeddings embedder := embed.NewLocalEmbedder(&embed.Config{ ModelPath: "/models/mxbai-embed-large.gguf", GPULayers: -1, // Auto-detect }) ``` ## Troubleshooting ### GPU Not Detected ```bash # Check GPU status curl http://localhost:7474/status | jq .gpu # Force CPU fallback export NORNICDB_GPU_BACKEND=cpu ``` ### Out of Memory ```bash # Reduce batch size export NORNICDB_GPU_BATCH_SIZE=500 # Limit GPU memory export NORNICDB_GPU_MAX_MEMORY_MB=1024 ``` ### CUDA Errors ```bash # Check NVIDIA driver nvidia-smi # Check CUDA version nvcc --version # Ensure container has GPU access docker run --gpus all nvidia/cuda:11.8-base nvidia-smi ``` ### Metal Errors (macOS) ```bash # Check Metal support system_profiler SPDisplaysDataType | grep Metal # Ensure running on Apple Silicon uname -m # Should show arm64 ``` ## Fallback Behavior If GPU acceleration fails, NornicDB automatically falls back to CPU: ```go // Automatic fallback result, err := gpuManager.VectorSearch(vectors, query, k) if err == gpu.ErrGPUUnavailable { // Falls back to CPU automatically result = cpuSearch(vectors, query, k) } ``` ## See Also - **[Vector Search](../user-guides/vector-search.md)** - Search guide - **[Memory Decay](memory-decay.md)** - Time-based scoring - **[Performance](../performance/)** - Benchmarks

Loading blob content...

Latest Blog Posts

Redis vs ioredis vs valkey-glide
By punkpeye on January 26, 2026.
benchmark
Redis
valkey
Quickstart: Publish an MCP Server to the MCP Registry
By punkpeye on January 24, 2026.
mcp
official reference mirror
Official MCP Registry Server.json Requirements
By punkpeye on January 24, 2026.
mcp
official reference mirror

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/orneryd/Mimir'

If you have feedback or need assistance with the MCP directory API, please join our Discord server

gpu-acceleration.md•4.07 KiB