M.I.M.I.R - Multi-agent Intelligent Memory & Insight Repository

Overview Schema Related Servers Score Discussions

Mimir
docker
llama-cpp

README.md•6.46 KiB

# llama.cpp Docker Images This directory contains Dockerfiles and scripts to build llama.cpp server for **multiple platforms and use cases**. ## Available Images ### 1. ARM64 - Embeddings (Apple Silicon) - **Image:** `timothyswt/llama-cpp-server-arm64:latest` - **Use Case:** Text embeddings for semantic search (Mac M1/M2/M3) - **Models:** nomic-embed-text (768d), mxbai-embed-large (1024d) - **Files:** `Dockerfile`, `Dockerfile.mxbai` ### 2. AMD64 - Embeddings (Windows/Linux x86_64) - **Image:** `timothyswt/llama-cpp-server-amd64-mxbai:latest` - **Use Case:** Text embeddings for semantic search (Windows/Linux PC) - **Models:** mxbai-embed-large (1024d) - **Files:** `Dockerfile.mxbai` (build with `--platform linux/amd64`) ### 3. AMD64 - Vision Models (Windows/Linux x86_64) ⭐ NEW - **Image:** `timothyswt/llama-cpp-server-amd64-vision:latest` - **Use Case:** Image understanding and multimodal AI (Windows/Linux PC) - **Models:** Llama 3.2 Vision (11B/90B), Qwen2.5-VL (2B/7B) - **Files:** `Dockerfile.amd64-vision` - **Docs:** `BUILD_AMD64_VISION.md` - **Build Script:** `scripts/build-llama-cpp-vision-amd64.ps1` ## Why Custom Builds? The official `ghcr.io/ggml-org/llama.cpp:server` image claims multi-arch support but actually only provides AMD64. These custom builds provide: - **Native ARM64** for Apple Silicon Macs (no emulation overhead) - **Native AMD64** with vision capabilities for Windows/Linux - **Embedded models** - no external downloads required --- ## Quick Start - Embeddings (ARM64) ### Using Pre-built Image ```bash docker pull timothyswt/llama-cpp-server-arm64:latest docker run -p 11434:8080 -v ./models:/models timothyswt/llama-cpp-server-arm64:latest ``` ### Building Locally ```bash # Build the image ./scripts/build-llama-cpp.sh # Or manually: docker build -t timothyswt/llama-cpp-server-arm64:latest -f docker/llama-cpp/Dockerfile . ``` ## Features - **Native ARM64**: Built specifically for Apple Silicon - **OpenAI-Compatible API**: Drop-in replacement for Ollama - **Embeddings Support**: Mean pooling for semantic search - **Lightweight**: ~200MB runtime image - **No GPU Required**: Optimized for CPU inference ## Usage with Mimir The `docker-compose.yml` automatically uses this image for the `llama-server` service: ```yaml llama-server: image: timothyswt/llama-cpp-server-arm64:latest ports: - "11434:8080" volumes: - ollama_models:/models ``` ## API Endpoints - **Health**: `GET /health` - **Embeddings**: `POST /v1/embeddings` - **Models**: `GET /v1/models` Compatible with Ollama API format. ## Model Management ### Download Models Models can be in GGUF format (same as Ollama). Place them in the `/models` directory: ```bash # If you have Ollama models already: cp -r ~/.ollama/models ./data/ollama/models # Or download directly: curl -L https://huggingface.co/nomic-ai/nomic-embed-text-v1.5-GGUF/resolve/main/nomic-embed-text-v1.5.Q8_0.gguf \ -o ./data/ollama/models/nomic-embed-text.gguf ``` ### Specify Model in docker-compose.yml Uncomment and set the model path in the `command` section: ```yaml command: - "--model" - "/models/nomic-embed-text.gguf" - "--alias" - "nomic-embed-text" ``` ## Performance ARM64 build performance: - **Embeddings**: ~50-100ms per request (768-dim) - **Memory**: ~200MB base + model size - **CPU**: Utilizes all cores efficiently ## Publishing Updates ```bash # Build and push new version ./scripts/build-llama-cpp.sh 1.0.1 # Or manually: docker push timothyswt/llama-cpp-server-arm64:1.0.1 docker push timothyswt/llama-cpp-server-arm64:latest ``` ## Troubleshooting ### Image Not Found ```bash docker pull timothyswt/llama-cpp-server-arm64:latest ``` ### Health Check Fails Wait 30 seconds for server startup, or check logs: ```bash docker logs llama_server ``` ### Model Not Loading Verify model path and format (must be GGUF): ```bash docker exec llama_server ls -la /models ``` ## Architecture ``` ┌─────────────────────────────────┐ │ Docker Container (ARM64) │ │ │ │ ┌──────────────────────────┐ │ │ │ llama-server binary │ │ │ │ (compiled from source) │ │ │ └──────────────────────────┘ │ │ ↓ │ │ ┌──────────────────────────┐ │ │ │ OpenAI-compatible API │ │ │ │ Port: 8080 │ │ │ └──────────────────────────┘ │ │ ↓ │ │ ┌──────────────────────────┐ │ │ │ /models (volume) │ │ │ │ GGUF model files │ │ │ └──────────────────────────┘ │ └─────────────────────────────────┘ ↓ Port 11434 (external) ↓ Mimir Server ``` ## Quick Reference - Vision Models (AMD64) ### Build Vision Image for Windows ```powershell # Build with Llama 3.2 Vision models (11B + 90B) .\scripts\build-llama-cpp-vision-amd64.ps1 -ModelType llama32 # Or build with Qwen2.5-VL models (2B + 7B) .\scripts\build-llama-cpp-vision-amd64.ps1 -ModelType qwen25 ``` ### Enable Vision in docker-compose.amd64.yml ```yaml # Uncomment one or both vision services llama-server-vision-2b: # Port 8081 - Faster, lower memory llama-server-vision-7b: # Port 8082 - Higher quality ``` ### Test Vision API ```powershell curl http://localhost:8081/v1/chat/completions ` -H "Content-Type: application/json" ` -d '{ "model": "vision-2b", "messages": [{ "role": "user", "content": [ {"type": "text", "text": "What is in this image?"}, {"type": "image_url", "image_url": {"url": "data:image/jpeg;base64,..."}} ] }] }' ``` See `BUILD_AMD64_VISION.md` for complete documentation. ## References - [llama.cpp GitHub](https://github.com/ggerganov/llama.cpp) - [GGUF Format](https://github.com/ggerganov/ggml/blob/master/docs/gguf.md) - [OpenAI Embeddings API](https://platform.openai.com/docs/api-reference/embeddings) - [Llama 3.2 Vision](https://github.com/meta-llama/llama-models/blob/main/models/llama3_2/MODEL_CARD_VISION.md) - [Qwen2.5-VL](https://github.com/QwenLM/Qwen2.5-VL)

Loading blob content...

Latest Blog Posts

Redis vs ioredis vs valkey-glide
By punkpeye on January 26, 2026.
benchmark
Redis
valkey
Quickstart: Publish an MCP Server to the MCP Registry
By punkpeye on January 24, 2026.
mcp
official reference mirror
Official MCP Registry Server.json Requirements
By punkpeye on January 24, 2026.
mcp
official reference mirror

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/orneryd/Mimir'

If you have feedback or need assistance with the MCP directory API, please join our Discord server

README.md•6.46 KiB