Skip to main content
Glama
orneryd

M.I.M.I.R - Multi-agent Intelligent Memory & Insight Repository

by orneryd
AMD64_VISION_SETUP.md5.74 kB
# AMD64 Vision Models Setup - Complete ## Overview Successfully built and deployed **two separate AMD64 Docker images** for Qwen2.5-VL vision models: - **2B Model** (faster, lower memory): `timothyswt/llama-cpp-server-amd64-qwen-vl-2b:latest` - **7B Model** (higher quality): `timothyswt/llama-cpp-server-amd64-qwen-vl-7b:latest` ## Architecture Decision - **TWO separate images** (one per model size) instead of single multi-model image - Each image contains its specific model.gguf and mmproj.gguf files embedded - Cleaner architecture, easier to deploy, no command overrides needed ## Model Sources Downloaded from HuggingFace: - **2B Model**: [mradermacher/Qwen2.5_VL_2B-GGUF](https://huggingface.co/mradermacher/Qwen2.5_VL_2B-GGUF) - `Qwen2.5_VL_2B.Q4_K_M.gguf` (1.84 GB) - `Qwen2.5_VL_2B.mmproj-Q8_0.gguf` (808 MB) - **7B Model**: [unsloth/Qwen2.5-VL-7B-Instruct-GGUF](https://huggingface.co/unsloth/Qwen2.5-VL-7B-Instruct-GGUF) - `Qwen2.5-VL-7B-Instruct-Q4_K_M.gguf` (4.47 GB) - `mmproj-F16.gguf` (1.29 GB) ## Files Modified/Created ### 1. Docker Configuration - **`docker/llama-cpp/Dockerfile.amd64-vision`**: Single-model-per-image builder using ARG MODEL_FILE and MMPROJ_FILE - **`docker-compose.amd64.yml`**: Two separate services: - `llama-server-vision-2b` on port 8081 (8G memory limit, 4G reservation) - `llama-server-vision-7b` on port 8082 (commented out, 16G limit when enabled) ### 2. Scripts - **`scripts/download-vision-models.ps1`**: Downloads all 4 GGUF files from HuggingFace - Fixed PowerShell syntax errors (try-catch, backticks, string terminators) - **`scripts/build-amd64-vision.ps1`**: Builds TWO separate images - Loops through 2B and 7B configs - Tests each container after build - Optional push to Docker Hub ### 3. Installation Automation - **`install.sh`**: macOS/Linux automated installer with dependency checking - Detects OS (Linux/Mac/Windows) - Checks for git, docker, docker-compose, node - Platform-specific installation instructions - Verifies Docker daemon running - **`install.ps1`**: Windows PowerShell equivalent - Same dependency checks - Windows-specific installation guidance - One-command install: `iwr -useb https://raw.githubusercontent.com/orneryd/Mimir/main/install.ps1 | iex` ### 4. Documentation - **`docs/index.html`**: Updated Quick Start section - Added macOS/Linux vs Windows toggle - Copy-pasteable installation commands - One-command remote installation for both platforms ## Docker Images Built ```bash ✅ timothyswt/llama-cpp-server-amd64-qwen-vl-2b:latest - Port: 8081 - Memory: 8G limit / 4G reservation - Model: Qwen2.5_VL_2B.Q4_K_M.gguf - MMProj: Qwen2.5_VL_2B.mmproj-Q8_0.gguf ✅ timothyswt/llama-cpp-server-amd64-qwen-vl-7b:latest - Port: 8082 - Memory: 16G limit / 8G reservation - Model: Qwen2.5-VL-7B-Instruct-Q4_K_M.gguf - MMProj: mmproj-F16.gguf ``` ## Testing Results ✅ **Vision recognition WORKING** - Tested 7B model with character image - Successfully provided detailed visual description - Both containers running and healthy ## Usage ### Download Models (First Time Only) ```powershell # Windows cd docker/llama-cpp ../../scripts/download-vision-models.ps1 ``` ### Build Images ```powershell # Windows ./scripts/build-amd64-vision.ps1 ``` ### Deploy with Docker Compose ```bash # Start 2B model (default) docker-compose -f docker-compose.amd64.yml up llama-server-vision-2b # Start 7B model (uncomment in docker-compose.amd64.yml first) docker-compose -f docker-compose.amd64.yml up llama-server-vision-7b # Start both models docker-compose -f docker-compose.amd64.yml up llama-server-vision-2b llama-server-vision-7b ``` ### Test Vision API ```bash # 2B Model curl http://localhost:8081/v1/chat/completions -H "Content-Type: application/json" -d '{ "model": "vision", "messages": [ { "role": "user", "content": [ {"type": "text", "text": "What do you see in this image?"}, {"type": "image_url", "image_url": {"url": "data:image/jpeg;base64,..."}} ] } ] }' # 7B Model curl http://localhost:8082/v1/chat/completions ... ``` ## One-Command Installation ### macOS / Linux ```bash curl -fsSL https://raw.githubusercontent.com/orneryd/Mimir/main/install.sh | bash ``` ### Windows (PowerShell) ```powershell iwr -useb https://raw.githubusercontent.com/orneryd/Mimir/main/install.ps1 | iex ``` ## Resource Requirements ### 2B Model - **Memory**: 8 GB RAM recommended (4 GB minimum) - **Disk**: ~3 GB for model files - **CPU**: Multi-core recommended for vision processing ### 7B Model - **Memory**: 16 GB RAM recommended (8 GB minimum) - **Disk**: ~6 GB for model files - **CPU**: Multi-core recommended, benefits from more cores ## Troubleshooting ### PowerShell Execution Policy If you get execution policy errors: ```powershell Set-ExecutionPolicy -ExecutionPolicy RemoteSigned -Scope CurrentUser ``` ### Docker Desktop Not Running ```powershell # Start Docker Desktop manually, then verify: docker info ``` ### HuggingFace Download Issues ```powershell # Install/upgrade huggingface-hub: pip install -U huggingface-hub ``` ### Out of Memory - For 7B model, ensure you have at least 16 GB RAM - Close other applications - Consider using 2B model instead ## Next Steps 1. ✅ Models downloaded and built 2. ✅ Docker images created and tested 3. ✅ docker-compose.amd64.yml updated 4. ✅ Installation scripts created (bash + PowerShell) 5. ✅ Documentation website updated 6. 🔄 Ready for deployment! ## Links - [HuggingFace Models](https://huggingface.co/models?search=Qwen2.5-VL) - [Llama.cpp Documentation](https://github.com/ggerganov/llama.cpp) - [Docker Hub Images](https://hub.docker.com/u/timothyswt)

Latest Blog Posts

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/orneryd/Mimir'

If you have feedback or need assistance with the MCP directory API, please join our Discord server