Click on "Install Server".
Wait a few minutes for the server to deploy. Once ready, it will show a "Started" state.
In the chat, type
@followed by the MCP server name and your instructions, e.g., "@Moondream MCP Servercaption this image of my cat sleeping on the couch"
That's it! The server will respond to your query, and you can continue using it as needed.
Here is a step-by-step guide with screenshots.
Moondream MCP Server
A FastMCP server for Moondream, an AI vision language model. This server provides image analysis capabilities including captioning, visual question answering, object detection, and visual pointing through the Model Context Protocol (MCP).
Features
πΌοΈ Image Captioning: Generate short, normal, or detailed captions for images
β Visual Question Answering: Ask natural language questions about images
π Object Detection: Detect and locate specific objects with bounding boxes
π Visual Pointing: Get precise coordinates of objects in images
π URL Support: Process images from both local files and remote URLs
β‘ Batch Processing: Analyze multiple images efficiently
π Device Optimization: Automatic detection and optimization for CPU, CUDA, and MPS (Apple Silicon)
Installation
Prerequisites
Python 3.10 or higher
PyTorch 2.0+ (with appropriate device support)
Using uvx (Recommended for Claude Desktop)
Install from PyPI
Install from Source
Development Installation
Quick Start
Running the Server
Claude Desktop Integration
Add to your Claude Desktop configuration file:
macOS: ~/Library/Application Support/Claude/claude_desktop_config.json
Windows: %APPDATA%\Claude\claude_desktop_config.json
Using uvx (Recommended)
Using pip-installed command
Configuration
The server can be configured using environment variables:
Model Settings
MOONDREAM_MODEL_NAME: Model name (default:vikhyatk/moondream2)MOONDREAM_MODEL_REVISION: Model revision (default:2025-01-09)MOONDREAM_TRUST_REMOTE_CODE: Trust remote code (default:true)
Device Settings
MOONDREAM_DEVICE: Force specific device (cpu,cuda,mps, orauto)
Image Processing
MOONDREAM_MAX_IMAGE_SIZE: Maximum image dimensions (default:2048x2048)MOONDREAM_MAX_FILE_SIZE_MB: Maximum file size in MB (default:50)
Performance
MOONDREAM_TIMEOUT_SECONDS: Processing timeout (default:120)MOONDREAM_MAX_CONCURRENT_REQUESTS: Max concurrent requests (default:5)MOONDREAM_ENABLE_STREAMING: Enable streaming for captions (default:true)MOONDREAM_MAX_BATCH_SIZE: Maximum batch size for batch operations (default:10)MOONDREAM_BATCH_CONCURRENCY: Concurrent batch processing limit (default:3)MOONDREAM_ENABLE_BATCH_PROGRESS: Enable progress reporting for batch operations (default:true)
Network (for URLs)
MOONDREAM_REQUEST_TIMEOUT_SECONDS: HTTP request timeout (default:30)MOONDREAM_MAX_REDIRECTS: Maximum HTTP redirects (default:5)MOONDREAM_USER_AGENT: HTTP User-Agent header
Available Tools
1. caption_image
Generate captions for images.
Parameters:
image_path(string): Path to image file or URLlength(string): Caption length -"short","normal", or"detailed"stream(boolean): Whether to stream caption generation
Example:
2. query_image
Ask questions about images.
Parameters:
image_path(string): Path to image file or URLquestion(string): Question to ask about the image
Example:
3. detect_objects
Detect specific objects in images.
Parameters:
image_path(string): Path to image file or URLobject_name(string): Name of object to detect
Example:
4. point_objects
Get coordinates of objects in images.
Parameters:
image_path(string): Path to image file or URLobject_name(string): Name of object to locate
Example:
5. analyze_image
Multi-purpose image analysis tool.
Parameters:
image_path(string): Path to image file or URLoperation(string): Operation type ("caption","query","detect","point")parameters(string): JSON string with operation-specific parameters
Example:
6. batch_analyze_images
Process multiple images in batch.
Parameters:
image_paths(string): JSON array of image pathsoperation(string): Operation to perform on all imagesparameters(string): JSON string with operation-specific parameters
Example:
Usage Examples
Basic Image Captioning
Visual Question Answering
Object Detection
Batch Processing
Device Support
The server automatically detects and optimizes for available hardware:
Apple Silicon (MPS)
Optimal performance on M1/M2/M3 Macs
Automatic memory management
Native acceleration
NVIDIA CUDA
GPU acceleration for NVIDIA cards
Automatic CUDA memory management
Mixed precision support
CPU Fallback
Works on any system
Optimized for multi-core processing
Lower memory requirements
Error Handling
The server provides detailed error information:
Common error codes:
MODEL_LOAD_ERROR: Issues loading the Moondream modelIMAGE_PROCESSING_ERROR: Problems with image files or URLsINFERENCE_ERROR: Model inference failuresINVALID_REQUEST: Invalid parameters or requests
Performance Tips
Use appropriate image sizes: Resize large images before processing
Batch processing: Use
batch_analyze_imagesfor multiple imagesDevice optimization: Let the server auto-detect the best device
Concurrent requests: Adjust
MOONDREAM_MAX_CONCURRENT_REQUESTSbased on your hardwareMemory management: Monitor memory usage, especially with large images
Troubleshooting
Model Loading Issues
Memory Issues
Reduce
MOONDREAM_MAX_IMAGE_SIZELower
MOONDREAM_MAX_CONCURRENT_REQUESTSUse CPU instead of GPU for large images
Network Issues
Check firewall settings for URL access
Increase
MOONDREAM_REQUEST_TIMEOUT_SECONDSVerify SSL certificates for HTTPS URLs
Development
Running Tests
Code Quality
Contributing
Fork the repository
Create a feature branch
Make your changes
Add tests
Run quality checks
Submit a pull request
License
This project is licensed under the MIT License. See LICENSE for details.
Acknowledgments
Moondream - The amazing vision language model
FastMCP - The MCP server framework
Model Context Protocol - The protocol specification
Support
π Documentation
π Issue Tracker
π¬ Discussions
Note: This server requires downloading the Moondream model on first use, which may take some time depending on your internet connection.