Uses Docker Desktop with MCP Toolkit for containerized deployment and secret management of the image analysis service
Leverages OpenCV library for local image processing, technical metadata extraction, and optical character recognition capabilities
Implements the MCP server using Python with PIL/Pillow for image processing and httpx for API communication
Image Description MCP Server
A Model Context Protocol (MCP) server that provides AI-powered image analysis using xAI's Grok API.
Purpose
This MCP server provides a secure interface for AI assistants to analyze images using Grok's advanced vision capabilities. It supports both web-hosted images and local files, offering detailed descriptions, technical metadata extraction, and optical character recognition (OCR).
Features
Current Implementation
describe_image_url
- Analyzes images from web URLs and provides AI-generated descriptionsdescribe_image_file
- Analyzes local image files and provides AI-generated descriptionsextract_text_from_image
- Performs OCR to extract readable text from images
Prerequisites
Docker Desktop with MCP Toolkit enabled
Docker MCP CLI plugin (
docker mcp
command)Grok API key from https://console.x.ai/
Installation
See the step-by-step instructions provided with the files.
Usage Examples
In Grok4 Code Fast, you can ask:
"Describe this image: https://example.com/image.jpg"
"What does this local image show: /path/to/image.png"
"Extract any text from this image: https://example.com/document.jpg"
"Give me a detailed analysis of this photo: https://example.com/photo.jpg"
"What text can you read in this screenshot: https://example.com/screenshot.png"
Local Testing
Adding New Tools
Add the function to
image-description-mcp_server.py
Decorate with
@mcp.tool()
Update the catalog entry with the new tool name
Rebuild the Docker image
Troubleshooting
Authentication Errors
Verify secrets with
docker mcp secret list
Ensure GROK_API_KEY is set correctly
Check API key validity at https://console.x.ai/
Image Processing Errors
Ensure image URLs are accessible and valid
Check local file paths exist and are readable
Verify image formats are supported (JPEG, PNG, WebP, etc.)
Security Considerations
All secrets stored in Docker Desktop secrets
Never hardcode API keys
Running as non-root user in Docker
Images processed temporarily in memory
No permanent storage of image data
Sensitive data never logged
API Documentation
This service integrates with xAI's Grok API:
Grok API Reference: https://docs.x.ai/docs/api-reference
MCP SDK Documentation: https://github.com/modelcontextprotocol/sdk
Data Sources
External Image URLs
Source: Web-hosted images accessible via HTTP/HTTPS
Access Method: HTTP GET requests using httpx
Purpose: Download images for analysis from any public URL
Limitations: Only accessible URLs; no authentication-protected images
Local Image Files
Source: Filesystem access to local image files
Access Method: Python file I/O
Purpose: Analyze images stored locally on the user's system
Supported Paths: Absolute and relative file paths
Supported Formats: JPEG, PNG, WebP, TIFF, GIF, BMP
Grok API
Source: xAI's Grok model with vision capabilities
Access Method: REST API calls via httpx
Purpose: AI-powered image analysis and description generation
Data Flow: Images converted to base64, sent to Grok, receive structured analysis
Image Processing
Source: PIL (Pillow) and OpenCV libraries
Access Method: Local processing
Purpose: Extract technical metadata and perform OCR
No External Calls: Pure local processing
License
MIT License
This server cannot be installed
hybrid server
The server is able to function both locally and remotely, depending on the configuration or use case.
Enables AI assistants to analyze images from URLs or local files using xAI's Grok API. Provides detailed image descriptions, technical metadata extraction, and optical character recognition (OCR) capabilities.