readme.md•3.75 kB
# Image Description MCP Server
A Model Context Protocol (MCP) server that provides AI-powered image analysis using xAI's Grok API.
## Purpose
This MCP server provides a secure interface for AI assistants to analyze images using Grok's advanced vision capabilities. It supports both web-hosted images and local files, offering detailed descriptions, technical metadata extraction, and optical character recognition (OCR).
## Features
### Current Implementation
- **`describe_image_url`** - Analyzes images from web URLs and provides AI-generated descriptions
- **`describe_image_file`** - Analyzes local image files and provides AI-generated descriptions
- **`extract_text_from_image`** - Performs OCR to extract readable text from images
## Prerequisites
- Docker Desktop with MCP Toolkit enabled
- Docker MCP CLI plugin (`docker mcp` command)
- Grok API key from https://console.x.ai/
## Installation
See the step-by-step instructions provided with the files.
## Usage Examples
In Grok4 Code Fast, you can ask:
- "Describe this image: https://example.com/image.jpg"
- "What does this local image show: /path/to/image.png"
- "Extract any text from this image: https://example.com/document.jpg"
- "Give me a detailed analysis of this photo: https://example.com/photo.jpg"
- "What text can you read in this screenshot: https://example.com/screenshot.png"
### Local Testing
```bash
# Set environment variables for testing
export GROK_API_KEY="your-grok-api-key"
# Run directly
python image-description-mcp_server.py
# Test MCP protocol
echo '{"jsonrpc":"2.0","method":"tools/list","id":1}' | python image-description-mcp_server.py
```
### Adding New Tools
1. Add the function to `image-description-mcp_server.py`
2. Decorate with `@mcp.tool()`
3. Update the catalog entry with the new tool name
4. Rebuild the Docker image
## Troubleshooting
### Authentication Errors
- Verify secrets with `docker mcp secret list`
- Ensure GROK_API_KEY is set correctly
- Check API key validity at https://console.x.ai/
### Image Processing Errors
- Ensure image URLs are accessible and valid
- Check local file paths exist and are readable
- Verify image formats are supported (JPEG, PNG, WebP, etc.)
## Security Considerations
- All secrets stored in Docker Desktop secrets
- Never hardcode API keys
- Running as non-root user in Docker
- Images processed temporarily in memory
- No permanent storage of image data
- Sensitive data never logged
## API Documentation
This service integrates with xAI's Grok API:
- **Grok API Reference**: https://docs.x.ai/docs/api-reference
- **MCP SDK Documentation**: https://github.com/modelcontextprotocol/sdk
## Data Sources
### External Image URLs
- **Source**: Web-hosted images accessible via HTTP/HTTPS
- **Access Method**: HTTP GET requests using httpx
- **Purpose**: Download images for analysis from any public URL
- **Limitations**: Only accessible URLs; no authentication-protected images
### Local Image Files
- **Source**: Filesystem access to local image files
- **Access Method**: Python file I/O
- **Purpose**: Analyze images stored locally on the user's system
- **Supported Paths**: Absolute and relative file paths
- **Supported Formats**: JPEG, PNG, WebP, TIFF, GIF, BMP
### Grok API
- **Source**: xAI's Grok model with vision capabilities
- **Access Method**: REST API calls via httpx
- **Purpose**: AI-powered image analysis and description generation
- **Data Flow**: Images converted to base64, sent to Grok, receive structured analysis
### Image Processing
- **Source**: PIL (Pillow) and OpenCV libraries
- **Access Method**: Local processing
- **Purpose**: Extract technical metadata and perform OCR
- **No External Calls**: Pure local processing
## License
MIT License