Offers optimized prompt templates specifically for Instagram content creation, enabling generation of platform-appropriate images for product announcements and other content types.
Integrates with OpenAI's gpt-image-1 model to provide text-to-image generation and image editing capabilities, supporting multiple output formats and quality settings.
Includes prompt templates for creating video thumbnails optimized for YouTube, allowing generation of thumbnail images that align with platform requirements.
Image Gen MCP Server
Empowering Universal Image Generation for AI Chatbots
Traditional AI chatbot interfaces are limited to text-only interactions, regardless of how powerful their underlying language models are. Image Gen MCP Server bridges this gap by enabling any LLM-powered chatbot client to generate professional-quality images through the standardized Model Context Protocol (MCP).
Whether you're using Claude Desktop, a custom ChatGPT interface, Llama-based applications, or any other LLM client that supports MCP, this server democratizes access to multiple AI image generation models including OpenAI's gpt-image-1, dall-e-3, dall-e-2, and Google's Imagen series (imagen-4, imagen-4-ultra, imagen-3), transforming text-only conversations into rich, visual experiences.
📦 Package Manager: This project uses UV for fast, reliable Python package management. UV provides better dependency resolution, faster installs, and proper environment isolation compared to traditional pip/venv workflows.
Why This Matters
The AI ecosystem has evolved to include powerful language models from multiple providers (OpenAI, Anthropic, Meta, Google, etc.), but image generation capabilities remain fragmented and platform-specific. This creates a significant gap:
🚫 Limited Access: Only certain platforms offer built-in image generation
🔒 Vendor Lock-in: Image capabilities tied to specific LLM providers
⚡ Poor Integration: Switching between text and image tools breaks workflow
🛠️ Complex Setup: Each client needs custom integrations
Image Gen MCP Server solves this by providing:
🌐 Universal Compatibility: Works with any MCP-enabled LLM client
🔄 Seamless Integration: No context switching or workflow interruption
⚡ Standardized Protocol: One server, multiple client support
🎨 Multi-Provider Support: Access to OpenAI and Google's latest image generation models
🔧 Unified Interface: Single API for multiple AI providers with automatic model discovery
Visual Showcase
Real-World Usage
Claude Desktop seamlessly generating images through MCP integration
Generated Examples
High-quality images generated through the MCP server, demonstrating professional-grade output
Use Cases & Applications
🎯 Content Creation Workflows
Bloggers & Writers: Generate custom illustrations directly in writing tools
Social Media Managers: Create platform-specific graphics without leaving chat interfaces
Marketing Teams: Rapid prototyping of visual concepts during brainstorming sessions
Educators: Generate teaching materials and visual aids on-demand
🚀 Development & Design
UI/UX Designers: Quick mockup generation during design discussions
Frontend Developers: Placeholder and concept images within development environments
Technical Writers: Custom diagrams and illustrations for documentation
Product Managers: Visual concept communication in any LLM-powered tool
🏢 Enterprise Integration
Customer Support: Generate visual explanations and guides
Sales Teams: Custom presentation materials tailored to client needs
Training Programs: Visual learning materials created in conversational interfaces
Internal Tools: Add image generation to existing LLM-powered applications
🎨 Creative Industries
Game Developers: Concept art and asset ideation
Film & Media: Storyboard and concept visualization
Architecture: Quick visual references and mood boards
Advertising: Campaign concept development
Key Advantage: Unlike platform-specific solutions, this universal approach means your image generation capabilities move with you across different tools and workflows, eliminating vendor lock-in and maximizing workflow efficiency.
Features
🎨 Multi-Provider Image Generation
Multiple AI Models: Support for OpenAI (gpt-image-1, dall-e-3, dall-e-2) and Google Gemini (imagen-4, imagen-4-ultra, imagen-3)
Text-to-Image: Generate high-quality images from text descriptions
Image Editing: Edit existing images with text instructions (OpenAI models)
Multiple Formats: Support for PNG, JPEG, and WebP output formats
Quality Control: Auto, high, medium, and low quality settings
Background Control: Transparent, opaque, or auto background options
Dynamic Model Discovery: Query available models and capabilities at runtime
🔗 MCP Integration
FastMCP Framework: Built with the latest MCP Python SDK
Multiple Transports: STDIO, HTTP, and SSE transport support
Structured Output: Validated tool responses with proper schemas
Resource Access: MCP resources for image retrieval and management
Prompt Templates: 10+ built-in templates for common use cases
💾 Storage & Caching
Local Storage: Organized directory structure with metadata
URL-based Access: Transport-aware URL generation for images
Dual Access: Immediate base64 data + persistent resource URIs
Smart Caching: Memory-based caching with TTL and Redis support
Auto Cleanup: Configurable file retention policies
🚀 Production Deployment
Docker Support: Production-ready Docker containers
Multi-Transport: STDIO for Claude Desktop, HTTP for web deployment
Reverse Proxy: Nginx configuration with rate limiting
Monitoring: Grafana and Prometheus integration
SSL/TLS: Automatic certificate management with Certbot
🛠️ Development Features
Type Safety: Full type hints with Pydantic models
Error Handling: Comprehensive error handling and logging
Configuration: Environment-based configuration management
Testing: Pytest-based test suite with async support
Dev Tools: Hot reload, Redis Commander, debug logging
Quick Start
Prerequisites
Python 3.10+
OpenAI API key (for OpenAI models)
Google Cloud service account with Vertex AI access (for Imagen models, optional)
Installation
Clone and setup:
git clone <repository-url> cd image-gen-mcp uv syncNote: This project uses UV for fast, reliable Python package management. UV provides better dependency resolution and faster installs compared to pip.
Configure environment:
cp .env.example .env # Edit .env and add your credentials: # - PROVIDERS__OPENAI__API_KEY for OpenAI models # - PROVIDERS__GEMINI__API_KEY for Imagen models (path to service account JSON file)For Imagen models (Vertex AI setup):
Go to Google Cloud Console
Enable Vertex AI API for your project
Create a service account with "Vertex AI User" role
Download the JSON key file to your project directory
Set
PROVIDERS__GEMINI__API_KEY
to the path of your JSON file
Test the setup:
uv run python scripts/dev.py setup uv run python scripts/dev.py test
Running the Server
Development Mode
Manual Execution
Command Line Options
MCP Client Integration
This server works with any MCP-compatible chatbot client. Here are configuration examples:
Claude Desktop (Anthropic)
Claude Code (Anthropic CLI)
Continue.dev (VS Code Extension)
Custom MCP Clients
For other MCP-compatible applications, use the standard MCP STDIO transport:
Universal Compatibility: This server follows the standard MCP protocol, ensuring compatibility with current and future MCP-enabled clients across the AI ecosystem.
Usage Examples
Basic Image Generation
Using Prompt Templates
Accessing Generated Images
Available Tools
list_available_models
List all available image generation models and their capabilities.
Returns: Dictionary with model information, capabilities, and provider details.
generate_image
Generate images from text descriptions using any supported model.
Parameters:
prompt
(required): Text description of desired imagemodel
(optional): Model to use (e.g., "gpt-image-1", "dall-e-3", "imagen-4")quality
: "auto" | "high" | "medium" | "low" (default: "auto")size
: "1024x1024" | "1536x1024" | "1024x1536" (default: "1536x1024")style
: "vivid" | "natural" (default: "vivid")output_format
: "png" | "jpeg" | "webp" (default: "png")background
: "auto" | "transparent" | "opaque" (default: "auto")
Note: Parameter availability depends on the selected model. Use list_available_models
to check capabilities.
edit_image
Edit existing images with text instructions.
Parameters:
image_data
(required): Base64 encoded image or data URLprompt
(required): Edit instructionsmask_data
: Optional mask for targeted editingsize
,quality
,output_format
: Same as generate_image
Available Resources
generated-images://{image_id}
- Access specific generated imagesimage-history://recent
- Browse recent generation historystorage-stats://overview
- Storage usage and statisticsmodel-info://gpt-image-1
- Model capabilities and pricing
Prompt Templates
Built-in templates for common use cases:
Creative Image: Artistic image generation
Product Photography: Commercial product images
Social Media Graphics: Platform-optimized posts
Blog Headers: Article header images
OG Images: Social media preview images
Hero Banners: Website hero sections
Email Headers: Newsletter headers
Video Thumbnails: YouTube/video thumbnails
Infographics: Data visualization images
Artistic Style: Specific art movement styles
Configuration
Configure via environment variables or .env
file:
Deployment
Production Deployment
The server supports production deployment with Docker, monitoring, and reverse proxy:
Production Stack includes:
Image Gen MCP Server: Main application container
Redis: Caching and session storage
Nginx: Reverse proxy with rate limiting (configured separately)
Prometheus: Metrics collection
Grafana: Monitoring dashboards
Access Points:
Main Service:
http://localhost:3001
(behind proxy)Grafana Dashboard:
http://localhost:3000
Prometheus:
http://localhost:9090
(localhost only)
VPS Deployment
For VPS deployment with SSL, monitoring, and production hardening:
Features included:
Docker containerization
Nginx reverse proxy with SSL
Automatic certificate management (Certbot)
System monitoring and logging
Firewall configuration
Automatic backups
See VPS Deployment Guide for detailed instructions.
Docker Configuration
Available Docker Compose profiles:
Development
Development Tools
Testing
Architecture
The server follows a modular, production-ready architecture:
Core Components:
Server Layer (
server.py
): FastMCP-based MCP server with multi-transport supportConfiguration (
config/
): Environment-based settings management with validationTool Layer (
tools/
): Image generation and editing capabilitiesResource Layer (
resources/
): MCP resources for data access and model registryStorage Manager (
storage/
): Organized local image storage with cleanupCache Manager (
utils/cache.py
): Memory and Redis-based caching system
Multi-Provider Architecture:
Provider Registry (
providers/registry.py
): Centralized provider and model managementProvider Base (
providers/base.py
): Abstract base class for all providersOpenAI Provider (
providers/openai.py
): OpenAI API integration with retry logicGemini Provider (
providers/gemini.py
): Google Gemini API integrationType System (
types/
): Pydantic models for type safetyValidation (
utils/validators.py
): Input validation and sanitization
Infrastructure:
Prompt Templates (
prompts/
): Template system for optimized promptsDynamic Model Discovery: Runtime model capability detection
Parameter Translation: Automatic parameter mapping between providers
Deployment:
Docker Support: Development and production containers
Multi-Transport: STDIO, HTTP, SSE transport layers
Monitoring: Prometheus metrics and Grafana dashboards
Reverse Proxy: Nginx configuration with SSL and rate limiting
Cost Estimation
The server provides cost estimation for operations:
Text Input: ~$5 per 1M tokens
Image Output: ~$40 per 1M tokens (~1750 tokens per image)
Typical Cost: ~$0.07 per image generation
Error Handling
Comprehensive error handling includes:
API rate limiting and retries
Invalid parameter validation
Storage error recovery
Cache failure fallbacks
Detailed error logging
Security
Security features include:
OpenAI API key protection
Input validation and sanitization
File system access controls
Rate limiting protection
No credential exposure in logs
License
MIT License - see LICENSE file for details.
Contributing
Fork the repository
Create a feature branch
Make your changes
Add tests for new functionality
Run the test suite
Submit a pull request
Support
For issues and questions:
Check the troubleshooting guide
Review common issues
Open an issue on GitHub
Built with ❤️ using the Model Context Protocol and OpenAI's gpt-image-1
The Future of AI Integration
The Model Context Protocol represents a paradigm shift towards standardized AI tool integration. As more LLM clients adopt MCP support, servers like this one become increasingly valuable by providing universal capabilities across the entire ecosystem.
Current MCP Adoption:
✅ Claude Desktop (Anthropic) - Full MCP support
✅ Continue.dev - VS Code extension with MCP integration
✅ Zed Editor - Built-in MCP support for coding workflows
🚀 Growing Ecosystem - New clients adopting MCP regularly
Vision: A future where AI capabilities are modular, interoperable, and user-controlled rather than locked to specific platforms.
🌟 Building the Universal AI Ecosystem
Democratizing advanced AI capabilities across all platforms through the power of the Model Context Protocol. One server, infinite possibilities.
This server cannot be installed
hybrid server
The server is able to function both locally and remotely, depending on the configuration or use case.
An MCP server that enables text-to-image generation and editing using OpenAI's gpt-image-1 model, supporting multiple output formats, quality settings, and background options.
Related MCP Servers
- -securityAlicense-qualityAn MCP tool server that enables generating and editing images through OpenAI's image models, supporting text-to-image generation and advanced image editing (inpainting, outpainting) across various MCP-compatible clients.Last updated -73MIT License
- AsecurityFlicenseAqualityAn MCP (Model Context Protocol) server that allows generating, editing, and creating variations of images using OpenAI's DALL-E APIs.
- -securityAlicense-qualityProvides tools for generating and editing images using OpenAI's gpt-image-1 model via an MCP interface, enabling AI assistants to create and modify images based on text prompts.Last updated -16Apache 2.0
- AsecurityAlicenseAqualityAn MCP server that allows Claude to use OpenAI's image generation capabilities (gpt-image-1) to create image assets for users, which is particularly useful for game and web development projects.Last updated -123MIT License