Provides tools for image generation and editing using Google Gemini AI, supporting multiple aspect ratios, custom styles, context images for guidance, watermark overlay, and automatic file saving with configurable safety settings.
Click on "Install Server".
Wait a few minutes for the server to deploy. Once ready, it will show a "Started" state.
In the chat, type
@followed by the MCP server name and your instructions, e.g., "@Gemini Image MCP Servergenerate a minimalist logo for a coffee shop with warm colors"
That's it! The server will respond to your query, and you can continue using it as needed.
Here is a step-by-step guide with screenshots.
Gemini Image MCP Server
A Model Context Protocol (MCP) server for image generation and editing using Google Gemini AI. Supports optional context images to guide results and now includes a dedicated edit workflow. Optimized for creating eyeβcatching social media images with square (1:1) format by default.
Features
β¨ Image generation with Google Gemini AI
π¨ Multiple aspect ratios (1:1, 16:9, 9:16, 4:3, 3:4)
π± Optimized for social media with 1:1 format by default
π― Custom style support
π§© Context images to guide generation
βοΈ Dedicated edit tool for modifying existing assets without juggling extra options
π·οΈ Watermark support - Overlay watermark images on generated results
πΎ Automatic saving of images to local files
π Flexible output path configuration
π‘οΈ Customizable safety settings
Installation
Clone this repository
Install dependencies:
npm installBuild the project:
npm run buildConfiguration
Environment Variables
You need to configure your Google AI API key:
export GOOGLE_API_KEY="your-api-key-here"Getting Google AI API Key
Go to Google AI Studio
Create a new API key
Copy the key and set it as an environment variable
Client Configuration
{
"servers": {
"gemini-image": {
"command": "node",
"args": ["/full/path/to/project/dist/index.js"],
"env": {
"GOOGLE_API_KEY": "your-api-key-here"
}
}
}
}Command Line Interface
In addition to the MCP server, the project now ships with a CLI for quick terminal-friendly workflows.
Build the project once:
npm run buildMake sure
GOOGLE_API_KEYis set in your environment.Explore the CLI:
node dist/cli.js --help # or, after publishing/packing: gemini-image --help
Commands
gemini-image generate: Create new imagery from a text prompt.gemini-image generate --prompt "A banana astronaut on Mars" --output ./images/gemini-image edit: Apply instructions to an existing image.gemini-image edit --prompt "Add neon lights to the skyline" --input ./images/city.png
Both commands support --help for detailed, friendly option descriptions. CLI option names are intentionally concise (for example --prompt, --context, --input) so they are easier to memorize than the MCP tool identifiers.
Available Tools
generate_image
Creates a brand-new image from a text description, optionally using one or more images as visual context. Use this tool when you want to generate fresh content.
Parameters:
description(string, required): Detailed description of the desired image.images(string[], optional): Array of image paths used as context (absolute or relative). Use this to βeditβ or guide style/content.aspectRatio(string, optional): Orientation preset (square,landscape,portrait). Default:square.style(string, optional): Additional style (e.g., "minimalist", "colorful", "professional", "artistic").outputPath(string, optional): Where to save the image. If omitted, saves in current directory.watermarkPath(string, optional): Path to watermark image to overlay.watermarkPosition(string, optional): One oftop-left,top-right,bottom-left,bottom-right. Default:bottom-right.
Usage Examples:
# Basic - saves to current directory
Generate an image of a mountain landscape at sunset with warm, minimalist style# With context image to guide composition
Generate an image: "Create a futuristic city skyline inspired by this photo", images: ["./reference-skyline.jpg"], aspectRatio: "landscape"# Multiple context images
Generate an image combining style of a logo and a photo, images: ["./photo.jpg", "./logo.png"], style: "professional"When you request a specific orientation (square, landscape, or portrait), the server automatically appends an invisible helper image (assets/square.png, assets/landscape.png, or assets/portrait.png) so Gemini respects the target dimensions.
edit_image
Modifies an existing image using a focused text instruction. This tool keeps the original framing unless you explicitly ask for structural changes.
Parameters:
description(string, required): Instructions describing the edits to apply to the provided image.image(string, required): Path to the image file you want to edit (absolute or relative).outputPath(string, optional): Where to save the edited result. If omitted, the server uses the working directory and an auto-generated filename.
Usage Examples:
# Simple edit
Edit image: "Soften skin tones and remove flyaway hairs", image: "./headshot.png"# Heavier retouch
Edit image: "Turn the product label red and add subtle sparkle highlights", image: "./product-shot.jpg"# Custom path and watermark (top-left)
Generate an image of a space cat, outputPath: "./images/epic_pizza.png", watermarkPath: "./my_logo.png", watermarkPosition: "top-left"Watermark Functionality
The generate_image tool supports adding watermarks to your images:
Features:
π·οΈ Add image watermarks to any generated output
π Position in any corner (
watermarkPosition)π Smart sizing (25% of image width, maintaining aspect ratio)
π― Consistent spacing (3% padding from edges)
πΌοΈ Supports PNG, JPG, WebP watermark files
β‘ Only applied when
watermarkPathparameter is provided
Usage:
# For image generation
watermarkPath: "./my-brand-logo.png"
# With context images
watermarkPath: "./watermark.jpg"Watermark Specifications:
Position: Configurable corner via
watermarkPositionSize: 25% of image width (maintains watermark aspect ratio)
Padding: 3% of image width from the selected edges
Blend mode: Over (watermark appears on top of image)
Save Functionality:
Default: Images are saved in the directory from where the MCP client is executed
Automatic naming: Generated based on description, date and time
Supported formats: PNG, JPG, WebP (depending on what Gemini returns)
Automatic creation: Creates necessary folders if they don't exist
Development
Available Scripts
npm run build: Compiles TypeScript to JavaScriptnpm run dev: Development mode with automatic reloadnpm start: Runs the compiled servernpm run cli: Runs the CLI entry directly (node dist/cli.js)
Project Structure
gemini-image-mcp-server/
βββ src/
β βββ index.ts # Main server entry point
β βββ cli.ts # CLI entry point (generate/edit commands)
β βββ services/
β β βββ gemini.ts # Gemini AI calls
β β βββ imageService.ts # File system + watermark handling
β β βββ serviceFactory.ts # Shared initialization helpers
β βββ tools/
β β βββ index.ts # Tools exports
β β βββ generateImage.ts # Tool for creating new images
β β βββ editImage.ts # Tool for editing existing images
β βββ types/
β βββ index.ts # Type definitions
βββ dist/ # Compiled files
βββ package.json
βββ tsconfig.json
βββ README.mdTroubleshooting
Error: "GOOGLE_API_KEY environment variable is required"
Make sure you have configured the GOOGLE_API_KEY environment variable with your Google AI API key.
Error: "Could not generate image"
Verify that your API key is valid and has permissions for the
gemini-2.5-flash-image-previewmodelEnsure the description doesn't contain content that might be blocked by safety filters
File saving error
Verify you have write permissions in the specified path
Make sure the path is valid and accessible
If specifying a folder, end it with
/
Server not responding
Verify the server is running correctly
Check logs in stderr for error messages
Make sure the MCP client is configured correctly
License
MIT
Contributing
Contributions are welcome. Please open an issue before making significant changes.