Gemini Image MCP Server

README.md•8.33 kB

# Gemini Image MCP Server A Model Context Protocol (MCP) server for image generation and editing using Google Gemini AI. Supports optional context images to guide results and now includes a dedicated edit workflow. Optimized for creating eye‑catching social media images with square (1:1) format by default. ## Features - ✨ Image generation with Google Gemini AI - 🎨 Multiple aspect ratios (1:1, 16:9, 9:16, 4:3, 3:4) - 📱 Optimized for social media with 1:1 format by default - 🎯 Custom style support - 🧩 Context images to guide generation - ✏️ Dedicated edit tool for modifying existing assets without juggling extra options - 🏷️ **Watermark support** - Overlay watermark images on generated results - 💾 Automatic saving of images to local files - 📁 Flexible output path configuration - 🛡️ Customizable safety settings ## Installation 1. Clone this repository 2. Install dependencies: ```bash npm install ``` 3. Build the project: ```bash npm run build ``` ## Configuration ### Environment Variables You need to configure your Google AI API key: ```bash export GOOGLE_API_KEY="your-api-key-here" ``` ### Getting Google AI API Key 1. Go to [Google AI Studio](https://makersuite.google.com/app/apikey) 2. Create a new API key 3. Copy the key and set it as an environment variable ## Client Configuration ```json { "servers": { "gemini-image": { "command": "node", "args": ["/full/path/to/project/dist/index.js"], "env": { "GOOGLE_API_KEY": "your-api-key-here" } } } } ``` ## Command Line Interface In addition to the MCP server, the project now ships with a CLI for quick terminal-friendly workflows. 1. Build the project once: ```bash npm run build ``` 2. Make sure `GOOGLE_API_KEY` is set in your environment. 3. Explore the CLI: ```bash node dist/cli.js --help # or, after publishing/packing: gemini-image --help ``` ### Commands - `gemini-image generate`: Create new imagery from a text prompt. ```bash gemini-image generate --prompt "A banana astronaut on Mars" --output ./images/ ``` - `gemini-image edit`: Apply instructions to an existing image. ```bash gemini-image edit --prompt "Add neon lights to the skyline" --input ./images/city.png ``` Both commands support `--help` for detailed, friendly option descriptions. CLI option names are intentionally concise (for example `--prompt`, `--context`, `--input`) so they are easier to memorize than the MCP tool identifiers. ## Available Tools ### `generate_image` Creates a brand-new image from a text description, optionally using one or more images as visual context. Use this tool when you want to generate fresh content. **Parameters:** - `description` (string, required): Detailed description of the desired image. - `images` (string[], optional): Array of image paths used as context (absolute or relative). Use this to “edit” or guide style/content. - `aspectRatio` (string, optional): Orientation preset (`square`, `landscape`, `portrait`). Default: `square`. - `style` (string, optional): Additional style (e.g., "minimalist", "colorful", "professional", "artistic"). - `outputPath` (string, optional): Where to save the image. If omitted, saves in current directory. - `watermarkPath` (string, optional): Path to watermark image to overlay. - `watermarkPosition` (string, optional): One of `top-left`, `top-right`, `bottom-left`, `bottom-right`. Default: `bottom-right`. **Usage Examples:** ``` # Basic - saves to current directory Generate an image of a mountain landscape at sunset with warm, minimalist style ``` ``` # With context image to guide composition Generate an image: "Create a futuristic city skyline inspired by this photo", images: ["./reference-skyline.jpg"], aspectRatio: "landscape" ``` ``` # Multiple context images Generate an image combining style of a logo and a photo, images: ["./photo.jpg", "./logo.png"], style: "professional" ``` When you request a specific orientation (`square`, `landscape`, or `portrait`), the server automatically appends an invisible helper image (`assets/square.png`, `assets/landscape.png`, or `assets/portrait.png`) so Gemini respects the target dimensions. ### `edit_image` Modifies an existing image using a focused text instruction. This tool keeps the original framing unless you explicitly ask for structural changes. **Parameters:** - `description` (string, required): Instructions describing the edits to apply to the provided image. - `image` (string, required): Path to the image file you want to edit (absolute or relative). - `outputPath` (string, optional): Where to save the edited result. If omitted, the server uses the working directory and an auto-generated filename. **Usage Examples:** ``` # Simple edit Edit image: "Soften skin tones and remove flyaway hairs", image: "./headshot.png" ``` ``` # Heavier retouch Edit image: "Turn the product label red and add subtle sparkle highlights", image: "./product-shot.jpg" ``` ``` # Custom path and watermark (top-left) Generate an image of a space cat, outputPath: "./images/epic_pizza.png", watermarkPath: "./my_logo.png", watermarkPosition: "top-left" ``` ## Watermark Functionality The `generate_image` tool supports adding watermarks to your images: **Features:** - 🏷️ Add image watermarks to any generated output - 📍 Position in any corner (`watermarkPosition`) - 📏 Smart sizing (25% of image width, maintaining aspect ratio) - 🎯 Consistent spacing (3% padding from edges) - 🖼️ Supports PNG, JPG, WebP watermark files - ⚡ Only applied when `watermarkPath` parameter is provided **Usage:** ```bash # For image generation watermarkPath: "./my-brand-logo.png" # With context images watermarkPath: "./watermark.jpg" ``` **Watermark Specifications:** - Position: Configurable corner via `watermarkPosition` - Size: 25% of image width (maintains watermark aspect ratio) - Padding: 3% of image width from the selected edges - Blend mode: Over (watermark appears on top of image) **Save Functionality:** - Default: Images are saved in the directory from where the MCP client is executed - Automatic naming: Generated based on description, date and time - Supported formats: PNG, JPG, WebP (depending on what Gemini returns) - Automatic creation: Creates necessary folders if they don't exist ## Development ### Available Scripts - `npm run build`: Compiles TypeScript to JavaScript - `npm run dev`: Development mode with automatic reload - `npm start`: Runs the compiled server - `npm run cli`: Runs the CLI entry directly (`node dist/cli.js`) ### Project Structure ``` gemini-image-mcp-server/ ├── src/ │ ├── index.ts # Main server entry point │ ├── cli.ts # CLI entry point (generate/edit commands) │ ├── services/ │ │ ├── gemini.ts # Gemini AI calls │ │ ├── imageService.ts # File system + watermark handling │ │ └── serviceFactory.ts # Shared initialization helpers │ ├── tools/ │ │ ├── index.ts # Tools exports │ │ ├── generateImage.ts # Tool for creating new images │ │ └── editImage.ts # Tool for editing existing images │ └── types/ │ └── index.ts # Type definitions ├── dist/ # Compiled files ├── package.json ├── tsconfig.json └── README.md ``` ## Troubleshooting ### Error: "GOOGLE_API_KEY environment variable is required" Make sure you have configured the `GOOGLE_API_KEY` environment variable with your Google AI API key. ### Error: "Could not generate image" - Verify that your API key is valid and has permissions for the `gemini-2.5-flash-image-preview` model - Ensure the description doesn't contain content that might be blocked by safety filters ### File saving error - Verify you have write permissions in the specified path - Make sure the path is valid and accessible - If specifying a folder, end it with `/` ### Server not responding - Verify the server is running correctly - Check logs in stderr for error messages - Make sure the MCP client is configured correctly ## License MIT ## Contributing Contributions are welcome. Please open an issue before making significant changes.

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/devexpert-io/gemini-image-mcp-server'

If you have feedback or need assistance with the MCP directory API, please join our Discord server