README.md•8.32 kB
# Gemini Flash Image 2.5 Tool (Nano Banana)
A tool for generating and editing images using Google's Gemini 2.5 Flash Image API (affectionately known as "Nano Banana").
Includes both a **Python CLI tool** and a **Model Context Protocol (MCP) server** for integration with AI assistants like Claude Code.
## Features
- **Text-to-Image Generation**: Create images from text prompts
- **Image Editing**: Modify existing images with natural language instructions
- **Multi-Image Composition**: Combine multiple images into one
- **Flexible Aspect Ratios**: Support for 10 different aspect ratios
- **Character Consistency**: Maintain character appearance across multiple generations
- **MCP Server**: Integrate with Claude Code and other MCP clients
- **Command-Line Interface**: Easy-to-use CLI for quick operations
- **Python API**: Use as a library in your own projects
## Installation
### Option 1: MCP Server (Recommended for AI Assistants)
#### Simplest Install (using npx)
For Claude Code MCP configuration, you can reference the package directly via GitHub:
**Add to your MCP settings** (`~/.config/claude/claude_desktop_config.json`):
```json
{
"mcpServers": {
"gemini-image": {
"command": "npx",
"args": ["-y", "github:brunoqgalvao/gemini-image-mcp-server"],
"env": {
"GEMINI_API_KEY": "your_api_key_here"
}
}
}
}
```
Then restart Claude Code! The `generate_image` tool will be available instantly.
#### Local Install
```bash
# Clone the repository
git clone https://github.com/brunoqgalvao/gemini-image-mcp-server.git
cd gemini-image-mcp-server
# Run the installer
./install.sh
```
The installer will:
- Install Node.js dependencies
- Create a `.env` file from template
- Run validation tests
- Show you the MCP configuration to add to Claude Code
#### Manual Install
1. Clone or download this repository
2. Install Node.js dependencies:
```bash
npm install
```
3. Get your API key from [Google AI Studio](https://aistudio.google.com/apikey)
4. Create a `.env` file in the project directory:
```bash
GEMINI_API_KEY=your_api_key_here
```
5. Configure your MCP client (e.g., Claude Code):
**For macOS/Linux** - Add to `~/.config/claude/claude_desktop_config.json`:
```json
{
"mcpServers": {
"gemini-image": {
"command": "node",
"args": ["/absolute/path/to/agent-dispatcher/index.js"],
"env": {
"GEMINI_API_KEY": "your_api_key_here"
}
}
}
}
```
**For Windows** - Add to `%APPDATA%\Claude\claude_desktop_config.json`:
```json
{
"mcpServers": {
"gemini-image": {
"command": "node",
"args": ["C:\\absolute\\path\\to\\agent-dispatcher\\index.js"],
"env": {
"GEMINI_API_KEY": "your_api_key_here"
}
}
}
}
```
6. Restart Claude Code or your MCP client
#### Installing on Another Computer
**Easiest way** - Just use npx! On any computer with Node.js:
Add to Claude Code MCP settings:
```json
{
"mcpServers": {
"gemini-image": {
"command": "npx",
"args": ["-y", "github:brunoqgalvao/gemini-image-mcp-server"],
"env": {
"GEMINI_API_KEY": "your_api_key_here"
}
}
}
}
```
No cloning needed! `npx` will fetch and run it automatically.
**Alternative: Local installation**
```bash
# Clone and install
git clone https://github.com/brunoqgalvao/gemini-image-mcp-server.git
cd gemini-image-mcp-server
./install.sh
```
### Option 2: Python CLI Tool
1. Clone or download this repository
2. Install Python dependencies:
```bash
pip install -r requirements.txt
```
3. Get your API key from [Google AI Studio](https://aistudio.google.com/apikey)
4. Create a `.env` file in the project directory:
```bash
GEMINI_API_KEY=your_api_key_here
```
## Usage
### MCP Server
Once configured, the `generate_image` tool will be available in your MCP client:
**Parameters:**
- `prompt` (required): Text description of the image to generate or edits to make
- `output_path` (required): Path where the image will be saved (must end in .png)
- `input_images` (optional): Array of paths to input images for editing/composition
- `aspect_ratio` (optional): One of: 1:1, 2:3, 3:2, 3:4, 4:3, 4:5, 5:4, 9:16, 16:9, 21:9
- `image_only` (optional): Set to true for image-only output without text
**Example usage in Claude Code:**
```
"Generate a sunset over mountains and save it to sunset.png"
```
The MCP server will handle the API call and save the image automatically.
### Command Line
**Basic text-to-image generation:**
```bash
python gemini_image_tool.py "A cat eating a banana in space" -o cat_banana.png
```
**Edit an existing image:**
```bash
python gemini_image_tool.py "Remove the background" -i photo.jpg -o edited.png
```
**Compose multiple images:**
```bash
python gemini_image_tool.py "Combine these into a collage" -i img1.jpg -i img2.jpg -o collage.png
```
**Specify aspect ratio:**
```bash
python gemini_image_tool.py "A cinematic landscape" -o wide.png --aspect-ratio 21:9
```
**Image-only output (no text response):**
```bash
python gemini_image_tool.py "A red apple" -o apple.png --image-only
```
**Save full API response:**
```bash
python gemini_image_tool.py "A sunset" -o sunset.png --save-json response.json
```
### Python API
```python
from gemini_image_tool import GeminiImageTool
# Initialize the tool
tool = GeminiImageTool(api_key="your_api_key_here")
# Generate an image
result = tool.generate_content(
prompt="A futuristic city at night",
aspect_ratio="16:9",
output_path="city.png"
)
# Edit an image
result = tool.generate_content(
prompt="Make the sky purple",
input_images=["city.png"],
output_path="city_purple.png"
)
# Combine multiple images
result = tool.generate_content(
prompt="Create a before/after comparison",
input_images=["before.jpg", "after.jpg"],
aspect_ratio="2:1",
output_path="comparison.png"
)
```
## Available Aspect Ratios
- `1:1` - Square (default)
- `2:3` - Portrait
- `3:2` - Landscape
- `3:4` - Portrait
- `4:3` - Landscape
- `4:5` - Portrait
- `5:4` - Landscape
- `9:16` - Vertical (social media)
- `16:9` - Widescreen
- `21:9` - Cinematic
## Supported Image Formats
**Input:** JPG, JPEG, PNG, WebP, GIF
**Output:** PNG
## Pricing
As of 2025, Gemini 2.5 Flash Image is priced at:
- $30.00 per 1 million output tokens
- Each image = 1290 output tokens
- Cost per image: ~$0.039
## Use Cases
- **E-commerce**: Product photography and variations
- **Content Creation**: Social media graphics, blog images
- **Marketing**: Ad creatives, promotional materials
- **Storytelling**: Consistent character illustrations
- **Photo Editing**: Background removal, color correction, object removal
- **Design**: Logo variations, mockups, concept art
## Command-Line Arguments
```
positional arguments:
prompt Text prompt for image generation/editing
optional arguments:
-h, --help Show help message
-i INPUT, --input INPUT
Input image file path (can be specified multiple times)
-o OUTPUT, --output OUTPUT
Output image file path (default: output.png)
-a ASPECT_RATIO, --aspect-ratio ASPECT_RATIO
Output aspect ratio (default: 1:1)
--image-only Request image-only output (no text response)
--api-key API_KEY Google AI API key (or set GEMINI_API_KEY env variable)
--save-json SAVE_JSON
Save full API response to JSON file
```
## Error Handling
The tool includes comprehensive error handling for:
- Missing API keys
- Invalid image paths
- Unsupported image formats
- Invalid aspect ratios
- API request failures
- Network errors
## Notes
- All generated images include a SynthID watermark (added by Google)
- The model benefits from Gemini's world knowledge for enhanced generation
- Character consistency works best with clear, descriptive prompts
- For best results, be specific in your prompts
## Documentation
For more information about Gemini 2.5 Flash Image:
- [Official Documentation](https://ai.google.dev/gemini-api/docs/image-generation)
- [Google Developers Blog](https://developers.googleblog.com/en/introducing-gemini-2-5-flash-image/)
## License
This tool is provided as-is for use with the Gemini API. See Google's terms of service for API usage restrictions.