generate_image
Create new images or modify existing ones using natural language instructions. Supports generating from prompts, editing local images, or combining multiple images with AI-powered image generation.
Instructions
Generate new images or edit existing images using natural language instructions.
Supports multiple input modes:
Pure generation: Just provide a prompt to create new images
Multi-image conditioning: Provide up to 3 input images using input_image_path_1/2/3 parameters
File ID editing: Edit previously uploaded images using Files API ID
File path editing: Edit local images by providing single input image path
Automatically detects mode based on parameters or can be explicitly controlled. Input images are read from the local filesystem to avoid massive token usage. Returns both MCP image content blocks and structured JSON with metadata.
Input Schema
Name | Required | Description | Default |
---|---|---|---|
file_id | No | Files API file ID to use as input/edit source (e.g., 'files/abc123'). If provided, this takes precedence over input_image_path_* parameters for the primary input. | |
input_image_path_1 | No | Path to first input image for composition/conditioning | |
input_image_path_2 | No | Path to second input image for composition/conditioning | |
input_image_path_3 | No | Path to third input image for composition/conditioning | |
mode | No | Operation mode: 'generate' for new image creation, 'edit' for modifying existing images. Auto-detected based on input parameters if not specified. | auto |
n | No | Requested image count (model may return fewer). | |
negative_prompt | No | Things to avoid (style, objects, text). | |
prompt | Yes | Clear, detailed image prompt. Include subject, composition, action, location, style, and any text to render. Add 'Square image' or '16:9' in the text to influence aspect ratio. | |
system_instruction | No | Optional system tone/style guidance. |