gemini_generate_image
Generate or edit images using natural language prompts. Provide optional images for editing and conversation IDs for iterative refinement.
Instructions
Generate or edit images with Gemini.
Without files: generates a new image from the text prompt. With files: edits/transforms the provided image(s) based on the prompt.
Pass conversation_id from a previous call to continue refining images in the same conversation thread (e.g. "make it more dramatic", "add rain"). You can also use a cid from the Gemini web URL (gemini.google.com/app/{cid}).
Images are saved to ~/Pictures/gemini/ and full file paths are returned.
Args: prompt: Description of the image to generate, or editing instruction (e.g. 'change the background to blue', 'make it a cartoon'). model: Model name. Defaults to gemini-3.0-flash-thinking (Nano Banana 2, supports non-square aspect ratios). files: Optional list of file paths to images to edit/transform. conversation_id: Optional list of [cid, rid, rcid] from a previous gemini_generate_image response to continue the conversation. Passing just [cid] (from browser URL) also works.
Returns: JSON with generated image paths, conversation_id for continuation, or an error message.
Input Schema
| Name | Required | Description | Default |
|---|---|---|---|
| prompt | Yes | ||
| model | No | ||
| files | No | ||
| conversation_id | No |
Output Schema
| Name | Required | Description | Default |
|---|---|---|---|
| result | Yes |