Generate image (Grok Imagine, GPT Image 2, Seedream V4, Wan, Imagen 4, Nano Banana, Ideogram V3, Z-Image Turbo)
aetherwave_generate_imageGenerate images from text prompts or combine with reference images for style transfer and editing. The tool submits a job, polls until done, and returns the resulting image URLs.
Instructions
Generates one or more images from a text prompt (T2I) or a text prompt + reference image(s) (I2I). Submits the job, polls until terminal, and returns the final image URLs. Default model is 'grok-imagine-t2i' (fast, 6 images per generation, 5 credits). Use list_image_models to see the full lineup with pricing. For I2I, pass referenceImages as an array of public image URLs and pick a model with I2I support (e.g. 'grok-imagine-i2i', 'wan-2.5-spicy-i2i').
Model selection guide (when the user does not specify a model)
Default: grok-imagine-t2i (5 cr, 6 outputs per call, fast, general purpose).
Strong recommendation: when a single high-quality output is what's wanted (most agent / one-shot workflows), prefer gpt-image-2-t2i (9 cr @ 1K / higher @ 2K, single deterministic image, best general quality across realism, illustration, typography, and composition; supports up to 2K resolution and most aspect ratios including auto). This is the front-runner for serious creative output where you don't need to pick from 6 variations.
Pick a different model when the prompt has these signals:
"single best result" / "one image" / production / no time to pick from variations ->
gpt-image-2-t2i(9 cr, 1 output, top general quality)"photoreal" / "photo of" / "realistic" ->
gpt-image-2-t2i(9 cr, best general realism) orimagen-4(12 cr, very high quality) orz-image-turbo(3 cr, fastest)"highest quality" / "premium" / no budget ->
gpt-image-2-t2iat 2K, orgrok-imagine-quality-t2i(16 cr @ 1K, 22 cr @ 2K), orimagen-4-ultraText inside the image (signs, posters, typography) ->
ideogram-v3-t2i(best in class) orgpt-image-2-t2i(also strong)Artistic / painterly / stylized ->
midjourney-t2iAlbum art / cover art ->
gpt-image-2-t2ifor one strong image;grok-imagine-t2ifor 6 variations to choose from;seedream-v4-t2iif 4K wantedLogo or design with embedded text ->
ideogram-v3-t2iNSFW / adult / explicit ->
wan-2.5-spicy-t2i(auto-tags creation as 18+; routes to adult gallery)Cheapest possible / quick test ->
z-image-turbo(3 cr)Multiple variations to compare -> keep
grok-imagine-t2i(6 outputs default) or usenumImageson a multi-output model
For I2I (reference image provided): prefer the dedicated aetherwave_edit_image tool for "change something in this image" intent. Use aetherwave_generate_image with I2I models only when you specifically want style transfer (midjourney-i2i), premium quality (grok-imagine-quality-i2i), or adult content (wan-2.5-spicy-i2i).
Always pass an explicit aspectRatio (e.g. "1:1" for square album art, "16:9" for video thumbnails, "9:16" for shorts/reels). Some upstream providers reject submissions with no aspect ratio.
Ask the user only when:
The prompt contradicts itself (e.g., "highest quality but cheapest")
The user requested "the best model" with no context, surface 2-3 options with tradeoffs
A single generation would cost more than 20 credits and the user has not confirmed
Input Schema
| Name | Required | Description | Default |
|---|---|---|---|
| prompt | Yes | Text description of the image to generate. | |
| model | No | Model ID. Defaults to 'grok-imagine-t2i'. Use list_image_models for the full list. | |
| aspectRatio | No | Aspect ratio (e.g. '1:1', '16:9', '9:16'). Pass this explicitly when possible; some upstream providers reject submissions without an aspect ratio. Default ratios vary by model. | |
| resolution | No | Output resolution. Most models accept '1K' or '2K'; some accept '480p'/'720p'. | |
| referenceImages | No | Array of public image URLs for image-to-image generation. Required when using an I2I model. A single URL string is also accepted (wrapped as a one-element array). | |
| numImages | No | Number of images for models that support multiple outputs. | |
| negative_prompt | No | What to avoid in the output (supported by some models). | |
| seed | No | Seed for deterministic generation (supported by some models). |