generate_image
Generate images from text prompts. Optionally use reference images for character or style consistency.
Instructions
Generate an image from a text prompt. Optionally conditioned on one or more reference images (file paths, http(s) URLs, or data URLs) for character / style consistency. Sends modalities: ["image","text"] by default; override via the modalities field if needed.
Input Schema
| Name | Required | Description | Default |
|---|---|---|---|
| prompt | Yes | ||
| model | No | ||
| aspect_ratio | No | Output aspect ratio (e.g. 1:1, 16:9, 9:16, 4:3, 3:4, 21:9). Model-dependent. | |
| image_size | No | Output resolution bucket. 1K is the default; 0.5K / 2K / 4K are model-dependent. | |
| max_tokens | No | Cap on completion tokens. Defaults to the model context window, which can trip free-tier quotas; set e.g. 4096 on low-credit accounts. | |
| save_path | No | Optional path to save the image. Routed through the OPENROUTER_OUTPUT_DIR sandbox. | |
| input_images | No | Optional reference images for visual consistency. Each entry may be a local file path (sandboxed to OPENROUTER_INPUT_DIR / OPENROUTER_OUTPUT_DIR / cwd), an http(s) URL, or a `data:image/...;base64,...` URL. Inlined as multimodal user content in the order given. | |
| modalities | No | Override the default `modalities: ["image","text"]` sent to OpenRouter. Most callers should leave this unset. Provide e.g. ["text"] to suppress image output for inspection / captioning. |