fix_image
Repairs images with garbled or glitched text by splitting into tiles, re-rendering each tile, and stitching them together for clear legibility.
Instructions
Fix an image that has glitched or garbled text by splitting it into tiles, re-rendering each tile, and stitching them back together. This works because smaller sections have less text for the model to handle at once. Use this when a generated image has text artifacts or overloaded text regions.
Input Schema
| Name | Required | Description | Default |
|---|---|---|---|
| filename | Yes | Filename of the source image in /root/Pictures/pixel-surgeon | |
| prompt | No | Instructions for fixing each tile | Clean up and fix any garbled, glitched, or distorted text in this image tile. Preserve the style, colors, and layout exactly but make all text crisp and legible. |
| grid | No | How to split the image: cols x rows | 2x2 |
| image_size | No | Resolution for each tile | 1K |
| model | No | Model to use. Available: 'gemini-3.1-flash-image' (Gemini 3.1 Flash Image), 'gemini-2.5-flash-image' (Gemini 2.5 Flash Image), 'gpt-image-1' (GPT Image 1 (OpenAI)), 'gpt-image-2' (GPT Image 2 (OpenAI)), 'grok-imagine' (Grok Imagine (xAI)). Default: 'gpt-image-2'. Set DEFAULT_IMAGE_MODEL env var to change the default. Provider tradeoffs: grok-imagine is fastest and cheapest; gemini is mid-quality with the best price/performance ratio (free tier available); gpt-image-2 is highest quality but slower and more expensive. Gemini models fall back to free tier on billing errors. OpenAI requires OPENAI_API_KEY. Grok requires XAI_API_KEY. |