Skip to main content
Glama

generate_image_structured

Generate images with structured prompts for precise control over composition, style, lighting, and subjects, enabling AI agents to create detailed visual content.

Instructions

Generate images with detailed structured prompts for precise control over composition, style, lighting, and subjects. Ideal for AI agents that need fine-grained control.

Input Schema

TableJSON Schema
NameRequiredDescriptionDefault
sceneYesOverall scene description - the main subject and setting
subjectsNoList of subjects with their positions and descriptions
styleNoArt style (e.g., 'Digital art painting', 'Photorealistic', 'Watercolor', 'Oil painting')
color_paletteNoHex color codes for the palette (e.g., ['#000033', '#6A0DAD', '#FFFFFF'])
lightingNoLighting description (e.g., 'Soft golden hour lighting', 'Dramatic chiaroscuro')
moodNoEmotional mood of the image (e.g., 'Serene', 'Dramatic', 'Mysterious')
backgroundNoBackground description
compositionNoCompositional rules (e.g., 'Rule of thirds', 'Centered', 'Golden ratio')
cameraNoCamera settings for photographic style control
effectsNoVisual effects (e.g., ['Bokeh', 'Light rays', 'Lens flare', 'Motion blur'])
negative_promptNoWhat to avoid in the image (e.g., 'blurry, low quality, distorted')
modelNoModel ID or alias. Use list_models to see options.flux_schnell
image_sizeNolandscape_16_9
num_imagesNo
seedNoSeed for reproducible generation
enable_safety_checkerNoEnable safety checker to filter inappropriate content
output_formatNoOutput image formatpng

Implementation Reference

  • The core handler function that implements the generate_image_structured tool logic. Constructs a structured JSON prompt from the input parameters (scene, subjects, style, etc.) and generates images using the Fal.ai model registry and queue strategy.
    async def handle_generate_image_structured(
        arguments: Dict[str, Any],
        registry: ModelRegistry,
        queue_strategy: QueueStrategy,
    ) -> List[TextContent]:
        """Handle the generate_image_structured tool."""
        model_input = arguments.get("model", "flux_schnell")
        try:
            model_id = await registry.resolve_model_id(model_input)
        except ValueError as e:
            return [
                TextContent(
                    type="text",
                    text=f"❌ {e}. Use list_models to see available options.",
                )
            ]
    
        # Build structured JSON prompt from arguments
        structured_prompt: Dict[str, Any] = {}
    
        # Required field
        structured_prompt["scene"] = arguments["scene"]
    
        # Optional structured fields
        for field in [
            "subjects",
            "style",
            "color_palette",
            "lighting",
            "mood",
            "background",
            "composition",
            "camera",
            "effects",
        ]:
            if field in arguments:
                structured_prompt[field] = arguments[field]
    
        # Convert structured prompt to JSON string
        json_prompt = json.dumps(structured_prompt, indent=2)
    
        fal_args: Dict[str, Any] = {
            "prompt": json_prompt,
            "image_size": arguments.get("image_size", "landscape_16_9"),
            "num_images": arguments.get("num_images", 1),
        }
    
        # Add optional generation parameters
        if "negative_prompt" in arguments:
            fal_args["negative_prompt"] = arguments["negative_prompt"]
        if "seed" in arguments:
            fal_args["seed"] = arguments["seed"]
        if "enable_safety_checker" in arguments:
            fal_args["enable_safety_checker"] = arguments["enable_safety_checker"]
        if "output_format" in arguments:
            fal_args["output_format"] = arguments["output_format"]
    
        # Use fast execution with timeout protection
        logger.info("Starting structured image generation with %s", model_id)
        try:
            result = await asyncio.wait_for(
                queue_strategy.execute_fast(model_id, fal_args),
                timeout=60,
            )
        except asyncio.TimeoutError:
            logger.error("Structured image generation timed out for %s", model_id)
            return [
                TextContent(
                    type="text",
                    text=f"❌ Image generation timed out after 60 seconds with {model_id}. Please try again.",
                )
            ]
    
        # Check for error in response
        if "error" in result:
            error_msg = result.get("error", "Unknown error")
            logger.error(
                "Structured image generation failed for %s: %s", model_id, error_msg
            )
            return [
                TextContent(
                    type="text",
                    text=f"❌ Image generation failed: {error_msg}",
                )
            ]
    
        images = result.get("images", [])
        if not images:
            logger.warning(
                "Structured image generation returned no images. Model: %s",
                model_id,
            )
            return [
                TextContent(
                    type="text",
                    text=f"❌ No images were generated by {model_id}. The prompt may have been filtered or the request format was invalid.",
                )
            ]
    
        # Extract URLs safely
        try:
            urls = [img["url"] for img in images]
        except (KeyError, TypeError) as e:
            logger.error("Malformed image response from %s: %s", model_id, e)
            return [
                TextContent(
                    type="text",
                    text=f"❌ Image generation completed but response was malformed: {e}",
                )
            ]
    
        response = (
            f"🎨 Generated {len(urls)} image(s) with {model_id} (structured prompt):\n\n"
        )
        for i, url in enumerate(urls, 1):
            response += f"Image {i}: {url}\n"
        return [TextContent(type="text", text=response)]
  • The input schema and Tool definition for generate_image_structured, providing detailed structured input parameters for precise image generation control including subjects, camera settings, effects, and more.
    Tool(
        name="generate_image_structured",
        description="Generate images with detailed structured prompts for precise control over composition, style, lighting, and subjects. Ideal for AI agents that need fine-grained control.",
        inputSchema={
            "type": "object",
            "properties": {
                "scene": {
                    "type": "string",
                    "description": "Overall scene description - the main subject and setting",
                },
                "subjects": {
                    "type": "array",
                    "items": {
                        "type": "object",
                        "properties": {
                            "type": {
                                "type": "string",
                                "description": "Type of subject (e.g., 'person', 'building', 'animal')",
                            },
                            "description": {
                                "type": "string",
                                "description": "Detailed description of the subject",
                            },
                            "pose": {
                                "type": "string",
                                "description": "Pose or action of the subject",
                            },
                            "position": {
                                "type": "string",
                                "enum": ["foreground", "midground", "background"],
                                "description": "Position in the composition",
                            },
                        },
                    },
                    "description": "List of subjects with their positions and descriptions",
                },
                "style": {
                    "type": "string",
                    "description": "Art style (e.g., 'Digital art painting', 'Photorealistic', 'Watercolor', 'Oil painting')",
                },
                "color_palette": {
                    "type": "array",
                    "items": {"type": "string"},
                    "description": "Hex color codes for the palette (e.g., ['#000033', '#6A0DAD', '#FFFFFF'])",
                },
                "lighting": {
                    "type": "string",
                    "description": "Lighting description (e.g., 'Soft golden hour lighting', 'Dramatic chiaroscuro')",
                },
                "mood": {
                    "type": "string",
                    "description": "Emotional mood of the image (e.g., 'Serene', 'Dramatic', 'Mysterious')",
                },
                "background": {
                    "type": "string",
                    "description": "Background description",
                },
                "composition": {
                    "type": "string",
                    "description": "Compositional rules (e.g., 'Rule of thirds', 'Centered', 'Golden ratio')",
                },
                "camera": {
                    "type": "object",
                    "properties": {
                        "angle": {
                            "type": "string",
                            "description": "Camera angle (e.g., 'Low angle', 'Eye level', 'Bird's eye')",
                        },
                        "distance": {
                            "type": "string",
                            "description": "Shot distance (e.g., 'Close-up', 'Medium shot', 'Wide shot')",
                        },
                        "focus": {
                            "type": "string",
                            "description": "Focus description (e.g., 'Sharp focus on subject, blurred background')",
                        },
                        "lens": {
                            "type": "string",
                            "description": "Lens type (e.g., 'Wide-angle', '50mm portrait', 'Telephoto')",
                        },
                        "f_number": {
                            "type": "string",
                            "description": "Aperture (e.g., 'f/1.8', 'f/5.6', 'f/11')",
                        },
                        "iso": {
                            "type": "integer",
                            "description": "ISO setting (e.g., 100, 400, 800)",
                        },
                    },
                    "description": "Camera settings for photographic style control",
                },
                "effects": {
                    "type": "array",
                    "items": {"type": "string"},
                    "description": "Visual effects (e.g., ['Bokeh', 'Light rays', 'Lens flare', 'Motion blur'])",
                },
                "negative_prompt": {
                    "type": "string",
                    "description": "What to avoid in the image (e.g., 'blurry, low quality, distorted')",
                },
                "model": {
                    "type": "string",
                    "default": "flux_schnell",
                    "description": "Model ID or alias. Use list_models to see options.",
                },
                "image_size": {
                    "type": "string",
                    "enum": [
                        "square",
                        "landscape_4_3",
                        "landscape_16_9",
                        "portrait_3_4",
                        "portrait_9_16",
                    ],
                    "default": "landscape_16_9",
                },
                "num_images": {
                    "type": "integer",
                    "default": 1,
                    "minimum": 1,
                    "maximum": 4,
                },
                "seed": {
                    "type": "integer",
                    "description": "Seed for reproducible generation",
                },
                "enable_safety_checker": {
                    "type": "boolean",
                    "default": True,
                    "description": "Enable safety checker to filter inappropriate content",
                },
                "output_format": {
                    "type": "string",
                    "enum": ["jpeg", "png", "webp"],
                    "default": "png",
                    "description": "Output image format",
                },
            },
            "required": ["scene"],
        },
    ),
  • Registration of the generate_image_structured handler in the TOOL_HANDLERS dictionary, mapping the tool name to its handler function for execution in the MCP server.
    TOOL_HANDLERS = {
        # Utility tools (no queue needed)
        "list_models": handle_list_models,
        "recommend_model": handle_recommend_model,
        "get_pricing": handle_get_pricing,
        "get_usage": handle_get_usage,
        "upload_file": handle_upload_file,
        # Image generation tools
        "generate_image": handle_generate_image,
        "generate_image_structured": handle_generate_image_structured,
        "generate_image_from_image": handle_generate_image_from_image,
        # Image editing tools
        "remove_background": handle_remove_background,
        "upscale_image": handle_upscale_image,
        "edit_image": handle_edit_image,
        "inpaint_image": handle_inpaint_image,
        "resize_image": handle_resize_image,
        "compose_images": handle_compose_images,
        # Video tools
        "generate_video": handle_generate_video,
        "generate_video_from_image": handle_generate_video_from_image,
        "generate_video_from_video": handle_generate_video_from_video,
        # Audio tools
        "generate_music": handle_generate_music,
    }
  • No, wrong. Wait, this is import in server.py actually. For handlers __init__.py
        arguments: Dict[str, Any],
        registry: ModelRegistry,
        queue_strategy: QueueStrategy,
    ) -> List[TextContent]:
  • Export/import of the handler function in the handlers package __init__.py, making it available for server imports.
    from fal_mcp_server.handlers.image_handlers import (
        handle_generate_image,
        handle_generate_image_from_image,
        handle_generate_image_structured,
    )
Behavior2/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

No annotations are provided, so the description carries the full burden of behavioral disclosure. It mentions 'fine-grained control' but lacks details on permissions, rate limits, costs, or output behavior (e.g., image URLs, processing time). For a complex tool with 17 parameters and no annotations, this is a significant gap in transparency.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is highly concise and well-structured in two sentences: the first states the purpose and key features, and the second specifies the ideal use case. Every word earns its place with no redundancy or fluff, making it easy to parse quickly.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness2/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the tool's complexity (17 parameters, nested objects, no output schema, and no annotations), the description is insufficient. It doesn't explain what the tool returns (e.g., image data or URLs), potential errors, or behavioral traits like costs or limitations. For such a rich input schema, more contextual information is needed to guide effective use.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

The schema description coverage is 88%, which is high, so the baseline score is 3. The description adds minimal value beyond the schema by listing key control aspects ('composition, style, lighting, and subjects'), but doesn't provide additional syntax, examples, or constraints for the parameters. It compensates slightly for the 12% coverage gap but not substantially.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose4/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the tool's purpose: 'Generate images with detailed structured prompts for precise control over composition, style, lighting, and subjects.' It specifies the verb ('generate'), resource ('images'), and scope ('structured prompts'), but doesn't explicitly differentiate from sibling tools like 'generate_image' or 'compose_images', which likely offer different approaches to image generation.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines3/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description provides implied usage guidance: 'Ideal for AI agents that need fine-grained control.' This suggests this tool is for detailed, structured prompts rather than simpler ones, but it doesn't explicitly state when to use this versus alternatives like 'generate_image' or 'compose_images', nor does it mention prerequisites or exclusions.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

Install Server

Other Tools

Latest Blog Posts

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/raveenb/fal-mcp-server'

If you have feedback or need assistance with the MCP directory API, please join our Discord server