Skip to main content
Glama

generate_image

Create images from text descriptions using AI models. Specify prompts, adjust settings like size and format, and generate visual content for various applications.

Instructions

Generate images from text prompts. Use list_models with category='image' to discover available models.

Input Schema

TableJSON Schema
NameRequiredDescriptionDefault
promptYesText description of the image to generate
modelNoModel ID (e.g., 'fal-ai/flux-pro') or alias (e.g., 'flux_schnell'). Use list_models to see options.flux_schnell
negative_promptNoWhat to avoid in the image
image_sizeNolandscape_16_9
num_imagesNo
seedNoSeed for reproducible generation
enable_safety_checkerNoEnable safety checker to filter inappropriate content
output_formatNoOutput image formatpng

Implementation Reference

  • The core handler function that implements the logic for the 'generate_image' tool. It resolves the model, prepares parameters for Fal.ai, executes the generation using the queue strategy, processes the results, handles errors, and returns the URLs of generated images.
    async def handle_generate_image(
        arguments: Dict[str, Any],
        registry: ModelRegistry,
        queue_strategy: QueueStrategy,
    ) -> List[TextContent]:
        """Handle the generate_image tool."""
        model_input = arguments.get("model", "flux_schnell")
        try:
            model_id = await registry.resolve_model_id(model_input)
        except ValueError as e:
            return [
                TextContent(
                    type="text",
                    text=f"❌ {e}. Use list_models to see available options.",
                )
            ]
    
        fal_args: Dict[str, Any] = {
            "prompt": arguments["prompt"],
            "image_size": arguments.get("image_size", "landscape_16_9"),
            "num_images": arguments.get("num_images", 1),
        }
    
        # Add optional parameters
        if "negative_prompt" in arguments:
            fal_args["negative_prompt"] = arguments["negative_prompt"]
        if "seed" in arguments:
            fal_args["seed"] = arguments["seed"]
        if "enable_safety_checker" in arguments:
            fal_args["enable_safety_checker"] = arguments["enable_safety_checker"]
        if "output_format" in arguments:
            fal_args["output_format"] = arguments["output_format"]
    
        # Use fast execution (no queue) for image generation
        try:
            result = await queue_strategy.execute_fast(model_id, fal_args)
        except Exception as e:
            logger.error("Image generation failed: %s", e)
            return [
                TextContent(
                    type="text",
                    text=f"❌ Image generation failed: {e}",
                )
            ]
    
        # Check for error in response
        if "error" in result:
            error_msg = result.get("error", "Unknown error")
            logger.error("Image generation failed for %s: %s", model_id, error_msg)
            return [
                TextContent(
                    type="text",
                    text=f"❌ Image generation failed: {error_msg}",
                )
            ]
    
        images = result.get("images", [])
        if not images:
            logger.warning("Image generation returned no images. Model: %s", model_id)
            return [
                TextContent(
                    type="text",
                    text=f"❌ No images were generated by {model_id}. The prompt may have been filtered.",
                )
            ]
    
        # Extract URLs safely
        try:
            urls = [img["url"] for img in images]
        except (KeyError, TypeError) as e:
            logger.error("Malformed image response from %s: %s", model_id, e)
            return [
                TextContent(
                    type="text",
                    text=f"❌ Image generation completed but response was malformed: {e}",
                )
            ]
    
        response = f"🎨 Generated {len(urls)} image(s) with {model_id}:\n\n"
        for i, url in enumerate(urls, 1):
            response += f"Image {i}: {url}\n"
        return [TextContent(type="text", text=response)]
  • The input schema and Tool definition for the 'generate_image' tool, defining parameters such as prompt (required), model, image_size, num_images, and others with types, enums, defaults, and descriptions.
    Tool(
        name="generate_image",
        description="Generate images from text prompts. Use list_models with category='image' to discover available models.",
        inputSchema={
            "type": "object",
            "properties": {
                "prompt": {
                    "type": "string",
                    "description": "Text description of the image to generate",
                },
                "model": {
                    "type": "string",
                    "default": "flux_schnell",
                    "description": "Model ID (e.g., 'fal-ai/flux-pro') or alias (e.g., 'flux_schnell'). Use list_models to see options.",
                },
                "negative_prompt": {
                    "type": "string",
                    "description": "What to avoid in the image",
                },
                "image_size": {
                    "type": "string",
                    "enum": [
                        "square",
                        "landscape_4_3",
                        "landscape_16_9",
                        "portrait_3_4",
                        "portrait_9_16",
                    ],
                    "default": "landscape_16_9",
                },
                "num_images": {
                    "type": "integer",
                    "default": 1,
                    "minimum": 1,
                    "maximum": 4,
                },
                "seed": {
                    "type": "integer",
                    "description": "Seed for reproducible generation",
                },
                "enable_safety_checker": {
                    "type": "boolean",
                    "default": True,
                    "description": "Enable safety checker to filter inappropriate content",
                },
                "output_format": {
                    "type": "string",
                    "enum": ["jpeg", "png", "webp"],
                    "default": "png",
                    "description": "Output image format",
                },
            },
            "required": ["prompt"],
        },
    ),
    Tool(
  • Registration of the 'generate_image' handler in the TOOL_HANDLERS dictionary, which is used by the server's call_tool method to dispatch tool executions to the appropriate handler function.
    TOOL_HANDLERS = {
        # Utility tools (no queue needed)
        "list_models": handle_list_models,
        "recommend_model": handle_recommend_model,
        "get_pricing": handle_get_pricing,
        "get_usage": handle_get_usage,
        "upload_file": handle_upload_file,
        # Image tools
        "generate_image": handle_generate_image,
        "generate_image_structured": handle_generate_image_structured,
        "generate_image_from_image": handle_generate_image_from_image,
        # Video tools
        "generate_video": handle_generate_video,
        "generate_video_from_image": handle_generate_video_from_image,
        "generate_video_from_video": handle_generate_video_from_video,
        # Audio tools
        "generate_music": handle_generate_music,
    }
  • Server decorator registration for list_tools() method that returns ALL_TOOLS, which includes the 'generate_image' tool schema, making it discoverable by MCP clients.
    @self.server.list_tools()
    async def list_tools() -> List[Tool]:
        """List all available Fal.ai tools"""
        return ALL_TOOLS
Behavior2/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

No annotations are provided, so the description carries the full burden of behavioral disclosure. The description only states what the tool does ('Generate images from text prompts') without mentioning any behavioral traits like rate limits, authentication requirements, cost implications, or what the output looks like. For a complex tool with 8 parameters and no annotations, this is a significant gap.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is perfectly concise: two sentences that each earn their place. The first sentence states the core purpose, and the second provides essential usage guidance. There's zero waste or redundancy.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness2/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the tool's complexity (8 parameters, no annotations, no output schema), the description is incomplete. It doesn't explain what the tool returns (images in what format? URLs? base64?), doesn't mention cost or rate limit considerations, and provides minimal behavioral context. For an image generation tool with significant parameters, this leaves too many gaps.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

The description adds no parameter-specific information beyond what's in the input schema. With 75% schema description coverage (6 of 8 parameters have descriptions), the schema does most of the work. The description's reference to list_models for model selection provides some context for the model parameter, but doesn't add meaningful semantics beyond the schema's documentation.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose4/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the tool's purpose: 'Generate images from text prompts.' This is a specific verb+resource combination that distinguishes it from siblings like edit_image or generate_video. However, it doesn't explicitly differentiate from generate_image_from_image or generate_image_structured, which are also image generation tools.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description provides clear context for usage: 'Use list_models with category='image' to discover available models.' This gives practical guidance on how to select the model parameter. However, it doesn't explicitly state when to use this tool versus alternatives like generate_image_from_image or generate_image_structured, nor does it provide exclusion criteria.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

Install Server

Other Tools

Latest Blog Posts

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/raveenb/fal-mcp-server'

If you have feedback or need assistance with the MCP directory API, please join our Discord server