Skip to main content
Glama

generate_image_from_image

Transform existing images into new versions using text prompts for style transfer, editing, and creative variations.

Instructions

Transform an existing image into a new image based on a prompt. Use for style transfer, editing, variations, and more. Use upload_file first if you have a local image.

Input Schema

TableJSON Schema
NameRequiredDescriptionDefault
image_urlYesURL of the source image to transform (use upload_file for local images)
promptYesText description of desired transformation (e.g., 'Transform into a watercolor painting')
modelNoImage-to-image model. Options: fal-ai/flux/dev/image-to-image, fal-ai/flux-2/editfal-ai/flux/dev/image-to-image
strengthNoHow much to transform (0=keep original, 1=ignore original)
num_imagesNo
negative_promptNoWhat to avoid in the output image
seedNoSeed for reproducible generation
enable_safety_checkerNoEnable safety checker to filter inappropriate content
output_formatNoOutput image formatpng

Implementation Reference

  • The handler function that implements the core logic for the generate_image_from_image tool. It resolves the model, prepares arguments including image_url and prompt, executes the image-to-image generation via queue_strategy.execute_fast, handles errors and timeouts, and returns generated image URLs.
    async def handle_generate_image_from_image(
        arguments: Dict[str, Any],
        registry: ModelRegistry,
        queue_strategy: QueueStrategy,
    ) -> List[TextContent]:
        """Handle the generate_image_from_image tool."""
        model_input = arguments.get("model", "fal-ai/flux/dev/image-to-image")
        try:
            model_id = await registry.resolve_model_id(model_input)
        except ValueError as e:
            return [
                TextContent(
                    type="text",
                    text=f"❌ {e}. Use list_models to see available options.",
                )
            ]
    
        # Both image_url and prompt are required
        img2img_args: Dict[str, Any] = {
            "image_url": arguments["image_url"],
            "prompt": arguments["prompt"],
            "strength": arguments.get("strength", 0.75),
            "num_images": arguments.get("num_images", 1),
        }
    
        # Add optional parameters
        if "negative_prompt" in arguments:
            img2img_args["negative_prompt"] = arguments["negative_prompt"]
        if "seed" in arguments:
            img2img_args["seed"] = arguments["seed"]
        if "enable_safety_checker" in arguments:
            img2img_args["enable_safety_checker"] = arguments["enable_safety_checker"]
        if "output_format" in arguments:
            img2img_args["output_format"] = arguments["output_format"]
    
        logger.info(
            "Starting image-to-image transformation with %s from %s",
            model_id,
            (
                arguments["image_url"][:50] + "..."
                if len(arguments["image_url"]) > 50
                else arguments["image_url"]
            ),
        )
    
        # Use fast execution with timeout protection
        try:
            result = await asyncio.wait_for(
                queue_strategy.execute_fast(model_id, img2img_args),
                timeout=60,
            )
        except asyncio.TimeoutError:
            logger.error(
                "Image-to-image transformation timed out after 60s. Model: %s",
                model_id,
            )
            return [
                TextContent(
                    type="text",
                    text=f"❌ Image transformation timed out after 60 seconds with {model_id}. Please try again.",
                )
            ]
        except Exception as e:
            logger.exception("Image-to-image transformation failed: %s", e)
            return [
                TextContent(
                    type="text",
                    text=f"❌ Image transformation failed: {e}",
                )
            ]
    
        # Check for error in response
        if "error" in result:
            error_msg = result.get("error", "Unknown error")
            logger.error(
                "Image-to-image transformation failed for %s: %s",
                model_id,
                error_msg,
            )
            return [
                TextContent(
                    type="text",
                    text=f"❌ Image transformation failed: {error_msg}",
                )
            ]
    
        images = result.get("images", [])
        if not images:
            logger.warning(
                "Image-to-image transformation returned no images. Model: %s",
                model_id,
            )
            return [
                TextContent(
                    type="text",
                    text=f"❌ No images were generated by {model_id}. The source image may have been filtered.",
                )
            ]
    
        # Extract URLs safely
        try:
            urls = [img["url"] for img in images]
        except (KeyError, TypeError) as e:
            logger.error("Malformed image response from %s: %s", model_id, e)
            return [
                TextContent(
                    type="text",
                    text=f"❌ Image transformation completed but response was malformed: {e}",
                )
            ]
    
        response = f"🎨 Transformed image with {model_id}:\n\n"
        response += f"**Source**: {arguments['image_url'][:50]}...\n\n"
        for i, url in enumerate(urls, 1):
            response += f"Result {i}: {url}\n"
        return [TextContent(type="text", text=response)]
  • The MCP Tool schema definition for generate_image_from_image, including detailed inputSchema with required fields image_url and prompt, defaults, and parameter descriptions.
    Tool(
        name="generate_image_from_image",
        description="Transform an existing image into a new image based on a prompt. Use for style transfer, editing, variations, and more. Use upload_file first if you have a local image.",
        inputSchema={
            "type": "object",
            "properties": {
                "image_url": {
                    "type": "string",
                    "description": "URL of the source image to transform (use upload_file for local images)",
                },
                "prompt": {
                    "type": "string",
                    "description": "Text description of desired transformation (e.g., 'Transform into a watercolor painting')",
                },
                "model": {
                    "type": "string",
                    "default": "fal-ai/flux/dev/image-to-image",
                    "description": "Image-to-image model. Options: fal-ai/flux/dev/image-to-image, fal-ai/flux-2/edit",
                },
                "strength": {
                    "type": "number",
                    "default": 0.75,
                    "minimum": 0.0,
                    "maximum": 1.0,
                    "description": "How much to transform (0=keep original, 1=ignore original)",
                },
                "num_images": {
                    "type": "integer",
                    "default": 1,
                    "minimum": 1,
                    "maximum": 4,
                },
                "negative_prompt": {
                    "type": "string",
                    "description": "What to avoid in the output image",
                },
                "seed": {
                    "type": "integer",
                    "description": "Seed for reproducible generation",
                },
                "enable_safety_checker": {
                    "type": "boolean",
                    "default": True,
                    "description": "Enable safety checker to filter inappropriate content",
                },
                "output_format": {
                    "type": "string",
                    "enum": ["jpeg", "png", "webp"],
                    "default": "png",
                    "description": "Output image format",
                },
            },
            "required": ["image_url", "prompt"],
        },
    ),
  • Registration of the tool handler in the TOOL_HANDLERS dictionary used by the MCP server to route tool calls to handle_generate_image_from_image.
    TOOL_HANDLERS = {
        # Utility tools (no queue needed)
        "list_models": handle_list_models,
        "recommend_model": handle_recommend_model,
        "get_pricing": handle_get_pricing,
        "get_usage": handle_get_usage,
        "upload_file": handle_upload_file,
        # Image generation tools
        "generate_image": handle_generate_image,
        "generate_image_structured": handle_generate_image_structured,
        "generate_image_from_image": handle_generate_image_from_image,
        # Image editing tools
        "remove_background": handle_remove_background,
        "upscale_image": handle_upscale_image,
        "edit_image": handle_edit_image,
        "inpaint_image": handle_inpaint_image,
        "resize_image": handle_resize_image,
        "compose_images": handle_compose_images,
        # Video tools
        "generate_video": handle_generate_video,
        "generate_video_from_image": handle_generate_video_from_image,
        "generate_video_from_video": handle_generate_video_from_video,
        # Audio tools
        "generate_music": handle_generate_music,
    }
  • No, wrong. Wait, this is import in server.py actually from line 20 in server.py: from fal_mcp_server.handlers import ( handle_generate_image_from_image,... )
    async def handle_generate_image(
        arguments: Dict[str, Any],
        registry: ModelRegistry,
        queue_strategy: QueueStrategy,
    ) -> List[TextContent]:
  • Export of the handler function from handlers package for use in servers.
    from fal_mcp_server.handlers.image_handlers import (
        handle_generate_image,
        handle_generate_image_from_image,
Behavior3/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

With no annotations provided, the description carries the full burden of behavioral disclosure. It mentions the transformation purpose and prerequisite for local images, but doesn't describe rate limits, authentication needs, cost implications, output format details, or what happens when transformations fail. It provides basic context but lacks comprehensive behavioral traits.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is perfectly concise with two sentences that each earn their place. The first sentence states the core purpose and use cases, while the second provides crucial prerequisite guidance. No wasted words, and information is front-loaded appropriately.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness3/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

For a complex tool with 9 parameters, no annotations, and no output schema, the description provides adequate but incomplete context. It covers the basic purpose and a key prerequisite, but doesn't address mutation implications (creates new images), performance characteristics, error conditions, or output expectations. Given the complexity, more completeness would be helpful.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema description coverage is high (89%), so the baseline is 3. The description doesn't add significant parameter semantics beyond what's in the schema - it mentions 'prompt' and 'image_url' implicitly but doesn't explain parameter interactions, default behaviors, or advanced usage patterns. The schema already documents parameters well.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the tool's purpose with specific verbs ('transform', 'edit') and resources ('existing image', 'new image'), distinguishing it from siblings like generate_image (text-to-image), edit_image (likely different editing), and compose_images (combining images). It explicitly mentions style transfer, editing, and variations as use cases.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description provides clear context for when to use this tool ('transform an existing image into a new image based on a prompt') and includes a specific prerequisite instruction ('use upload_file first if you have a local image'). However, it doesn't explicitly state when NOT to use it or name alternatives among siblings (e.g., when to use edit_image instead).

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

Install Server

Other Tools

Latest Blog Posts

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/raveenb/fal-mcp-server'

If you have feedback or need assistance with the MCP directory API, please join our Discord server