Skip to main content
Glama
yunwoong7
by yunwoong7

image_conditioning

Generate images matching the layout and composition of a reference image using text prompts and control modes. Specify attributes to include or exclude for precise output customization.

Instructions

Generate an image that follows the layout and composition of a reference image.

Args:
    image_path: File path of the reference image
    prompt: Text describing the image to be generated
    negative_prompt: Text specifying attributes to exclude from generation
    control_mode: Control mode (CANNY_EDGE, etc.)
    height: Output image height (pixels)
    width: Output image width (pixels)
    cfg_scale: Prompt matching degree (1-20)
    
Returns:
    Dict: Dictionary containing the file path of the generated image

Input Schema

TableJSON Schema
NameRequiredDescriptionDefault
cfg_scaleNo
control_modeNoCANNY_EDGE
heightNo
image_pathYes
negative_promptNo
promptYes
widthNo

Implementation Reference

  • The primary handler function implementing the image_conditioning tool. It processes a reference image, generates a conditioned image using Bedrock API, saves it, and returns the path.
    async def image_conditioning(
            image_path: str,
            prompt: str,
            negative_prompt: str = "",
            control_mode: str = "CANNY_EDGE",
            height: int = 512,
            width: int = 512,
            cfg_scale: float = 8.0,
            output_path: str = None,
    ) -> Dict[str, Any]:
        """
        Generate an image that follows the layout and composition of a reference image.
        
        Args:
            image_path: File path of the reference image
            prompt: Text describing the image to be generated
            negative_prompt: Text specifying attributes to exclude from generation
            control_mode: Control mode (CANNY_EDGE, etc.)
            height: Output image height (pixels)
            width: Output image width (pixels)
            cfg_scale: Prompt matching degree (1-20)
            output_path: Absolute path to save the image
            
        Returns:
            Dict: Dictionary containing the file path of the generated image
        """
        try:
            # Read image file and encode to base64
            with open(image_path, "rb") as image_file:
                input_image = base64.b64encode(image_file.read()).decode('utf8')
    
            body = json.dumps({
                "taskType": "TEXT_IMAGE",
                "textToImageParams": {
                    "text": prompt,
                    "negativeText": negative_prompt,
                    "conditionImage": input_image,
                    "controlMode": control_mode
                },
                "imageGenerationConfig": {
                    "numberOfImages": 1,
                    "height": height,
                    "width": width,
                    "cfgScale": cfg_scale
                }
            })
    
            # Generate image
            image_bytes = generate_image(body)
    
            # Save image
            image_info = save_image(image_bytes, output_path=output_path)
    
            # Generate result
            result = {
                "image_path": image_info["image_path"],
                "message": f"Image conditioning completed successfully. Saved location: {image_info['image_path']}"
            }
    
            return result
    
        except Exception as e:
            raise McpError(f"Error occurred while image conditioning: {str(e)}")
  • Registration point for the image_conditioning tool in the MCP server (currently commented out).
    # mcp.add_tool(image_conditioning)
  • Import statement bringing the image_conditioning handler into the server module.
    from .tools.image_conditioning import image_conditioning
Behavior2/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

With no annotations provided, the description carries full burden for behavioral disclosure. While it mentions the tool 'Generates an image' (implying creation/mutation), it doesn't disclose important behavioral traits like whether this is a computationally intensive operation, potential rate limits, authentication requirements, or what happens if the reference image is invalid. The description provides basic functional information but lacks operational context.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness4/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is well-structured with a clear purpose statement followed by organized parameter explanations. Every sentence serves a purpose: the first states the tool's function, and subsequent lines document parameters and return value. It could be slightly more concise by combining some parameter explanations, but overall it's efficiently organized with zero wasted text.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness3/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

For a 7-parameter image generation tool with no annotations and no output schema, the description provides adequate functional coverage but lacks operational context. It explains what the tool does and what parameters mean, but doesn't address important considerations like performance characteristics, error conditions, or format details of the returned dictionary. The absence of output schema means the description should ideally explain the return structure more thoroughly.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters4/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

With 0% schema description coverage (titles only, no descriptions), the description provides substantial value by explaining all 7 parameters. It clarifies that 'image_path' is for the reference image, 'prompt' describes what to generate, 'negative_prompt' excludes attributes, 'control_mode' specifies the technique (with CANNY_EDGE as example), and provides context for dimensions and cfg_scale. This compensates well for the schema's lack of descriptions.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the tool's purpose with a specific verb ('Generate an image') and resource ('that follows the layout and composition of a reference image'). It distinguishes from siblings like 'text_to_image' (no reference image) and 'image_variation' (likely modifies existing images rather than generating new ones based on reference composition).

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines3/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description implies usage context through the phrase 'follows the layout and composition of a reference image,' suggesting this tool should be used when you want to generate new images with similar structure to an existing image. However, it doesn't explicitly state when NOT to use it or name specific alternatives among the sibling tools, leaving some ambiguity about tool selection.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

Install Server

Other Tools

Related Tools

Latest Blog Posts

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/yunwoong7/aws-nova-canvas-mcp'

If you have feedback or need assistance with the MCP directory API, please join our Discord server