Skip to main content
Glama

generate_image

Generate images from text prompts using ComfyUI's default workflow for quick and straightforward image creation.

Instructions

Generate an image using the default workflow.

    This is a simplified interface for quick image generation.
    Requires COMFY_WORKFLOW_JSON_FILE, PROMPT_NODE_ID, and OUTPUT_NODE_ID
    to be configured.

    For more control, use run_workflow() or execute_workflow().

    Args:
        prompt: Text description of the image to generate
    

Input Schema

TableJSON Schema
NameRequiredDescriptionDefault
promptYesText prompt for image generation

Implementation Reference

  • The primary handler for the generate_image tool. Decorated with @mcp.tool() for automatic registration and schema definition. Loads the default workflow, injects the user prompt into the configured prompt node, and delegates execution to _execute_workflow.
    @mcp.tool()
    def generate_image(
        prompt: str = Field(description="Text prompt for image generation"),
        ctx: Context = None,
    ):
        """Generate an image using the default workflow.
    
        This is a simplified interface for quick image generation.
        Requires COMFY_WORKFLOW_JSON_FILE, PROMPT_NODE_ID, and OUTPUT_NODE_ID
        to be configured.
    
        For more control, use run_workflow() or execute_workflow().
    
        Args:
            prompt: Text description of the image to generate
        """
        if not settings.workflow_json_file:
            return "Error: COMFY_WORKFLOW_JSON_FILE not configured"
        if not settings.prompt_node_id:
            return "Error: PROMPT_NODE_ID not configured"
        if not settings.output_node_id:
            return "Error: OUTPUT_NODE_ID not configured"
    
        with open(settings.workflow_json_file) as f:
            workflow = json.load(f)
    
        workflow[settings.prompt_node_id]["inputs"]["text"] = prompt
    
        if ctx:
            ctx.info(f"Generating: {prompt[:50]}...")
    
        return _execute_workflow(workflow, settings.output_node_id, ctx)
  • Key helper function called by generate_image to handle workflow submission to ComfyUI, polling for completion, image retrieval, and returning either Image object or URL based on settings.
    def _execute_workflow(workflow: dict, output_node_id: str, ctx: Context | None):
        """Internal function to execute workflow and return result."""
        # Submit workflow
        status, resp_data = comfy_post("/prompt", {"prompt": workflow})
    
        if status != 200:
            error_msg = resp_data.get("error", f"status {status}")
            return f"Failed to submit workflow: {error_msg}"
    
        prompt_id = resp_data.get("prompt_id")
        if not prompt_id:
            node_errors = resp_data.get("node_errors", {})
            if node_errors:
                return f"Workflow validation failed:\n{json.dumps(node_errors, indent=2)}"
            return "Failed to get prompt_id from response"
    
        if ctx:
            ctx.info(f"Submitted: {prompt_id}")
    
        # Poll callback for progress logging
        def on_poll(attempt: int, max_attempts: int):
            if ctx and attempt % 5 == 0:
                ctx.info(f"Waiting... ({attempt}/{max_attempts})")
    
        # Poll for result
        image_data = poll_for_result(prompt_id, output_node_id, on_poll=on_poll)
    
        if image_data:
            if ctx:
                ctx.info("Image generated successfully")
    
            if settings.output_mode.lower() == "url":
                # Return URL instead of image data
                history = comfy_get(f"/history/{prompt_id}")
                if prompt_id in history:
                    outputs = history[prompt_id].get("outputs", {})
                    if output_node_id in outputs:
                        images = outputs[output_node_id].get("images", [])
                        if images:
                            url_values = urllib.parse.urlencode(images[0])
                            return get_file_url(settings.comfy_url_external, url_values)
    
            return Image(data=image_data, format="png")
    
        return "Failed to generate image. Use get_queue_status() and get_history() to debug."
  • Top-level registration function that invokes register_execution_tools(mcp), which defines and registers the generate_image tool.
    def register_all_tools(mcp):
        """Register all tools with the MCP server."""
        register_system_tools(mcp)
        register_discovery_tools(mcp)
        register_workflow_tools(mcp)
        register_execution_tools(mcp)
  • Pydantic settings fields required by generate_image for the default workflow path, prompt input node ID, and output node ID.
    workflow_json_file: str | None = Field(
        default=None,
        alias="comfy_workflow_json_file",
        description="Default workflow file for generate_image",
    )
    prompt_node_id: str | None = Field(default=None, description="Default prompt node ID")
    output_node_id: str | None = Field(default=None, description="Default output node ID")
    output_mode: str = Field(
  • Initializes the MCP server and calls register_all_tools(mcp), starting the tool registration process for generate_image.
    register_all_tools(mcp)
Behavior3/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

No annotations are provided, so the description carries the full burden. It discloses that this is a 'simplified interface' and mentions configuration requirements (COMFY_WORKFLOW_JSON_FILE, etc.), which adds useful context. However, it lacks details on behavioral traits like error handling, rate limits, or output format (e.g., image type, size). The description doesn't contradict annotations (none exist), but it's incomplete for a tool with no annotation coverage.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness4/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is well-structured and appropriately sized. It front-loads the main purpose, follows with usage context and alternatives, and ends with parameter details. Each sentence earns its place, though the Args section is somewhat redundant given the schema. It's concise but could be slightly tighter by integrating the parameter note more seamlessly.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness3/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the tool's complexity (image generation with configuration dependencies), no annotations, and no output schema, the description is moderately complete. It covers purpose, usage guidelines, and hints at configuration needs, but lacks details on output (e.g., image format, handling), error cases, or performance expectations. For a tool with no structured data support, it should do more to compensate.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

The input schema has 100% description coverage, with the 'prompt' parameter well-documented. The description adds minimal value beyond the schema, only restating 'prompt: Text description of the image to generate' in the Args section. Since schema coverage is high, the baseline score of 3 is appropriate, as the description doesn't significantly enhance parameter understanding.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose4/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the tool's purpose: 'Generate an image using the default workflow.' It specifies the verb ('generate') and resource ('image'), though it doesn't explicitly distinguish it from sibling tools like 'run_workflow' or 'execute_workflow' beyond mentioning them as alternatives. The purpose is clear but lacks explicit sibling differentiation.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines5/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description provides explicit guidance on when to use this tool vs. alternatives: 'This is a simplified interface for quick image generation' and 'For more control, use run_workflow() or execute_workflow().' It clearly states the intended context (simplified, quick) and names specific alternatives, making it easy for an agent to choose appropriately.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

Install Server

Other Tools

Latest Blog Posts

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/IO-AtelierTech/comfyui-mcp'

If you have feedback or need assistance with the MCP directory API, please join our Discord server