Skip to main content
Glama
gerred

MCP Server Replicate

generate_image

Create images from text prompts with customizable parameters like style, quality, dimensions, and output count using AI model inference.

Instructions

Generate an image using the specified parameters.

Input Schema

TableJSON Schema
NameRequiredDescriptionDefault
promptYes
styleNo
qualityNobalanced
widthNo
heightNo
num_outputsNo
seedNo

Implementation Reference

  • The core handler for the 'generate_image' tool. It applies quality and style presets to construct input parameters for the SDXL model (hardcoded version), creates a Replicate prediction, and returns a resource URI and metadata for tracking progress.
    async def generate_image(
        prompt: str,
        style: str | None = None,
        quality: str = "balanced",
        width: int | None = None,
        height: int | None = None,
        num_outputs: int = 1,
        seed: int | None = None,
    ) -> dict[str, Any]:
        """Generate an image using the specified parameters."""
        # Get quality preset parameters
        if quality not in QUALITY_PRESETS["presets"]:
            quality = "balanced"
        parameters = QUALITY_PRESETS["presets"][quality]["parameters"].copy()
    
        # Apply style preset if specified
        if style:
            if style in STYLE_PRESETS["presets"]:
                style_params = STYLE_PRESETS["presets"][style]["parameters"]
                # Merge prompt prefixes
                if "prompt_prefix" in style_params:
                    prompt = f"{style_params['prompt_prefix']}, {prompt}"
                # Copy other parameters
                for k, v in style_params.items():
                    if k != "prompt_prefix":
                        parameters[k] = v
    
        # Override size if specified
        if width:
            parameters["width"] = width
        if height:
            parameters["height"] = height
    
        # Add other parameters
        parameters.update(
            {
                "prompt": prompt,
                "num_outputs": num_outputs,
            }
        )
        if seed is not None:
            parameters["seed"] = seed
    
        # Create prediction with SDXL model
        async with ReplicateClient(api_token=os.getenv("REPLICATE_API_TOKEN")) as client:
            result = await client.create_prediction(
                version="39ed52f2a78e934b3ba6e2a89f5b1c712de7dfea535525255b1aa35c5565e08b",  # SDXL v1.0
                input=parameters,
            )
    
            # Return resource information
            return {
                "resource_uri": f"generations://{result['id']}",
                "status": result["status"],
                "message": (
                    "🎨 Starting your image generation...\n\n"
                    f"Prompt: {prompt}\n"
                    f"Style: {style or 'Default'}\n"
                    f"Quality: {quality}\n\n"
                    "Let me check the status of your generation. I'll use:\n"
                    f"`get_prediction(\"{result['id']}\", wait=true)`\n\n"
                    "This will let me monitor the progress and show you the image as soon as it's ready."
                ),
                "next_prompt": "after_generation",
                "metadata": {  # Add metadata for client use
                    "prompt": prompt,
                    "style": style,
                    "quality": quality,
                    "width": width,
                    "height": height,
                    "seed": seed,
                    "model": "SDXL v1.0",
                    "created_at": result.get("created_at"),
                },
            }
  • QUALITY_PRESETS dictionary used by generate_image to set default parameters based on quality level (draft, balanced, quality, extreme).
    QUALITY_PRESETS = {
        "id": "quality-presets",
        "name": "Quality Presets",
        "description": "Common quality presets for different generation scenarios",
        "model_type": "any",
        "presets": {
            "draft": {
                "description": "Fast draft quality for quick iterations",
                "parameters": {
                    "num_inference_steps": 20,
                    "guidance_scale": 5.0,
                    "width": 512,
                    "height": 512,
                },
            },
            "balanced": {
                "description": "Balanced quality and speed for most use cases",
                "parameters": {
                    "num_inference_steps": 30,
                    "guidance_scale": 7.5,
                    "width": 768,
                    "height": 768,
                },
            },
            "quality": {
                "description": "High quality for final outputs",
                "parameters": {
                    "num_inference_steps": 50,
                    "guidance_scale": 7.5,
                    "width": 1024,
                    "height": 1024,
                },
            },
            "extreme": {
                "description": "Maximum quality, very slow",
                "parameters": {
                    "num_inference_steps": 150,
                    "guidance_scale": 8.0,
                    "width": 1536,
                    "height": 1536,
                },
            },
        },
        "version": "1.0.0",
    }
  • STYLE_PRESETS dictionary used by generate_image to apply style-specific parameters and prompt prefixes (photorealistic, cinematic, anime, etc.).
    STYLE_PRESETS = {
        "id": "style-presets",
        "name": "Style Presets",
        "description": "Common style presets for different artistic looks",
        "model_type": "any",
        "presets": {
            "photorealistic": {
                "description": "Highly detailed photorealistic style",
                "parameters": {
                    "prompt_prefix": "professional photograph, photorealistic, highly detailed, 8k uhd",
                    "negative_prompt": "painting, drawing, illustration, anime, cartoon, artistic, unrealistic",
                    "guidance_scale": 8.0,
                },
            },
            "cinematic": {
                "description": "Dramatic cinematic style",
                "parameters": {
                    "prompt_prefix": "cinematic shot, dramatic lighting, movie scene, high budget film",
                    "negative_prompt": "low quality, amateur, poorly lit",
                    "guidance_scale": 7.5,
                },
            },
            "anime": {
                "description": "Anime/manga style",
                "parameters": {
                    "prompt_prefix": "anime style, manga art, clean lines, vibrant colors",
                    "negative_prompt": "photorealistic, 3d render, photograph, western art style",
                    "guidance_scale": 7.0,
                },
            },
            "digital_art": {
                "description": "Digital art style",
                "parameters": {
                    "prompt_prefix": "digital art, vibrant colors, detailed illustration",
                    "negative_prompt": "photograph, realistic, grainy, noisy",
                    "guidance_scale": 7.0,
                },
            },
            "oil_painting": {
                "description": "Oil painting style",
                "parameters": {
                    "prompt_prefix": "oil painting, textured brushstrokes, artistic, rich colors",
                    "negative_prompt": "photograph, digital art, 3d render, smooth",
                    "guidance_scale": 7.0,
                },
            },
        },
        "version": "1.0.0",
    }
Behavior2/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

No annotations are provided, so the description carries full burden. It states the tool generates an image but doesn't disclose behavioral traits such as whether it's a read/write operation, potential costs, rate limits, authentication needs, or what happens on failure (e.g., error handling). The description is minimal and misses critical context for a generative tool.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is extremely concise with a single sentence that directly states the tool's function. It's front-loaded with no wasted words, making it easy to parse quickly. However, this conciseness comes at the cost of completeness, as noted in other dimensions.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness2/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the tool's complexity (7 parameters, generative function), lack of annotations, and no output schema, the description is incomplete. It doesn't explain what the tool returns (e.g., image URL, binary data), error conditions, or usage constraints. For an image generation tool with rich parameters, this minimal description leaves too much unspecified.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters2/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema description coverage is 0%, so the description must compensate. It only mentions 'specified parameters' generically without explaining what parameters exist (e.g., prompt, style, quality) or their meanings. With 7 parameters including one required ('prompt'), this lack of semantic detail is a significant gap, failing to add value beyond the bare schema.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose3/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description states the tool's purpose ('Generate an image') which is clear but vague. It specifies the action and resource but lacks detail about what kind of image generation (e.g., AI-generated, from templates) or how it differs from sibling tools like 'create_prediction' or 'open_image_with_system'. It's not tautological but doesn't provide specific differentiation.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines2/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description provides no guidance on when to use this tool versus alternatives. It mentions 'using the specified parameters' but doesn't indicate context, prerequisites, or exclusions. With multiple sibling tools like 'create_prediction' and 'search_models', this lack of guidance leaves the agent uncertain about tool selection.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

Install Server

Other Tools

Latest Blog Posts

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/gerred/mcp-server-replicate'

If you have feedback or need assistance with the MCP directory API, please join our Discord server