Agnes AI MCP Server

image_to_image

Generate new images from reference images and a text prompt. Upload existing images and describe the desired output to create customized versions.

Instructions

Generate new image(s) based on reference image(s) and a text prompt.

This is image-to-image generation: provide one or more reference images (as URLs) along with a text prompt describing the desired output.

Args: prompt: Text description guiding the generation. images: List of reference image URLs (at least one required). model: Model name (agnes-image-2.0-flash or agnes-image-2.1-flash). size: Output size (e.g. 1024x768, 1024x1024, 768x1024). n: Number of images to generate (1-4). Default: 1. output_dir: Directory to save the downloaded image(s). Defaults to ~/agnes_output. return_mode: 'url' for image URL, 'b64' for base64 + local save.

Returns: dict with url, local_path, model, size, n, images.

Input Schema

TableJSON Schema

Name	Required	Default
`n`	No
`size`	No	1024x768
`model`	No	agnes-image-2.1-flash
`images`	Yes
`prompt`	Yes
`output_dir`	No
`return_mode`	No	url

Output Schema

TableJSON Schema

Name	Required	Description	Default
No arguments

Tool Definition Quality

A4.3/5.0

Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

No annotations are provided, so the description bears full responsibility. It discloses that the tool generates images, accepts image URLs, and returns a dict with url, local_path, etc. It mentions saving locally and return modes. It does not cover potential rate limits or file size constraints, but the core behavior is transparent.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness4/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is well-structured with 'Args' and 'Returns' sections, making it easy to parse. It is concise, with only minor redundancy in the opening sentences. Every sentence provides value, though a slight trim could improve focus.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given 7 parameters (2 required) and an output schema, the description covers all inputs and outputs. It explains each parameter's role and the return format. No critical gaps are present, though it could mention that 'images' must be URLs. Overall, it is complete enough for an agent to use effectively.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters4/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema description coverage is 0%, meaning no parameter descriptions in the schema. However, the tool description lists all 7 parameters with explanations, defaults, and allowed values (e.g., model names, size formats). This adds significant meaning beyond the bare schema, fulfilling the compensation requirement.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the tool's purpose: 'Generate new image(s) based on reference image(s) and a text prompt.' It uses specific verbs ('generate') and resources ('image(s)'), and distinguishes from sibling tools like text_to_image, which lacks reference images. The title is absent but the description compensates.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description explains that reference images are required and a text prompt guides generation. It lists parameters and their defaults, providing context. However, it does not explicitly state when to use this tool over alternatives (e.g., text_to_image) or when not to use it. Still, the purpose is clear enough for an agent to infer appropriate usage.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

Install Server

Other Tools

Latest Blog Posts

Your AI Chatbot Just Exposed Your CEO's Salary to an Intern
By Om-Shree-0709 on July 2, 2026.
Agent Identity
MCP Security
OAuth Delegation
Why MCP Servers Need Execution Sandboxing (And Why Your Current Stack Isn't Enough)
By Om-Shree-0709 on June 30, 2026.
Agentic Ai
Prompt Injection
WebAssembly
Lightport: Open-Sourcing Glama's AI Gateway
By punkpeye on April 27, 2026.
OpenAI
open source

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/MSWEIMZ/agnes-mcp'

If you have feedback or need assistance with the MCP directory API, please join our Discord server