Nanana AI Image Generation Server

image_to_image

Transform existing images using text prompts to create new variations. Upload 1-9 images and describe desired changes to generate modified versions with AI.

Instructions

Transform existing images based on a text prompt using Nanana AI. This operation typically takes 15-30 seconds to complete. The tool will wait for transformation to finish and return the final image URL.

Input Schema

TableJSON Schema

Name	Required	Description	Default
`imageUrls`	Yes	Array of image URLs to transform (1-9 images)
`prompt`	Yes	The text prompt describing how to transform the images

Implementation Reference

src/index.ts:106-137 (handler)

Core handler function that executes the image-to-image tool logic: sends POST request to Nanana API with imageUrls and prompt, handles response and errors, converts image to base64.

async function callImageToImage(
  apiToken: string,
  params: ImageToImageParams
): Promise<{ imageUrl: string; imageBase64: string; prompt: string }> {
  const response = await fetch(`${API_BASE_URL}/api/mcp/v1/image-to-image`, {
    method: "POST",
    headers: {
      Authorization: `Bearer ${apiToken}`,
      "Content-Type": "application/json",
    },
    body: JSON.stringify({
      imageUrls: params.imageUrls,
      prompt: params.prompt,
    }),
  });

  if (!response.ok) {
    const error = (await response.json().catch(() => ({}))) as {
      error?: string;
    };
    throw new Error(
      error.error || `API request failed with status ${response.status}`
    );
  }

  const data = (await response.json()) as { imageUrl: string; prompt: string };

  // Download image and convert to base64
  const imageBase64 = await imageUrlToBase64(data.imageUrl);

  return { imageUrl: data.imageUrl, imageBase64, prompt: data.prompt };
}

src/index.ts:18-21 (schema)
TypeScript interface defining the input parameters for the image_to_image tool.
```
interface ImageToImageParams {
  imageUrls: string[];
  prompt: string;
}
```

src/index.ts:40-61 (registration)

Tool registration object defining the name, description, and input schema for 'image_to_image'.

const IMAGE_TO_IMAGE_TOOL: Tool = {
  name: "image_to_image",
  description:
    "Transform existing images based on a text prompt using Nanana AI. This operation typically takes 15-30 seconds to complete. The tool will wait for transformation to finish and return the final image URL.",
  inputSchema: {
    type: "object",
    properties: {
      imageUrls: {
        type: "array",
        items: { type: "string" },
        description: "Array of image URLs to transform (1-9 images)",
        minItems: 1,
        maxItems: 9,
      },
      prompt: {
        type: "string",
        description: "The text prompt describing how to transform the images",
      },
    },
    required: ["imageUrls", "prompt"],
  },
};

src/index.ts:161-165 (registration)

Registration of available tools list, including image_to_image tool, in the MCP listTools handler.

server.setRequestHandler(ListToolsRequestSchema, async () => {
  return {
    tools: [TEXT_TO_IMAGE_TOOL, IMAGE_TO_IMAGE_TOOL],
  };
});

src/index.ts:187-202 (handler)

MCP CallToolRequestSchema dispatch handler block that invokes the image_to_image tool handler.

} else if (name === "image_to_image") {
  const params = args as unknown as ImageToImageParams;
  console.error(
    `[MCP] Starting image-to-image transformation: "${params.prompt}"`
  );
  const result = await callImageToImage(apiToken, params);
  console.error(`[MCP] Transformation completed: ${result.imageUrl}`);
  return {
    content: [
      {
        type: "text",
        text: `Successfully transformed image!\n\nPrompt: ${result.prompt}\n\nImage URL: ${result.imageUrl}`,
      },
    ],
  };
} else {

Tool Definition Quality

A4.2/5.0

Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

With no annotations provided, the description carries full burden and does well by disclosing key behavioral traits: the operation takes 15-30 seconds, the tool waits for completion, and it returns a final image URL. This covers timing, blocking behavior, and output format - important information not available elsewhere.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Three sentences with zero waste: first states purpose, second provides timing information, third describes blocking behavior and return value. Each sentence earns its place and the description is appropriately sized for the tool's complexity.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

For a tool with no annotations and no output schema, the description does well by covering purpose, timing, blocking behavior, and return format. However, it doesn't mention potential errors, rate limits, or authentication requirements that might be relevant for an AI image transformation service.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema description coverage is 100%, so the schema already documents both parameters thoroughly. The description doesn't add any additional meaning about the parameters beyond what the schema provides (image URLs array with 1-9 items, text prompt). Baseline 3 is appropriate when schema does the heavy lifting.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the specific action ('Transform existing images'), the resource ('images'), and the method ('based on a text prompt using Nanana AI'). It distinguishes from the sibling tool 'text_to_image' by specifying it works with existing images rather than generating from text alone.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description provides clear context about when to use this tool (transforming existing images with a text prompt) and implicitly distinguishes it from 'text_to_image' which likely generates images from text. However, it doesn't explicitly state when NOT to use this tool or mention specific alternatives beyond the sibling tool.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

Install Server

Other Tools

Latest Blog Posts

Lightport: Open-Sourcing Glama's AI Gateway
By punkpeye on April 27, 2026.
open source
OpenAI
Tool Definition Quality Score (TDQS)
By punkpeye on April 3, 2026.
mcp
The Hackers Who Tracked My Sleep Cycle
By punkpeye on March 26, 2026.
security

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/nanana-app/mcp-server-nano-banana'

If you have feedback or need assistance with the MCP directory API, please join our Discord server