Skip to main content
Glama
nanana-app

Nanana AI Image Generation Server

by nanana-app

text_to_image

Generate custom images from text descriptions using AI. Enter a detailed prompt to create visual content for various applications.

Instructions

Generate an image from a text prompt using Nanana AI. This operation typically takes 15-30 seconds to complete. The tool will wait for generation to finish and return the final image URL.

Input Schema

TableJSON Schema
NameRequiredDescriptionDefault
promptYesThe text prompt describing the image to generate

Implementation Reference

  • Core handler function that executes the text-to-image generation by calling the Nanana AI API, handling errors, and returning the image URL.
    async function callTextToImage(
      apiToken: string,
      params: TextToImageParams
    ): Promise<{ imageUrl: string; imageBase64: string; prompt: string }> {
      const response = await fetch(`${API_BASE_URL}/api/mcp/v1/text-to-image`, {
        method: "POST",
        headers: {
          Authorization: `Bearer ${apiToken}`,
          "Content-Type": "application/json",
        },
        body: JSON.stringify({ prompt: params.prompt }),
      });
    
      if (!response.ok) {
        const error = (await response.json().catch(() => ({}))) as {
          error?: string;
        };
        throw new Error(
          error.error || `API request failed with status ${response.status}`
        );
      }
    
      const data = (await response.json()) as { imageUrl: string; prompt: string };
    
      // Download image and convert to base64
      // const imageBase64 = await imageUrlToBase64(data.imageUrl);
      const imageBase64 = "";
    
      return { imageUrl: data.imageUrl, imageBase64, prompt: data.prompt };
    }
  • MCP protocol handler dispatch for text_to_image tool call, logging, invoking core handler, and formatting response content.
    if (name === "text_to_image") {
      const params = args as unknown as TextToImageParams;
      console.error(
        `[MCP] Starting text-to-image generation: "${params.prompt}"`
      );
      const result = await callTextToImage(apiToken, params);
      console.error(`[MCP] Generation completed: ${result.imageUrl}`);
      return {
        content: [
          {
            type: "text",
            text: `Successfully generated image!\n\nPrompt: ${result.prompt}\n\nImage URL: ${result.imageUrl}`,
          },
        ],
      };
  • Type definition for input parameters of the text_to_image tool.
    interface TextToImageParams {
      prompt: string;
    }
  • src/index.ts:24-38 (registration)
    Tool registration object defining name, description, and input schema for text_to_image.
    const TEXT_TO_IMAGE_TOOL: Tool = {
      name: "text_to_image",
      description:
        "Generate an image from a text prompt using Nanana AI. This operation typically takes 15-30 seconds to complete. The tool will wait for generation to finish and return the final image URL.",
      inputSchema: {
        type: "object",
        properties: {
          prompt: {
            type: "string",
            description: "The text prompt describing the image to generate",
          },
        },
        required: ["prompt"],
      },
    };
  • src/index.ts:162-165 (registration)
    Registration of available tools in the MCP listTools handler, including text_to_image.
      return {
        tools: [TEXT_TO_IMAGE_TOOL, IMAGE_TO_IMAGE_TOOL],
      };
    });
Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

With no annotations provided, the description carries full burden for behavioral disclosure. It effectively describes key behavioral traits: the asynchronous nature of the operation (15-30 second completion time), the synchronous tool behavior (will wait for generation), and the return value (final image URL). This goes beyond what the input schema provides.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is perfectly concise with three focused sentences: purpose statement, timing information, and behavioral details. Every sentence earns its place by providing essential information without redundancy or unnecessary elaboration.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the tool's moderate complexity (asynchronous operation with waiting behavior), no annotations, and no output schema, the description provides good contextual coverage. It explains the operation timing, waiting behavior, and return format. However, it doesn't mention potential failure modes, rate limits, or authentication requirements that would be helpful for a complete understanding.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema description coverage is 100%, so the schema already fully documents the single 'prompt' parameter. The description doesn't add any additional parameter semantics beyond what's in the schema. Baseline score of 3 is appropriate when schema does the heavy lifting.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the specific action ('Generate an image'), identifies the resource ('from a text prompt'), and specifies the service provider ('using Nanana AI'). It distinguishes from the sibling tool 'image_to_image' by focusing on text-to-image generation rather than image-to-image transformation.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines3/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description implies usage context through the mention of generation time (15-30 seconds) and the tool's waiting behavior, but it doesn't explicitly state when to use this tool versus alternatives. No explicit guidance on when-not-to-use or comparison with the sibling tool 'image_to_image' is provided.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

Install Server

Other Tools

Latest Blog Posts

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/nanana-app/mcp-server-nano-banana'

If you have feedback or need assistance with the MCP directory API, please join our Discord server