Gemini MCP Server

gemini-image

Generate images from text descriptions using Gemini's Nano Banana models. Create visuals for various use cases with adjustable parameters like aspect ratio and quantity, automatically selecting Pro model for text-heavy prompts.

Instructions

Generate images using Nano Banana (Gemini's native image generation).

Two models available:

Nano Banana (default): Fast, cheap (~$0.04/image), good for most use cases
Nano Banana Pro: Advanced model with better text rendering, infographics, diagrams

Auto-detection: Says "nano banana pro" or mentions text/infographic/diagram/chart/logo/poster in prompt → automatically uses Pro model.

Parameters:

prompt: Text description of the image to generate (required)
numberOfImages: How many images to generate (1-4, default: 1)
aspectRatio: Image aspect ratio ("1:1", "3:4", "4:3", "9:16", "16:9", default: "1:1")
usePro: Force Nano Banana Pro (auto-detected from prompt if not specified)
outputPath: Optional path to save images

Input Schema

TableJSON Schema

Name	Required	Description
`prompt`	Yes	Text description of the image to generate
`numberOfImages`	No	Number of images to generate (1-4)
`aspectRatio`	No	Aspect ratio of generated images
`usePro`	No	Use Nano Banana Pro for higher quality (better text, infographics)
`outputPath`	No	Optional directory path to save generated images

Tool Definition Quality

A4.4/5.0

Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

With no annotations provided, the description carries full burden of behavioral disclosure. It does well by explaining cost implications (~$0.04/image), model performance characteristics (fast/cheap vs better text rendering), and auto-detection behavior. However, it doesn't mention rate limits, authentication requirements, or error conditions. For a generative AI tool with no annotations, this is strong but not comprehensive behavioral context.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is efficiently structured with clear sections: purpose statement, model comparison, auto-detection logic, and parameter summary. Every sentence earns its place by providing essential information without redundancy. The information is front-loaded with the core purpose and model options before parameter details.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

For an image generation tool with 5 parameters, 100% schema coverage, and no output schema, the description provides strong contextual information about model selection, cost, and auto-detection. However, without annotations or output schema, it doesn't describe what the tool returns (image URLs? file paths? metadata?) or potential limitations. Given the complexity, it's mostly complete but missing output information.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema description coverage is 100%, so the schema already documents all parameters thoroughly. The description adds minimal value beyond the schema - it mentions 'auto-detection' logic which relates to the 'usePro' parameter, but doesn't provide additional semantic context about parameter interactions or usage patterns. The baseline of 3 is appropriate when the schema does the heavy lifting.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the tool's purpose: 'Generate images using Nano Banana (Gemini's native image generation).' It specifies the exact action (generate images) and resource (Nano Banana/Gemini), distinguishing it from sibling tools like gemini (likely text generation) and gemini-video-generate (video generation). The description immediately establishes this is an image generation tool with specific model options.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines5/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description provides explicit guidance on when to use different model options: 'Fast, cheap (~$0.04/image), good for most use cases' for the default model versus 'Advanced model with better text rendering, infographics, diagrams' for Pro. It also explains auto-detection logic for Pro model usage and mentions the 'usePro' parameter for manual override. This gives clear decision criteria for model selection.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

Install Server

Other Tools

Latest Blog Posts

Lightport: Open-Sourcing Glama's AI Gateway
By punkpeye on April 27, 2026.
open source
OpenAI
Tool Definition Quality Score (TDQS)
By punkpeye on April 3, 2026.
mcp
The Hackers Who Tracked My Sleep Cycle
By punkpeye on March 26, 2026.
security

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/riotofgeese/gemini-mcp'

If you have feedback or need assistance with the MCP directory API, please join our Discord server