What can you do with this server?

mcp-alphabanana is an MCP server that generates high-quality image assets using Google Gemini AI models, designed for web/game asset pipelines and MCP-compatible clients (Claude Desktop, VS Code, Cursor). Image Generation * Create images from text prompts using Gemini model tiers: Flash3.1, Flash2.5, and Pro3 * Generate at 0.5K, 1K, 2K, or 4K source resolutions with native aspect ratio support * Supported aspect ratios: 1:1, 2:3, 3:2, 3:4, 4:3, 9:16, 16:9, 21:9, and more (Flash3.1 adds 1:4, 4:1, 1:8, 8:1) * Zero watermarks on all outputs Output Control * Formats: PNG, JPEG, or WebP * Delivery: Save to file, return as base64, or both * Sizing: Specify exact pixel dimensions or use noresize to return Gemini's native dimensions * Resize modes: crop, stretch, letterbox, or contain Transparency & Post-Processing * Generate transparent PNG/WebP via automatic background removal (histogram analysis) * Custom color-key transparency with configurable hex color and tolerance * Fringe reduction modes: auto, crisp, or hd for clean alpha edges Advanced Features * Reference image guidance: Provide up to 14 local images (3 for Flash2.5) to influence style/composition * Thinking Mode (minimal or high): Higher prompt adherence (Flash3.1 only) * Grounding via Google Search (text, image, or both): Search-backed accuracy (Flash3.1 only) * Metadata output: Grounding/reasoning metadata and model thought content in JSON * Debug mode: Save intermediate processing artifacts

Which integrations are available for this server?

Enables image generation using Google Gemini models (Flash 3.1, Flash 2.5, Pro 3), supporting features like transparent PNG/WebP assets, local reference image guidance, and search-backed grounding.

de en es ja ko ru zh

mcp-alphabanana

by tasopen

Overview Schema Related Servers Score Discussions

TypeScript

mcp-alphabanana

npm version License: MIT

English | 日本語

mcp-alphabanana is a Model Context Protocol (MCP) server for generating image assets with Google Gemini. It is built for MCP-compatible clients and agent workflows that need fast image generation, transparent outputs, reference-image guidance, and flexible delivery formats.

Keywords: MCP server, Model Context Protocol, Gemini AI, image generation, FastMCP

Key capabilities:

Ultra-fast Gemini image generation across Flash and Pro tiers
Transparent PNG/WebP asset output for web and game pipelines
Multi-image style guidance with local reference image files
Flexible file, base64, or combined outputs for agent workflows

alphabanana demo

Quick Start

Run the MCP server with npx:

npx -y @tasopen/mcp-alphabanana

Or add it to your MCP configuration:

{
  "mcp": {
    "servers": {
      "alphabanana": {
        "command": "npx",
        "args": ["-y", "@tasopen/mcp-alphabanana"],
        "env": {
          "GEMINI_API_KEY": "${env:GEMINI_API_KEY}"
        }
      }
    }
  }
}

Set GEMINI_API_KEY before starting the server.

For Claude Desktop, Download mcp-alphabanana-latest.mcpb, then add it as Extension from Claude Desktop Settings. For Windows, Recommend add 'FileSystem' extension for better local file handling.
Download MCPB

Claude Registry

The Claude registry / MCPB package metadata is defined in manifest.json and ships with the static 512x512 icon at images/mcp-alphabanana.png.

Native sharp runtime packages are declared as optional dependencies so .mcpb installs can resolve the correct prebuilt binary on each supported platform without relying on postinstall hooks.

Stable MCPB URL: https://github.com/tasopen/mcp-alphabanana/releases/latest/download/mcp-alphabanana-latest.mcpb
Versioned MCPB URL pattern: https://github.com/tasopen/mcp-alphabanana/releases/download/vVERSION/mcp-alphabanana-VERSION.mcpb
Support: GitHub Issues

MCP Server

This repository provides an MCP server that enables AI agents to generate images using Google Gemini.

It can be used with MCP-compatible clients such as:

Claude Desktop
VS Code MCP
Cursor

Built with FastMCP 3 for a simplified codebase and flexible output options.

Glama MCP Server badge:

Available Tools

generate_image

Generates images using Google Gemini with optional transparency, local reference images, grounding, and reasoning metadata.

For Claude Desktop, prefer outputType=file for medium or large images. base64 and combine responses consume Claude context and can hit the client's size limit. On Windows, use the FileSystem extension to choose a writable absolute outputPath and any local referenceImages paths.

Key parameters:

prompt (string): description of the image to generate
model: Flash3.1, Flash2.5, Pro3, flash, pro
outputWidth and outputHeight: requested final image size in pixels in normal mode
noresize + aspectRatio + output_resolution: return Gemini native size without resizing
output_resolution: 0.5K, 1K, 2K, 4K
output_format: png, jpg, webp
outputType: file, base64, combine
outputPath: required when outputType is file or combine
transparent: enable transparent PNG/WebP post-processing
referenceImages: optional array of local reference image files
grounding_type and thinking_mode: advanced Gemini 3.1 controls

Model Selection

Input Model ID	Internal Model ID	Description
`Flash3.1`	`gemini-3.1-flash-image-preview`	Ultra-fast, supports Thinking/Grounding.
`Flash2.5`	`gemini-2.5-flash-image`	Legacy Flash. High stability. Low cost.
`Pro3`	`gemini-3.0-pro-image-preview`	High-fidelity Pro model.
`flash`	`gemini-3.1-flash-image-preview`	Alias for backward compatibility.
`pro`	`gemini-3.0-pro-image-preview`	Alias for backward compatibility.

Parameters

Full parameter reference for the generate_image tool.

Parameter	Type	Default	Description
`prompt`	string	required	Description of the image to generate
`outputFileName`	string	required	Output filename (extension auto-added if missing)
`outputType`	enum	`combine`	`file`, `base64`, or `combine`
`model`	enum	`Flash3.1`	Model: `Flash3.1`, `Flash2.5`, `Pro3`, `flash`, `pro`
`output_resolution`	enum	auto	`0.5K`, `1K`, `2K`, `4K`; required when `noresize=true`
`noresize`	boolean	`false`	Skip post-generation resize and return Gemini native dimensions
`aspectRatio`	enum	optional	Required when `noresize=true`; e.g. `1:1`, `16:9`, `4:5`
`outputWidth`	integer	required unless `noresize=true`	Final output width in pixels
`outputHeight`	integer	required unless `noresize=true`	Final output height in pixels
`output_format`	enum	`png`	`png`, `jpg`, `webp`
`outputPath`	string	required for `file` / `combine`	Absolute output directory path
`transparent`	boolean	`false`	Transparent background (PNG/WebP only)
`transparentColor`	string or null	`null`	Color key override for transparency extraction
`colorTolerance`	integer	`30`	Transparency color matching tolerance
`fringeMode`	enum	`auto`	`auto`, `crisp`, `hd`
`resizeMode`	enum	`crop`	`crop`, `stretch`, `letterbox`, `contain`
`grounding_type`	enum	`none`	`none`, `text`, `image`, `both` (Flash3.1 only)
`thinking_mode`	enum	`minimal`	`minimal`, `high` (Flash3.1 only)
`include_thoughts`	boolean	`false`	Return model reasoning fields when metadata is enabled
`include_metadata`	boolean	`false`	Include grounding and reasoning metadata in JSON output
`referenceImages`	array	`[]`	Up to 14 local reference files (Flash3.1/Pro3), 3 for Flash2.5
`debug`	boolean	`false`	Save intermediate debug artifacts

Why alphabanana?

Zero Watermarks: API-native clean images.
Thinking/Grounding Support: Higher prompt adherence and search-backed accuracy.
Production Ready: Supports transparent WebP and exact aspect ratios for web and game assets.

Features

Ultra-fast image generation (Gemini 3.1 Flash, 0.5K/1K/2K/4K)
Advanced multi-image reasoning (up to 14 reference images)
Thinking/Grounding support (Flash3.1 only)
Transparent PNG/WebP output (color-key post-processing, despill)
Multiple output formats: file, base64, or both
Flexible resize modes: crop, stretch, letterbox, contain
Multiple model tiers: Flash3.1, Flash2.5, Pro3, legacy aliases

Example Outputs

These sample outputs were generated with mcp-alphabanana and stored in images/examples.

Pixel art asset	Reference-image game scene	Photorealistic generation

Configuration

Configure the GEMINI_API_KEY in your MCP configuration (for example, mcp.json).

Examples:

Reference an OS environment variable from mcp.json:

{
  "env": {
    "GEMINI_API_KEY": "${env:GEMINI_API_KEY}"
  }
}

Provide the key directly in mcp.json:

{
  "env": {
    "GEMINI_API_KEY": "your_api_key_here"
  }
}

VS Code Integration

Add to your VS Code settings (.vscode/settings.json or user settings), configuring the server env in mcp.json or via the VS Code MCP settings.

{
  "mcp": {
    "servers": {
      "mcp-alphabanana": {
        "command": "npx",
        "args": ["-y", "@tasopen/mcp-alphabanana"],
        "env": {
          "GEMINI_API_KEY": "${env:GEMINI_API_KEY}"
        }
      }
    }
  }
}

Optional: Set a custom fallback directory for write failures by adding MCP_FALLBACK_OUTPUT to the env object.

Usage Examples

Basic Generation

{
  "prompt": "A pixel art treasure chest, golden trim, wooden texture",
  "model": "Flash3.1",
  "outputFileName": "chest",
  "outputType": "base64",
  "outputWidth": 64,
  "outputHeight": 64,
  "transparent": true
}

Native Size Without Resize

{
  "prompt": "A clean app icon with a banana mascot, flat graphic design",
  "model": "Flash3.1",
  "outputFileName": "banana-icon-native",
  "outputType": "base64",
  "noresize": true,
  "aspectRatio": "1:1",
  "output_resolution": "0.5K",
  "output_format": "png"
}

This mode returns the Gemini native pixel size for the requested ratio and resolution. For example, 1:1 + 0.5K returns 512x512 without any resize pass.

Advanced (Vertical poster and thinking)

{
  "prompt": "A vertical, photorealistic travel poster advertising Magical Wings Day Tours. A joyful young couple flies high above a breathtaking European countryside at golden hour, holding hands as they soar through a partly cloudy sky. Below them are vineyards, villages, forests, a winding river, and a hilltop medieval castle. The poster uses large, elegant typography with the headline FLY THE COUNTRYSIDE at the top and Magical Wings Day Tours branding near the bottom.",
  "model": "Flash3.1",
  "output_resolution": "1K",
  "outputFileName": "photoreal-travel-poster",
  "outputType": "file",
  "outputPath": "/path/to/output",
  "outputWidth": 848,
  "outputHeight": 1264,
  "output_format": "jpg",
  "thinking_mode": "high",
  "include_metadata": true
}

Grounding Sample (Search-backed)

{
  "prompt": "A modern travel poster featuring today's weather and skyline highlights in Kuala Lumpur",
  "model": "Flash3.1",
  "outputFileName": "kl_travel_poster",
  "outputType": "base64",
  "outputWidth": 1024,
  "outputHeight": 1024,
  "grounding_type": "text",
  "thinking_mode": "high",
  "include_metadata": true,
  "include_thoughts": true
}

This sample enables Google Search grounding and returns grounding and reasoning metadata in JSON.

With Reference Images

{
  "prompt": "Use the reference image to create a game screen showing an opened treasure chest filled with coins and treasure, 8-bit dungeon crawler style, after-battle reward scene, dungeon corridor background, four-party status UI at the bottom",
  "model": "Flash3.1",
  "output_resolution": "0.5K",
  "outputFileName": "reference-image-dungeon-loot",
  "outputType": "file",
  "outputPath": "/path/to/output",
  "outputWidth": 600,
  "outputHeight": 448,
  "output_format": "webp",
  "transparent": false,
  "referenceImages": [
    {
      "description": "Treasure chest style reference",
      "filePath": "/path/to/references/pixel-art-treasure-chest.png"
    }
  ]
}

Transparency & Output Formats

PNG: Full alpha, color-key + despill
WebP: Full alpha, better compression (Flash3.1+)
JPEG: No transparency (falls back to solid background)

Development

# Development mode with MCP CLI
npm run dev

# MCP Inspector (Web UI)
npm run inspect

# Build for production
npm run build

License

MIT

Install Server

license - permissive license

quality

maintenance

How are these scores calculated?

Maintenance

–Maintainers

–Response time

3dRelease cycle

5Releases (12mo)

Resources

Need Help?

Related Servers

Tools

generate_imageB

Latest Blog Posts

Lightport: Open-Sourcing Glama's AI Gateway
By punkpeye on April 27, 2026.
open source
OpenAI
Tool Definition Quality Score (TDQS)
By punkpeye on April 3, 2026.
mcp
The Hackers Who Tracked My Sleep Cycle
By punkpeye on March 26, 2026.
security

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/tasopen/mcp-alphabanana'

If you have feedback or need assistance with the MCP directory API, please join our Discord server