Skip to main content
Glama
miracorhan

Open Google Image Generator MCP

by miracorhan

Open Google Image Generator MCP

This project is a Model Context Protocol (MCP) server that exposes Google Cloud Vertex AI and Google GenAI SDK capabilities—Imagen, Gemini, Veo, Lyria, and Chirp models—to MCP-compatible clients. Built with the FastMCP framework.

Current version: 3.1.1 — Full GenAI SDK integration (embed, speech, video analysis, live generation), WebP/AVIF format support, multi-tier model selection, parallel batch generation, sequential pipeline engine, and comprehensive video tools. Service Account JSON auth support, Gemini 3.x text/speech/live models.

Features & Tools

Image Tools

Tool

Description

Backend

tool_generate_image

Text-to-image generation. Supports aspect ratio, negative prompt, seed, watermark, GCS output, and WebP/AVIF output

Imagen 4 (imagen-4.0-fast-generate-001)

tool_edit_image

Mask-based inpaint/outpaint, background swap, product image, and prompt-driven edit. See Edit modes below

Imagen 3 Capability (imagen-3.0-capability-001)

tool_transform_image

Free-form image + text → image transformation: style transfer, scene rewriting, multi-reference composition

Gemini multimodal (gemini-2.5-flash-image)

tool_analyze_image

Multimodal image understanding and Q&A. Supports thinking_level (MINIMAL/LOW/MEDIUM/HIGH) and media_resolution (LOW/MEDIUM/HIGH/ULTRA_HIGH)

Gemini Vision (gemini-2.5-flash)

tool_upscale_image

Upscale low-resolution images

Imagen

tool_remove_background

Remove background via EDIT_MODE_BGSWAP

Imagen

tool_batch_generate

Parallel batch text-to-image generation (up to 10 prompts, max 4 concurrent). balanced tier not supported for batch

Imagen

tool_run_pipeline

Sequential multi-step image processing pipeline (generate → edit → transform → …)

Mixed

Video Tools

Tool

Description

Backend

tool_generate_video

Text-to-video generation. Supports audio_enabled for Veo 3+

Veo (veo-3.1-fast-generate-001)

tool_image_to_video

Animate a still image into video. Supports optional last_frame_path for first+last frame mode

Veo

tool_extend_video

Extend an existing video clip by 4, 6, or 8 seconds

Veo

tool_video_object_edit

Insert or remove an object in a video via operation (insert/remove) and prompt

Veo

tool_analyze_video

Video understanding and Q&A (max 20MB; mp4, mov, avi, webm, mkv)

Gemini GenAI SDK

Audio Tools

Tool

Description

Backend

tool_generate_speech

Text-to-speech with voice selection (Aoede, Charon, Fenrir, Kore, Puck). Supports model_tier (fast/quality). Outputs WAV

Gemini TTS (gemini-3.1-flash-tts-preview / gemini-2.5-pro-preview-tts)

tool_generate_music

Music generation from a text prompt

Lyria 2 / Lyria 3 (GenAI SDK)

GenAI SDK Tools

Tool

Description

Backend

tool_embed

Text embeddings as float vectors

Gemini Embedding (text-embedding-004 on Vertex AI, gemini-embedding-2 / gemini-embedding-001 on Gemini API)

tool_live_generate

Streaming text generation — response is accumulated and returned in full

Gemini Live (gemini-3.1-flash-lite / gemini-3.5-flash / gemini-3.1-flash-live-preview)

Utility Tools

Tool

Description

tool_list_available_models

Live-probes every candidate model in the configured project/location and returns only those that respond (200/400 = reachable, 404 = excluded). Cached for the server process lifetime; pass force_refresh=true to rescan. Also reports available update versions.

tool_upload_file

Register a local file for use as a reference image in subsequent tool calls (e.g. tool_transform_image). Returns a file_uri.


Edit modes (tool_edit_image)

edit_mode

What it does

Mask required?

EDIT_MODE_DEFAULT (default)

Prompt-driven full-image edit, no mask

No

EDIT_MODE_INPAINT_INSERTION

Add an object into the masked region

Yes

EDIT_MODE_INPAINT_REMOVAL

Remove content in the masked region

Yes

EDIT_MODE_OUTPAINT

Extend the image beyond its original bounds

Yes

EDIT_MODE_BGSWAP

Swap the background

No

EDIT_MODE_PRODUCT_IMAGE

Product reference styling

No

Use imagen-3.0-capability-001 (default) for all of the above. The legacy imagen-3.0-generate-002 only supports EDIT_MODE_DEFAULT and does not accept a mask.

When to use which "image + text → image" tool

Need

Use

Mask-based inpaint/outpaint/BG-swap with pixel precision

tool_edit_image (Imagen Capability)

"Make it look like X" / style transfer / scene rewriting / multi-reference compositions

tool_transform_image (Gemini multimodal)

Model tiers

Most tools accept a model_tier parameter:

Tier

Description

fast (default)

Lowest latency, lowest cost

balanced

Quality / speed trade-off; routes to Gemini for image generation. Not supported for tool_batch_generate

quality

Higher quality, moderate latency

ultra

Maximum quality (Imagen 4 Ultra / Veo quality models)

Model resolution by tier and tool

Tier

tool_generate_image

tool_transform_image

tool_generate_video

fast

imagen-4.0-fast-generate-001

gemini-2.5-flash-image ¹

veo-3.1-fast-generate-001

balanced

gemini-2.5-flash-image ¹

gemini-2.5-flash-image ¹

veo-3.1-fast-generate-001

quality

imagen-4.0-generate-001

gemini-2.5-pro-image

veo-3.1-generate-001

ultra

imagen-4.0-ultra-generate-001

gemini-2.5-pro-image

veo-3.1-generate-001

GenAI SDK tools (tool_live_generate, tool_analyze_video, tool_generate_speech, tool_embed):

Tier

tool_live_generate / tool_analyze_video

tool_generate_speech

tool_embed

fast

gemini-3.1-flash-lite ¹

gemini-3.1-flash-tts-preview ¹

gemini-embedding-2 / text-embedding-004 ²

balanced

gemini-3.5-flash ¹

quality

gemini-2.5-pro

gemini-2.5-pro-preview-tts

¹ Requires GOOGLE_CLOUD_API_KEY (Gemini API endpoint). Verified working as of June 2026 per Gemini API Models docs. ² gemini-embedding-2 / gemini-embedding-001 are used when GOOGLE_CLOUD_API_KEY is set; text-embedding-004 is used with Vertex AI ADC.


Model availability: Vertex AI vs. Gemini API

Google exposes models through two separate endpoints with different model catalogs:

Endpoint

Auth

Image models available

Text/speech models available

Vertex AI (aiplatform.googleapis.com)

ADC / Service Account

gemini-2.5-flash-image, gemini-2.5-pro-image

gemini-2.5-pro/flash/flash-lite

Gemini API (generativelanguage.googleapis.com)

GOOGLE_CLOUD_API_KEY

gemini-3.1-flash-image, gemini-3-pro-image ³

gemini-3.5-flash, gemini-3.1-flash-lite, gemini-3.1-pro-preview

³ Why gemini-3.1-flash-image and gemini-3-pro-image are not currently usable: tool_transform_image and the balanced generate tier call Vertex AI's :generateContent endpoint with an OAuth token. As of June 2026, Google has not yet published these models to the Vertex AI publisher catalog — requesting them returns HTTP 404 regardless of project or region. They are listed in the Gemini API catalog and are reachable only via API key (generativelanguage.googleapis.com). The image transformation backend currently uses Vertex AI, so it cannot use them until Google makes them available there. This server will be updated automatically once they appear.

Sources:

Output formats

tool_generate_image, tool_edit_image, tool_transform_image, and tool_upscale_image accept a save_format / output_format parameter:

Format

Notes

PNG (default)

Lossless

JPEG

Smaller files, lossy. compression_quality (0-100, default 85) applies only to JPEG

WEBP

Modern lossless/lossy, wide browser support

AVIF

Best compression, requires Pillow>=10


Error handling

All tools return a uniform error shape:

{
  "success": false,
  "error": {
    "code": 404,
    "model": "gemini-9.9-nonexistent",
    "endpoint": ":generateContent",
    "message": "Publisher Model `...` is not found.",
    "hint": "Model '...' not found in project '...' / location '...'. Try: gemini-2.5-flash-image.",
    "docs_url": "https://docs.cloud.google.com/...",
    "log_path": ".../logs/vertex_ai_mcp.log",
    "duration_s": 0.42
  }
}

HTTP code

What you'll see in error.hint

400

Vertex's parameter-validation message verbatim

401

"Run gcloud auth application-default login and retry."

403

IAM role hint (roles/aiplatform.user) + Vertex AI API enablement check

404

Live alternatives from the probe cache (tool_list_available_models)

429

Retry after N (from Retry-After header) + quota-increase pointer

500/502/503/504

"Safe to retry once"

TIMEOUT

After 90s — suggests a -fast- variant

VALIDATION

Client-side validation failure (mask missing, file not found, etc.); no HTTP call is made

Full request/response logs are written to logs/vertex_ai_mcp.log.

Resources & Prompts

  • Local Resources (local://outputs/{filename}): Generated and processed media files are exposed as MCP resources for seamless display in MCP clients (Claude Desktop, Cursor, etc.).

  • Pre-built Prompts: Includes specialized prompt templates for character_design, logo_concept, and UI_UX_mockup.


Related MCP server: nano-banana-mcp

Prerequisites & Resources

  1. Python 3.9 or newer

  2. Google Cloud Account with an active project

  3. Vertex AI API enabled in your project

  4. Google Cloud CLI (gcloud) installed and configured

For GenAI SDK tools (tool_embed, tool_analyze_video, tool_generate_speech, tool_live_generate, tool_generate_music), you additionally need one of:

  • A Cloud API Key (GOOGLE_CLOUD_API_KEY) — created in GCP Console, uses your existing GCP billing. Enables newer models like gemini-3.5-flash. (Recommended)

  • A Gemini API key (GOOGLE_GENAI_API_KEY) — from AI Studio, separate billing.

  • Vertex AI ADC — set GOOGLE_GENAI_BACKEND=vertexai, uses gcloud auth application-default login.

  • Service Account JSON (GOOGLE_APPLICATION_CREDENTIALS) — see below.

Cloud API Key setup: Run gcloud services enable generativelanguage.googleapis.com apikeys.googleapis.com then gcloud services api-keys create --display-name="GeminiKey". The key string appears in the command output — copy it directly to .env.


Installation & Setup

Option A: Install from PyPI

pip install open-google-image-generator-mcp

Option B: Clone the Repository

git clone https://github.com/miracorhan/OpenGoogleImageGeneratorMCP.git
cd OpenGoogleImageGeneratorMCP
pip install -r requirements.txt

Authentication

Choose one of the following methods:

Option A — User account (interactive login):

gcloud auth application-default login

Opens a browser. Use an account that has access to your Google Cloud project.

Option B — Service Account JSON key:

# Download key: GCP Console → IAM & Admin → Service Accounts → Keys → Add Key → JSON
# Required role: Vertex AI User (roles/aiplatform.user)

Set in .env:

GOOGLE_APPLICATION_CREDENTIALS=C:\path\to\service-account-key.json

Both vertex_ai_tools (Imagen, Veo, Lyria) and genai_tools (embed, speech, video analysis, live) pick up the SA JSON automatically. No gcloud CLI required when using a service account key.

Environment Configuration

Create a .env file in the project root:

# Required
GOOGLE_CLOUD_PROJECT=your-google-cloud-project-id
GOOGLE_CLOUD_LOCATION=us-central1

# Output directory for generated media
DEFAULT_OUTPUT_DIR=./outputs

# --- GenAI SDK (for embed, speech, live, music, video-analysis tools) ---
# Option A: Cloud API Key (recommended — uses GCP billing, enables gemini-3.5-flash)
# Create: gcloud services enable generativelanguage.googleapis.com apikeys.googleapis.com
#         gcloud services api-keys create --display-name="GeminiKey"
GOOGLE_CLOUD_API_KEY=AIza...

# Option B: Gemini API key (AI Studio, separate billing, free tier available)
# GOOGLE_GENAI_BACKEND=gemini_api
# GOOGLE_GENAI_API_KEY=AIza...

# Option C: Vertex AI ADC (no key needed, uses gcloud auth application-default login)
# GOOGLE_GENAI_BACKEND=vertexai

# Option D: Service Account JSON key (no gcloud CLI required)
# Download: GCP Console → IAM & Admin → Service Accounts → Keys → Add Key → JSON
# Required role: Vertex AI User (roles/aiplatform.user)
# GOOGLE_APPLICATION_CREDENTIALS=C:\path\to\service-account-key.json

# --- Advanced Vertex AI Authentication (Optional) ---
# Direct OAuth 2.0 Access Token
# GOOGLE_ACCESS_TOKEN=ya29.a0AfB_by...

# Service Account Impersonation
# IMPERSONATE_SERVICE_ACCOUNT=your-service-account@your-project.iam.gserviceaccount.com

Usage

Running as a Standalone Script

python mcp_server.py

Integrating with MCP Clients

For Claude Desktop (claude_desktop_config.json):

{
  "mcpServers": {
    "OpenGoogleImageGenerator": {
      "command": "python",
      "args": ["/absolute/path/to/OpenGoogleImageGeneratorMCP/mcp_server.py"],
      "env": {
        "GOOGLE_CLOUD_PROJECT": "your-google-cloud-project-id",
        "GOOGLE_CLOUD_LOCATION": "us-central1",
        "GOOGLE_GENAI_API_KEY": "AIza..."
      }
    }
  }
}

Replace /absolute/path/to/your/... with the actual path, and use the correct Python executable if using a virtual environment.

Example prompts

  • "Generate an image of a futuristic city at sunset."

  • "Edit this banner — add a glowing cyan halo around the logo." (tool_edit_image, EDIT_MODE_DEFAULT)

  • "Transform this photo into a hand-drawn pencil sketch." (tool_transform_image)

  • "Remove the background from the image I just generated."

  • "Analyze this image and tell me what objects are present."

  • "Generate 8 product shots in parallel with different backgrounds." (tool_batch_generate)

  • "Run a pipeline: generate → remove background → upscale." (tool_run_pipeline)

  • "Convert this text to speech using the Kore voice." (tool_generate_speech)

  • "Generate a 30-second ambient music track." (tool_generate_music)

  • "Embed this sentence for semantic search." (tool_embed)

  • "Animate this product photo into a 5-second video." (tool_image_to_video)

  • "Generate a video of a sunset with audio." (tool_generate_video, audio_enabled=true)


Troubleshooting

Symptom

Cause

Fix

RefreshError / "Reauthentication needed"

ADC token expired

gcloud auth application-default login or switch to SA JSON

available: {} from tool_list_available_models

Same ADC expiry, probe skipped

Re-authenticate (above)

403 PERMISSION_DENIED with SA JSON

Service account missing role

Grant roles/aiplatform.user to the service account

FileNotFoundError for SA JSON

Wrong path in GOOGLE_APPLICATION_CREDENTIALS

Use an absolute path; forward slashes or escaped backslashes

401 UNAUTHENTICATED with GOOGLE_CLOUD_API_KEY

Key used with Vertex AI endpoint (rejects keys)

Ensure GOOGLE_GENAI_BACKEND is not set to vertexai when using a Cloud API key

403 PERMISSION_DENIED — "Gemini API has not been used"

generativelanguage.googleapis.com not enabled

gcloud services enable generativelanguage.googleapis.com

400 INVALID_ARGUMENT — "API key not valid" even with a brand-new key

Google's key-validation error is identical whether the key is dead or generativelanguage.googleapis.com just isn't enabled on that project — the message doesn't distinguish the two

Confirm with gcloud services list --project=YOUR_PROJECT --filter="config.name:generativelanguage.googleapis.com"; if missing, gcloud services enable generativelanguage.googleapis.com --project=YOUR_PROJECT and allow a few minutes for propagation before retrying

403 PERMISSION_DENIED — "API Keys API disabled"

apikeys.googleapis.com blocked by org policy

gcloud services enable apikeys.googleapis.com

tool_generate_speech / tool_embed / tool_live_generate / tool_analyze_video return 404 NOT_FOUND under GOOGLE_GENAI_BACKEND=vertexai

The TTS-preview, live-preview, and embedding-preview models used by these tools are not published to the Vertex AI publisher catalog — Vertex AI/ADC is not a usable fallback for them, unlike image models

Use GOOGLE_CLOUD_API_KEY or GOOGLE_GENAI_API_KEY (Gemini API endpoint) for these four tools specifically

Worried a GenAI SDK call will charge your Cloud Billing / free-trial credit

Once billing is enabled on a project, the Gemini API free tier is gone entirely for that project — every call becomes billable from the first token, and some promotional/free-trial credits explicitly exclude Gemini API usage regardless

For risk-free testing, create a separate GCP project with no billing account attached at all and generate its AI Studio key there (GOOGLE_GENAI_API_KEY); verify with gcloud billing projects describe YOUR_PROJECTbillingEnabled: false. Keep your billed project for Vertex AI (Imagen/Veo/Lyria) only

Edited .env but the exact same auth error keeps happening

config.py loads .env once at process start and genai_tools._genai_client is cached for the process lifetime — editing the file does not affect an already-running server

Reconnect the MCP server (/mcp → reconnect, or restart the client) after any .env change

404 NOT_FOUND for text-embedding-004

Vertex AI embedding model not available on Gemini API endpoint

Set GOOGLE_CLOUD_API_KEY (auto-selects gemini-embedding-2) or use ADC

404 NOT_FOUND for gemini-3.1-flash-image or gemini-3-pro-image

These models are Gemini API only — not yet on Vertex AI

Use gemini-2.5-flash-image / gemini-2.5-pro-image for transform, or wait for Vertex AI availability

404 NOT_FOUND for a model

Model not available in your project/region

Run tool_list_available_models and pick from available list

gcloud api-keys command not found

Wrong command prefix

Use gcloud services api-keys list/create/delete

Security: The Cloud API Key string appears in plaintext in gcloud services api-keys create output. Copy it directly to .env — never paste it into a chat, commit, or log. Revoke a leaked key immediately with gcloud services api-keys delete KEY_ID.


Author & License

A
license - permissive license
-
quality - not tested
A
maintenance

Maintenance

Maintainers
Response time
1wRelease cycle
5Releases (12mo)
Commit activity

Resources

Unclaimed servers have limited discoverability.

Looking for Admin?

If you are the server author, to access and configure the admin panel.

Latest Blog Posts

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/miracorhan/OpenGoogleImageGeneratorMCP'

If you have feedback or need assistance with the MCP directory API, please join our Discord server