Gemini 图像生成器 MCP 服务器

通过 MCP 协议使用 Google 的 Gemini 模型从文本提示生成高质量的图像。

概述

这款 MCP 服务器允许任何 AI 助手使用 Google 的 Gemini AI 模型生成图像。该服务器负责处理快速工程、文本转图像、文件名生成以及本地图像存储，让您能够通过任何 MCP 客户端轻松创建和管理 AI 生成的图像。

特征

使用 Gemini 2.0 Flash 进行文本到图像的生成
基于文本提示的图像到图像转换
支持基于文件和 base64 编码的图像
根据提示自动智能生成文件名
非英语提示的自动翻译
可配置输出路径的本地图像存储
生成的图像中严格排除文本
高分辨率图像输出
直接访问图像数据和文件路径

可用的 MCP 工具

该服务器为AI助手提供了以下MCP工具：

1. `generate_image_from_text`

根据文本提示描述创建新图像。

generate_image_from_text(prompt: str) -> Tuple[bytes, str]

参数：

prompt ：要生成的图像的文本描述

返回：

包含以下内容的元组：
- 原始图像数据（字节）
- 保存的图像文件的路径（str）

这种双重返回格式允许 AI 助手直接处理图像数据或引用保存的文件路径。

例子：

“生成山上日落的图像”
“在科幻城市中创造一只逼真的飞猪”

示例输出

该图像是使用提示生成的：

"Hi, can you create a 3d rendered image of a pig with wings and a top hat flying over a happy futuristic scifi city with lots of greenery?"

飞猪飞过科幻城市

一只戴着高顶礼帽、长着翅膀的 3D 渲染猪，飞过一座充满绿意的未来科幻城市

已知问题

将此 MCP 服务器与 Claude Desktop Host 一起使用时：

性能问题：与其他方法相比，使用transform_image_from_encoded处理时间可能会显著延长。这是由于通过 MCP 协议传输大量 base64 编码的图像数据会产生开销。
路径解析问题：使用 Claude Desktop Host 时，可能无法正确解析图像路径。主机应用程序可能无法正确解释返回的文件路径，从而导致难以访问生成的图像。

为了获得最佳体验，请考虑在可能的情况下使用替代 MCP 客户端或transform_image_from_file方法。

2. `transform_image_from_encoded`

使用 base64 编码的图像数据根据文本提示转换现有图像。

transform_image_from_encoded(encoded_image: str, prompt: str) -> Tuple[bytes, str]

参数：

encoded_image ：带有格式标头的 Base64 编码图像数据（必须采用以下格式：“data/[format];base64,[data]”）
prompt ：关于如何转换图像的文本描述

返回：

包含以下内容的元组：
- 原始转换图像数据（字节）
- 已保存的转换图像文件的路径（str）

例子：

“给这片风景添加雪景”
“将背景改为海滩”

3. `transform_image_from_file`

根据文本提示转换现有的图像文件。

transform_image_from_file(image_file_path: str, prompt: str) -> Tuple[bytes, str]

参数：

image_file_path ：要转换的图像文件的路径
prompt ：关于如何转换图像的文本描述

返回：

包含以下内容的元组：
- 原始转换图像数据（字节）
- 已保存的转换图像文件的路径（str）

例子：

“在此图像中的人物旁边添加一只骆驼”
“让白天的场景看起来像夜晚”

示例转换

使用上面创建的飞猪图像，我们根据以下提示应用了变换：

"Add a cute baby whale flying alongside the pig"

前：飞越科幻城市的猪

后：飞猪和小鲸鱼

原始的飞猪图像加上一只可爱的小鲸鱼在它旁边飞翔

设置

先决条件

Python 3.11+
Google AI API 密钥（Gemini）
MCP 主机应用程序（Claude Desktop App、Cursor 或其他 MCP 兼容客户端）

获取 Gemini API 密钥

访问Google AI Studio API 密钥页面
使用您的 Google 帐户登录
点击“创建 API 密钥”
复制新的 API 密钥以用于配置
注意：API 密钥每月提供一定额度的免费使用。您可以在 Google AI Studio 中查看使用情况。

安装

克隆存储库：

git clone https://github.com/your-username/gemini-image-generator.git
cd gemini-image-generator

创建虚拟环境并安装依赖项：

# Using regular venv
python -m venv .venv
source .venv/bin/activate
pip install -e .

# Or using uv
uv venv
source .venv/bin/activate
uv pip install -e .

复制示例环境文件并添加您的 API 密钥：

cp .env.example .env

编辑.env文件以包含您的 Google Gemini API 密钥和首选输出路径：

GEMINI_API_KEY="your-gemini-api-key-here"
OUTPUT_IMAGE_PATH="/path/to/save/images"

配置 Claude 桌面

将以下内容添加到您的claude_desktop_config.json中：

macOS ： ~/Library/Application Support/Claude/claude_desktop_config.json

{
    "mcpServers": {
        "gemini-image-generator": {
            "command": "uv",
            "args": [
                "--directory",
                "/absolute/path/to/gemini-image-generator",
                "run",
                "server.py"
            ],
            "env": {
                "GEMINI_API_KEY": "GEMINI_API_KEY",
                "OUTPUT_IMAGE_PATH": "OUTPUT_IMAGE_PATH"
            }
        }
    }
}

用法

安装并配置完成后，您可以要求 Claude 使用以下提示生成或转换图像：

生成新图像

“生成山上日落的图像”
“创作一幅未来城市景观的插画”
“画一张戴着太阳镜的猫的照片”

转换现有图像

“通过在场景中添加雪来改变这幅图像”
“编辑这张照片，让它看起来像是在晚上拍摄的”
“在这张图片的背景中添加一条飞翔的龙”

生成/转换后的图像将保存到您配置的输出路径，并在 Claude 中显示。通过更新的返回类型，AI 助手还可以直接处理图像数据，而无需访问已保存的文件。

测试

您可以通过运行 FastMCP 开发服务器来测试该应用程序：

fastmcp dev server.py

此命令启动本地开发服务器，并通过http://localhost:5173/访问 MCP Inspector。MCP Inspector 提供了一个便捷的 Web 界面，您可以直接在其中测试图像生成工具，而无需使用 Claude 或其他 MCP 客户端。您可以输入文本提示，执行工具并立即查看结果，这对于开发和调试非常有帮助。

执照

MIT 许可证

Install Server

HTTP connection URL

security – no known vulnerabilities

license - permissive license

quality - confirmed to work

How are these scores calculated?

hybrid server

The server is able to function both locally and remotely, depending on the configuration or use case.

Tools

允许 AI 助手通过 MCP 协议使用 Google 的 Gemini 模型从文本提示生成和转换高质量图像。

Related MCP Servers

Image Generation MCP Server
mikeyny
A
security
F
license
A
quality
Enables users to generate images from text prompts using Replicate's model, with configurable parameters and full MCP protocol compliance.
Last updated -
1
122
TypeScript
Together AI Image Server
zym9863
A
security
A
license
A
quality
A MCP server that enables Claude and other MCP-compatible assistants to generate images from text prompts using Together AI's image generation models.
Last updated -
1
4
TypeScript
MIT License
Gemini MCP Image Generation Server
sanxfxteam
A
security
A
license
A
quality
A Model Context Protocol server that provides image generation capabilities using Google's Gemini 2 API, allowing users to generate multiple images with customizable parameters like prompts, aspect ratios, and person generation settings.
Last updated -
1
3
JavaScript
MIT License
OpenAI Image Generation MCP Server
IncomeStreamSurfer
-
security
A
license
-
quality
Provides tools for generating and editing images using OpenAI's gpt-image-1 model via an MCP interface, enabling AI assistants to create and modify images based on text prompts.
Last updated -
17
Python
Apache 2.0

View all related MCP servers

Gemini Image Generator MCP Server