Skip to main content
Glama

image_generate

Generate images from text descriptions. Supports custom sizes, multiple outputs, and automatic resolution inference.

Instructions

文本生成图像(text-to-image)。米醋代理 + gpt-image-2 系列。

[WHAT] 把一段文字 prompt 渲染成 1 张或 N 张图像,落盘到本地。

[WHEN TO USE]

  • 用户要"画 / 生成 / 创建一张图"且没有提供任何参考图 → 用此 tool。

  • 如果用户提供了 1 张参考图要"修改 / 编辑 / 替换某部分" → 改用 image_edit。

  • 如果用户提供了多张参考图要"按它们的风格画一张新的" → 用 image_multi_reference。

  • 如果不知道怎么选 size:先调 server_info() 看 recommended_sizes。

[SIZE 选取建议]

  • 默认 None:MCP 自动从 prompt 关键字推断(4K/UHD → 3840x2160;1080p/2K → 2048x1152; 正方形/logo/头像 → 1024x1024;竖屏/9:16 → 1024x1536;横屏/16:9 → 1536x1024 等)。 推断不出来 fallback 1024x1024。

  • 强烈推荐:如果你(LLM)已经从用户消息读出确定的 size 偏好,直接显式传 size,比关键字推断准。

  • 用户提到"高清/4K/海报/壁纸" → "3840x2160"(横)或 "2160x3840"(竖),自动用 pro。

  • 用户提到"FullHD/1080p/横屏视频封面" → "2048x1152"(横)或 "1152x2048"(竖),跨过 2.25MP 阈值。

  • pro 与非 pro 价格一致 —— 想要真分辨率请直接拉高 size,1920×1080 这种 ≤2.25MP 的会被压成 ~1.57MP。

  • W 与 H 必须都是 8 的倍数(米醋实测约束;OpenAI 官方要 16,米醋更宽容)。

  • ≤2.25MP 的请求都被代理压到 ~1.57MP;要真实分辨率必须 ≥4MP(即 2048² 或更大)。

[PROMPT 写法建议]

  • 中英文混合可。gpt-image-2 文本渲染近完美,可大段嵌字(中英标点都行)。

  • 越具体越好:风格 / 视角 / 光线 / 主体 / 细节程度。

Args: prompt: 图像描述。1-2000 字符。例:"A minimalist sushi mascot logo, soft pastel palette". size: "WxH" 字符串或 None。留 None 让 MCP 从 prompt 推(弱 LLM 兜底用); 强 LLM 已知偏好时直接显式传更准。W 和 H 都必须是 8 的倍数(米醋约束)。常用: "1024x1024" "1280x720" "1024x1536" "1536x1024" "720x1280" ← 1K 档(被压到 1.57MP) "1920x1080" "1080x1920" ← 名义 2K 但 ≤2.25MP,被压到 1.57MP "2048x2048" "2048x1152" "1152x2048" ← 真 2K 档(仅 pro,≥4MP 严格 1:1) "3840x2160" "2160x3840" ← 4K 档(仅 pro,严格 1:1) 默认 None(推断后兜底 1024x1024)。 n: 张数 1-10。1K 时 N>1 自动 5 并发;≥2K 强制 N=1(代理限流)。默认 1。 model: 显式指定模型。留空时按 size 自动选(max edge ≥1600 用 pro,否则 non-pro)。 可选值:"gpt-image-2"(快、便宜)/ "gpt-image-2-pro"(高细节、≥2K 必需)。 save_dir: 输出目录。必须在安全根目录 MICU_SAVE_DIR_ROOT 之下(默认 ~/Pictures/micu-out); 传 root 之外路径会被拒。留空使用默认。 basename: 文件名前缀(不带扩展名),仅允许 [A-Za-z0-9_-.]。 含 / .. 或路径分量会被拒。默认 "gen_"。 api_key: 覆盖 MICU_API_KEY 环境变量。一般留空。 注意:base_url 已锁在启动时 env,运行期不接受 tool 参数(防 key 外泄到攻击者 host)。

Returns: dict 含以下字段: ok (bool): 至少有 1 张成功才为 True。 model (str): 实际用的模型 id。 size (str): 请求的 size。 requested_n (int): 实际生成的张数。 saved (list[dict]): 每张成功的图。每项含 path(绝对路径)/ size_bytes / actual_size(PNG header 读出的真实像素)/ actual_megapixels。 errors (list[str]): 失败请求的错误描述。 notes (list[str]): 路由 / 自动决策 / 实测尺寸偏差的说明。

Examples: # 最简:默认 1024x1024 单张 image_generate(prompt="a red apple on white")

# 4K 壁纸
image_generate(prompt="cyberpunk Tokyo at night", size="3840x2160")

# 一次出 4 张候选(1K 自动并发)
image_generate(prompt="cute sticker of a cat", size="1024x1024", n=4)

Common errors and what to do: "size W/H 必须是 8 的倍数" → 客户端入口拒,改 size 即可(OpenAI 端有时返回"divisible by 16" 提示,米醋 8 倍数已能过)。 "HTTP 524: timeout" → 已自动重试 3 次仍失败,建议改小 size 或稍后再试。 "未配置 API key" → 设置 MICU_API_KEY 环境变量或传 api_key 参数。

Input Schema

TableJSON Schema
NameRequiredDescriptionDefault
promptYes
sizeNo
nNo
modelNo
save_dirNo
basenameNo
api_keyNo

Output Schema

TableJSON Schema
NameRequiredDescriptionDefault

No arguments

Behavior5/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

No annotations are provided, so the description carries full burden. It discloses many behavioral details: size compression logic (≤2.25MP compressed to ~1.57MP), pro/non-pro model selection based on size, concurrency behavior for n>1, save_dir security restrictions, and common errors with solutions. This fully informs the agent of the tool's behavior.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is well-structured with clear sections (WHAT, WHEN TO USE, SIZE, PROMPT, Args, Returns, Examples, Common errors). It is front-loaded with the most critical information. Despite its length, every section provides value and the format is easy to scan.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness5/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

The description covers all 7 parameters in detail and provides the output schema explicitly. It includes examples and common errors, making it highly complete for a complex tool with multiple parameters and behaviors.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters5/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema description coverage is 0%, but the description adds extensive meaning to each parameter. It explains size constraints (multiples of 8, resolution tiers, compression), prompt length and style advice, n concurrency, model auto-selection, save_dir restrictions, basename constraints, api_key usage, and fallback behaviors. This fully compensates for the lack of schema descriptions.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the tool's purpose: text-to-image generation, rendering a prompt into images and saving locally. It explicitly distinguishes from siblings by providing when-to-use conditions for image_edit and image_multi_reference, making differentiation clear.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines5/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description includes an explicit 'WHEN TO USE' section that defines conditions for using this tool versus alternatives. It also offers guidance on selecting size, including recommending to call server_info() for recommended sizes, and advises on when to specify size explicitly vs letting MCP infer.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

Install Server

Other Tools

Latest Blog Posts

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/Subaru486desuwa/micu-image-mcp'

If you have feedback or need assistance with the MCP directory API, please join our Discord server