Skip to main content
Glama

image_edit

Apply text-driven edits to local images, supporting up to 2K resolution and optional alpha masks for precise local modifications.

Instructions

图像编辑(image-to-image,单张输入,支持 1K + 真 2K;4K 已禁用见下)。

[WHAT] 接受 1 张本地图片 + 修改指令,输出修改后的图。

[WHEN TO USE]

  • 用户提供 1 张图(路径或刚刚生成的图)且要"改 / 替换 / 加 / 去掉某部分" → 用此 tool。

  • 如果用户没提供图想从零生成 → 改用 image_generate。

  • 如果用户提供了多张图想"批量改"(每张做同样操作)→ 改用 image_batch_edit。

  • 如果用户用多张图作风格参考想画一张新的 → 用 image_multi_reference(≤1.57MP)。

[4K 已禁用] origin 处理 4K + 参考图稳定 > 120s,撞 CF Proxy Read Timeout (524)。 入口直接拒 4K size,不发请求。请改 2K("2048x1152" / "1152x2048" / "2048x2048"), 或两步法:先 1K/2K image_edit → image_generate(size="3840x2160", 描述同场景) 升 4K。

[路由实现](实测确定,双路径)

  • 1K(边长 ≤1536):走 /v1/images/edits multipart,支持 alpha mask。 失败 fallback 到 /v1/chat/completions stream。

  • 2K(边长 1600–2999):走 /v1/images/generations + 米醋扩展字段 reference_image=data_url。 size 真实生效(实测 2048² 真 2048²)。 此路径不支持 mask(米醋扩展字段不接受 mask);如需 mask 请降到 1K。

  • 自动锁 pro:max edge ≥1600 → gpt-image-2-pro。

[MASK 工作原理](仅 1K 路径)

  • mask_path 指向一张 PNG,尺寸应与 image_path 一致。

  • mask 中 alpha=0(透明) 的像素 = 要修改的区域。

  • alpha=255(不透明)的像素 = 要保持原样。

  • 不传 mask 则模型自由决定改哪里。

Args: prompt: 修改指令,越具体越好。例:"change the background to deep navy with stars, keep the subject pixel-identical". image_path: 输入图的绝对或相对路径。PNG / JPG / WebP 都支持。 mask_path: 可选 alpha mask PNG 路径,透明区即编辑区。仅 1K 路径生效;≥2K 时被忽略。 size: 输出 size。 "1024x1024" "1280x720" "1024x1536" "1536x1024" "720x1280" ← 1K(被压到 1.57MP,含 mask 支持) "1920x1080" "1080x1920" ← 名义 2K 但 ≤2.25MP,会被压到 1.57MP "2048x2048" "2048x1152" "1152x2048" ← 真 2K(≥4MP 严格 1:1,pro 自动) "3840x2160" / "2160x3840" ← 4K 已禁用(撞 CF 524 物理上限),传入直接拒 默认 "1024x1024"。 model: "gpt-image-2"(默认)/ "gpt-image-2-pro"(≥2K 自动切)。 save_dir: 输出目录(必须在安全根目录之下)。默认 ~/Pictures/micu-out 或 MICU_SAVE_DIR。 basename: 文件名前缀(仅 [A-Za-z0-9_-.])。默认 "edit_"。 api_key: 覆盖 MICU_API_KEY;base_url 已锁在启动期 env,运行期不接受。

Returns: dict 含: ok (bool): 是否成功。 model (str): 实际用的模型。 size (str): 请求 size。 used_fallback (bool): True 表示主端点失败已切换到 chat/completions(仅 1K 可能触发)。 saved (dict): { path, size_bytes, actual_size, actual_megapixels }。 notes (list[str]): 决策与提示。

Examples: # 1K 换背景 image_edit(prompt="replace background with a sunset beach", image_path="/p/portrait.jpg")

# 1K 局部修改(mask 生效)
image_edit(prompt="change hair color to silver", image_path="/p/x.png", mask_path="/p/x_mask.png")

# 真 2K 升级(无 mask)
image_edit(prompt="enhance to cinematic detail, preserve composition", image_path="/p/draft.png", size="2048x2048")

Common errors: "image_path 不存在" → 检查路径,建议用绝对路径。 "size=3840x2160 (4K) 在 image_edit 已禁用" → 4K image_edit 物理撞 CF 524 上限,请改 2K 或两步法。 "HTTP 524" → 2K 单图正常 ~50s,撞了说明 origin 那阵特别忙;自动重试仍失败请稍后再试。

Input Schema

TableJSON Schema
NameRequiredDescriptionDefault
promptYes
image_pathYes
mask_pathNo
sizeNo1024x1024
modelNo
save_dirNo
basenameNo
api_keyNo

Output Schema

TableJSON Schema
NameRequiredDescriptionDefault

No arguments

Behavior5/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

No annotations are provided, so the description fully compensates. It discloses routing behavior (1K vs 2K paths), mask mechanics, 4K disablement, fallback logic, auto pro model selection, and common errors. This is highly transparent beyond the schema.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness4/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is quite long but well-structured with clear sections (WHAT, WHEN TO USE, 4K禁用, etc.) and bullet points. It uses front-loaded sections for quick scanning. Minor redundancy (e.g., mask details repeated in MASK section and param docs) prevents a perfect 5.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness5/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the tool's complexity (8 params, routing, mask, size constraints, fallback), the description is remarkably complete. It includes examples, common errors, and return value details. The presence of an output schema further reduces the burden, but the description still adds value.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters5/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

The schema has 0% description coverage, so the description must carry the full burden. It provides extensive details for all 8 parameters: prompt (specificity advice), image_path (supports multiple formats), mask_path (only 1K path, ignored at 2K), size (valid values, default, 4K disabled), model (auto pro), save_dir/basename (safety constraints), api_key (override locked base_url). This compensates completely.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description uses a specific verb ('edit') and clearly states it operates on a single image with a modification prompt. It distinguishes from sibling tools by explicitly listing when to use image_generate, image_batch_edit, and image_multi_reference instead.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines5/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description provides explicit 'when to use' and 'when not to use' guidance, including alternatives like image_generate for zero-creation and image_batch_edit for batch edits. It also details specific conditions (e.g., 4K disabled, mask only on 1K path).

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

Install Server

Other Tools

Latest Blog Posts

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/Subaru486desuwa/micu-image-mcp'

If you have feedback or need assistance with the MCP directory API, please join our Discord server