Skip to main content
Glama

mcp_gemini_edit_image

Edit and update images using text prompts and Base64-encoded image data with Gemini AI. Supports multi-turn edits, enabling contextual adjustments. Specify 'image update' or 'image edit' in prompts for generation.

Instructions

Gemini 모델을 사용하여 기존 이미지를 편집합니다. 텍스트 프롬프트와 Base64로 인코딩된 이미지 데이터가 필요합니다. 제한사항: 1) 오디오/동영상 입력 미지원 2) 이미지 생성이 항상 트리거되지 않음(명시적으로 "이미지 업데이트", "이미지 편집" 등 요청 필요) 3) 모델이 가끔 생성을 중단할 수 있음. 이미지에 텍스트를 생성할 때는 먼저 텍스트를 생성한 다음 텍스트가 포함된 이미지를 요청하는 것이 효과적입니다. 멀티턴 이미지 편집(채팅)이 가능하며, 컨텍스트를 유지하면서 이미지를 수정할 수 있습니다.

Input Schema

TableJSON Schema
NameRequiredDescriptionDefault
fileNameNo저장할 이미지 파일 이름 (확장자 제외)
imageDataYesBase64로 인코딩된 이미지 데이터. fs.readFileSync(imagePath).toString("base64")로 파일에서 얻을 수 있습니다.
imageMimeTypeNo이미지 MIME 타입 (예: image/png, image/jpeg)image/png
modelNo사용할 Gemini 모델 ID (예: gemini-2.0-flash-exp-image-generation)gemini-2.0-flash-exp-image-generation
promptYes이미지 편집을 위한 텍스트 프롬프트. 명시적으로 "이미지 업데이트", "이미지 편집" 등의 표현을 포함하세요.
saveDirNo편집된 이미지를 저장할 디렉토리./temp
Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

With no annotations provided, the description carries the full burden of behavioral disclosure. It does an excellent job describing important behavioral traits: limitations (no audio/video input, image generation not always triggered), potential failures (model may stop generation), and effective usage patterns (multi-turn editing, context preservation, text generation strategy). This goes well beyond basic functionality description.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness4/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is appropriately sized and well-structured. It starts with the core purpose, then lists requirements, limitations, and effective strategies. Each sentence adds value, though some sentences could be more concise (e.g., the text generation strategy explanation is somewhat verbose). Overall, it's efficient without being terse.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the complexity of an image editing tool with no annotations and no output schema, the description provides substantial context. It covers purpose, requirements, limitations, behavioral patterns, and usage strategies. The main gap is the lack of information about return values or what happens after editing (though the saveDir parameter suggests files are saved). For a tool with this complexity, it's quite complete.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema description coverage is 100%, so the schema already documents all 6 parameters thoroughly. The description mentions that text prompts and Base64-encoded image data are required, which aligns with the required parameters in the schema, but doesn't add significant semantic value beyond what the schema already provides. The baseline of 3 is appropriate when the schema does the heavy lifting.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose4/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the tool's purpose: 'Gemini 모델을 사용하여 기존 이미지를 편집합니다' (Uses Gemini model to edit existing images). It specifies the verb (edit) and resource (images) with the Gemini model context. However, it doesn't explicitly differentiate from sibling tools like mcp_gemini_create_image or mcp_gemini_generate_image, which might also involve image manipulation.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description provides clear context for when to use this tool: it requires text prompts and Base64-encoded image data for editing existing images. It mentions multi-turn image editing capabilities and suggests effective strategies (e.g., generating text first for text-in-image edits). However, it doesn't explicitly state when NOT to use this tool or name specific alternatives among the many sibling tools.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

Install Server

Other Tools

Related Tools

Latest Blog Posts

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/bigdata-coss/agent_mcp'

If you have feedback or need assistance with the MCP directory API, please join our Discord server