Gemini Image MCP

continue_image_session

Continue editing images in an existing multi-turn session while preserving previous conversation context. Add new instructions to modify images while maintaining the session's history.

Instructions

기존 멀티턴 세션을 이어서 이미지를 편집합니다. 이전 대화 맥락이 유지됩니다.

Input Schema

TableJSON Schema

Name	Required	Description
`sessionId`	Yes	start_image_session에서 받은 세션 ID
`prompt`	Yes	추가 편집 지시사항
`outputPath`	Yes	결과 이미지를 저장할 파일 경로
`imagePath`	No	추가로 참조할 이미지 경로 (선택사항)

Implementation Reference

server.js:214-275 (handler)

Main tool registration and handler for continue_image_session. Validates sessionId, retrieves session from sessions Map, calls session.sendWithImage() or session.send() based on whether an imagePath is provided, and returns formatted success/error responses.

server.tool(
  "continue_image_session",
  "기존 멀티턴 세션을 이어서 이미지를 편집합니다. 이전 대화 맥락이 유지됩니다.",
  {
    sessionId: z.string().describe("start_image_session에서 받은 세션 ID"),
    prompt: z.string().describe("추가 편집 지시사항"),
    outputPath: z.string().describe("결과 이미지를 저장할 파일 경로"),
    imagePath: z
      .string()
      .optional()
      .describe("추가로 참조할 이미지 경로 (선택사항)"),
  },
  async ({ sessionId, prompt, outputPath, imagePath }) => {
    try {
      const session = sessions.get(sessionId);
      if (!session) {
        return {
          content: [
            {
              type: "text",
              text: `세션을 찾을 수 없습니다 (ID: ${sessionId}). start_image_session으로 새 세션을 시작하세요.`,
            },
          ],
          isError: true,
        };
      }

      let result;
      if (imagePath) {
        result = await session.sendWithImage(imagePath, prompt, { outputPath });
      } else {
        result = await session.send(prompt, { outputPath });
      }

      if (!result.success) {
        return {
          content: [{ type: "text", text: result.text }],
          isError: !result.overloaded,
        };
      }

      return {
        content: [
          {
            type: "text",
            text: `이미지가 편집되었습니다: ${result.outputPath}${result.text ? "\n\n" + result.text : ""}`,
          },
        ],
      };
    } catch (error) {
      return {
        content: [
          {
            type: "text",
            text: `세션 편집 실패: ${error.message}`,
          },
        ],
        isError: true,
      };
    }
  }
);

server.js:217-225 (schema)

Input schema definition using Zod validation. Defines required parameters: sessionId (string), prompt (string), outputPath (string), and optional imagePath (string) for additional image reference.

{
  sessionId: z.string().describe("start_image_session에서 받은 세션 ID"),
  prompt: z.string().describe("추가 편집 지시사항"),
  outputPath: z.string().describe("결과 이미지를 저장할 파일 경로"),
  imagePath: z
    .string()
    .optional()
    .describe("추가로 참조할 이미지 경로 (선택사항)"),
},

gemini-image.js:227-333 (helper)

createSession function that creates a multi-turn session object with send() and sendWithImage() methods. These methods are called by the continue_image_session handler to maintain conversation context across multiple edits.

export function createSession(options = {}) {
  const ai = getAI();
  const model = options.model || DEFAULT_MODEL;
  const config = buildConfig(options);

  const chat = ai.chats.create({
    model,
    config,
  });

  return {
    async send(prompt, sendOptions = {}) {
      const outputPath = resolveOutput(
        sendOptions.output || sendOptions.outputPath || "./session-output.png"
      );

      try {
        const response = await chat.sendMessage({
          message: prompt,
        });

        const image = extractImageFromResponse(response);
        const text = extractTextFromResponse(response);

        if (!image) {
          const fallbackMsg = text || "이미지가 생성되지 않았습니다.";
          return { success: false, text: fallbackMsg, outputPath: null };
        }

        ensureDir(outputPath);
        const finalPath = outputPath.match(/\.\w+$/)
          ? outputPath
          : outputPath + getExtFromMimeType(image.mimeType);

        fs.writeFileSync(finalPath, Buffer.from(image.data, "base64"));

        return { success: true, outputPath: finalPath, text: text || "" };
      } catch (error) {
        if (isOverloadError(error)) {
          return {
            success: false,
            text: `API 과부하 에러 (${model}). model: "${FALLBACK_MODEL}" 옵션으로 다시 시도해보세요.`,
            outputPath: null,
            overloaded: true,
          };
        }
        throw error;
      }
    },

    async sendWithImage(imagePath, prompt, sendOptions = {}) {
      const outputPath = resolveOutput(
        sendOptions.output || sendOptions.outputPath || "./session-output.png"
      );

      const absImagePath = path.resolve(imagePath);
      if (!fs.existsSync(absImagePath)) {
        throw new Error(`이미지 파일을 찾을 수 없습니다: ${absImagePath}`);
      }

      const imageData = fs.readFileSync(absImagePath);
      const base64Image = imageData.toString("base64");
      const mimeType = getMimeTypeFromPath(absImagePath);

      const message = [
        {
          inlineData: {
            mimeType,
            data: base64Image,
          },
        },
        { text: prompt },
      ];

      try {
        const response = await chat.sendMessage({ message });

        const image = extractImageFromResponse(response);
        const text = extractTextFromResponse(response);

        if (!image) {
          const fallbackMsg = text || "편집된 이미지가 생성되지 않았습니다.";
          return { success: false, text: fallbackMsg, outputPath: null };
        }

        ensureDir(outputPath);
        const finalPath = outputPath.match(/\.\w+$/)
          ? outputPath
          : outputPath + getExtFromMimeType(image.mimeType);

        fs.writeFileSync(finalPath, Buffer.from(image.data, "base64"));

        return { success: true, outputPath: finalPath, text: text || "" };
      } catch (error) {
        if (isOverloadError(error)) {
          return {
            success: false,
            text: `API 과부하 에러 (${model}). model: "${FALLBACK_MODEL}" 옵션으로 다시 시도해보세요.`,
            outputPath: null,
            overloaded: true,
          };
        }
        throw error;
      }
    },
  };
}

gemini-image.js:277-331 (helper)

sendWithImage method implementation on the session object. Handles sending a message with both an image and text prompt, saves the result, and returns success/outputPath information.

async sendWithImage(imagePath, prompt, sendOptions = {}) {
  const outputPath = resolveOutput(
    sendOptions.output || sendOptions.outputPath || "./session-output.png"
  );

  const absImagePath = path.resolve(imagePath);
  if (!fs.existsSync(absImagePath)) {
    throw new Error(`이미지 파일을 찾을 수 없습니다: ${absImagePath}`);
  }

  const imageData = fs.readFileSync(absImagePath);
  const base64Image = imageData.toString("base64");
  const mimeType = getMimeTypeFromPath(absImagePath);

  const message = [
    {
      inlineData: {
        mimeType,
        data: base64Image,
      },
    },
    { text: prompt },
  ];

  try {
    const response = await chat.sendMessage({ message });

    const image = extractImageFromResponse(response);
    const text = extractTextFromResponse(response);

    if (!image) {
      const fallbackMsg = text || "편집된 이미지가 생성되지 않았습니다.";
      return { success: false, text: fallbackMsg, outputPath: null };
    }

    ensureDir(outputPath);
    const finalPath = outputPath.match(/\.\w+$/)
      ? outputPath
      : outputPath + getExtFromMimeType(image.mimeType);

    fs.writeFileSync(finalPath, Buffer.from(image.data, "base64"));

    return { success: true, outputPath: finalPath, text: text || "" };
  } catch (error) {
    if (isOverloadError(error)) {
      return {
        success: false,
        text: `API 과부하 에러 (${model}). model: "${FALLBACK_MODEL}" 옵션으로 다시 시도해보세요.`,
        outputPath: null,
        overloaded: true,
      };
    }
    throw error;
  }
},

gemini-image.js:238-275 (helper)

send method implementation on the session object. Handles sending a text-only prompt while maintaining conversation context, saves the result image, and returns success/outputPath information.

async send(prompt, sendOptions = {}) {
  const outputPath = resolveOutput(
    sendOptions.output || sendOptions.outputPath || "./session-output.png"
  );

  try {
    const response = await chat.sendMessage({
      message: prompt,
    });

    const image = extractImageFromResponse(response);
    const text = extractTextFromResponse(response);

    if (!image) {
      const fallbackMsg = text || "이미지가 생성되지 않았습니다.";
      return { success: false, text: fallbackMsg, outputPath: null };
    }

    ensureDir(outputPath);
    const finalPath = outputPath.match(/\.\w+$/)
      ? outputPath
      : outputPath + getExtFromMimeType(image.mimeType);

    fs.writeFileSync(finalPath, Buffer.from(image.data, "base64"));

    return { success: true, outputPath: finalPath, text: text || "" };
  } catch (error) {
    if (isOverloadError(error)) {
      return {
        success: false,
        text: `API 과부하 에러 (${model}). model: "${FALLBACK_MODEL}" 옵션으로 다시 시도해보세요.`,
        outputPath: null,
        overloaded: true,
      };
    }
    throw error;
  }
},

Tool Definition Quality

B3.2/5.0

Behavior2/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

No annotations are provided, so the description carries full burden. It mentions that 'previous conversation context is maintained', which is useful behavioral context about session state persistence. However, it lacks critical details like whether this is a read-only or mutation operation, what permissions are needed, how errors are handled, or if there are rate limits. For a tool with 4 parameters and no annotations, this is insufficient.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is two concise sentences in Korean that efficiently convey the core purpose and key behavioral trait. Every word earns its place with no redundancy or fluff. It's appropriately sized for the tool's complexity.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness2/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given 4 parameters, no annotations, and no output schema, the description is incomplete. It doesn't explain what the tool returns (e.g., success confirmation, error details, or image metadata), doesn't cover error conditions, and provides minimal behavioral context beyond session continuity. For a session-based image editing tool, this leaves significant gaps for an AI agent.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema description coverage is 100%, so the schema already documents all 4 parameters thoroughly. The description doesn't add any parameter-specific information beyond what's in the schema (e.g., it doesn't explain format constraints or provide examples). With high schema coverage, the baseline is 3 even without additional param details in the description.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose4/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the verb ('continue') and resource ('multi-turn session') with the specific action 'edit images'. It distinguishes from siblings like 'start_image_session' by focusing on continuation rather than initiation. However, it doesn't explicitly differentiate from 'edit_image' which might also edit images but without session context.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines3/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description implies usage context by mentioning 'previous conversation context is maintained', suggesting it should be used when there's an existing session. However, it doesn't explicitly state when NOT to use it or name alternatives like 'start_image_session' for new sessions or 'edit_image' for single edits without session continuity.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

Install Server

Other Tools

Latest Blog Posts

Your AI Chatbot Just Exposed Your CEO's Salary to an Intern
By Om-Shree-0709 on July 2, 2026.
Agent Identity
MCP Security
OAuth Delegation
Why MCP Servers Need Execution Sandboxing (And Why Your Current Stack Isn't Enough)
By Om-Shree-0709 on June 30, 2026.
Agentic Ai
Prompt Injection
WebAssembly
Lightport: Open-Sourcing Glama's AI Gateway
By punkpeye on April 27, 2026.
OpenAI
open source

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/jk7g14/gemini-image-mcp'

If you have feedback or need assistance with the MCP directory API, please join our Discord server