gpt-image-2-mcp
Provides tools for generating and editing images using OpenAI's gpt-image-2 model, including iterative multi-turn editing sessions.
Click on "Install Server".
Wait a few minutes for the server to deploy. Once ready, it will show a "Started" state.
In the chat, type
@followed by the MCP server name and your instructions, e.g., "@gpt-image-2-mcpgenerate an image of a cat in space"
That's it! The server will respond to your query, and you can continue using it as needed.
Here is a step-by-step guide with screenshots.
gpt-image-2-mcp
An MCP server that exposes OpenAI's gpt-image-2 (released 2026-04-21) to any MCP client — Claude Desktop, Claude Code, Cursor, MCP Inspector, etc.
Six tools:
Tool | What it does |
| text → image |
| 1–8 reference images (+ optional mask) → image |
| begin an iterative multi-turn edit |
| apply another refinement turn — previous output becomes the new input |
| release a session |
| show active sessions |
Every generated image is saved to disk and returned inline so the calling model sees it.
Requirements
Node.js ≥ 20
An OpenAI API key on an org with
gpt-image-2access (Organization Verification may be required)
Related MCP server: OpenAI Image Generation MCP Server
Install
pnpm install
pnpm run buildThis produces build/index.js, which is the server entry point.
Configure a client
Claude Desktop
Edit ~/Library/Application Support/Claude/claude_desktop_config.json (macOS) or %APPDATA%\Claude\claude_desktop_config.json (Windows):
{
"mcpServers": {
"gpt-image-2": {
"command": "node",
"args": ["/absolute/path/to/gpt_image_2_mcp/build/index.js"],
"env": {
"OPENAI_API_KEY": "sk-..."
}
}
}
}Claude Code
Either add to ~/.claude.json under mcpServers with the same shape, or drop an .mcp.json next to your project:
{
"mcpServers": {
"gpt-image-2": {
"command": "node",
"args": ["/absolute/path/to/gpt_image_2_mcp/build/index.js"],
"env": { "OPENAI_API_KEY": "sk-..." }
}
}
}MCP Inspector (interactive testing)
pnpm run inspectLaunches the official inspector UI pointed at your local build.
Environment variables
Var | Required | Purpose |
| ✅ | Auth |
| Override for proxies / enterprise routes | |
| Forwarded as | |
| Forwarded as | |
| Global default for where images are saved. Absolute paths used as-is, relative resolved from CWD. | |
| Set to | |
| Max concurrent in-memory edit sessions, LRU-evicted beyond this (default 20; | |
| Idle TTL before an edit session is swept (default 3600000 = 1h; | |
| Set to | |
| Host model used by the Responses-API fallback edit route (default |
Where images go
Unless overridden, each tool writes to:
<OS config dir>/gpt-image-2-mcp/output/<project-name>-<hash>/macOS/Linux:
~/.config/gpt-image-2-mcp/output/<project>-<hash>/Windows:
%APPDATA%\gpt-image-2-mcp\output\<project>-<hash>\
<project>-<hash> is derived from the git root (if any) or the current working directory — each project gets its own folder so generations don't collide.
Per-call override: pass output_dir: "/some/path" to any tool.
Filenames look like image-20260422-150301-a1b2c3.png. If you pass filename_prefix: "hero-banner", it becomes image-20260422-150301-a1b2c3-hero-banner.png.
What the tools return
Every tool result contains:
An inline
ImageContentblock per generated image (so the LLM sees the image)A text summary: applied settings, file path, token usage, estimated cost
structuredContentfor programmatic consumers:
{
"model": "gpt-image-2",
"prompt": "…",
"requested": { "size": "auto", "quality": "auto", "n": 1, "format": "png" },
"applied": { "size": "1024x1024", "quality": "high", "background": "opaque", "output_format": "png" },
"images": [ { "file_path": "…", "filename": "…", "size_bytes": 123456, "mime_type": "image/png" } ],
"usage": { "input_tokens": …, "output_tokens": …, "total_tokens": …, "input_tokens_details": { … } },
"cost_usd_estimated": 0.2112
}Session tools additionally return session_id and turn.
Sizes
Default is auto (the model picks). You can pass:
A preset:
1024x1024,1536x1024,1024x1536Any custom
WxHwhere:Both edges are multiples of 16
Max edge ≤ 3840px (outputs above 2K are beta)
Aspect ratio within 1:3 and 3:1
Total pixels between 655,360 and 8,294,400
Invalid sizes fail before the API call with a clear error — no wasted requests.
background: "transparent" is NOT supported by gpt-image-2. Use a model that supports it if you need alpha.
Iterative editing example
start_edit_session prompt: "A coastal lighthouse at dawn, photorealistic", images: ["./sketch.png"]
→ session_id: edit-1761149123-a1b2c3d4, turn 1, saved to …/session-…-turn1.png
continue_edit_session session_id: "edit-…-a1b2c3d4", prompt: "Make the sky more orange. Keep everything else the same."
→ turn 2
continue_edit_session session_id: "edit-…-a1b2c3d4", prompt: "Add a small boat on the horizon."
→ turn 3
end_edit_session session_id: "edit-…-a1b2c3d4"Sessions are in-memory only and discarded on server restart — this is intentional (keeps the server stateless on the wire) and mirrors the Gemini MCP pattern.
Image inputs for edit_image and start_edit_session
Accepts any mix of:
Absolute path:
/Users/me/photo.pngRelative path:
./photo.png(resolved from CWD)file:///Users/me/photo.pnghttps://example.com/photo.png(downloaded, size-capped)data:image/png;base64,iVBOR…
Up to 8 images per call. Each ≤ 50MB. PNG/WEBP/JPG supported.
Cost guardrails
The server ships no hard spending limits — you should watch your OpenAI usage dashboard. Each tool result includes an estimated cost in USD computed from the token usage returned by the API, plus an approximate pre-flight estimate logged to stderr.
Rough per-image cost at common sizes:
Quality | 1024×1024 | 1024×1536 / 1536×1024 |
low | ~$0.006 | ~$0.005 |
medium | ~$0.053 | ~$0.041 |
high | ~$0.211 | ~$0.165 |
Custom sizes scale with pixel count. Edit calls additionally tokenize input images at high fidelity — large reference images are expensive.
Edit routing
edit_image, start_edit_session, and continue_edit_session call POST /v1/images/edits directly. This is the canonical endpoint: it supports n > 1, masks, and returns accurate per-call token usage for cost estimation.
History: at launch (2026-04-21) the endpoint rejected
gpt-image-2(andgpt-image-1.5) with400 Invalid value: 'gpt-image-2'. Value must be 'dall-e-2'.— an OpenAI-side bug. Versions ≤ 0.2.0 of this server therefore routed edits through the Responses API by default. OpenAI fixed the endpoint silently in early May 2026 (verified live 2026-06-11), and since 0.3.0 the direct endpoint is the default again.
The Responses-API workaround is kept as a fallback (src/utils/edit-via-responses.ts):
It engages automatically if the direct endpoint ever returns the launch-era 400 again (matched narrowly; the rejection is remembered for 10 minutes so only the first call in that window pays the failed attempt, then the direct endpoint is re-probed).
Set
OPENAI_FORCE_RESPONSES_EDITS=1to pin it explicitly.The legacy
OPENAI_USE_DIRECT_EDITStoggle from 0.2.0 is deprecated and ignored (its only meaningful setting was1— opt into the direct endpoint, which is now the default).
Fallback mechanics: input images are uploaded via the Files API (purpose: "vision"), a cheap host model (default gpt-4.1-mini, override with OPENAI_RESPONSES_EDIT_MODEL) is forced to invoke the image_generation tool, the base64 result is extracted, and uploaded files are deleted afterwards.
Fallback trade-offs versus the direct endpoint (only apply when the fallback is active — the tool result carries route: "responses" and a note when they do):
n > 1is not supported — the Responses path returns one image per call.Cost accounting undercounts —
usageonly reports the host chat model's text tokens; the image tool is billed separately (~$0.04–0.05 extra for a 1024×1536 medium edit).Masks still work — uploaded and referenced via
input_image_mask.file_id.
Troubleshooting
"OPENAI_API_KEY is not set" — add it to the
envblock of your MCP config.403 / organization verification— gpt-image-2 may require Organization Verification on your OpenAI org. Check the dashboard.429— you hit the IPM (images per minute) cap for your tier. Lowern, or wait.Image doesn't appear in the client — check the file path in the text block; the image is saved regardless of inline display.
Protocol disconnects silently — something printed to stdout. Check
src/**/*.ts— all logs must useutils/logger.ts(stderr). This is the single biggest MCP footgun.
Development
pnpm run dev # tsx watch
pnpm run typecheck # tsc --noEmit
pnpm run build # compile to build/
pnpm run inspect # launch MCP InspectorLicense
MIT
Maintenance
Resources
Unclaimed servers have limited discoverability.
Looking for Admin?
If you are the server author, to access and configure the admin panel.
Latest Blog Posts
MCP directory API
We provide all the information about MCP servers via our MCP API.
curl -X GET 'https://glama.ai/api/mcp/v1/servers/Borys520/gpt-image-2-mcp'
If you have feedback or need assistance with the MCP directory API, please join our Discord server