gpt-image-2-mcp
Provides tools for generating and editing images using OpenAI's gpt-image-2 model, including iterative multi-turn editing sessions.
Click on "Install Server".
Wait a few minutes for the server to deploy. Once ready, it will show a "Started" state.
In the chat, type
@followed by the MCP server name and your instructions, e.g., "@gpt-image-2-mcpgenerate an image of a cat in space"
That's it! The server will respond to your query, and you can continue using it as needed.
Here is a step-by-step guide with screenshots.
gpt-image-2-mcp
An MCP server that exposes OpenAI's gpt-image-2 (released 2026-04-21) to any MCP client — Claude Desktop, Claude Code, Cursor, MCP Inspector, etc.
Six tools:
Tool | What it does |
| text → image |
| 1–8 reference images (+ optional mask) → image |
| begin an iterative multi-turn edit |
| apply another refinement turn — previous output becomes the new input |
| release a session |
| show active sessions |
Every generated image is saved to disk and returned inline so the calling model sees it.
Requirements
Node.js ≥ 20
An OpenAI API key on an org with
gpt-image-2access (Organization Verification may be required)
Related MCP server: OpenAI Image Generation MCP Server
Install
pnpm install
pnpm run buildThis produces build/index.js, which is the server entry point.
Configure a client
Claude Desktop
Edit ~/Library/Application Support/Claude/claude_desktop_config.json (macOS) or %APPDATA%\Claude\claude_desktop_config.json (Windows):
{
"mcpServers": {
"gpt-image-2": {
"command": "node",
"args": ["/absolute/path/to/gpt_image_2_mcp/build/index.js"],
"env": {
"OPENAI_API_KEY": "sk-..."
}
}
}
}Claude Code
Either add to ~/.claude.json under mcpServers with the same shape, or drop an .mcp.json next to your project:
{
"mcpServers": {
"gpt-image-2": {
"command": "node",
"args": ["/absolute/path/to/gpt_image_2_mcp/build/index.js"],
"env": { "OPENAI_API_KEY": "sk-..." }
}
}
}MCP Inspector (interactive testing)
pnpm run inspectLaunches the official inspector UI pointed at your local build.
Environment variables
Var | Required | Purpose |
| ✅ | Auth |
| Override for proxies / enterprise routes | |
| Forwarded as | |
| Forwarded as | |
| Global default for where images are saved. Absolute paths used as-is, relative resolved from CWD. | |
| Set to | |
| Max concurrent in-memory edit sessions, LRU-evicted beyond this (default 20; | |
| Idle TTL before an edit session is swept (default 3600000 = 1h; | |
| Set to | |
| Host model used by the Responses-API fallback edit route (default |
Where images go
Unless overridden, each tool writes to:
<OS config dir>/gpt-image-2-mcp/output/<project-name>-<hash>/macOS/Linux:
~/.config/gpt-image-2-mcp/output/<project>-<hash>/Windows:
%APPDATA%\gpt-image-2-mcp\output\<project>-<hash>\
<project>-<hash> is derived from the git root (if any) or the current working directory — each project gets its own folder so generations don't collide.
Per-call override: pass output_dir: "/some/path" to any tool.
Filenames look like image-20260422-150301-a1b2c3.png. If you pass filename_prefix: "hero-banner", it becomes image-20260422-150301-a1b2c3-hero-banner.png.
What the tools return
Every tool result contains:
An inline
ImageContentblock per generated image (so the LLM sees the image)A text summary: applied settings, file path, token usage, estimated cost
structuredContentfor programmatic consumers:
{
"model": "gpt-image-2",
"prompt": "…",
"requested": { "size": "auto", "quality": "auto", "n": 1, "format": "png" },
"applied": { "size": "1024x1024", "quality": "high", "background": "opaque", "output_format": "png" },
"images": [ { "file_path": "…", "filename": "…", "size_bytes": 123456, "mime_type": "image/png" } ],
"usage": { "input_tokens": …, "output_tokens": …, "total_tokens": …, "input_tokens_details": { … } },
"cost_usd_estimated": 0.2112
}Session tools additionally return session_id and turn.
Sizes
Default is auto (the model picks). You can pass:
A preset:
1024x1024,1536x1024,1024x1536Any custom
WxHwhere:Both edges are multiples of 16
Max edge ≤ 3840px (outputs above 2K are beta)
Aspect ratio within 1:3 and 3:1
Total pixels between 655,360 and 8,294,400
Invalid sizes fail before the API call with a clear error — no wasted requests.
background: "transparent" is NOT supported by gpt-image-2. Use a model that supports it if you need alpha.
Iterative editing example
start_edit_session prompt: "A coastal lighthouse at dawn, photorealistic", images: ["./sketch.png"]
→ session_id: edit-1761149123-a1b2c3d4, turn 1, saved to …/session-…-turn1.png
continue_edit_session session_id: "edit-…-a1b2c3d4", prompt: "Make the sky more orange. Keep everything else the same."
→ turn 2
continue_edit_session session_id: "edit-…-a1b2c3d4", prompt: "Add a small boat on the horizon."
→ turn 3
end_edit_session session_id: "edit-…-a1b2c3d4"Sessions are in-memory only and discarded on server restart — this is intentional (keeps the server stateless on the wire) and mirrors the Gemini MCP pattern.
Image inputs for edit_image and start_edit_session
Accepts any mix of:
Absolute path:
/Users/me/photo.pngRelative path:
./photo.png(resolved from CWD)file:///Users/me/photo.pnghttps://example.com/photo.png(downloaded, size-capped)data:image/png;base64,iVBOR…
Up to 8 images per call. Each ≤ 50MB. PNG/WEBP/JPG supported.
Cost guardrails
The server ships no hard spending limits — you should watch your OpenAI usage dashboard. Each tool result includes an estimated cost in USD computed from the token usage returned by the API, plus an approximate pre-flight estimate logged to stderr.
Rough per-image cost at common sizes:
Quality | 1024×1024 | 1024×1536 / 1536×1024 |
low | ~$0.006 | ~$0.005 |
medium | ~$0.053 | ~$0.041 |
high | ~$0.211 | ~$0.165 |
Custom sizes scale with pixel count. Edit calls additionally tokenize input images at high fidelity — large reference images are expensive.
Edit routing
edit_image, start_edit_session, and continue_edit_session call POST /v1/images/edits directly. This is the canonical endpoint: it supports n > 1, masks, and returns accurate per-call token usage for cost estimation.
History: at launch (2026-04-21) the endpoint rejected
gpt-image-2(andgpt-image-1.5) with400 Invalid value: 'gpt-image-2'. Value must be 'dall-e-2'.— an OpenAI-side bug. Versions ≤ 0.2.0 of this server therefore routed edits through the Responses API by default. OpenAI fixed the endpoint silently in early May 2026 (verified live 2026-06-11), and since 0.3.0 the direct endpoint is the default again.
The Responses-API workaround is kept as a fallback (src/utils/edit-via-responses.ts):
It engages automatically if the direct endpoint ever returns the launch-era 400 again (matched narrowly; the rejection is remembered for 10 minutes so only the first call in that window pays the failed attempt, then the direct endpoint is re-probed).
Set
OPENAI_FORCE_RESPONSES_EDITS=1to pin it explicitly.The legacy
OPENAI_USE_DIRECT_EDITStoggle from 0.2.0 is deprecated and ignored (its only meaningful setting was1— opt into the direct endpoint, which is now the default).
Fallback mechanics: input images are uploaded via the Files API (purpose: "vision"), a cheap host model (default gpt-4.1-mini, override with OPENAI_RESPONSES_EDIT_MODEL) is forced to invoke the image_generation tool, the base64 result is extracted, and uploaded files are deleted afterwards.
Fallback trade-offs versus the direct endpoint (only apply when the fallback is active — the tool result carries route: "responses" and a note when they do):
n > 1is not supported — the Responses path returns one image per call.Cost accounting undercounts —
usageonly reports the host chat model's text tokens; the image tool is billed separately (~$0.04–0.05 extra for a 1024×1536 medium edit).Masks still work — uploaded and referenced via
input_image_mask.file_id.
Troubleshooting
"OPENAI_API_KEY is not set" — add it to the
envblock of your MCP config.403 / organization verification— gpt-image-2 may require Organization Verification on your OpenAI org. Check the dashboard.429— you hit the IPM (images per minute) cap for your tier. Lowern, or wait.Image doesn't appear in the client — check the file path in the text block; the image is saved regardless of inline display.
Protocol disconnects silently — something printed to stdout. Check
src/**/*.ts— all logs must useutils/logger.ts(stderr). This is the single biggest MCP footgun.
Development
pnpm run dev # tsx watch
pnpm run typecheck # tsc --noEmit
pnpm run build # compile to build/
pnpm run inspect # launch MCP InspectorLicense
MIT
Maintenance
Resources
Unclaimed servers have limited discoverability.
Looking for Admin?
If you are the server author, to access and configure the admin panel.
Latest Blog Posts
- Your AI Chatbot Just Exposed Your CEO's Salary to an InternBy Om-Shree-0709 on .Agent IdentityMCP SecurityOAuth Delegation
- Why MCP Servers Need Execution Sandboxing (And Why Your Current Stack Isn't Enough)By Om-Shree-0709 on .Agentic AiPrompt InjectionWebAssembly
MCP directory API
We provide all the information about MCP servers via our MCP API.
curl -X GET 'https://glama.ai/api/mcp/v1/servers/Borys520/gpt-image-2-mcp'
If you have feedback or need assistance with the MCP directory API, please join our Discord server