Codex Vision MCP
Uses OpenAI's Codex/Responses backend to provide computer vision capabilities, enabling text-only agents to analyze images, screenshots, diagrams, and local files.
Click on "Install Server".
Wait a few minutes for the server to deploy. Once ready, it will show a "Started" state.
In the chat, type
@followed by the MCP server name and your instructions, e.g., "@Codex Vision MCPwhat does ./demo.png describe?"
That's it! The server will respond to your query, and you can continue using it as needed.
Here is a step-by-step guide with screenshots.
Codex Vision MCP
Codex Vision MCP is a local MCP server that adds computer-vision tools to coding agents and models that do not support native image input. It uses existing Codex OAuth credentials and calls the ChatGPT/Codex Responses backend.
The main use case is running a strong text/code model, such as GLM 5.2, while still giving the agent a way to inspect screenshots, diagrams, UI mockups, chart images, and local image files through MCP tools.
Name
Use Codex Vision MCP.
Vision MCP is too generic. The important differentiator is that this server uses Codex OAuth and the Codex Responses backend as the vision provider.
The package and executable names are:
package:
codex-vision-mcpcommand:
codex-vision-mcpsuggested MCP server name:
codex-vision
Related MCP server: mcp-vision
Tools
ui_to_artifactextract_text_from_screenshotdiagnose_error_screenshotunderstand_technical_diagramanalyze_data_visualizationui_diff_checkanalyze_imageanalyze_video
Tool names are intentionally compatible with the Z.AI vision MCP tool shape so prompts and agent behavior can transfer easily.
How It Works
The MCP client calls one of the vision tools with a local file path, remote URL, or data:image/... URL. This server loads the image, sends it to the Codex Responses backend with Codex OAuth, and returns a text result to the calling agent.
This is useful when the active model is text-only or when the model provider rejects native image input.
Current Limitation
In Claude Code, pasting an image directly into the client does not automatically call this MCP server. Claude Code transcodes pasted images and sends them directly to the selected model. If the selected model does not support image input, the model call can fail before MCP tools are involved.
For now, use a local file path and ask the agent to inspect that file:
What does ./demo.png describe?or:
Use codex-vision to analyze /absolute/path/to/demo.pngA future Claude Code integration can intercept pasted images before the model call, route them through this MCP server, and continue with a text-only prompt.
Install From Source
npm install
npm run build
node dist/index.js --smoke--smoke checks Codex OAuth discovery and prints redacted auth metadata.
Add To Claude Code
Project-local registration is recommended. It keeps this vision server enabled only for the repository that needs it.
cd /path/to/your/project
claude mcp add --scope local codex-vision -- npx -y codex-vision-mcpFor local development from this checkout:
cd /path/to/your/project
claude mcp add --scope local codex-vision -- node /absolute/path/to/vision_mcp/dist/index.jsOn Windows PowerShell:
cd C:\path\to\your\project
claude mcp add --scope local codex-vision -- npx -y codex-vision-mcpAuth
Current auth discovery prioritizes local Codex auth first after explicit environment overrides:
CODEX_VISION_AUTH_JSONCODEX_IMAGEN_AUTH_JSONCODEX_AUTH_JSONCODEX_HOME/auth.json~/.codex/auth.json
The server refreshes near-expiry OAuth tokens through https://auth.openai.com/oauth/token and writes them back atomically.
Environment
CODEX_VISION_MODEL: Responses model, defaultgpt-5.5CODEX_VISION_BASE_URL: defaulthttps://chatgpt.com/backend-api/codexCODEX_VISION_AUTH_JSON: explicit auth JSON pathCODEX_VISION_IMAGE_DETAIL:auto,low,high, ororiginalCODEX_VISION_TIMEOUT_MS: default300000CODEX_VISION_MAX_IMAGE_MB: default10CODEX_VISION_MAX_VIDEO_MB: default50CODEX_VISION_VIDEO_FRAMES: default4CODEX_VISION_MCP_LOG_PATH: optional log file path
Platform Support
The server is implemented in Node.js and should run on macOS, Linux, and Windows with Node.js 22 or newer.
Cross-platform notes:
Local paths are resolved with Node
pathandos.homedir().~/.codex/auth.jsonmaps to the current user's home directory on each OS.Auth file permissions are tightened with
chmodonly on non-Windows systems.The optional Codex CLI version lookup falls back safely if
codexis not onPATH.analyze_videorequiresffmpegonPATH; installffmpegseparately on each OS or extract frames manually and useanalyze_image.
This has been checked for obvious macOS-only assumptions in the source. Windows and Linux should work, but still need real-machine smoke testing.
Development
npm run check
npm run buildRun the server directly:
node dist/index.jsMaintenance
Resources
Unclaimed servers have limited discoverability.
Looking for Admin?
If you are the server author, to access and configure the admin panel.
Latest Blog Posts
MCP directory API
We provide all the information about MCP servers via our MCP API.
curl -X GET 'https://glama.ai/api/mcp/v1/servers/darkamenosa/codex-vision-mcp'
If you have feedback or need assistance with the MCP directory API, please join our Discord server