Skip to main content
Glama

Codex Vision MCP

Codex Vision MCP is a local MCP server that adds computer-vision tools to coding agents and models that do not support native image input. It uses existing Codex OAuth credentials and calls the ChatGPT/Codex Responses backend.

The main use case is running a strong text/code model, such as GLM 5.2, while still giving the agent a way to inspect screenshots, diagrams, UI mockups, chart images, and local image files through MCP tools.

Name

Use Codex Vision MCP.

Vision MCP is too generic. The important differentiator is that this server uses Codex OAuth and the Codex Responses backend as the vision provider.

The package and executable names are:

  • package: codex-vision-mcp

  • command: codex-vision-mcp

  • suggested MCP server name: codex-vision

Related MCP server: mcp-vision

Tools

  • ui_to_artifact

  • extract_text_from_screenshot

  • diagnose_error_screenshot

  • understand_technical_diagram

  • analyze_data_visualization

  • ui_diff_check

  • analyze_image

  • analyze_video

Tool names are intentionally compatible with the Z.AI vision MCP tool shape so prompts and agent behavior can transfer easily.

How It Works

The MCP client calls one of the vision tools with a local file path, remote URL, or data:image/... URL. This server loads the image, sends it to the Codex Responses backend with Codex OAuth, and returns a text result to the calling agent.

This is useful when the active model is text-only or when the model provider rejects native image input.

Current Limitation

In Claude Code, pasting an image directly into the client does not automatically call this MCP server. Claude Code transcodes pasted images and sends them directly to the selected model. If the selected model does not support image input, the model call can fail before MCP tools are involved.

For now, use a local file path and ask the agent to inspect that file:

What does ./demo.png describe?

or:

Use codex-vision to analyze /absolute/path/to/demo.png

A future Claude Code integration can intercept pasted images before the model call, route them through this MCP server, and continue with a text-only prompt.

Install From Source

npm install
npm run build
node dist/index.js --smoke

--smoke checks Codex OAuth discovery and prints redacted auth metadata.

Add To Claude Code

Project-local registration is recommended. It keeps this vision server enabled only for the repository that needs it.

cd /path/to/your/project
claude mcp add --scope local codex-vision -- npx -y codex-vision-mcp

For local development from this checkout:

cd /path/to/your/project
claude mcp add --scope local codex-vision -- node /absolute/path/to/vision_mcp/dist/index.js

On Windows PowerShell:

cd C:\path\to\your\project
claude mcp add --scope local codex-vision -- npx -y codex-vision-mcp

Auth

Current auth discovery prioritizes local Codex auth first after explicit environment overrides:

  1. CODEX_VISION_AUTH_JSON

  2. CODEX_IMAGEN_AUTH_JSON

  3. CODEX_AUTH_JSON

  4. CODEX_HOME/auth.json

  5. ~/.codex/auth.json

The server refreshes near-expiry OAuth tokens through https://auth.openai.com/oauth/token and writes them back atomically.

Environment

  • CODEX_VISION_MODEL: Responses model, default gpt-5.5

  • CODEX_VISION_BASE_URL: default https://chatgpt.com/backend-api/codex

  • CODEX_VISION_AUTH_JSON: explicit auth JSON path

  • CODEX_VISION_IMAGE_DETAIL: auto, low, high, or original

  • CODEX_VISION_TIMEOUT_MS: default 300000

  • CODEX_VISION_MAX_IMAGE_MB: default 10

  • CODEX_VISION_MAX_VIDEO_MB: default 50

  • CODEX_VISION_VIDEO_FRAMES: default 4

  • CODEX_VISION_MCP_LOG_PATH: optional log file path

Platform Support

The server is implemented in Node.js and should run on macOS, Linux, and Windows with Node.js 22 or newer.

Cross-platform notes:

  • Local paths are resolved with Node path and os.homedir().

  • ~/.codex/auth.json maps to the current user's home directory on each OS.

  • Auth file permissions are tightened with chmod only on non-Windows systems.

  • The optional Codex CLI version lookup falls back safely if codex is not on PATH.

  • analyze_video requires ffmpeg on PATH; install ffmpeg separately on each OS or extract frames manually and use analyze_image.

This has been checked for obvious macOS-only assumptions in the source. Windows and Linux should work, but still need real-machine smoke testing.

Development

npm run check
npm run build

Run the server directly:

node dist/index.js
Install Server
A
license - permissive license
A
quality
C
maintenance

Maintenance

Maintainers
Response time
Release cycle
Releases (12mo)
Commit activity

Resources

Unclaimed servers have limited discoverability.

Looking for Admin?

If you are the server author, to access and configure the admin panel.

Latest Blog Posts

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/darkamenosa/codex-vision-mcp'

If you have feedback or need assistance with the MCP directory API, please join our Discord server