Codex Vision MCP
Server Configuration
Describes the environment variables required to run the server.
| Name | Required | Description | Default |
|---|---|---|---|
| CODEX_VISION_MODEL | No | Responses model to use for vision analysis. Default: gpt-5.5 | gpt-5.5 |
| CODEX_VISION_BASE_URL | No | Base URL for the Codex Responses backend. Default: https://chatgpt.com/backend-api/codex | https://chatgpt.com/backend-api/codex |
| CODEX_VISION_AUTH_JSON | No | Explicit path to a JSON file containing OAuth credentials. If not set, the server attempts to discover auth from other sources. | |
| CODEX_VISION_TIMEOUT_MS | No | Timeout in milliseconds for API requests. Default: 300000 | 300000 |
| CODEX_VISION_IMAGE_DETAIL | No | Image detail level: auto, low, high, or original. Default: auto | auto |
| CODEX_VISION_MAX_IMAGE_MB | No | Maximum image file size in MB. Default: 10 | 10 |
| CODEX_VISION_MAX_VIDEO_MB | No | Maximum video file size in MB. Default: 50 | 50 |
| CODEX_VISION_MCP_LOG_PATH | No | Optional path to a log file for the MCP server. | |
| CODEX_VISION_VIDEO_FRAMES | No | Number of frames to extract from video. Default: 4 | 4 |
Capabilities
Features and capabilities supported by this server
| Capability | Details |
|---|---|
| tools | {
"listChanged": true
} |
Tools
Functions exposed to the LLM to take actions
| Name | Description |
|---|---|
| ui_to_artifactA | Convert UI screenshots into artifacts: frontend code, AI prompts, design specifications, or natural-language descriptions. Use only for UI screenshots. Do not use for OCR-only screenshots, error messages, diagrams, or charts. |
| extract_text_from_screenshotA | Extract and recognize text from screenshots, including code, terminal output, logs, documentation, and UI copy. Use when the user needs faithful OCR or text reconstruction from an image. |
| diagnose_error_screenshotA | Diagnose screenshots of error messages, stack traces, exception dialogs, or failed command output. Returns likely cause, actionable fixes, and prevention notes where possible. |
| understand_technical_diagramA | Analyze technical diagrams including architecture diagrams, flowcharts, UML, ER diagrams, network diagrams, and sequence diagrams. |
| analyze_data_visualizationA | Analyze charts, graphs, dashboards, and other data visualizations to extract metrics, trends, anomalies, and recommendations. |
| ui_diff_checkB | Compare an expected/reference UI screenshot with an actual/current UI screenshot and report visual implementation differences. |
| analyze_imageA | General-purpose image analysis for cases not covered by the specialized tools. Use as a fallback for flexible visual understanding. |
| analyze_videoA | Analyze a local or remote video by extracting sampled frames with ffmpeg and sending them to Codex vision analysis. Requires ffmpeg on PATH. For precise video work, extract important frames manually and use analyze_image or ui_diff_check. |
Prompts
Interactive templates invoked by user choice
| Name | Description |
|---|---|
No prompts | |
Resources
Contextual data attached and managed by the client
| Name | Description |
|---|---|
No resources | |
Latest Blog Posts
MCP directory API
We provide all the information about MCP servers via our MCP API.
curl -X GET 'https://glama.ai/api/mcp/v1/servers/darkamenosa/codex-vision-mcp'
If you have feedback or need assistance with the MCP directory API, please join our Discord server