opencode-openai-vision-mcp
Provides vision capabilities by sending images to OpenAI-compatible vision models for analysis and description.
Click on "Install Server".
Wait a few minutes for the server to deploy. Once ready, it will show a "Started" state.
In the chat, type
@followed by the MCP server name and your instructions, e.g., "@opencode-openai-vision-mcpWhat does this screenshot show?"
That's it! The server will respond to your query, and you can continue using it as needed.
Here is a step-by-step guide with screenshots.
opencode-openai-vision-mcp
A tiny MCP server that gives OpenCode (and any other MCP client) working image/vision support through any OpenAI-compatible endpoint — for example an OmniRoute or LiteLLM gateway, or OpenAI itself.
It reads an image file from disk, sends it to a vision-capable model as an
image_url content block, and returns the model's text description.
Why this exists
OpenCode currently cannot send image attachments to vision models served through a
custom OpenAI-compatible provider (@ai-sdk/openai-compatible). The attachment is
dropped before it reaches the model, and the assistant replies with something like:
this model does not support image input
This is an OpenCode adapter bug, tracked upstream in
anomalyco/opencode#20802
(fix PRs #26826 /
#21627 were not merged at the time
of writing). The underlying model and gateway are usually fine — a direct
/chat/completions call with image_url works; only OpenCode's conversion is broken.
This project is a workaround: instead of relying on OpenCode's native image path, it routes images through an MCP tool that you fully control.
Related MCP server: Vision MCP Server
How it works
OpenCode (you paste a screenshot)
└─ opencode-vision plugin intercepts the image inside OpenCode,
(separate project, see below) saves it to a temp file, and rewrites the
message to "call local_vision with this path"
└─ local_vision (THIS server) reads the file, base64-encodes it, and POSTs
it as image_url to an OpenAI-compatible endpoint
└─ your gateway/model e.g. OmniRoute -> any vision model
returns a text description back to OpenCodeThis is a describe-then-reason approach: a vision model looks at the image and returns text; your main model works from that text. It is not native multimodality, but it restores the "paste a screenshot and ask" workflow.
Prerequisites
Node.js >= 18
A vision-capable model reachable via an OpenAI-compatible
/chat/completionsendpoint (OmniRoute, LiteLLM, OpenAI, etc.).The opencode-vision plugin (AGPL-3.0) — this is what intercepts the pasted image inside OpenCode and calls the
local_visiontool. This MCP server is the tool it calls. You need both.
Install
git clone https://github.com/WormAlien/opencode-openai-vision-mcp.git
cd opencode-openai-vision-mcp
npm installQuick self-test (optional) — point it at your gateway and it should print a color word:
VISION_BASE_URL=http://localhost:20128/v1 \
VISION_API_KEY=your-key \
VISION_MODEL=your-vision-model \
node server.js
# then drive it with any MCP client, or just confirm it starts without errorConfigure OpenCode
Add the MCP server to your opencode.json (see opencode.example.json):
{
"mcp": {
"local": {
"type": "local",
"command": ["node", "/absolute/path/to/opencode-openai-vision-mcp/server.js"],
"enabled": true,
"environment": {
"VISION_BASE_URL": "http://localhost:20128/v1",
"VISION_API_KEY": "YOUR_GATEWAY_API_KEY",
"VISION_MODEL": "your-vision-model-or-alias"
}
}
}
}Naming the server
localmakes its tool resolve tolocal_vision, which matches the defaultimageAnalysisToolof the opencode-vision plugin — so no extra wiring needed.
Then enable the plugin for your models via opencode-vision.json
(see opencode-vision.example.json):
{
"models": ["your-provider/*"],
"imageAnalysisTool": "local_vision"
}Restart OpenCode, select a model under that provider, paste an image, and ask away.
Environment variables
Variable | Default | Description |
|
| OpenAI-compatible base URL (must end in |
| (empty) | Bearer token for the endpoint. Omitted if empty. |
|
| Vision model name, or a gateway alias. |
|
| Max tokens for the description. |
Tip: point VISION_MODEL at a gateway alias (e.g. a vision alias in OmniRoute).
Then you can swap the real model in your gateway dashboard without editing any config.
The vision tool
Input:
path(absolute path to a PNG/JPEG/WebP/GIF), optionalquestion.Output: a text description (transcribes visible text by default).
Handles both plain JSON and SSE/streamed responses from the endpoint.
Notes / limitations
The image must be readable on the same machine the server runs on (it reads from the local filesystem path the plugin saved).
Quality depends entirely on the vision model you point it at.
Once OpenCode merges native image support for openai-compatible providers (#20802), you may not need this.
Credits
opencode-vision (AGPL-3.0) — the OpenCode plugin that intercepts images and calls the MCP tool. A separate project; not bundled here.
Model Context Protocol and its TypeScript SDK.
License
MIT — see LICENSE. (This applies to this MCP server only; the opencode-vision plugin has its own AGPL-3.0 license.)
Maintenance
Resources
Unclaimed servers have limited discoverability.
Looking for Admin?
If you are the server author, to access and configure the admin panel.
Tools
Latest Blog Posts
MCP directory API
We provide all the information about MCP servers via our MCP API.
curl -X GET 'https://glama.ai/api/mcp/v1/servers/WormAlien/OpenCode-vision-OmniRoute'
If you have feedback or need assistance with the MCP directory API, please join our Discord server