Skip to main content
Glama
WormAlien

opencode-openai-vision-mcp

by WormAlien

opencode-openai-vision-mcp

A tiny MCP server that gives OpenCode (and any other MCP client) working image/vision support through any OpenAI-compatible endpoint — for example an OmniRoute or LiteLLM gateway, or OpenAI itself.

It reads an image file from disk, sends it to a vision-capable model as an image_url content block, and returns the model's text description.

Why this exists

OpenCode currently cannot send image attachments to vision models served through a custom OpenAI-compatible provider (@ai-sdk/openai-compatible). The attachment is dropped before it reaches the model, and the assistant replies with something like:

this model does not support image input

This is an OpenCode adapter bug, tracked upstream in anomalyco/opencode#20802 (fix PRs #26826 / #21627 were not merged at the time of writing). The underlying model and gateway are usually fine — a direct /chat/completions call with image_url works; only OpenCode's conversion is broken.

This project is a workaround: instead of relying on OpenCode's native image path, it routes images through an MCP tool that you fully control.

Related MCP server: Vision MCP Server

How it works

OpenCode (you paste a screenshot)
  └─ opencode-vision plugin            intercepts the image inside OpenCode,
     (separate project, see below)     saves it to a temp file, and rewrites the
                                        message to "call local_vision with this path"
        └─ local_vision (THIS server)  reads the file, base64-encodes it, and POSTs
                                        it as image_url to an OpenAI-compatible endpoint
              └─ your gateway/model     e.g. OmniRoute -> any vision model
                                        returns a text description back to OpenCode

This is a describe-then-reason approach: a vision model looks at the image and returns text; your main model works from that text. It is not native multimodality, but it restores the "paste a screenshot and ask" workflow.

Prerequisites

  1. Node.js >= 18

  2. A vision-capable model reachable via an OpenAI-compatible /chat/completions endpoint (OmniRoute, LiteLLM, OpenAI, etc.).

  3. The opencode-vision plugin (AGPL-3.0) — this is what intercepts the pasted image inside OpenCode and calls the local_vision tool. This MCP server is the tool it calls. You need both.

Install

git clone https://github.com/WormAlien/opencode-openai-vision-mcp.git
cd opencode-openai-vision-mcp
npm install

Quick self-test (optional) — point it at your gateway and it should print a color word:

VISION_BASE_URL=http://localhost:20128/v1 \
VISION_API_KEY=your-key \
VISION_MODEL=your-vision-model \
node server.js
# then drive it with any MCP client, or just confirm it starts without error

Configure OpenCode

Add the MCP server to your opencode.json (see opencode.example.json):

{
  "mcp": {
    "local": {
      "type": "local",
      "command": ["node", "/absolute/path/to/opencode-openai-vision-mcp/server.js"],
      "enabled": true,
      "environment": {
        "VISION_BASE_URL": "http://localhost:20128/v1",
        "VISION_API_KEY": "YOUR_GATEWAY_API_KEY",
        "VISION_MODEL": "your-vision-model-or-alias"
      }
    }
  }
}

Naming the server local makes its tool resolve to local_vision, which matches the default imageAnalysisTool of the opencode-vision plugin — so no extra wiring needed.

Then enable the plugin for your models via opencode-vision.json (see opencode-vision.example.json):

{
  "models": ["your-provider/*"],
  "imageAnalysisTool": "local_vision"
}

Restart OpenCode, select a model under that provider, paste an image, and ask away.

Environment variables

Variable

Default

Description

VISION_BASE_URL

http://localhost:20128/v1

OpenAI-compatible base URL (must end in /v1).

VISION_API_KEY

(empty)

Bearer token for the endpoint. Omitted if empty.

VISION_MODEL

gpt-4o

Vision model name, or a gateway alias.

VISION_MAX_TOKENS

1024

Max tokens for the description.

Tip: point VISION_MODEL at a gateway alias (e.g. a vision alias in OmniRoute). Then you can swap the real model in your gateway dashboard without editing any config.

The vision tool

  • Input: path (absolute path to a PNG/JPEG/WebP/GIF), optional question.

  • Output: a text description (transcribes visible text by default).

  • Handles both plain JSON and SSE/streamed responses from the endpoint.

Notes / limitations

  • The image must be readable on the same machine the server runs on (it reads from the local filesystem path the plugin saved).

  • Quality depends entirely on the vision model you point it at.

  • Once OpenCode merges native image support for openai-compatible providers (#20802), you may not need this.

Credits

License

MIT — see LICENSE. (This applies to this MCP server only; the opencode-vision plugin has its own AGPL-3.0 license.)

Install Server
A
license - permissive license
A
quality
C
maintenance

Maintenance

Maintainers
Response time
Release cycle
Releases (12mo)
Commit activity

Resources

Unclaimed servers have limited discoverability.

Looking for Admin?

If you are the server author, to access and configure the admin panel.

Tools

Latest Blog Posts

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/WormAlien/OpenCode-vision-OmniRoute'

If you have feedback or need assistance with the MCP directory API, please join our Discord server