low-hallucination-vision
Click on "Install Server".
Wait a few minutes for the server to deploy. Once ready, it will show a "Started" state.
In the chat, type
@followed by the MCP server name and your instructions, e.g., "@low-hallucination-visionanalyze this image of a receipt and extract the total amount"
That's it! The server will respond to your query, and you can continue using it as needed.
Here is a step-by-step guide with screenshots.
Low-Hallucination Vision Toolkit
A drop-in replacement for high-hallucination vision MCPs (like the default
analyze_image), built on top of your own OpenAI-compatible multimodal model
(mimo v2.5). Two pieces that work together:
vision-mcp/ ← MCP server (Python). The engine. Plug into any agent.
vision-skill/ ← Skill (SKILL.md). The cross-verification workflow. Plug into ZCode.Why this is lower-hallucination than a generic VLM call
It's not magic — it's five boring disciplines, all in the MCP layer:
Mode-routed prompts — UI / general / OCR / detect each get a tightly scoped system prompt instead of one "describe everything" prompt.
Forced structured JSON — every claim is an object with
confidence.Low temperature (0.2 default) — less creative completion.
"Allowed to be ignorant" — prompts explicitly forbid common-sense completion of details not actually visible.
Confidence gating — the MCP reflags any claim below threshold as
"_flag": "存疑", so the agent can't accidentally report it as fact.
The Skill adds a sixth layer on top: cross-verification (run two independent modes and only trust claims both agree on).
Related MCP server: image-tiler-mcp-server
Setup (uv-managed environment)
This project uses uv for environment management.
uv creates an isolated .venv per project and pins the Python version, so
nothing pollutes your global Python. The .venv is what VSCode auto-detects.
1. Install dependencies & create the venv
cd C:\Users\zrzring\ZCodeProject\vision-mcp
uv syncThat single command:
reads
.python-version(3.12) and auto-downloads that Python if missing,creates
vision-mcp\.venv,installs everything in
pyproject.toml(currentlymcp[cli]).
To add a package later:
uv add <pkg>. To rebuild after pulling the repo: justuv syncagain. Never use rawpiphere — it would install into the wrong place.
Verify it works:
uv run python -c "import main; print('OK', main.mcp.name)"
# → OK low-hallucination-vision2. Configure API credentials
copy .env.example .env # then edit .envVISION_API_BASE=https://api.mimo.example.com/v1 # your OpenAI-compatible endpoint
VISION_API_KEY=sk-...
VISION_MODEL=mimo-vl-2.5
VISION_TEMPERATURE=0.23. Make VSCode detect the venv
VSCode's Python extension auto-detects .venv in the workspace. To be safe:
Open the folder
C:\Users\zrzring\ZCodeProject(not the single file) in VSCode.Install the Python extension (ms-python.python) if not already.
Ctrl+Shift+P→ Python: Select Interpreter → pick the one shown asPython 3.12.13 ('.venv')undervision-mcp\.venv\Scripts\python.exe.
If it doesn't show up, force it with a workspace setting — create
.vscode/settings.json in the project root:
{
"python.defaultInterpreterForWorkspace": "vision-mcp\\.venv\\Scripts\\python.exe",
"python.terminal.activateEnvironment": true
}Now any terminal you open in VSCode auto-activates .venv, and you get
autocomplete / type-checking for mcp and your code.
4. Register the MCP server with your agents
The server speaks stdio MCP. Use uv run to launch it — this guarantees
the project's .venv is used regardless of the agent's working directory:
Claude Code — ~/.claude.json (or project .mcp.json):
{
"mcpServers": {
"low-hallucination-vision": {
"command": "uv",
"args": ["run", "--directory",
"C:\\Users\\zrzring\\ZCodeProject\\vision-mcp",
"python", "main.py"],
"env": {
"VISION_API_BASE": "https://api.mimo.example.com/v1",
"VISION_API_KEY": "sk-...",
"VISION_MODEL": "mimo-vl-2.5"
}
}
}
}OpenCode — opencode.json:
{
"mcp": {
"low-hallucination-vision": {
"type": "local",
"command": ["uv", "run", "--directory",
"C:\\Users\\zrzring\\ZCodeProject\\vision-mcp",
"python", "main.py"],
"environment": {
"VISION_API_BASE": "https://api.mimo.example.com/v1",
"VISION_API_KEY": "sk-...",
"VISION_MODEL": "mimo-vl-2.5"
}
}
}
}ZCode — same mcpServers shape as Claude Code.
Why
uv run --directoryinstead of a barepython? Because the agent may launch the server from any working directory;uv run --directoryalways activates the right.venv. Environment variables can live in the config (as above) OR invision-mcp/.env— either works.
Alternative: build a standalone vision-mcp.exe
If you'd rather not depend on uv/Python at runtime, package the server into
a single executable with PyInstaller. The exe is self-contained (~24 MB),
needs no Python installed, and works on any machine when shipped with its
.env. It runs in two modes: a stdio MCP server (default) and a
command-line image tool.
Build it
pyinstaller is already in pyproject.toml, so after uv sync:
cd C:\Users\zrzring\ZCodeProject\vision-mcp
uv run pyinstaller --onefile --name vision-mcp --collect-all mcp --clean --noconfirm main.pyOutput lands in dist\vision-mcp.exe. The vision-mcp.spec file is
auto-generated; you can re-run pyinstaller vision-mcp.spec --noconfirm
after that for identical builds.
Put it on PATH and configure
Copy the exe and your
.envto a directory already on PATH (e.g.C:\Users\<you>\.local\bin):copy dist\vision-mcp.exe C:\Users\<you>\.local\bin\ copy .env C:\Users\<you>\.local\bin\The exe reads
.envfrom its own directory first, then the source dir, then the working dir. So keep.envnext to the exe — change key/endpoint there, no rebuild needed.Verify from anywhere:
vision-mcp --help vision-mcp analyze C:\path\to\pic.png --mode general --prompt "describe it"
Register the exe with agents
Because the exe defaults to MCP-server mode, agent config is minimal — no
uv run, no args, no env block (creds come from the exe's .env):
Claude Code — ~/.claude.json (or project .mcp.json):
{
"mcpServers": {
"low-hallucination-vision": {
"command": "vision-mcp"
}
}
}OpenCode — opencode.json:
{
"mcp": {
"low-hallucination-vision": {
"type": "local",
"command": ["vision-mcp"]
}
}
}If vision-mcp isn't on PATH for the agent, use the full path instead:
"command": "C:\\Users\\<you>\\.local\\bin\\vision-mcp.exe".
CLI mode (use it directly, no agent)
The same exe doubles as a terminal image tool:
vision-mcp # = MCP server (default)
vision-mcp mcp # " (explicit)
vision-mcp analyze <image> [--mode general|ui_screenshot|ocr|detect] [--prompt "..."]
vision-mcp ocr <image> [--prompt "..."]
vision-mcp detect <image> [--prompt "..."]<image> is a local path or an http(s) URL. Output is the same JSON the MCP
tools return (with bbox normalization + confidence flagging applied).
Source vs exe — which to use? Source (
uv run) is best while developing (editmain.py, reload instantly). The exe is best for daily use and sharing to other machines — no Python toolchain needed.
3. (Optional) Register the Skill with ZCode
Copy or symlink vision-skill/ into your skills directory so the
cross-verification workflow is auto-loaded:
<skills-dir>/low-hallucination-vision/SKILL.mdThe Skill is agent-agnostic in content but only ZCode auto-discovers Skills. For Claude Code / OpenCode, the MCP tools alone still work — just keep the Skill's workflow in mind (or paste the relevant section into your own prompt).
Tools provided
Tool | What it does | When to use |
| Structured analysis; | Default entry point |
| Text-only extraction | When you only need words |
| Object detection with mandatory bbox | When you need locations |
All three return JSON. Claims below VISION_CONFIDENCE_THRESHOLD (default 0.6)
are tagged "_flag": "存疑".
image_source accepts either a local file path or an http(s) URL.
File map
ZCodeProject/
├── vision-mcp/ ← uv project (this README lives here)
│ ├── main.py ← the MCP server + CLI (engine + anti-hallucination)
│ ├── pyproject.toml ← deps: mcp[cli], pyinstaller
│ ├── uv.lock ← pinned versions (auto-generated)
│ ├── .python-version ← 3.12 (uv auto-downloads it)
│ ├── .env.example ← copy to .env and fill in
│ ├── vision-mcp.spec ← auto-generated by PyInstaller (for rebuilds)
│ ├── .venv/ ← created by `uv sync` (gitignored)
│ ├── build/ ← PyInstaller intermediates (gitignored)
│ └── dist/
│ └── vision-mcp.exe ← the standalone exe (built, gitignored)
└── .agents/ ← skill(s) discovered by ZCode
└── skills/vision-skill/
└── SKILL.md ← cross-verification workflow for the agentTuning
Still too much hallucination? Lower
VISION_TEMPERATUREto 0.1 and raiseVISION_CONFIDENCE_THRESHOLDto 0.7.Missing real things (over-conservative)? Lower the threshold to 0.5 and raise temperature slightly to 0.3.
Model keeps breaking JSON? Some VLMs ignore schema instructions; in that case the tool returns
"_parse_error": truewith the raw text so you can post-process. Consider switching to a model with stronger JSON support.
This server cannot be installed
Maintenance
Resources
Unclaimed servers have limited discoverability.
Looking for Admin?
If you are the server author, to access and configure the admin panel.
Latest Blog Posts
MCP directory API
We provide all the information about MCP servers via our MCP API.
curl -X GET 'https://glama.ai/api/mcp/v1/servers/ZRZRING/vision-mcp'
If you have feedback or need assistance with the MCP directory API, please join our Discord server