photo-vlm-mcp
Uses Ollama's vision-capable models to analyze photos, extract text, inspect scenes, compare photos, and extract metadata.
Click on "Install Server".
Wait a few minutes for the server to deploy. Once ready, it will show a "Started" state.
In the chat, type
@followed by the MCP server name and your instructions, e.g., "@photo-vlm-mcpExtract text from this whiteboard photo."
That's it! The server will respond to your query, and you can continue using it as needed.
Here is a step-by-step guide with screenshots.
photo-vlm-mcp
photo-vlm-mcp is a local, Ollama-backed MCP server that gives coding assistants a
portable photo-understanding toolset:
analyze_photo- ask questions about a real-world photo.photo_ocr- extract text from labels, receipts, whiteboards, signs, and photographed pages.inspect_scene- return structured objects, scene context, visible text, quality, and uncertainty.compare_photos- compare before/after or near-duplicate photos.extract_metadata- read dimensions, orientation, EXIF fields, GPS, and optional SHA-256.health- check Ollama reachability and configured model availability.
It is designed for user-level registration with Claude Code, Codex, Antigravity, Cursor, Cline, Windsurf, Zed, and other MCP clients.
Requirements
Python 3.11+
Ollama running locally
A vision-capable Ollama model
Recommended local models:
ollama pull qwen3-vl:8b
ollama pull minicpm-vIf qwen3-vl:8b is unavailable in your Ollama build, use qwen2.5vl:7b or another
vision model from ollama.com/search?c=vision.
Related MCP server: mcp-vision
Install
Directly from GitHub:
python -m pip install "git+https://github.com/YehudRaanan/photo-vlm-mcp.git"Or run without a persistent install using uvx:
uvx --from "git+https://github.com/YehudRaanan/photo-vlm-mcp.git" photo-vlm-mcp --versionFor local development:
git clone https://github.com/YehudRaanan/photo-vlm-mcp.git
cd photo-vlm-mcp
python -m pip install -e .With optional Tesseract OCR support:
python -m pip install -e ".[tesseract]"Run
photo-vlm-mcpThe server uses MCP over stdio, so it normally runs under an MCP client rather than as a long-lived terminal command.
Helpful local checks:
photo-vlm-mcp --version
photo-vlm-mcp --print-configRegister With Claude Code
claude mcp add photo-vlm --scope user -- photo-vlm-mcpWith explicit model config:
claude mcp add photo-vlm --scope user `
-e OLLAMA_URL=http://127.0.0.1:11434 `
-e PHOTO_VLM_MODEL=qwen3-vl:8b `
-e PHOTO_OCR_MODEL=minicpm-v `
-- photo-vlm-mcpRegister With Codex / Antigravity / Other MCP Clients
Add this server to the client user-level MCP config:
{
"mcpServers": {
"photo-vlm": {
"command": "photo-vlm-mcp",
"env": {
"OLLAMA_URL": "http://127.0.0.1:11434",
"PHOTO_VLM_MODEL": "qwen3-vl:8b",
"PHOTO_OCR_MODEL": "minicpm-v"
}
}
}
}If the console script is not on PATH, use:
{
"command": "python",
"args": ["-m", "photo_vlm_mcp"]
}Configuration
Variable | Default | Meaning |
|
| Ollama endpoint |
|
| Model for analysis, scene inspection, comparison |
|
| Model for VLM OCR |
|
| Default generation limit |
|
| Ollama request timeout in seconds |
|
| Ollama keep-alive setting |
|
| Downscale longest side before inference |
|
| Reject larger images |
|
| URL fetch timeout |
|
| Allow private/loopback URLs |
| unset | Optional path allow-list, separated by |
Legacy aliases VLM_MODEL, OCR_MODEL, and related VLM_* variables are also accepted.
QA
python -m pytest
python -m ruff check src tests scripts
python -m black --check src tests scripts
python -m isort --check-only src tests scriptsThe unit tests mock Ollama and validate image input handling, metadata extraction, prompt construction, and client behavior. Live model quality evaluation belongs in a separate environment with Ollama and selected models installed.
See also:
License
MIT
This server cannot be installed
Maintenance
Resources
Unclaimed servers have limited discoverability.
Looking for Admin?
If you are the server author, to access and configure the admin panel.
Latest Blog Posts
MCP directory API
We provide all the information about MCP servers via our MCP API.
curl -X GET 'https://glama.ai/api/mcp/v1/servers/YehudRaanan/photo-vlm-mcp'
If you have feedback or need assistance with the MCP directory API, please join our Discord server