vision-mcp
Server Configuration
Describes the environment variables required to run the server.
| Name | Required | Description | Default |
|---|---|---|---|
| VISION_MODEL | No | Model identifier for the vision backend (default depends on profile) | |
| Z_AI_API_KEY | No | Alternative API key (alias for VISION_API_KEY) | |
| VISION_API_KEY | No | API key for the vision backend (alias Z_AI_API_KEY) | |
| VISION_PROFILE | No | Backend profile: glm, kimi, mimo, openai, generic | glm |
| VISION_BASE_URL | No | Base URL for the OpenAI-compatible endpoint (default depends on profile) | |
| VISION_MAX_EDGE_PX | No | Overview downsample longest edge in pixels | 1568 |
| VISION_ALLOWED_DIRS | No | Directories allowed for local image paths (semicolon or colon separated) | current working directory |
| VISION_MAX_IMAGE_MB | No | Maximum image file size in MB | 10 |
| VISION_MAX_VIDEO_MB | No | Maximum video file size in MB | 50 |
| VISION_VIDEO_FRAMES | No | Frames sampled per video | 8 |
| VISION_MAX_ZOOM_ROUNDS | No | Maximum agentic zoom rounds | 3 |
| VISION_ALLOW_URL_PASSTHROUGH | No | Forward image URLs to the backend (default: download + SSRF-check → data URI) | false |
Capabilities
Features and capabilities supported by this server
| Capability | Details |
|---|---|
| tools | {
"listChanged": true
} |
Tools
Functions exposed to the LLM to take actions
| Name | Description |
|---|---|
| ui_to_artifactC | 把一张 UI 截图/设计稿转成可运行代码或结构化规格。当宿主拿到界面图、需要据此生成或还原前端实现时使用。 |
| extract_text_from_screenshotA | 逐字提取截图中的文本(代码、终端、报错、文档等),保留阅读顺序与布局。需要把图里的文字读出来时使用。 |
| diagnose_error_screenshotA | 分析报错/异常截图,给出根因、逐字错误原文、位置和可执行的修复步骤。处理崩溃/红屏/堆栈截图时使用。 |
| understand_technical_diagramA | 解读架构图/流程图/UML/ER/时序图等:节点、连线、流程与设计意图。需要读懂一张技术示意图时使用。 |
| analyze_data_visualizationA | 读懂图表/仪表盘并抽取数据与洞察(趋势、异常、数值)。需要从图表里读出数字或结论时使用。 |
| ui_diff_checkA | 对比两张 UI 截图(A 基准 / B 对照),逐条列出视觉与实现差异及可能的回归。做视觉回归/前后对比时使用。 |
| image_analysisA | 通用兜底:理解任意图片并回答问题。不确定用哪个专用工具,或只是想问一张图时使用。 |
| video_analysisA | 理解一段视频(时序+画面)并回答问题。无原生视频能力的后端会自动走帧采样。需要分析录屏/短视频时使用。 |
Prompts
Interactive templates invoked by user choice
| Name | Description |
|---|---|
No prompts | |
Resources
Contextual data attached and managed by the client
| Name | Description |
|---|---|
No resources | |
Latest Blog Posts
MCP directory API
We provide all the information about MCP servers via our MCP API.
curl -X GET 'https://glama.ai/api/mcp/v1/servers/Pelican0126/vision-mcp'
If you have feedback or need assistance with the MCP directory API, please join our Discord server