vision_query
Process images by extracting text via OCR, answering visual queries, or detecting elements through the GLM-4.5V MCP Server. Specify modes like describe, ocr, qa, or detect to analyze visuals effectively.
Instructions
调用 GLM-4.5V 对图片进行 OCR/问答/检测
Input Schema
Name | Required | Description | Default |
---|---|---|---|
mode | No | 查询模式 | describe |
path | Yes | 图片路径或URL | |
prompt | Yes | 查询提示词 | |
returnJson | No | 是否返回JSON格式结果 |
Input Schema (JSON Schema)
{
"$schema": "http://json-schema.org/draft-07/schema#",
"additionalProperties": false,
"properties": {
"mode": {
"default": "describe",
"description": "查询模式",
"enum": [
"describe",
"ocr",
"qa",
"detect"
],
"type": "string"
},
"path": {
"description": "图片路径或URL",
"type": "string"
},
"prompt": {
"description": "查询提示词",
"type": "string"
},
"returnJson": {
"default": false,
"description": "是否返回JSON格式结果",
"type": "boolean"
}
},
"required": [
"path",
"prompt"
],
"type": "object"
}