analyze_image
Analyze images by providing a URL, file path, or base64 data and a prompt. Describe contents, extract text, interpret charts, or identify objects using a vision model.
Instructions
Analyze an image with a multimodal LLM (GPT-4o, Claude, Gemini, etc.).
Provide an image (URL, local path, or base64) and a description of what you want to know. The tool calls an OpenAI-compatible vision API and returns the model's text response.
Use this tool whenever you have an image and need to:
Describe its contents
Extract text / OCR
Understand a chart, diagram, or data visualization
Analyse a UI screenshot (layout, elements, issues)
Identify objects, colours, people, or scenes in a photo
Compare or summarise visual information
Args: params (AnalyzeImageInput): - image_source (str): URL, local file path, or base64 image data. - prompt (str): What to analyze or extract from the image. - mime_type (Optional[str]): Override auto-detected MIME type.
Returns: str: The multimodal model's analysis as plain text / Markdown.
Input Schema
| Name | Required | Description | Default |
|---|---|---|---|
| params | Yes |
Output Schema
| Name | Required | Description | Default |
|---|---|---|---|
| result | Yes |