chat_with_vision
Analyze images using Grok vision models to answer questions about visual content from local files or public URLs.
Instructions
Analyze one or more images with a Grok vision model.
Accepts local image paths and/or public URLs in the same call. Local images
are sent as base64 data URIs (JPG/JPEG/PNG only, max 20 MiB each).
Args:
prompt: Question or instruction about the image(s).
session: Optional session name for persistent history in `chats/{session}.json`.
model: Vision-capable Grok model (default `grok-4-1-fast-reasoning`).
image_paths: Local image file paths to analyze.
image_urls: Public image URLs to analyze.
detail: Image detail level. One of `"auto"`, `"low"`, or `"high"`.
Returns:
The model's textual answer about the image(s).Input Schema
| Name | Required | Description | Default |
|---|---|---|---|
| prompt | Yes | ||
| session | No | ||
| model | No | grok-4-1-fast-reasoning | |
| image_paths | No | ||
| image_urls | No | ||
| detail | No | auto |