chat_with_vision
Analyze images and answer questions about visual content using AI vision capabilities. Upload images via paths or URLs and receive detailed descriptions or insights based on your prompts.
Input Schema
TableJSON Schema
| Name | Required | Description | Default |
|---|---|---|---|
| prompt | Yes | ||
| model | No | grok-4 | |
| image_paths | No | ||
| image_urls | No | ||
| detail | No | auto |
Implementation Reference
- src/server.py:120-150 (handler)Main handler function for chat_with_vision tool. This async function takes a prompt, model name, optional image paths/URLs, and detail level. It creates a chat client, processes images (encoding them to base64 if needed), appends them as image content along with the prompt, and returns the AI response content.@mcp.tool() async def chat_with_vision( prompt: str, model: str = "grok-4", image_paths: Optional[List[str]] = None, image_urls: Optional[List[str]] = None, detail: str = "auto" ): client = Client(api_key=XAI_API_KEY) chat = client.chat.create(model=model, store_messages=False) user_content = [] if image_paths: for path in image_paths: ext = Path(path).suffix.lower().replace('.', '') if ext not in ["jpg", "jpeg", "png"]: raise ValueError(f"Unsupported image type: {ext}") base64_img = encode_image_to_base64(path) user_content.append(image(image_url=f"data:image/{ext};base64,{base64_img}", detail=detail)) if image_urls: for url in image_urls: user_content.append(image(image_url=url, detail=detail)) user_content.append(prompt) chat.append(user(*user_content)) response = chat.sample() client.close() return response.content
- src/server.py:120-120 (registration)Tool registration using @mcp.tool() decorator on the chat_with_vision function, which registers it as an MCP tool with the FastMCP server.@mcp.tool()
- src/utils.py:13-18 (helper)Helper function encode_image_to_base64 that reads an image file from disk and converts it to a base64 encoded string, used by chat_with_vision to process local image files.def encode_image_to_base64(image_path: str): path = Path(image_path) if not path.exists(): raise FileNotFoundError(f"Image file not found: {image_path}") with open(image_path, "rb") as image_file: return base64.b64encode(image_file.read()).decode("utf-8")