analyze_image
Analyze an image from a file path, URL, or data URL. Optionally ask a question about the image to get specific insights.
Instructions
Analyze an image using a vision model
Input Schema
| Name | Required | Description | Default |
|---|---|---|---|
| image_path | Yes | File path, URL, or data URL | |
| question | No | Question about the image | |
| model | No |
Implementation Reference
- src/tool-handlers/analyze-image.ts:23-80 (handler)Main handler function that validates image_path, fetches and optimizes the image via prepareImageUrl, sends it to a vision model via OpenAI chat completions, and returns the extracted text response.
export async function handleAnalyzeImage( request: { params: { arguments: AnalyzeImageToolRequest } }, openai: OpenAI, defaultModel?: string, ) { const { image_path, question, model } = request.params.arguments ?? { image_path: '' }; if (!image_path) { return toolError(ErrorCode.INVALID_INPUT, 'image_path is required.'); } let imageUrl: string; try { imageUrl = await prepareImageUrl(image_path); } catch (err) { const msg = err instanceof Error ? err.message : String(err); if (msg.includes('Blocked host')) return toolErrorFrom(ErrorCode.UPSTREAM_REFUSED, err); if (msg.toLowerCase().includes('too large')) { return toolErrorFrom(ErrorCode.RESOURCE_TOO_LARGE, err); } return toolErrorFrom(ErrorCode.INVALID_INPUT, err); } let completion: ChatCompletion; try { completion = await openai.chat.completions.create({ model: model || defaultModel || DEFAULT_MODEL, messages: [ { role: 'user', content: [ { type: 'text', text: question || "What's in this image?" }, { type: 'image_url', image_url: { url: imageUrl } }, ], }, ] as ChatCompletionMessageParam[], }); } catch (err) { return classifyUpstreamError(err); } const extracted = extractCompletionText(completion); const cutoff = detectReasoningCutoff(extracted); if (cutoff) return cutoff; if (!extracted.text) { return toolError(ErrorCode.INTERNAL, 'Vision model returned no textual content.', { finish_reason: extracted.finishReason, }); } return { content: [{ type: 'text' as const, text: extracted.text }], _meta: { finish_reason: extracted.finishReason, ...(toUsageMeta(extracted.usage) ?? {}), }, }; } - Input schema/interface for the analyze_image tool: requires image_path, optional question and model.
export interface AnalyzeImageToolRequest { image_path: string; question?: string; model?: string; } - src/tool-handlers.ts:146-161 (registration)Registration of the 'analyze_image' tool in the ListToolsResponseSchema, defining its name, description, and input schema.
name: 'analyze_image', description: 'Analyze an image using a vision model', annotations: { readOnlyHint: true, destructiveHint: false, idempotentHint: false, }, inputSchema: { type: 'object', properties: { image_path: { type: 'string', description: 'File path, URL, or data URL' }, question: { type: 'string', description: 'Question about the image' }, model: { type: 'string' }, }, required: ['image_path'], }, - src/tool-handlers.ts:475-480 (registration)Dispatch in the CallToolRequestSchema handler: routes 'analyze_image' calls to the handleAnalyzeImage function.
case 'analyze_image': return handleAnalyzeImage( wrapToolArgs(args as AnalyzeImageToolRequest | undefined), this.openai, this.defaultModel, ); - Helper that fetches an image (HTTP, data URL, or local file), optionally optimizes it with sharp, and returns a data URL string for the vision model API.
export async function prepareImageUrl(source: string): Promise<string> { if (source.startsWith('data:')) return source; const buffer = await fetchImage(source); const { base64, mime } = await optimizeImage(buffer); // When optimization succeeded, mime is 'image/jpeg'. When it failed, we // use the sniffed mime. For local files we prefer the extension-derived // mime (more specific) when optimization fell back. const finalMime = mime === 'image/jpeg' || source.startsWith('http') ? mime : getMimeType(source); return `data:${finalMime};base64,${base64}`; }