analyze_image
Analyze images using Gemini vision capabilities to answer questions or follow instructions about visual content.
Instructions
Analyze images using Gemini vision capabilities
Input Schema
TableJSON Schema
| Name | Required | Description | Default |
|---|---|---|---|
| prompt | Yes | Question or instruction about the image | |
| imageUrl | No | URL of the image to analyze | |
| imageBase64 | No | Base64-encoded image data (alternative to URL) | |
| model | No | Vision-capable Gemini model | gemini-2.5-flash |
Implementation Reference
- src/enhanced-stdio-server.ts:619-696 (handler)The core handler function that executes the analyze_image tool. Processes input arguments, handles image data (URL or base64), calls the Gemini vision API, and returns the analysis result as an MCP response.private async analyzeImage(id: any, args: any): Promise<MCPResponse> { try { const model = args.model || 'gemini-2.5-flash'; // Validate inputs if (!args.imageUrl && !args.imageBase64) { throw new Error('Either imageUrl or imageBase64 must be provided'); } // Prepare image part let imagePart: any; if (args.imageUrl) { // For URL, we'd need to fetch and convert to base64 // For now, we'll just pass the URL as instruction imagePart = { text: `[Image URL: ${args.imageUrl}]` }; } else if (args.imageBase64) { // Log base64 data size for debugging console.error(`Image base64 length: ${args.imageBase64.length}`); // Extract MIME type and data const matches = args.imageBase64.match(/^data:(.+);base64,(.+)$/); if (matches) { console.error(`MIME type: ${matches[1]}, Data length: ${matches[2].length}`); imagePart = { inlineData: { mimeType: matches[1], data: matches[2] } }; } else { // If no data URI format, assume raw base64 console.error('Raw base64 data detected'); imagePart = { inlineData: { mimeType: 'image/jpeg', data: args.imageBase64 } }; } } const result = await this.genAI.models.generateContent({ model, contents: [{ parts: [ { text: args.prompt }, imagePart ], role: 'user' }] }); const text = result.text || ''; return { jsonrpc: '2.0', id, result: { content: [{ type: 'text', text: text }] } }; } catch (error) { console.error('Error in analyzeImage:', error); return { jsonrpc: '2.0', id, error: { code: -32603, message: `Image analysis failed: ${error instanceof Error ? error.message : 'Unknown error'}` } }; } }
- src/enhanced-stdio-server.ts:276-303 (schema)Input schema definition for the analyze_image tool, specifying parameters like prompt, imageUrl or imageBase64 (one required), and model.inputSchema: { type: 'object', properties: { prompt: { type: 'string', description: 'Question or instruction about the image' }, imageUrl: { type: 'string', description: 'URL of the image to analyze' }, imageBase64: { type: 'string', description: 'Base64-encoded image data (alternative to URL)' }, model: { type: 'string', description: 'Vision-capable Gemini model', enum: ['gemini-2.5-pro', 'gemini-2.5-flash', 'gemini-2.0-flash'], default: 'gemini-2.5-flash' } }, required: ['prompt'], oneOf: [ { required: ['imageUrl'] }, { required: ['imageBase64'] } ] }
- src/enhanced-stdio-server.ts:273-304 (registration)Tool registration entry returned by tools/list, defining the name, description, and full input schema for analyze_image.{ name: 'analyze_image', description: 'Analyze images using Gemini vision capabilities', inputSchema: { type: 'object', properties: { prompt: { type: 'string', description: 'Question or instruction about the image' }, imageUrl: { type: 'string', description: 'URL of the image to analyze' }, imageBase64: { type: 'string', description: 'Base64-encoded image data (alternative to URL)' }, model: { type: 'string', description: 'Vision-capable Gemini model', enum: ['gemini-2.5-pro', 'gemini-2.5-flash', 'gemini-2.0-flash'], default: 'gemini-2.5-flash' } }, required: ['prompt'], oneOf: [ { required: ['imageUrl'] }, { required: ['imageBase64'] } ] } },
- src/enhanced-stdio-server.ts:477-478 (registration)Dispatch logic in handleToolCall method that maps 'analyze_image' tool calls to the analyzeImage handler.case 'analyze_image': return await this.analyzeImage(request.id, args);