Skip to main content
Glama

image_recognition

Analyzes and describes images using Google Gemini AI by processing file paths and custom prompts, enabling detailed content recognition for diverse use cases.

Instructions

Analyze and describe images using Google Gemini AI

Input Schema

TableJSON Schema
NameRequiredDescriptionDefault
filepathYesPath to the media file to analyze
modelnameNoGemini model to use for recognitiongemini-2.0-flash
promptNoCustom prompt for the recognitionDescribe this content

Implementation Reference

  • The main handler function (callback) that implements the image_recognition tool logic. It validates the image file, uploads it using GeminiService, processes it with a prompt, and returns the result or error.
    callback: async (args: ImageRecognitionParams): Promise<CallToolResult> => { try { log.info(`Processing image recognition request for file: ${args.filepath}`); log.verbose('Image recognition request', JSON.stringify(args)); // Verify file exists if (!fs.existsSync(args.filepath)) { throw new Error(`Image file not found: ${args.filepath}`); } // Verify file is an image const ext = path.extname(args.filepath).toLowerCase(); if (!['.jpg', '.jpeg', '.png', '.webp'].includes(ext)) { throw new Error(`Unsupported image format: ${ext}. Supported formats are: .jpg, .jpeg, .png, .webp`); } // Default prompt if not provided const prompt = args.prompt || 'Describe this image'; const modelName = args.modelname || 'gemini-2.0-flash'; // Upload the file log.info('Uploading image file...'); const file = await geminiService.uploadFile(args.filepath); // Process with Gemini log.info('Generating content from image...'); const result = await geminiService.processFile(file, prompt, modelName); if (result.isError) { log.error(`Error in image recognition: ${result.text}`); return { content: [ { type: 'text', text: result.text } ], isError: true }; } log.info('Image recognition completed successfully'); log.verbose('Image recognition result', JSON.stringify(result)); return { content: [ { type: 'text', text: result.text } ] }; } catch (error) { log.error('Error in image recognition tool', error); const errorMessage = error instanceof Error ? error.message : String(error); return { content: [ { type: 'text', text: `Error processing image: ${errorMessage}` } ], isError: true }; } }
  • Zod schema for input parameters of the image_recognition tool. Extends the common RecognitionParamsSchema with filepath, optional prompt, and modelname.
    export const ImageRecognitionParamsSchema = RecognitionParamsSchema.extend({}); export type ImageRecognitionParams = z.infer<typeof ImageRecognitionParamsSchema>;
  • src/server.ts:58-62 (registration)
    Registration of the image_recognition tool with the MCP server using mcpServer.tool() method, passing name, description, inputSchema, and callback.
    this.mcpServer.tool( imageRecognitionTool.name, imageRecognitionTool.description, imageRecognitionTool.inputSchema.shape, imageRecognitionTool.callback
  • src/server.ts:53-53 (registration)
    Creation of the imageRecognitionTool instance using createImageRecognitionTool factory function before registration.
    const imageRecognitionTool = createImageRecognitionTool(this.geminiService);
  • Factory function that creates the full tool definition object for image_recognition, including name, description, schema, and handler callback.
    export const createImageRecognitionTool = (geminiService: GeminiService) => { return { name: 'image_recognition', description: 'Analyze and describe images using Google Gemini AI', inputSchema: ImageRecognitionParamsSchema, callback: async (args: ImageRecognitionParams): Promise<CallToolResult> => { try { log.info(`Processing image recognition request for file: ${args.filepath}`); log.verbose('Image recognition request', JSON.stringify(args)); // Verify file exists if (!fs.existsSync(args.filepath)) { throw new Error(`Image file not found: ${args.filepath}`); } // Verify file is an image const ext = path.extname(args.filepath).toLowerCase(); if (!['.jpg', '.jpeg', '.png', '.webp'].includes(ext)) { throw new Error(`Unsupported image format: ${ext}. Supported formats are: .jpg, .jpeg, .png, .webp`); } // Default prompt if not provided const prompt = args.prompt || 'Describe this image'; const modelName = args.modelname || 'gemini-2.0-flash'; // Upload the file log.info('Uploading image file...'); const file = await geminiService.uploadFile(args.filepath); // Process with Gemini log.info('Generating content from image...'); const result = await geminiService.processFile(file, prompt, modelName); if (result.isError) { log.error(`Error in image recognition: ${result.text}`); return { content: [ { type: 'text', text: result.text } ], isError: true }; } log.info('Image recognition completed successfully'); log.verbose('Image recognition result', JSON.stringify(result)); return { content: [ { type: 'text', text: result.text } ] }; } catch (error) { log.error('Error in image recognition tool', error); const errorMessage = error instanceof Error ? error.message : String(error); return { content: [ { type: 'text', text: `Error processing image: ${errorMessage}` } ], isError: true }; } } }; };

Other Tools

Related Tools

Latest Blog Posts

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/mario-andreschak/mcp_video_recognition'

If you have feedback or need assistance with the MCP directory API, please join our Discord server