video_recognition

Instructions

Analyze and describe videos using Google Gemini AI

Input Schema

TableJSON Schema

Name	Required	Description	Default
`filepath`	Yes	Path to the media file to analyze
`modelname`	No	Gemini model to use for recognition	gemini-2.0-flash
`prompt`	No	Custom prompt for the recognition	Describe this content

Implementation Reference

src/tools/video-recognition.ts:16-89 (handler)
Factory function that creates the video_recognition tool object, defining its name, description, input schema reference, and the core callback handler. The handler validates the video file, uploads it to GeminiService, processes it with a prompt, and returns the recognition result or error.
export const createVideoRecognitionTool = (geminiService: GeminiService) => { return { name: 'video_recognition', description: 'Analyze and describe videos using Google Gemini AI', inputSchema: VideoRecognitionParamsSchema, callback: async (args: VideoRecognitionParams): Promise<CallToolResult> => { try { log.info(`Processing video recognition request for file: ${args.filepath}`); log.verbose('Video recognition request', JSON.stringify(args)); // Verify file exists if (!fs.existsSync(args.filepath)) { throw new Error(`Video file not found: ${args.filepath}`); } // Verify file is a video const ext = path.extname(args.filepath).toLowerCase(); if (ext !== '.mp4' && ext !== '.mpeg' && ext !== '.mov' && ext !== '.avi' && ext !== '.webm') { throw new Error(`Unsupported video format: ${ext}. Supported formats are: .mp4, .mpeg, .mov, .avi, .webm`); } // Default prompt if not provided const prompt = args.prompt || 'Describe this video'; const modelName = args.modelname || 'gemini-2.0-flash'; // Upload the file - this will handle waiting for video processing log.info('Uploading and processing video file...'); const file = await geminiService.uploadFile(args.filepath); // Process with Gemini log.info('Video processing complete, generating content...'); const result = await geminiService.processFile(file, prompt, modelName); if (result.isError) { log.error(`Error in video recognition: ${result.text}`); return { content: [ { type: 'text', text: result.text } ], isError: true }; } log.info('Video recognition completed successfully'); log.verbose('Video recognition result', JSON.stringify(result)); return { content: [ { type: 'text', text: result.text } ] }; } catch (error) { log.error('Error in video recognition tool', error); const errorMessage = error instanceof Error ? error.message : String(error); return { content: [ { type: 'text', text: `Error processing video: ${errorMessage}` } ], isError: true }; } } }; };
src/types/index.ts:11-23 (schema)
Zod schemas defining the input parameters for the video_recognition tool: filepath (required), optional prompt and modelname. VideoRecognitionParamsSchema extends the common RecognitionParamsSchema.
export const RecognitionParamsSchema = z.object({ filepath: z.string().describe('Path to the media file to analyze'), prompt: z.string().default('Describe this content').describe('Custom prompt for the recognition'), modelname: z.string().default('gemini-2.0-flash').describe('Gemini model to use for recognition') }); export type RecognitionParams = z.infer<typeof RecognitionParamsSchema>; /** * Video recognition specific types */ export const VideoRecognitionParamsSchema = RecognitionParamsSchema.extend({}); export type VideoRecognitionParams = z.infer<typeof VideoRecognitionParamsSchema>;
src/server.ts:72-77 (registration)
Registers the video_recognition tool with the MCP server by calling mcpServer.tool() with the tool's name, description, input schema shape, and callback handler.
this.mcpServer.tool( videoRecognitionTool.name, videoRecognitionTool.description, videoRecognitionTool.inputSchema.shape, videoRecognitionTool.callback );

MCP Video Recognition Server

Instructions

Input Schema

Implementation Reference

Other Tools

Related Tools

Latest Blog Posts

MCP directory API