searchVisualContent
Search video content visually to find specific frames using OCR and AI descriptions. Returns matching images with timestamps for evidence-based discovery.
Instructions
Search the actual visual content of a video or your indexed frame library. Uses Apple Vision OCR, optional Gemini frame descriptions, and optional Gemini semantic embeddings. Always returns frame/image evidence with timestamps.
Input Schema
| Name | Required | Description | Default |
|---|---|---|---|
| query | Yes | Visual search query, e.g. 'whiteboard diagram' or 'slide that says title research checklist' | |
| videoIdOrUrl | No | Optional video scope. If provided, the server can auto-index this video if needed. | |
| maxResults | No | ||
| minScore | No | ||
| autoIndexIfNeeded | No | If scoped to a video and no visual index exists yet, build it automatically (default true) | |
| intervalSec | No | Frame interval to use if auto-indexing is triggered | |
| maxFrames | No | Frame cap to use if auto-indexing is triggered | |
| imageFormat | No | ||
| width | No | ||
| autoDownload | No | ||
| downloadFormat | No | ||
| includeGeminiDescriptions | No | ||
| includeGeminiEmbeddings | No | ||
| dryRun | No |