get_video_preview
Generate a contact-sheet preview of a YouTube video, sampling frames evenly across the video or a time window, with timestamps for each tile to identify scenes and pick moments for closer inspection.
Instructions
Get a visual overview of a YouTube video as ONE tiled contact-sheet image: tiles frames
sampled evenly across the video (or across a start..end window of it), plus a text legend
mapping each tile to its mm:ss timestamp.
Use this to see what's going on across a video (talking head vs slides vs demo footage, scene changes, "is there a chart anywhere?") and to pick moments worth a closer look. To inspect one part in more detail, call it again with start/end around that part -- but pick the window from the transcript, chapters, or get_most_replayed first and zoom once; don't binary-search the video with repeated sheets, since every returned image stays in context. Read each tile's timestamp from the legend -- do not count grid cells yourself. Tiles are small and not readable: to read a slide, caption, or UI, follow up with get_video_frame(video, at=<that tile's timestamp>). Windows under ~1 minute may return near-duplicate tiles (frames land on keyframes, a few seconds apart).
Requires ffmpeg on the server (the system binary, or the one bundled by the [media] extra).
Args: video: A YouTube URL (watch, youtu.be, shorts, embed, live) or an 11-character video ID. tiles: How many frames to sample (clamped 4..24; default 12). tile_width: Width in pixels of each tile (clamped 160..480; default 320). The whole sheet stays around 1000-1300 px wide at the defaults -- cheap on a vision model's image budget while keeping tiles recognizable. start: Optional window start -- seconds (e.g. 90) or a "mm:ss" / "h:mm:ss" string. Defaults to the beginning of the video. end: Optional window end, same forms. Defaults to (and is clamped to) the video's end.
Input Schema
| Name | Required | Description | Default |
|---|---|---|---|
| video | Yes | ||
| tiles | No | ||
| tile_width | No | ||
| start | No | ||
| end | No |