scroll_capture
Capture entire webpages or long documents by scrolling and stitching multiple screenshots into a single image for full-length overviews.
Instructions
Purpose: Scroll a window top-to-bottom (or left-to-right) and stitch all frames into one image — for full-length webpages or documents that exceed a single screenshot. Details: Output is capped at ~700KB raw (MCP base64 encoding inflates to ~933KB, approaching the 1MB message limit); when sizeReduced=true appears in the response, iterative WebP downscale was applied (up to 3 passes at 0.75× each) — reduce maxScrolls or add grayscale=true to avoid truncation. Focuses the target window, scrolls to Ctrl+Home, then captures frames via Page Down until identical consecutive frames are detected or maxScrolls is reached. Pixel-overlap detection eliminates seam duplication; check response overlapMode — 'mixed-with-failures' means some seams may have duplicate rows. Prefer: Use only when the goal is whole-page overview of content too long for one screenshot. For partial verification or locating a specific section, prefer scroll + screenshot(detail='text') — you get actionable[] with coords and pay only per-viewport token cost. scroll_capture returns a stitched image (not clickable elements) that stays expensive in tokens regardless of the 1MB guard. Caveats: When sizeReduced=true, stitched image pixels do NOT match screen coords — use for reading only, not for mouse_click. When overlapMode='mixed-with-failures', expect occasional duplicate content rows near frame boundaries. Increase scrollDelayMs for pages with animations or lazy-loaded images.
Input Schema
| Name | Required | Description | Default |
|---|---|---|---|
| windowTitle | Yes | Partial title of the window to capture (case-insensitive match) | |
| direction | No | Scroll direction: 'down' (vertical, uses Page Down key) or 'right' (horizontal, uses mouse scroll). Default 'down'. | down |
| maxScrolls | No | Maximum scroll iterations before stopping (default 10, max 30) | |
| scrollDelayMs | No | Milliseconds to wait after each scroll for rendering to settle (default 400). Increase for slow/animated pages. | |
| maxWidth | No | Max size of the short edge of the final image (default 1280). For 'down': caps the image width; height is unconstrained. For 'right': caps the image height; width is unconstrained. |