take_screenshot
Capture browser screenshots as base64 PNG images to visually analyze web pages before interacting with elements. Supports full-page captures, viewport shots, or specific element targeting for web automation tasks.
Instructions
Capture a screenshot of the current browser page as a base64-encoded PNG image. Essential for visual feedback to understand what's displayed before deciding which buttons to click or forms to fill. Supports capturing the visible viewport, entire scrollable page, or specific elements. Returns the image as base64 string and data URL for easy display.
Input Schema
| Name | Required | Description | Default |
|---|---|---|---|
| fullPage | No | Capture the entire scrollable page content (true) or just visible viewport (false). Use true for long pages (default: false) | |
| selector | No | Optional CSS selector to capture only a specific element (e.g., '.chat-container', '#main-content') | |
| sessionId | Yes | Session ID obtained from initialize_session |
Implementation Reference
- src/tools/visualInspection.js:10-40 (handler)The core handler function that executes the screenshot logic using Puppeteer: gets the browser page from session, captures screenshot of page or element, converts to base64 PNG, and returns structured response with image data URL.export async function takeScreenshot( sessionId, fullPage = false, selector = null ) { const session = getSession(sessionId); const { page } = session; let screenshotBuffer; if (selector) { const element = await page.$(selector); if (!element) { throw new Error(`Element with selector "${selector}" not found`); } screenshotBuffer = await element.screenshot(); } else { screenshotBuffer = await page.screenshot({ fullPage }); } const base64Image = screenshotBuffer.toString("base64"); return { success: true, sessionId, currentUrl: page.url(), title: await page.title(), screenshotBase64: base64Image, screenshotDataUrl: `data:image/png;base64,${base64Image}`, message: "Screenshot captured successfully", }; }
- src/index.js:99-123 (schema)The input schema and tool metadata (name, description) registered for listTools request in MCP server.{ name: "take_screenshot", description: "Capture a screenshot of the current browser page as a base64-encoded PNG image. Essential for visual feedback to understand what's displayed before deciding which buttons to click or forms to fill. Supports capturing the visible viewport, entire scrollable page, or specific elements. Returns the image as base64 string and data URL for easy display.", inputSchema: { type: "object", properties: { sessionId: { type: "string", description: "Session ID obtained from initialize_session", }, fullPage: { type: "boolean", description: "Capture the entire scrollable page content (true) or just visible viewport (false). Use true for long pages (default: false)", default: false, }, selector: { type: "string", description: "Optional CSS selector to capture only a specific element (e.g., '.chat-container', '#main-content')", }, }, required: ["sessionId"], },
- src/index.js:429-438 (registration)The dispatch case in CallToolRequestSchema handler that validates arguments and invokes the takeScreenshot function.case "take_screenshot": { const { sessionId, fullPage = false, selector } = args; if (!sessionId) { throw new McpError( ErrorCode.InvalidParams, "sessionId parameter is required" ); } result = await takeScreenshot(sessionId, fullPage, selector); break;
- src/index.js:12-13 (registration)Import of the takeScreenshot handler from the tools module.reverseEngineerChat, takeScreenshot,
- src/tools/reverseEngineer.js:13-13 (helper)Re-export of takeScreenshot from visualInspection.js to central tools index.export { takeScreenshot, getCurrentPageInfo } from "./visualInspection.js";