screenshot
Capture webpage screenshots with options for full-page or viewport shots. Specify a URL and define post-load actions to automate precise captures for documentation or analysis.
Instructions
Take a screenshot of a webpage
Input Schema
TableJSON Schema
| Name | Required | Description | Default |
|---|---|---|---|
| full_page | No | Whether to capture the full page or just the viewport | |
| steps | No | Comma-separated actions or sentences describing steps to take after page load (e.g., "click #submit, scroll down" or "Fill the login form and submit") | |
| url | Yes | The URL to navigate to |
Implementation Reference
- src/browser_handler.py:149-192 (handler)Core implementation of the screenshot tool: handles input validation, constructs task for AI agent to navigate URL and perform steps, runs the agent, captures screenshot (full_page option), saves PNG file, returns base64 data and file path.if command == 'screenshot': if not args.get('url'): return { 'success': False, 'error': 'URL is required for screenshot command' } task = f"1. Go to {args['url']}" if args.get('steps'): steps = args['steps'].split(',') for i, step in enumerate(steps, 2): task += f"\n{i}. {step.strip()}" task += f"\n{len(steps) + 2}. Take a screenshot" else: task += "\n2. Take a screenshot" if args.get('full_page'): task += " of the full page" print(f"[DEBUG] Creating agent for task: {task}") use_vision = os.getenv('USE_VISION', 'false').lower() == 'true' agent = Agent(task=task, llm=llm, use_vision=use_vision, browser_context=context) print("[DEBUG] Running agent") await agent.run() print("[DEBUG] Agent run completed") # Get the screenshot from the browser context try: # await context.navigate_to(args['url']) screenshot_base64 = await context.take_screenshot(full_page=args.get('full_page', False)) filename = f"screenshot_{int(time.time())}.png" filepath = os.path.join(SCREENSHOT_DIR, filename) # Decode base64 and save image screenshot_bytes = base64.b64decode(screenshot_base64) with open(filepath, 'wb') as f: f.write(screenshot_bytes) return { 'success': True, 'screenshot': screenshot_base64, # Keep base64 for potential direct display 'filepath': os.path.abspath(filepath) # Include full file path in response } finally: await context.close()
- src/index.ts:151-172 (schema)MCP tool schema definition for 'screenshot', specifying input parameters: url (required), full_page (boolean, default false), steps (string).{ name: 'screenshot', description: 'Take a screenshot of a webpage', inputSchema: { type: 'object', properties: { url: { type: 'string', description: 'The URL to navigate to', }, full_page: { type: 'boolean', description: 'Whether to capture the full page or just the viewport', default: false, }, steps: { type: 'string', description: 'Comma-separated actions or sentences describing steps to take after page load (e.g., "click #submit, scroll down" or "Fill the login form and submit")', }, }, required: ['url'], },
- src/index.ts:149-233 (registration)Registers the 'screenshot' tool (among others) with the MCP server via ListToolsRequestSchema handler, making it discoverable by clients.this.server.setRequestHandler(ListToolsRequestSchema, async () => ({ tools: [ { name: 'screenshot', description: 'Take a screenshot of a webpage', inputSchema: { type: 'object', properties: { url: { type: 'string', description: 'The URL to navigate to', }, full_page: { type: 'boolean', description: 'Whether to capture the full page or just the viewport', default: false, }, steps: { type: 'string', description: 'Comma-separated actions or sentences describing steps to take after page load (e.g., "click #submit, scroll down" or "Fill the login form and submit")', }, }, required: ['url'], }, }, { name: 'get_html', description: 'Get the HTML content of a webpage', inputSchema: { type: 'object', properties: { url: { type: 'string', description: 'The URL to navigate to', }, steps: { type: 'string', description: 'Comma-separated actions or sentences describing steps to take after page load (e.g., "click #submit, scroll down" or "Fill the login form and submit")', }, }, required: ['url'], }, }, { name: 'execute_js', description: 'Execute JavaScript code on a webpage', inputSchema: { type: 'object', properties: { url: { type: 'string', description: 'The URL to navigate to', }, script: { type: 'string', description: 'The JavaScript code to execute', }, steps: { type: 'string', description: 'Comma-separated actions or sentences describing steps to take after page load (e.g., "click #submit, scroll down" or "Fill the login form and submit")', }, }, required: ['url', 'script'], }, }, { name: 'get_console_logs', description: 'Get the console logs of a webpage', inputSchema: { type: 'object', properties: { url: { type: 'string', description: 'The URL to navigate to', }, steps: { type: 'string', description: 'Comma-separated actions or sentences describing steps to take after page load (e.g., "click #submit, scroll down" or "Fill the login form and submit")', }, }, required: ['url'], }, }, ], }));
- src/index.ts:269-281 (handler)TypeScript wrapper handler for screenshot tool call: dispatches to Python script via runPythonScript, formats response with status and filepath.if (request.params.name === 'screenshot') { return { content: [ { type: 'text', text: JSON.stringify({ status: `Screenshot successful.`, path: result.filepath, // screenshot: 'Data: ' + result.screenshot }) }, ], };
- src/browser_handler.py:19-25 (helper)Helper defining and creating the screenshots directory used for saving screenshot files.SCREENSHOT_DIR = os.path.join('.', 'screenshots') async def handle_command(command, args): """Handle different browser commands""" # Ensure screenshot directory exists os.makedirs(SCREENSHOT_DIR, exist_ok=True)