screenshot

Capture webpage screenshots for documentation, testing, or sharing by automating browser actions after page load.

Instructions

Take a screenshot of a webpage

Input Schema

TableJSON Schema

Name	Required	Description
`url`	Yes	The URL to navigate to
`full_page`	No	Whether to capture the full page or just the viewport
`steps`	No	Comma-separated actions or sentences describing steps to take after page load (e.g., "click #submit, scroll down" or "Fill the login form and submit")

Implementation Reference

src/browser_handler.py:149-192 (handler)
Core handler for the 'screenshot' tool: validates URL, constructs task for Agent to navigate and perform steps, runs the agent, captures screenshot (full_page option), saves to file, returns base64 and filepath.
if command == 'screenshot': if not args.get('url'): return { 'success': False, 'error': 'URL is required for screenshot command' } task = f"1. Go to {args['url']}" if args.get('steps'): steps = args['steps'].split(',') for i, step in enumerate(steps, 2): task += f"\n{i}. {step.strip()}" task += f"\n{len(steps) + 2}. Take a screenshot" else: task += "\n2. Take a screenshot" if args.get('full_page'): task += " of the full page" print(f"[DEBUG] Creating agent for task: {task}") use_vision = os.getenv('USE_VISION', 'false').lower() == 'true' agent = Agent(task=task, llm=llm, use_vision=use_vision, browser_context=context) print("[DEBUG] Running agent") await agent.run() print("[DEBUG] Agent run completed") # Get the screenshot from the browser context try: # await context.navigate_to(args['url']) screenshot_base64 = await context.take_screenshot(full_page=args.get('full_page', False)) filename = f"screenshot_{int(time.time())}.png" filepath = os.path.join(SCREENSHOT_DIR, filename) # Decode base64 and save image screenshot_bytes = base64.b64decode(screenshot_base64) with open(filepath, 'wb') as f: f.write(screenshot_bytes) return { 'success': True, 'screenshot': screenshot_base64, # Keep base64 for potential direct display 'filepath': os.path.abspath(filepath) # Include full file path in response } finally: await context.close()
src/index.ts:151-173 (schema)
Tool schema definition including name, description, and input schema with required 'url', optional 'full_page' and 'steps'.
{ name: 'screenshot', description: 'Take a screenshot of a webpage', inputSchema: { type: 'object', properties: { url: { type: 'string', description: 'The URL to navigate to', }, full_page: { type: 'boolean', description: 'Whether to capture the full page or just the viewport', default: false, }, steps: { type: 'string', description: 'Comma-separated actions or sentences describing steps to take after page load (e.g., "click #submit, scroll down" or "Fill the login form and submit")', }, }, required: ['url'], }, },
src/index.ts:149-233 (registration)
Registers the 'screenshot' tool (among others) in the MCP server's ListToolsRequestSchema handler.
this.server.setRequestHandler(ListToolsRequestSchema, async () => ({ tools: [ { name: 'screenshot', description: 'Take a screenshot of a webpage', inputSchema: { type: 'object', properties: { url: { type: 'string', description: 'The URL to navigate to', }, full_page: { type: 'boolean', description: 'Whether to capture the full page or just the viewport', default: false, }, steps: { type: 'string', description: 'Comma-separated actions or sentences describing steps to take after page load (e.g., "click #submit, scroll down" or "Fill the login form and submit")', }, }, required: ['url'], }, }, { name: 'get_html', description: 'Get the HTML content of a webpage', inputSchema: { type: 'object', properties: { url: { type: 'string', description: 'The URL to navigate to', }, steps: { type: 'string', description: 'Comma-separated actions or sentences describing steps to take after page load (e.g., "click #submit, scroll down" or "Fill the login form and submit")', }, }, required: ['url'], }, }, { name: 'execute_js', description: 'Execute JavaScript code on a webpage', inputSchema: { type: 'object', properties: { url: { type: 'string', description: 'The URL to navigate to', }, script: { type: 'string', description: 'The JavaScript code to execute', }, steps: { type: 'string', description: 'Comma-separated actions or sentences describing steps to take after page load (e.g., "click #submit, scroll down" or "Fill the login form and submit")', }, }, required: ['url', 'script'], }, }, { name: 'get_console_logs', description: 'Get the console logs of a webpage', inputSchema: { type: 'object', properties: { url: { type: 'string', description: 'The URL to navigate to', }, steps: { type: 'string', description: 'Comma-separated actions or sentences describing steps to take after page load (e.g., "click #submit, scroll down" or "Fill the login form and submit")', }, }, required: ['url'], }, }, ], }));
src/index.ts:269-281 (handler)
MCP CallToolRequestSchema handler specific response formatting for screenshot tool results from Python subprocess.
if (request.params.name === 'screenshot') { return { content: [ { type: 'text', text: JSON.stringify({ status: `Screenshot successful.`, path: result.filepath, // screenshot: 'Data: ' + result.screenshot }) }, ], };
src/browser_handler.py:19-25 (helper)
Helper: Defines and ensures existence of screenshots directory for saving screenshot files.
SCREENSHOT_DIR = os.path.join('.', 'screenshots') async def handle_command(command, args): """Handle different browser commands""" # Ensure screenshot directory exists os.makedirs(SCREENSHOT_DIR, exist_ok=True)

Browser Use Server

screenshot

Instructions

Input Schema

Implementation Reference

Other Tools

Latest Blog Posts

MCP directory API