Skip to main content
Glama
ztobs
by ztobs

screenshot

Capture webpage screenshots with options for full-page or viewport shots. Specify a URL and define post-load actions to automate precise captures for documentation or analysis.

Instructions

Take a screenshot of a webpage

Input Schema

TableJSON Schema
NameRequiredDescriptionDefault
full_pageNoWhether to capture the full page or just the viewport
stepsNoComma-separated actions or sentences describing steps to take after page load (e.g., "click #submit, scroll down" or "Fill the login form and submit")
urlYesThe URL to navigate to

Implementation Reference

  • Core implementation of the screenshot tool: handles input validation, constructs task for AI agent to navigate URL and perform steps, runs the agent, captures screenshot (full_page option), saves PNG file, returns base64 data and file path.
    if command == 'screenshot': if not args.get('url'): return { 'success': False, 'error': 'URL is required for screenshot command' } task = f"1. Go to {args['url']}" if args.get('steps'): steps = args['steps'].split(',') for i, step in enumerate(steps, 2): task += f"\n{i}. {step.strip()}" task += f"\n{len(steps) + 2}. Take a screenshot" else: task += "\n2. Take a screenshot" if args.get('full_page'): task += " of the full page" print(f"[DEBUG] Creating agent for task: {task}") use_vision = os.getenv('USE_VISION', 'false').lower() == 'true' agent = Agent(task=task, llm=llm, use_vision=use_vision, browser_context=context) print("[DEBUG] Running agent") await agent.run() print("[DEBUG] Agent run completed") # Get the screenshot from the browser context try: # await context.navigate_to(args['url']) screenshot_base64 = await context.take_screenshot(full_page=args.get('full_page', False)) filename = f"screenshot_{int(time.time())}.png" filepath = os.path.join(SCREENSHOT_DIR, filename) # Decode base64 and save image screenshot_bytes = base64.b64decode(screenshot_base64) with open(filepath, 'wb') as f: f.write(screenshot_bytes) return { 'success': True, 'screenshot': screenshot_base64, # Keep base64 for potential direct display 'filepath': os.path.abspath(filepath) # Include full file path in response } finally: await context.close()
  • MCP tool schema definition for 'screenshot', specifying input parameters: url (required), full_page (boolean, default false), steps (string).
    { name: 'screenshot', description: 'Take a screenshot of a webpage', inputSchema: { type: 'object', properties: { url: { type: 'string', description: 'The URL to navigate to', }, full_page: { type: 'boolean', description: 'Whether to capture the full page or just the viewport', default: false, }, steps: { type: 'string', description: 'Comma-separated actions or sentences describing steps to take after page load (e.g., "click #submit, scroll down" or "Fill the login form and submit")', }, }, required: ['url'], },
  • src/index.ts:149-233 (registration)
    Registers the 'screenshot' tool (among others) with the MCP server via ListToolsRequestSchema handler, making it discoverable by clients.
    this.server.setRequestHandler(ListToolsRequestSchema, async () => ({ tools: [ { name: 'screenshot', description: 'Take a screenshot of a webpage', inputSchema: { type: 'object', properties: { url: { type: 'string', description: 'The URL to navigate to', }, full_page: { type: 'boolean', description: 'Whether to capture the full page or just the viewport', default: false, }, steps: { type: 'string', description: 'Comma-separated actions or sentences describing steps to take after page load (e.g., "click #submit, scroll down" or "Fill the login form and submit")', }, }, required: ['url'], }, }, { name: 'get_html', description: 'Get the HTML content of a webpage', inputSchema: { type: 'object', properties: { url: { type: 'string', description: 'The URL to navigate to', }, steps: { type: 'string', description: 'Comma-separated actions or sentences describing steps to take after page load (e.g., "click #submit, scroll down" or "Fill the login form and submit")', }, }, required: ['url'], }, }, { name: 'execute_js', description: 'Execute JavaScript code on a webpage', inputSchema: { type: 'object', properties: { url: { type: 'string', description: 'The URL to navigate to', }, script: { type: 'string', description: 'The JavaScript code to execute', }, steps: { type: 'string', description: 'Comma-separated actions or sentences describing steps to take after page load (e.g., "click #submit, scroll down" or "Fill the login form and submit")', }, }, required: ['url', 'script'], }, }, { name: 'get_console_logs', description: 'Get the console logs of a webpage', inputSchema: { type: 'object', properties: { url: { type: 'string', description: 'The URL to navigate to', }, steps: { type: 'string', description: 'Comma-separated actions or sentences describing steps to take after page load (e.g., "click #submit, scroll down" or "Fill the login form and submit")', }, }, required: ['url'], }, }, ], }));
  • TypeScript wrapper handler for screenshot tool call: dispatches to Python script via runPythonScript, formats response with status and filepath.
    if (request.params.name === 'screenshot') { return { content: [ { type: 'text', text: JSON.stringify({ status: `Screenshot successful.`, path: result.filepath, // screenshot: 'Data: ' + result.screenshot }) }, ], };
  • Helper defining and creating the screenshots directory used for saving screenshot files.
    SCREENSHOT_DIR = os.path.join('.', 'screenshots') async def handle_command(command, args): """Handle different browser commands""" # Ensure screenshot directory exists os.makedirs(SCREENSHOT_DIR, exist_ok=True)

Other Tools

Related Tools

Latest Blog Posts

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/ztobs/cline-browser-use-mcp'

If you have feedback or need assistance with the MCP directory API, please join our Discord server