get_html

Retrieve HTML content from webpages for browser automation tasks, enabling data extraction and page analysis through URL navigation and post-load actions.

Instructions

Get the HTML content of a webpage

Input Schema

TableJSON Schema

Name	Required	Description	Default
`url`	Yes	The URL to navigate to
`steps`	No	Comma-separated actions or sentences describing steps to take after page load (e.g., "click #submit, scroll down" or "Fill the login form and submit")

Implementation Reference

src/browser_handler.py:194-220 (handler)

Core handler logic for 'get_html' command: validates URL, uses an AI agent to navigate and perform optional steps, then retrieves and returns the page HTML.

elif command == 'get_html':
    if not args.get('url'):
        return {
            'success': False,
            'error': 'URL is required for get_html command'
        }
        
    task = f"1. Go to {args['url']}"
    if args.get('steps'):
        steps = args['steps'].split(',')
        for i, step in enumerate(steps, 2):
            task += f"\n{i}. {step.strip()}"
        task += f"\n{len(steps) + 2}. Get the page HTML"
    else:
        task += "\n2. Get the page HTML"
    use_vision = os.getenv('USE_VISION', 'false').lower() == 'true'
    agent = Agent(task=task, llm=llm, use_vision=use_vision, browser_context=context)
    await agent.run()
    
    try:
        html = await context.get_page_html()
        return {
            'success': True,
            'html': html
        }
    finally:
        await context.close()

src/index.ts:175-190 (schema)

Input schema definition for the 'get_html' tool, specifying required 'url' and optional 'steps' parameters.

name: 'get_html',
description: 'Get the HTML content of a webpage',
inputSchema: {
  type: 'object',
  properties: {
    url: {
      type: 'string',
      description: 'The URL to navigate to',
    },
    steps: {
      type: 'string',
      description: 'Comma-separated actions or sentences describing steps to take after page load (e.g., "click #submit, scroll down" or "Fill the login form and submit")',
    },
  },
  required: ['url'],
},

src/index.ts:282-290 (handler)

MCP server handler that processes the 'get_html' tool call response from Python backend and formats it as MCP content.

} else if (request.params.name === 'get_html') {
  return {
    content: [
      {
        type: 'text',
        text: result.html,
      },
    ],
  };

src/index.ts:141-233 (registration)

Registers the 'get_html' tool with the MCP server using addTool, including its schema.

        const stdout = Buffer.concat(stdoutChunks).toString('utf8');
        reject(new Error(`Failed to parse Python script output: ${error}\nOutput was: ${stdout}`));
      }
    });
  });
}

private setupToolHandlers() {
  this.server.setRequestHandler(ListToolsRequestSchema, async () => ({
    tools: [
      {
        name: 'screenshot',
        description: 'Take a screenshot of a webpage',
        inputSchema: {
          type: 'object',
          properties: {
            url: {
              type: 'string',
              description: 'The URL to navigate to',
            },
            full_page: {
              type: 'boolean',
              description: 'Whether to capture the full page or just the viewport',
              default: false,
            },
            steps: {
              type: 'string',
              description: 'Comma-separated actions or sentences describing steps to take after page load (e.g., "click #submit, scroll down" or "Fill the login form and submit")',
            },
          },
          required: ['url'],
        },
      },
      {
        name: 'get_html',
        description: 'Get the HTML content of a webpage',
        inputSchema: {
          type: 'object',
          properties: {
            url: {
              type: 'string',
              description: 'The URL to navigate to',
            },
            steps: {
              type: 'string',
              description: 'Comma-separated actions or sentences describing steps to take after page load (e.g., "click #submit, scroll down" or "Fill the login form and submit")',
            },
          },
          required: ['url'],
        },
      },
      {
        name: 'execute_js',
        description: 'Execute JavaScript code on a webpage',
        inputSchema: {
          type: 'object',
          properties: {
            url: {
              type: 'string',
              description: 'The URL to navigate to',
            },
            script: {
              type: 'string',
              description: 'The JavaScript code to execute',
            },
            steps: {
              type: 'string',
              description: 'Comma-separated actions or sentences describing steps to take after page load (e.g., "click #submit, scroll down" or "Fill the login form and submit")',
            },
          },
          required: ['url', 'script'],
        },
      },
      {
        name: 'get_console_logs',
        description: 'Get the console logs of a webpage',
        inputSchema: {
          type: 'object',
          properties: {
            url: {
              type: 'string',
              description: 'The URL to navigate to',
            },
            steps: {
              type: 'string',
              description: 'Comma-separated actions or sentences describing steps to take after page load (e.g., "click #submit, scroll down" or "Fill the login form and submit")',
            },
          },
          required: ['url'],
        },
      },
    ],
  }));

Browser Use Server

get_html

Instructions

Input Schema

Implementation Reference

Other Tools

Latest Blog Posts

MCP directory API