browser.get_html
Extract HTML source or plain text from web pages. Configure to capture full pages or visible viewport content for automation tasks.
Instructions
Get the HTML source of the current page. Set text_only=true to strip tags and return plain text. Set full_page=false (default) for visible viewport only.
Input Schema
TableJSON Schema
| Name | Required | Description | Default |
|---|---|---|---|
| session_id | Yes | ||
| full_page | No | ||
| text_only | No |
Implementation Reference
- controller/app/tool_gateway.py:1165-1174 (handler)The _get_html method in McpToolGateway is the implementation handler for the browser.get_html tool.
async def _get_html(self, payload: GetPageHtmlInput) -> dict[str, Any]: session = await self.manager.get_session(payload.session_id) if payload.text_only: text = await session.page.evaluate( "() => document.body ? document.body.innerText : ''" ) return {"session_id": payload.session_id, "content": text, "type": "text"} html = await session.page.content() return {"session_id": payload.session_id, "content": html, "type": "html"} - controller/app/tool_gateway.py:682-690 (registration)Registration of browser.get_html tool within McpToolGateway._tools.
name="browser.get_html", description=( "Get the HTML source of the current page. " "Set text_only=true to strip tags and return plain text. " "Set full_page=false (default) for visible viewport only." ), input_model=GetPageHtmlInput, handler=self._get_html, ), - Input schema definition for browser.get_html tool.
class GetPageHtmlInput(SessionIdInput): full_page: bool = False text_only: bool = False