get_page_html
Extract clean HTML from web pages for automation testing by removing scripts and styles while preserving interactive element annotations.
Instructions
Get the HTML of the web page in the active playback window, with interactive elements annotated. Scripts and styles are removed.
Input Schema
TableJSON Schema
| Name | Required | Description | Default |
|---|---|---|---|
| tab_id | No | Optional tab/window ID. Use "all" for all tabs. Omit for active tab. |
Implementation Reference
- src/bridge-client.ts:24-72 (handler)The `BridgeClient.call` method acts as the handler for all MCP tools defined in this codebase. It dynamically constructs an HTTP request to the Selenix API using the tool name as the endpoint path (`/api/${endpoint}`). Therefore, `get_page_html` is handled by making a POST request to `/api/get_page_html`.
export class BridgeClient { async call(endpoint: string, body: Record<string, unknown> = {}): Promise<unknown> { // Re-read config on every call so we pick up new tokens after Selenix restarts const config = readConfig() return new Promise((resolve, reject) => { const data = JSON.stringify(body) const req = http.request( { hostname: '127.0.0.1', port: config.port, path: `/api/${endpoint}`, method: 'POST', headers: { 'Content-Type': 'application/json', Authorization: `Bearer ${config.token}`, 'Content-Length': Buffer.byteLength(data), }, timeout: 180000, // 3 minutes for long-running operations like run_test }, (res) => { let responseData = '' res.on('data', (chunk: string) => (responseData += chunk)) res.on('end', () => { try { resolve(JSON.parse(responseData)) } catch { resolve({ raw: responseData }) } }) } ) req.on('error', (err) => reject( new Error( `Cannot connect to Selenix bridge at 127.0.0.1:${config.port}. ` + `Is Selenix running with MCP Server enabled? (${err.message})` ) ) ) req.on('timeout', () => { req.destroy() reject(new Error('Request timed out after 180 seconds')) }) req.write(data) req.end() }) } } - src/tools.ts:14-28 (registration)Definition and registration of the `get_page_html` tool, including its schema and description.
{ name: 'get_page_html', description: 'Get the HTML of the web page in the active playback window, with interactive elements annotated. Scripts and styles are removed.', inputSchema: { type: 'object' as const, properties: { tab_id: { type: 'string', description: 'Optional tab/window ID. Use "all" for all tabs. Omit for active tab.', }, }, }, },