scrape_as_html
Extract HTML content from any webpage URL while bypassing bot detection and CAPTCHA protection for reliable web scraping.
Instructions
Scrape a single webpage URL with advanced options for content extraction and get back the results in HTML. This tool can unlock any webpage even if it uses bot detection or CAPTCHA.
Input Schema
TableJSON Schema
| Name | Required | Description | Default |
|---|---|---|---|
| url | Yes |
Implementation Reference
- server.js:184-205 (registration)Full registration of the 'scrape_as_html' tool using addTool, including name, description, input schema, and execute handler.addTool({ name: 'scrape_as_html', description: 'Scrape a single webpage URL with advanced options for ' +'content extraction and get back the results in HTML. ' +'This tool can unlock any webpage even if it uses bot detection or ' +'CAPTCHA.', parameters: z.object({url: z.string().url()}), execute: tool_fn('scrape_as_html', async({url})=>{ let response = await axios({ url: 'https://api.brightdata.com/request', method: 'POST', data: { url, zone: unlocker_zone, format: 'raw', }, headers: api_headers(), responseType: 'text', }); return response.data; }), });
- server.js:191-204 (handler)The core handler logic wrapped in tool_fn: performs HTTP POST to Bright Data API (/request) with the target URL and unlocker zone, requesting raw HTML format, and returns the response data.execute: tool_fn('scrape_as_html', async({url})=>{ let response = await axios({ url: 'https://api.brightdata.com/request', method: 'POST', data: { url, zone: unlocker_zone, format: 'raw', }, headers: api_headers(), responseType: 'text', }); return response.data; }),
- server.js:190-190 (schema)Zod schema for tool input: requires a single 'url' parameter that must be a valid URL string.parameters: z.object({url: z.string().url()}),