novada_unblock
Retrieve raw HTML from anti-bot or JavaScript-heavy websites. Fully renders the page using Web Unblocker or Chromium to bypass blocks and execute JavaScript.
Instructions
Use when you need the raw rendered HTML of a blocked or JS-heavy page. Forces JS rendering via Web Unblocker or Browser API. Returns raw HTML, not cleaned text.
Best for: When you need raw HTML (not cleaned text) for custom DOM parsing. When novada_extract with render="render" still fails. Returns the full JS-rendered HTML source. Tip: For most anti-bot pages, try novada_extract with render="render" first — it returns clean text. Use novada_unblock when you specifically need the raw HTML source. Not for: Reading cleaned text (use novada_extract with render="render"), structured platform data (use novada_scrape). Methods: "render" (Web Unblocker, faster/cheaper), "browser" (full Chromium CDP, handles complex SPAs). Wait hint: Use wait_for to specify a CSS selector to wait for before capturing HTML. Note: wait_ms, block_resources, auto_runs are accepted but not yet implemented — they have no effect in the current version.
Common mistakes:
This tool returns RAW HTML, not parsed/cleaned text. Passing the output directly to an LLM expecting markdown will produce garbled, token-heavy responses.
For extracted content from bot-protected pages, use novada_extract (it calls the unblocker internally with render='render').
Do not use novada_unblock for simple static pages — it adds 9-16 seconds of latency vs 112ms for novada_extract.
When to use:
You need the original DOM structure for CSS selector parsing in a processing pipeline.
You are feeding the HTML into a downstream parser, not directly to an LLM.
You need raw access to a page's complete HTML before novada_extract's content selection.
Not for:
Getting readable content from protected pages — use novada_extract with render='render'.
Input Schema
| Name | Required | Description | Default |
|---|---|---|---|
| url | Yes | ||
| method | Yes | Rendering method. 'render': JS rendering via Web Unblocker (requires NOVADA_WEB_UNBLOCKER_KEY). 'browser': full Chromium CDP (requires NOVADA_BROWSER_WS). Unlike novada_extract which uses 'render=', this tool uses 'method='. | render |
| country | No | ISO 2-letter country code for geo-targeted rendering. | |
| wait_for | No | CSS selector to wait for before capturing HTML. E.g. '.price', '#product-title'. | |
| wait_ms | No | [NOT_IMPLEMENTED — reserved for future use] Max time in ms to wait for page to fully load before capture. Use when wait_for selector is unavailable. Max 100000ms. | |
| block_resources | No | [NOT_IMPLEMENTED — reserved for future use] Block images, CSS, and video loading for faster captures. Reduces bandwidth and latency on image-heavy pages. | |
| auto_runs | No | [NOT_IMPLEMENTED — reserved for future use] Number of retry attempts if the page load fails or returns incomplete content. Default 2, max 10. | |
| timeout | Yes | Timeout in ms. Default 30000, max 120000. | |
| max_chars | No | Maximum characters of raw HTML to return (default: 100000, max: 500000). When content exceeds this limit, it is truncated and a notice is appended. Raw HTML is typically much larger than extracted text — increase this if you need the full DOM. |