unbrowser
Server Configuration
Describes the environment variables required to run the server.
| Name | Required | Description | Default |
|---|---|---|---|
| UNBROWSER_SHIMS | No | Opt-in enhanced browser-environment shims. Set to 'enhanced' to enable. | |
| UNBROWSER_TIMEOUT_MS | No | Outer watchdog timeout in milliseconds. Default is 30000, clamped to 1s..10min. | |
| UNBROWSER_COOKIE_SERVICE_URL | No | URL of the local cookie solver service. Default is not set. | |
| UNBROWSER_SCRIPT_EVAL_BUDGET_MS | No | Budget for script evaluation in milliseconds. Default is 5000. |
Capabilities
Features and capabilities supported by this server
| Capability | Details |
|---|---|
| tools | {} |
Tools
Functions exposed to the LLM to take actions
| Name | Description |
|---|---|
| navigateA | Fetch a URL with Chrome-fingerprinted HTTP using the active profile. Parses HTML, seeds the JS DOM, returns BlockMap inline. With Auto-extract: when the page embeds JSON-bearing tags (density.json_scripts > 0 — covers application/json, application/ld+json, text/x-magento-init, text/x-shopify-app, etc.), navigate auto-runs Tool advice: navigate also returns Auto-solve: Reddit's JS proof-of-work challenge (provider: reddit_js_challenge) is transparently solved — the challenge is detected, the GET solution URL is computed (solution = hex_value + hex_value), and the real page is returned in one navigate call. challenge:null on the result means the real page was served. Subsequent navigations in the same session carry the clearance cookie and skip the challenge entirely. |
| queryA | Run a CSS selector against the current page's parsed DOM. Returns matching elements as [{ref, tag, attrs, text, text_chars, text_truncated}]. Element refs (e:NN) are stable handles for use with click/type/submit. Selector engine supports tag, id, class, attribute matchers (=, ^=, $=, *=, ~=), all four combinators (descendant, >, +, ~), pseudo-classes (:first/last/nth-child including An+B formulas, :first/last/nth-of-type, :only-child/of-type), :not(), and :has(). |
| query_debugA | Diagnose why a CSS selector did or did not match. Returns matched_count, sample matches, DOM summary counts, selector hints (top tags/classes/data attrs/ids), and actionable hints for selector_miss, thin_shell, or embedded_json. Use this when query() returns [] and you need to distinguish a bad selector from an empty/browser-rendered DOM. |
| textA | Get the textContent of the FIRST element matching the selector (default: body). Note: on Wikipedia/MDN/news sites, the first is often a hatnote or image caption, not the lead paragraph — prefer |
| text_mainA | Get the textContent of the page's main content area, excluding chrome (header/nav/footer/aside). Tries , then [role=main], then a single , then falls back to the longest non-chrome subtree. Use this for reading article body / docs page / blog post content. |
| text_cleanB | Return chrome-stripped, JSON-stripped, whitespace-collapsed text from a selector or the best content root. Drops script/style/noscript/svg and page chrome (nav/header/footer/aside) plus obvious hidden widgets and repeated boilerplate. |
| find_textA | Find localized text matches and return [{ref, tag, attrs, before, match, after, text}]. Ranks article/main/content matches above nav/header/footer boilerplate. |
| text_aroundB | Return cleaned surrounding text around an element ref or the best ranked text match. Returns {ref, before, match, after, text}. |
| query_textA | Find elements by visible text content. Returns the smallest/deepest element whose textContent matches the needle, with chrome (header/nav/footer/aside) skipped. Anchor-promotion: a span/strong/etc. inside an resolves to the anchor (so click() targets the actionable element). Right tool when CSS selectors are unstable (React-rendered pages with hashed class names) but the visible label is reliable — e.g. find a 'Sign in' button without knowing its class. |
| blockmapA | Recompute the BlockMap for the current page. Use after eval'd JS or click/type modifies the DOM. Same shape as the inline blockmap from navigate. |
| page_modelA | Render the current page into semantic, task-discoverable JSON objects. Reconstructs page structure as search_form, nav_link, article_card, course_card, model_card, product_card, table, answer_block, and limitation objects with actions, normalized fields, goal-based scoring, and provenance. Prefer this as the first planning tool after navigate when raw links/text are too wide. |
| route_discoverA | Find page-owned navigation/search routes for a goal. Returns ranked visible links, forms with controls/query_url previews, and inferred URLs derived from page-owned routes plus goal terms. Use before guessing URLs manually. |
| discoverA | High-level cheap-first information discovery. Optionally navigates to a URL, runs light JS, merges DOM routes, inferred form/query URLs, and network JSON routes into one ranked graph with provenance plus route-level escalation hints. Use this when the task is to find where information lives before extracting it. |
| network_extractA | Parse captured JSON/API/network responses into semantic objects with fields, scores, matched query terms, and capture/path provenance. Use after navigate or activate when network_stores shows JSON/GraphQL/NDJSON captures and raw body_preview is too noisy. |
| extractA | Auto-strategy structured-data extraction. Tries JSON-LD (schema.org) → NEXT_DATA → Nuxt → JSON-in-script (Magento, Shopify, BigCommerce custom-typed scripts) → OpenGraph/meta → microdata → text_main fallback, returns the highest-confidence hit as {strategy, confidence, data, tried}. Use this as the one-shot 'give me the data, you figure out how' call when you don't want to plan the strategy yourself. Pass strategy='json_ld' (or any of the names above) to force a specific extractor. |
| extract_tableA | Pull a into {headers, rows, row_count}. Headers come from ... if present, else the first 's cells. Each subsequent 's cells become a row dict keyed by header (or 'col_N' if no header for that column). Right tool for pricing tables, specs, finance/listings tables — saves writing the per-cell mapping eval. |
| table_to_jsonA | Alias for extract_table with a first-table default. Pulls a table into {headers, rows, row_count}; selector defaults to 'table'. Use this when an agent expects a table-to-JSON convenience tool. |
| extract_listA | Pull a repeated card pattern into [{...}, {...}]. Right tool for HN-style lists, search results, product grids — collapses per-site eval boilerplate. Field spec shapes: 'css selector' (text content), 'css selector @attr' (attribute), or ['css selector', '@attr'] (tuple form). If a sub-selector returns null, the field value is null. |
| extract_cardsA | Auto-detect repeated article/card/product/course/listing blocks and return normalized items [{title, price, condition, url, availability, snippet, meta, image_alt, score}]. Prefer this over extract_list when the page has semantically ambiguous recipe, course, product, or model cards and you do not already know field selectors. Optional selector scopes detection to known card nodes; kind can bias scoring (recipe, course, product, listing). |
| settleA | Drain the JS event loop: alternately runs queued microtasks (Promise resolutions) and fires expired setTimeout/setInterval callbacks, sleeping to the next deadline when only timers remain. Returns when the queue is empty OR max_ms elapses OR max_iters iterations complete. Defaults: max_ms=2000, max_iters=50. Use after seeding the DOM (or after eval'd code that schedules timers) to let pending callbacks run. |
| clickA | Dispatch a click event on the element at |
| activateB | Higher-level action probe. Clicks an element by ref or visible action text, settles, and returns before/after URL, BlockMap/page_model summaries, network counts, hashes, and classification: navigated, dom_changed, network_changed, no_effect, or unsupported. |
| typeA | Set the value of an input/textarea (referenced by |
| submitA | Submit a form by gathering input/textarea/select values and navigating to the resolved action URL. Supports GET and application/x-www-form-urlencoded POST. Checked checkbox/radio values are serialized; multipart upload forms are not supported. |
| bodyA | Return the raw HTML body of the last navigation. Use as a fallback when the BlockMap or selectors aren't enough — but the response can be large (often 100KB+). |
| evalA | Run arbitrary JavaScript in the embedded QuickJS runtime against the current page's parsed DOM. Returns the JSON-stringified result. Power tool — prefer query/text/blockmap when the CSS selector engine can express what you need. Canonical param is code; raw JSON-RPC also accepts script or expression aliases and errors if no code-like param is present. |
| cookies_setA | Add cookies to the session jar. Each item is an object {name, value, domain, path?, secure?, http_only?, url?} or a raw Set-Cookie string. Used to replay clearance cookies (e.g. PerimeterX _px3) lifted from a real Chrome session, bypassing bot detection without running the challenge JS. |
| cookies_getA | Return all cookies currently in the jar as [{name, value, domain, path, secure, http_only}]. Use this to export cookies to disk for a later session. |
| cookies_clearA | Drop all cookies from the jar. |
| report_outcomeA | Bind a task outcome (success/failure/quality) to a previous navigation_id from a navigate() call. Used by the policy framework's outcome protocol — see docs/probabilistic-policy.md §4.5. v0 emits an outcome_reported NDJSON event for the navigation; no posterior updates yet. Drivers should call this once per agent task so future Bayesian phases (B/D-2) can attribute extraction success/failure to specific policy decisions. |
| network_storesA | Return content-bearing fetch/XHR responses captured during navigate, ranked by likely content value. SPAs often keep their data in API responses (JSON, GraphQL, NDJSON, Next/Nuxt route data) that are cleaner than the rendered DOM — this tool surfaces them directly. Each entry has capture_id, URL, status, content-type, body_preview (truncated to 256 KB), body_bytes (full size), body_truncated flag, navigation_id, and a heuristic score. Bodies for trackers/ads/CSS/HTML/media are NOT captured. The navigate result already contains a top-5 summary scoped to that navigation; use this tool to get more entries, filter by host, or pull captures from a different navigation. |
| network_stores_clearA | Drop all captured network responses from the session's network store. Use this between unrelated navigations if you don't want earlier captures showing up in later network_stores calls. |
Prompts
Interactive templates invoked by user choice
| Name | Description |
|---|---|
No prompts | |
Resources
Contextual data attached and managed by the client
| Name | Description |
|---|---|
No resources | |
Latest Blog Posts
MCP directory API
We provide all the information about MCP servers via our MCP API.
curl -X GET 'https://glama.ai/api/mcp/v1/servers/protostatis/unbrowser'
If you have feedback or need assistance with the MCP directory API, please join our Discord server