| browser_launch | Launch stealth Chrome via nodriver. Creates persistent profile by default. Args:
url: initial URL to load
headless: run without UI (many sites detect headless — prefer False)
proxy: "http://user:pass@host:port" or "socks5://host:port"
user_agent: override UA string
window_width, window_height: viewport size
persistent: reuse profile at ~/.mcp-stealth/profile
lang: browser language
extra_args: additional Chromium flags
storage_state_path: load cookies/localStorage from JSON before first nav
testing_mode: 2-5× faster startup+nav for perf/regression testing —
disables image loading, background throttling dampers, translate,
notifications, media autoplay. WARNING: reduces stealth — not for
anti-bot work (sites can detect missing images as automation signal).
auto_verify: if True (default), automatically detect Cloudflare /
Turnstile challenges after the initial load and dispatch a
CDP-level click on the checkbox. Caps at 2 attempts ~6s total —
never loops. Set False to opt out.
user_data_dir: launch Chrome against an EXISTING user profile root
(e.g. "~/Library/Application Support/Google/Chrome"). Overrides
persistent + the default MCP profile. The target Chrome instance
(if any) MUST be closed first — locked profiles are detected
upfront and refused with the lock-holder PID. Supports ~ expansion.
profile_directory: when paired with user_data_dir, picks a sub-profile
inside it (e.g. "Default", "Profile 21"). Without this, Chrome
uses "Default". Helpful to drive a specific persona without
cloning the profile. Use list_chrome_profiles to enumerate.
|
| browser_close | Close the browser and free the profile lock. |
| browser_recover | Force-recover from a stuck browser state. Escape hatch when browser_close() can't run (graceful shutdown depends
on internal state that may be corrupt). Steps:
1. Best-effort browser.stop() — ignore any errors
2. Kill orphan Chrome PIDs whose argv references the active profile
(SIGTERM, then SIGKILL after 2s if still alive) — only PIDs spawned
against THIS MCP profile, never the user's daily Chrome
3. Reset BrowserState (clears tabs, instances, network index, locks)
4. Clear devtools / dialog caches
5. Wipe stale Singleton* lock files in all known profile dirs
Always succeeds — never raises. Use this when normal close hangs or
returns an error you can't diagnose. After this, browser_launch()
again to start fresh.
|
| navigate | Navigate the active tab to url. wait_until: load|none. auto_verify: if True (default), automatically detect Cloudflare /
Turnstile challenges after load and click the checkbox naturally
(CDP-level click, max 2 attempts). Set False to opt out.
focus: if True (default), bring the tab's window to the OS foreground
after navigating. Without this, programmatic navigation in Chrome
does NOT raise the window — so on systems with the user's own
Chrome already open, MCP-driven navigation can be invisible to the
eye even though it succeeded internally. Set False to navigate
silently (background scraping flows).
|
| tab_focus | ⭐ Bring the active tab's browser window to the OS foreground. Programmatic CDP navigation does not raise Chrome to the front, so
on a desktop where the user has their own Chrome already open you
may see only the original window even though MCP successfully
drove a different window/tab. Call this when you want to *see*
what MCP is doing.
Common reasons MCP's window is hidden:
- Per-PID profile fallback: another Chrome already held
~/.mcp-stealth/profile/, so MCP launched into
~/.mcp-stealth/profile-pid<N>/ — that's a SEPARATE Chrome window.
Run server_status to confirm (profile_dir field).
- OAuth popup opened a new tab/window MCP now drives.
- macOS Spaces / minimized window / behind other apps.
|
| go_back | |
| go_forward | |
| reload | |
| browser_snapshot | Inject SNAPSHOT_JS and return a ref-indexed list of interactive elements. Refs (e0, e1, ...) are attached via data-mcp-ref and valid until next nav.
Modes (performance vs completeness tradeoff):
full — default; same shape as mcp-camoufox (computed-style visibility + full attrs)
fast — skip getComputedStyle + minimal attrs (2-3× faster, less info per element)
viewport — full fidelity but only elements inside current scroll viewport
(5-10× faster on long feeds/SERPs, pair with scroll for segment-by-segment)
diff_from_last=True caches a DOM hash per tab; if the hash matches the previous
call on the same URL, returns "unchanged" without re-serializing the element list
(near-instant for re-check loops).
|
| screenshot | Screenshot active tab. Saves to ~/.mcp-stealth/screenshots/. Args:
filename: output name (default timestamped)
full_page: stitch entire page height (slower, larger file)
return_base64: append base64 body to response (useful for vision models)
format: "auto" (from extension, default), "png" (lossless), or "jpeg" (smaller)
quality: JPEG quality 1-100 (default 80) — ignored for PNG
region: clip to {x, y, width, height} — uses CDP Page.captureScreenshot clip
(skips full-viewport paint, 2-5× faster for small crops)
max_dimension: if either width or height exceeds this (px), the image is
resized proportionally via OpenCV INTER_AREA. Default 1920 keeps output
under the 2000 px per-side limit that LLM image tools (Claude/GPT) enforce
— prevents "image exceeds dimension limit" failures on long full_page
captures or hi-DPR device emulation. Pass 0 to disable resizing.
|
| get_text | Return innerText of element (by selector or ref) or whole document. |
| get_html | Return innerHTML (or outerHTML) of element or whole document. |
| get_url | Return current URL of active tab. |
| save_pdf | Save current page as PDF via CDP Page.printToPDF. |
| click | Click an element by ref (from snapshot) or CSS selector. JS fallback on failure. |
| click_text | Find and click element whose text matches. |
| click_role | Click by ARIA role (e.g. button, link, textbox), optional accessible name. |
| hover | |
| fill | Fill input/textarea via set_value (fast, works for standard inputs). |
| type_text | Type into focused element (keystroke-by-keystroke). Use humanize for Gaussian delays. |
| press_key | Press a single key (Enter, Escape, Tab, ArrowDown, a, etc). |
| select_option | Select by value or label. |
| check | Tick a checkbox/radio (idempotent). |
| uncheck | |
| upload_file | |
| mouse_click_xy | Click at raw viewport coordinates. |
| mouse_move | Move cursor to raw coordinates. humanize=True uses Bezier path. |
| drag_and_drop | Drag from (start_x, start_y) to (end_x, end_y). |
| wait_for | Wait until selector exists or text appears on page. |
| wait_for_navigation | Wait until the page finishes loading. |
| wait_for_url | Wait until URL matches a regex pattern. |
| wait_for_response | Wait for a network response whose URL matches regex. |
| tab_list | List all open tabs with index, URL, title. |
| tab_new | Open a new tab and make it active. |
| tab_select | Switch to tab at given index (from tab_list). |
| tab_close | Close tab at index (defaults to active). |
| cookie_list | List all cookies (optionally filtered by URL). |
| cookie_set | Set a cookie on the browser. |
| cookie_delete | Delete cookies matching name (optionally scoped to domain). |
| cookie_import | Bulk-import cookies. Three input modes — pick whichever is easiest: 1. cookies=[{...}] inline array of dicts (DevTools / EditThisCookie shape)
2. file_path="..." JSON file (array OR storage_state {"cookies":[...]})
3. raw_text="..." paste any of these and we auto-parse:
• JSON array / object (same shapes as 1+2)
• Header string: "name=val; name2=val2" or
"Cookie: name=val; name2=val2"
• Netscape cookies.txt (tab-separated)
• curl --cookie / -b argument string
NOTE: header / netscape formats lack domain — pass default_domain=".example.com"
(or call this AFTER navigate so the active tab's URL provides it).
Per-cookie fields (when JSON):
{"name":"...", "value":"...", "domain":".example.com", "path":"/",
"expires":1234567890, "secure":true, "httpOnly":false, "sameSite":"Lax"}
Args:
cookies: inline array
file_path: JSON file
raw_text: any cookie text — format auto-detected
default_domain: fallback domain for header / netscape cookies (or auto-uses
current tab URL's host)
clear_first: wipe all existing cookies before import (default False)
For full cookies + localStorage + sessionStorage restore, use storage_state_load.
|
| cookie_export | Export cookies to a JSON file. Cookies-only (use storage_state_save for full session). Output is a plain JSON array compatible with cookie_import / EditThisCookie /
Playwright cookies format. Saved to ~/.mcp-stealth/storage-states/.
|
| localstorage_get | Get localStorage — all keys or one specific key. |
| localstorage_set | Set a localStorage entry. |
| localstorage_clear | Clear all localStorage for current origin. |
| sessionstorage_get | Get sessionStorage — all keys or one. |
| sessionstorage_set | Set a sessionStorage entry. |
| sessionstorage_clear | Clear all sessionStorage for current origin (parity with localstorage_clear). |
| cache_clear | Clear the browser HTTP cache (CDP Network.clearBrowserCache). Mirrors DevTools → Application → Clear storage → Clear site data (cache).
Does NOT touch cookies, localStorage, or IndexedDB — use dedicated tools
or browser_launch(persistent=False) for a full wipe.
|
| indexeddb_list | List IndexedDB databases for the current origin. Reads via CDP IndexedDB.requestDatabaseNames. Use indexeddb_delete(name)
to drop one. Useful for clearing SPA state (many PWAs store auth / drafts
in IndexedDB rather than localStorage).
|
| indexeddb_delete | Delete an IndexedDB database by name (scoped to current origin). |
| evaluate | Execute arbitrary JS expression in page context. Returns stringified result. |
| inject_init_script | Register a script that runs before page scripts on every navigation of
the CURRENT tab. Scope is the active tab/target only — it is NOT auto-applied
to other open tabs or to tabs opened later; re-run per tab if needed. |
| inspect_element | Return tag, attributes, position, text for an element. |
| get_attribute | Get attribute value of element. |
| query_selector_all | Return count + attrs of all elements matching CSS selector. |
| get_links | |
| list_frames | List all iframes and their URLs. |
| frame_evaluate | Run JS inside an iframe matching URL pattern. Same-origin frames only: cross-origin iframes (reCAPTCHA bframe, payment
widgets, third-party embeds) block contentWindow.eval and return an error.
|
| batch_actions | Execute a list of actions sequentially. Each action: {type: click|fill|type|wait|press|navigate, ...params}
Example: [{"type":"click","ref":"e3"},{"type":"fill","ref":"e4","value":"x"}]
|
| fill_form | Fill multiple fields then optionally submit. fields: [{ref: "e1", value: "..."}, {selector: "#email", value: "..."}]
|
| navigate_and_snapshot | Navigate then immediately snapshot — common pattern. |
| get_viewport_size | Return current window dimensions. |
| set_viewport_size | Resize the browser window. |
| scroll | Scroll page via REAL mouseWheel CDP events (not JS scrollBy). humanize=True (default): variable chunks 50-150px + micro-pauses + 20%
reading-pause chance — bypasses DataDome/PerimeterX behavioral detection.
humanize=False: instant scroll (faster, less stealthy).
Directions: up | down | top | bottom
|
| scroll_to | Smooth-scroll a specific element into viewport. Args:
ref: snapshot ref (e.g. "e7") from browser_snapshot
selector: CSS selector alternative
block: "start" | "center" | "end" | "nearest" — vertical alignment
smooth: CSS smooth scroll (default) vs instant jump
Works even if element is far off-screen (pages of scroll away).
|
| dialog_handle | Pre-arm handler for next alert/confirm/prompt. Call BEFORE action that triggers it. |
| dialog_auto_handle | ⭐ Install a PERSISTENT auto-handler for native browser dialogs.
Unlike dialog_handle (one-shot, action baked in at arm time), this
one stays armed across many dialogs and reads its config at fire
time — call again with new action/types to update without re-arming. Args:
action: "accept" (Leave / OK) or "dismiss" (Cancel / Stay)
enabled: True to arm, False to disable (config preserved)
types: optional list to scope handling — any of:
["alert", "confirm", "prompt", "beforeunload"]
None (default) = handle all types.
text: prompt response when action="accept" on prompt() dialogs;
also basic-auth "user:pass" for HTTP 401.
Common patterns:
# Form pages with "unsaved changes" guard — auto-leave forever
dialog_auto_handle(action="accept", types=["beforeunload"])
# Pages that spam alert() — auto-OK
dialog_auto_handle(action="accept", types=["alert"])
# Disable when done
dialog_auto_handle(enabled=False)
Native dialogs only (Chrome's own card UI). HTML/CSS modal overlays
are regular DOM — use click_text("Cancel") / click_role for those.
|
| accessibility_snapshot | Return ARIA accessibility tree of current page. |
| console_start | Begin capturing console messages of active tab. |
| console_get | Retrieve captured console messages (chronological: oldest first within the last limit, newest last). |
| network_start | Begin capturing network requests + responses with full headers. Args:
capture_bodies: if True (default), also indexes by request_id so
network_get(include_body=True) can fetch response bodies on
demand via CDP Network.getResponseBody.
|
| network_get | Retrieve captured network events. Args:
limit: max entries returned (chronological: oldest first within the last `limit`, newest last)
filter_url: substring filter on URL
include_body: fetch response bodies via CDP Network.getResponseBody
for each matching entry. Bodies are truncated to max_body_bytes.
Requires network_start(capture_bodies=True) (default).
max_body_bytes: cap per-body length (default 10000)
full: if True, return entries with full headers + body fields
from network_index (use this once you've called network_start).
Default False = legacy flat event stream (backward compat).
|
| server_status | Diagnostic info about the server and browser. |
| get_page_errors | Retrieve JS errors caught on active tab. |
| export_har | Export captured network traffic to HAR-like JSON file. |
| detect_content_pattern | Heuristically detect the most likely repeating container on page. Useful for scraping job listings, product cards, search results.
Returns top-3 candidate CSS selectors ranked by child-similarity. |
| extract_structured | Extract structured data from repeating containers. fields: [{name: "title", selector: ".job-title", attribute: "text|href|src|..."}]
Only direct text nodes of element are captured for "text" (prevents child-field mixing).
|
| extract_table | Extract a as JSON rows with optional header keys. |
| scrape_page | Clean readable text extraction — drops nav, footer, scripts, styles. Smart-truncates at paragraph boundary (not mid-word). |
| storage_state_save | ⭐ Save cookies + localStorage of current origin to JSON. DIFFERENTIATOR: Per research, session-reuse is THE most reliable way to
bypass Cloudflare Turnstile — it never triggers if session valid.
Login manually once → save state → reuse forever until expiry.
|
| storage_state_load | ⭐ Load cookies + localStorage from a saved JSON file. Call BEFORE navigating to protected site so session is ready. |
| solve_captcha | Solve a CAPTCHA via CapSolver HTTP API. kind: turnstile | recaptcha_v2 | recaptcha_v3 | hcaptcha
Needs CAPSOLVER_KEY env var (or pass api_key). Returns solved token.
If inject_selector given, also injects token into that form field
(e.g. input[name='cf-turnstile-response']).
|
| verify_cf | ⭐ Use nodriver's built-in Cloudflare challenge verification. Uses OpenCV template matching to find the Turnstile checkbox on a screenshot
and click it. template_image is a path to a cropped image of the checkbox;
without it, the bundled English default is used.
Works on simple CF interstitials. For managed-mode Turnstile (ChatGPT-level),
combine with storage_state or solve_captcha.
|
| fingerprint_rotate | Override fingerprint vectors for active tab: user_agent, accept_language,
platform (Win32/MacIntel/Linux x86_64), timezone (Asia/Jakarta, etc).
Applied via CDP. Persists until next tab creation.
|
| humanize_click | ⭐ Click with Bezier-curve mouse approach + randomized dwell. |
| humanize_type | ⭐ Type with Gaussian-distributed keystroke delays. |
| click_turnstile | Auto-find and click the Cloudflare Turnstile checkbox. Three-tier detection strategy:
1. Primary selectors: iframe[src*=challenges.cloudflare.com], [data-sitekey], .cf-turnstile
2. Secondary: .turnstile, input[name=cf-turnstile-response] → nearest sized container
3. Fallback (if fallback_template=True): OpenCV template match via verify_cf
— covers out-of-process iframe cases (e.g. nopecha.com/captcha/turnstile)
Args:
offset_x: pixels from widget left edge (default 30, calibrated for CF checkbox)
offset_y: vertical offset (default = container center)
fallback_template: if selectors fail, try OpenCV template click (default True)
Known to work on: 2captcha.com/demo/cloudflare-turnstile, dash.cloudflare.com login,
nopecha.com/captcha/turnstile (via template fallback).
Does NOT work on: Cloudflare managed-mode interstitials ("Just a moment..." full-page
challenges) — use solve_captcha or storage_state_load for those.
|
| click_element_offset | Click inside element at percentage position (not center). Examples:
x_percent=8 → checkbox at left edge of label
x_percent=90 → right-side toggle slider
y_percent=20 → top portion of a card
|
| click_at_corner | Click at a corner of element (close X buttons, delete icons, dismiss). corner: top-left | top-right | bottom-left | bottom-right
offset: inset pixels from corner (default 8px — works for most X buttons)
|
| find_by_image | ⭐ Find an image on the current page via OpenCV template matching. Takes a fresh screenshot, matches against template_path image, returns
(x, y) center of best match. Use for finding visual buttons/icons when
DOM selectors aren't available.
Returns JSON: {"found": true, "x": ..., "y": ..., "score": ..., "template": "..."}
|
| click_at_image | ⭐ Find image via template matching, then click its center. Combines find_by_image + humanize_move + mouse_click. Useful for visual
CAPTCHAs, custom buttons without reliable selectors, or interacting with
canvas-based UIs.
|
| mouse_drift | ⭐ Simulate idle mouse wandering to pass behavioral ML. Random Bezier segments across the viewport — mimics a user thinking.
Call BEFORE a critical interaction (form submit, button click) to
establish 'human' behavior pattern before the deterministic action. |
| mouse_record | ⭐ Record real mouse movements from the page for later replay. Injects a listener that captures mousemove events during duration. Move
your mouse naturally in the Chrome window while this runs. The recorded
path can then be played back via mouse_replay() — highest-stealth
behavioral pattern (indistinguishable from human).
Returns: JSON array of {t, x, y} events.
|
| mouse_replay | ⭐ Replay a recorded mouse path (from mouse_record). Args:
path_json: JSON array of {t, x, y} from mouse_record
speed: 1.0 = original speed, 2.0 = 2x faster, 0.5 = slower
|
| solve_recaptcha_ai | Solve reCAPTCHA v2 image challenge using a vision-enabled LLM. Supports Anthropic (Claude) OR any OpenAI-compatible API (gpt-4o, gpt-5.x,
Groq llama3.2-vision, local Ollama llava, Together.ai, Fireworks, etc).
⚠️ MODEL MUST BE MULTIMODAL (vision-capable) — text-only models fail silently.
✅ Supported: gpt-4o, gpt-5.x, claude-opus-4-7, llava, llama-3.2-90b-vision-preview
❌ NOT: gpt-3.5-turbo, llama3 (non-vision), claude-3-haiku
Env vars (OpenAI SDK standard — priority checked if args omitted):
OPENAI_API_KEY + OPENAI_BASE_URL + OPENAI_MODEL → OpenAI-compat
ANTHROPIC_API_KEY + ANTHROPIC_MODEL → Claude
AI_VISION_* (legacy, DEPRECATED — removed v0.2.0) → backward-compat
Explicit override:
provider="anthropic" | "openai"
base_url="https://your-provider.example.com/v1"
api_key="..."
model="gpt-4o" | "claude-opus-4-7" | ...
Cost: varies by provider (~$0.005-0.03 per solve).
|
| vision_locate | ⭐ Find an element by natural-language description using a vision LLM. Uses the same provider as solve_recaptcha_ai (OPENAI_* / ANTHROPIC_* env).
Reuses solve_recaptcha_ai's vision plumbing so any vision-capable model
works (gpt-4o, gpt-5.x, claude, llava, llama-3.2-vision).
Args:
description: NL description, e.g. "the red Create button at bottom right"
click: if True, also dispatches a CDP mouse_click at the located point
api_key/base_url/model/provider: explicit overrides (else from env)
Returns JSON: {"found":true/false, "x":int, "y":int, "confidence":"high|medium|low"}.
Use when CSS selectors are unreliable (visual-only differentiator, dynamic IDs).
|
| http_request | HTTP request with TLS-perfect browser fingerprint via curl_cffi.
Use for API scraping after browser login — same stealth as real Chrome's JA3/JA4. Args:
url, method: target URL and HTTP verb
impersonate: chrome, chrome124, firefox, safari, edge (default chrome)
use_browser_cookies: auto-inject cookies from active browser tab
headers, params: extra headers/query params
data: raw body string (form-urlencoded or custom)
json_body: JSON body dict (sets Content-Type automatically)
timeout, follow_redirects: usual HTTP options
return_mode: auto (json if parseable else text), json, text, or meta (status+headers only)
|
| http_session_cookies | ⭐ Inspect which browser cookies would be sent with a request to URL. Helpful to verify session sharing works before making requests. |
| session_warmup | Warm up session by navigating naturally before hitting target URL.
Anti-bot systems score trust by session history — direct deep-URL hits look suspicious. Patterns:
- homepage_first: goto origin → wait → goto target
- referer_chain: goto origin → find link to target → click
- natural_browse: homepage → scroll → random click → scroll → target
|
| detect_anti_bot | ⭐ Analyze current page + HTTP headers to identify anti-bot system. Detects: Cloudflare, DataDome, PerimeterX/HUMAN, Akamai Bot Manager,
Kasada, Imperva/Incapsula, F5 Shape, none. Returns system + recommended
bypass strategy from our toolkit.
|