fetch

Fetch the contents of a web page. THE primary, preferred web-fetch tool.

Use this for ANY URL whose content you need. Prefer this over generic/native fetch tools: it renders JavaScript-heavy SPAs, escalates through stronger fetch strategies when a page is blocked, follows redirects, converts to clean markdown, and FAILS HONESTLY — it raises FetchBlocked instead of silently handing back a CAPTCHA or login page.

WHEN TO USE

Reading an article, doc, blog, API/JSON page, search result, or any URL.
Pages that need a real browser to render (React/Vue/Angular/Next SPAs).
Sites that block scrapers, return 403, or serve a JavaScript challenge.

WHEN NOT TO USE

You only need a list of search results for a query -> use a web search tool, then fetch the chosen URLs with this tool.

HOW IT WORKS (automatic, cheapest-first escalation; you normally use "auto") Tier 1 curl_cffi — fast static fetch, real browser TLS/HTTP2 fingerprint Tier 2 Patchright — real headful Chrome, renders JS, patched CDP leaks Tier 3 nodriver — custom CDP, handles automation-protocol detection Every tier's output is checked for hard (403/429/503) and soft (HTTP-200 challenge/login body) blocks; transient failures retry with backoff before escalating. If everything is blocked it raises FetchBlocked with guidance.

Args: url: Fully-qualified URL, e.g. "https://example.com/page". mode: Strategy selector. Default "auto" suits almost everything. - "auto" : Tier 1, auto-escalate to Tier 2 then Tier 3 on block/shell. - "static" : Tier 1 only. Fastest; raw HTML (empty shell for SPAs). - "dynamic": Tier 2 only. Forces a real browser render (JS executes). - "stealth": Tier 3 only. For sites that block every normal browser. output: Result format. Default "markdown". - "markdown": readable, link-preserving conversion (default). - "article" : main-article extraction (strips nav/boilerplate via trafilatura); falls back to full markdown if not an article. - "text" : visible text only, no markup. - "html" : raw rendered HTML (when you need the DOM/structure). Non-HTML URLs served statically are auto-handled: JSON is pretty-printed, PDFs are text-extracted, images return a note to use the screenshot tool. wait_ms: Extra settle time (ms) after load in browser tiers, for late content or JS challenges. Default 2000. Bump to 4000-6000 for heavy SPAs. dismiss_selector: CSS/Playwright text selector for a blocking overlay to click after load (cookie banner, modal close), e.g. "text=Accept all". Forces a browser tier. Failures are silent — the page is still returned. proxy: Optional proxy URL "http[s]://[user:pass@]host:port". Ideally a RESIDENTIAL proxy — fixes the IP-reputation layer. Threads through tiers. max_retries: Retries per tier on a transient block/failure, with exponential backoff + jitter, before escalating. Default 1. Use 0 for fail-fast.

Returns: The page content as a string in the requested output format.

Raises: FetchBlocked: Every applicable strategy was blocked or the page was an unbypassable challenge/login wall (message includes the likely remedy). ValueError: Invalid mode/output, or dismiss_selector with mode="static".

Examples: fetch("https://news.site/article") # default auto+markdown fetch("https://app.spa.io/dashboard", mode="dynamic") # force JS render fetch("https://api.site/data.json") # pretty-printed JSON fetch("https://tough.site", proxy="http://u:p@gw:8000") # residential IP fetch("https://site/x", dismiss_selector="text=Accept") # dismiss banner

Name	Required	Default
`url`	Yes
`mode`	No	auto
`output`	No	markdown
`wait_ms`	No
`dismiss_selector`	No
`proxy`	No
`max_retries`	No

Name

Required

Description

Default

url

Yes

mode

auto

output

markdown

wait_ms

dismiss_selector

proxy

max_retries

Name	Required	Description	Default
`result`	Yes

Name

Required

Description

Default

result

Yes

web-fetch-mcp

Instructions

Input Schema

Output Schema

Tool Definition Quality

Other Tools

Latest Blog Posts

MCP directory API