Skip to main content
Glama
scrapfly

Scrapfly MCP

web_scrape

Extract structured data from websites by scraping URLs with customizable HTTP methods, headers, proxies, JavaScript rendering, and anti-bot protection.

Instructions

Scrape a URL with full control. Use tool scraping_instruction_enhanced before using this tool. Prefer web_get_page for quick fetch

Input Schema

TableJSON Schema
NameRequiredDescriptionDefault
urlYesThe target URL to scrape.
methodNoThe HTTP method to use for the request.GET
bodyNoRequest body for POST/PUT/PATCH requests.
headersNoHTTP headers to send.
countryNoThe country to use for the proxy. Supports ISO 3166-1 alpha-2 country codes.
proxy_poolNoThe proxy pool to use. Supports public_datacenter_pool and public_residential_pool, defaults: public_datacenter_poolpublic_datacenter_pool
render_jsNoEnable JavaScript rendering with a headless browser.
rendering_waitNoWait for this number of milliseconds before returning the response.
aspNoEnable Anti Scraping Protection.
cacheNoEnable caching of the response.
cache_ttlNoCache TTL in seconds when cache is true.
cache_clearNoIf true, bypass & clear cache for this URL.
retryNoIf false, disable automatic retry on transient errors.
wait_for_selectorNo(Prefer rendering_wait). Wait for this CSS selector to appear in the page when rendering JS.
langNoLanguages to use for the request (Accept-Language header). Empty for auto-detection/Proxy Location alignment
cookiesNoCookies to send with the request.
formatNoThe desired output format for the content. Supports clean_html, markdown, text, and jsonmarkdown
format_optionsNoAdditional options (only available for markdown and text formats)
jsNoJavaScript to execute on the page.
js_scenarioNoA schema for validating a sequence of browser actions (JS Scenario) for the Scrapfly API.
screenshotsNoScreenshots with target (fullpage, selector). Example: [{ 'name': 'my_screenshot', 'target': 'fullpage' }, { 'name': 'my_screenshot2', 'target': 'selector', 'css_selector': '#price' }]
screenshot_flagsNoScreenshot flags to use for the screenshot.
timeoutNoServer-side timeout in milliseconds. (Prefer rendering_wait + timeout)
extraction_promptNo(Avoid if the llm is thinking and can process the data itself). If data extraction cannot be assumed by the current llm model,AI prompt to add step of llm assisted data extraction.
extraction_modelNoThe extraction model to use for the offloaded extraction. Exclusive with extraction_template and extraction_prompt.
powYesuse scraping_instruction_enhanced tool use for instructions

Latest Blog Posts

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/scrapfly/scrapfly-mcp'

If you have feedback or need assistance with the MCP directory API, please join our Discord server