firecrawl_crawl

Extract content from multiple related web pages by crawling a website. Use this tool to comprehensively gather information from all pages within specified depth and URL limits.

Instructions

Starts an asynchronous crawl job on a website and extracts content from all pages.

Best for: Extracting content from multiple related pages, when you need comprehensive coverage. Not recommended for: Extracting content from a single page (use scrape); when token limits are a concern (use map + batch_scrape); when you need fast results (crawling can be slow). Warning: Crawl responses can be very large and may exceed token limits. Limit the crawl depth and number of pages, or use map + batch_scrape for better control. Common mistakes: Setting limit or maxDepth too high (causes token overflow); using crawl for a single page (use scrape instead). Prompt Example: "Get all blog posts from the first two levels of example.com/blog." Usage Example:

{ "name": "firecrawl_crawl", "arguments": { "url": "https://example.com/blog/*", "maxDepth": 2, "limit": 100, "allowExternalLinks": false, "deduplicateSimilarURLs": true } }

Returns: Operation ID for status checking; use firecrawl_check_crawl_status to check progress.

Input Schema

Name	Required	Description
`allowBackwardLinks`	No	Allow crawling links that point to parent directories
`allowExternalLinks`	No	Allow crawling links to external domains
`deduplicateSimilarURLs`	No	Remove similar URLs during crawl
`excludePaths`	No	URL paths to exclude from crawling
`ignoreQueryParameters`	No	Ignore query parameters when comparing URLs
`ignoreSitemap`	No	Skip sitemap.xml discovery
`includePaths`	No	Only crawl these URL paths
`limit`	No	Maximum number of pages to crawl
`maxDepth`	No	Maximum link depth to crawl
`scrapeOptions`	No	Options for scraping each page
`url`	Yes	Starting URL for the crawl
`webhook`	No

Input Schema (JSON Schema)

{ "properties": { "allowBackwardLinks": { "description": "Allow crawling links that point to parent directories", "type": "boolean" }, "allowExternalLinks": { "description": "Allow crawling links to external domains", "type": "boolean" }, "deduplicateSimilarURLs": { "description": "Remove similar URLs during crawl", "type": "boolean" }, "excludePaths": { "description": "URL paths to exclude from crawling", "items": { "type": "string" }, "type": "array" }, "ignoreQueryParameters": { "description": "Ignore query parameters when comparing URLs", "type": "boolean" }, "ignoreSitemap": { "description": "Skip sitemap.xml discovery", "type": "boolean" }, "includePaths": { "description": "Only crawl these URL paths", "items": { "type": "string" }, "type": "array" }, "limit": { "description": "Maximum number of pages to crawl", "type": "number" }, "maxDepth": { "description": "Maximum link depth to crawl", "type": "number" }, "scrapeOptions": { "description": "Options for scraping each page", "properties": { "excludeTags": { "items": { "type": "string" }, "type": "array" }, "formats": { "items": { "enum": [ "markdown", "html", "rawHtml", "screenshot", "links", "screenshot@fullPage", "extract" ], "type": "string" }, "type": "array" }, "includeTags": { "items": { "type": "string" }, "type": "array" }, "onlyMainContent": { "type": "boolean" }, "waitFor": { "type": "number" } }, "type": "object" }, "url": { "description": "Starting URL for the crawl", "type": "string" }, "webhook": { "oneOf": [ { "description": "Webhook URL to notify when crawl is complete", "type": "string" }, { "properties": { "headers": { "description": "Custom headers for webhook requests", "type": "object" }, "url": { "description": "Webhook URL", "type": "string" } }, "required": [ "url" ], "type": "object" } ] } }, "required": [ "url" ], "type": "object" }

Firecrawl MCP Server

firecrawl_crawl

Instructions

Input Schema

Input Schema (JSON Schema)

Other Tools from Firecrawl MCP Server

Related Tools

MCP directory API