firecrawl_crawl
Extract content from multiple website pages by starting a crawl job. Use for comprehensive coverage of related pages, with options to control depth and scope.
Instructions
Starts a crawl job on a website and extracts content from all pages.
Best for: Extracting content from multiple related pages, when you need comprehensive coverage. Not recommended for: Extracting content from a single page (use scrape); when token limits are a concern (use map + batch_scrape); when you need fast results (crawling can be slow). Warning: Crawl responses can be very large and may exceed token limits. Limit the crawl depth and number of pages, or use map + batch_scrape for better control. Common mistakes: Setting limit or maxDiscoveryDepth too high (causes token overflow) or too low (causes missing pages); using crawl for a single page (use scrape instead). Using a /* wildcard is not recommended. Prompt Example: "Get all blog posts from the first two levels of example.com/blog." Usage Example:
Returns: Operation ID for status checking; use firecrawl_check_crawl_status to check progress.
Input Schema
| Name | Required | Description | Default |
|---|---|---|---|
| url | Yes | ||
| prompt | No | ||
| excludePaths | No | ||
| includePaths | No | ||
| maxDiscoveryDepth | No | ||
| sitemap | No | ||
| limit | No | ||
| allowExternalLinks | No | ||
| allowSubdomains | No | ||
| crawlEntireDomain | No | ||
| delay | No | ||
| maxConcurrency | No | ||
| webhook | No | ||
| deduplicateSimilarURLs | No | ||
| ignoreQueryParameters | No | ||
| scrapeOptions | No |