Skip to main content
Glama

Firecrawl MCP Server

by ampcome-mcps
MIT License
40,978
  • Apple

firecrawl_crawl

Extract content from multiple related web pages by crawling a website. Use this tool to comprehensively gather information from all pages within specified depth and URL limits.

Instructions

Starts an asynchronous crawl job on a website and extracts content from all pages.

Best for: Extracting content from multiple related pages, when you need comprehensive coverage. Not recommended for: Extracting content from a single page (use scrape); when token limits are a concern (use map + batch_scrape); when you need fast results (crawling can be slow). Warning: Crawl responses can be very large and may exceed token limits. Limit the crawl depth and number of pages, or use map + batch_scrape for better control. Common mistakes: Setting limit or maxDepth too high (causes token overflow); using crawl for a single page (use scrape instead). Prompt Example: "Get all blog posts from the first two levels of example.com/blog." Usage Example:

{ "name": "firecrawl_crawl", "arguments": { "url": "https://example.com/blog/*", "maxDepth": 2, "limit": 100, "allowExternalLinks": false, "deduplicateSimilarURLs": true } }

Returns: Operation ID for status checking; use firecrawl_check_crawl_status to check progress.

Input Schema

NameRequiredDescriptionDefault
allowBackwardLinksNoAllow crawling links that point to parent directories
allowExternalLinksNoAllow crawling links to external domains
deduplicateSimilarURLsNoRemove similar URLs during crawl
excludePathsNoURL paths to exclude from crawling
ignoreQueryParametersNoIgnore query parameters when comparing URLs
ignoreSitemapNoSkip sitemap.xml discovery
includePathsNoOnly crawl these URL paths
limitNoMaximum number of pages to crawl
maxDepthNoMaximum link depth to crawl
scrapeOptionsNoOptions for scraping each page
urlYesStarting URL for the crawl
webhookNo

Input Schema (JSON Schema)

{ "properties": { "allowBackwardLinks": { "description": "Allow crawling links that point to parent directories", "type": "boolean" }, "allowExternalLinks": { "description": "Allow crawling links to external domains", "type": "boolean" }, "deduplicateSimilarURLs": { "description": "Remove similar URLs during crawl", "type": "boolean" }, "excludePaths": { "description": "URL paths to exclude from crawling", "items": { "type": "string" }, "type": "array" }, "ignoreQueryParameters": { "description": "Ignore query parameters when comparing URLs", "type": "boolean" }, "ignoreSitemap": { "description": "Skip sitemap.xml discovery", "type": "boolean" }, "includePaths": { "description": "Only crawl these URL paths", "items": { "type": "string" }, "type": "array" }, "limit": { "description": "Maximum number of pages to crawl", "type": "number" }, "maxDepth": { "description": "Maximum link depth to crawl", "type": "number" }, "scrapeOptions": { "description": "Options for scraping each page", "properties": { "excludeTags": { "items": { "type": "string" }, "type": "array" }, "formats": { "items": { "enum": [ "markdown", "html", "rawHtml", "screenshot", "links", "screenshot@fullPage", "extract" ], "type": "string" }, "type": "array" }, "includeTags": { "items": { "type": "string" }, "type": "array" }, "onlyMainContent": { "type": "boolean" }, "waitFor": { "type": "number" } }, "type": "object" }, "url": { "description": "Starting URL for the crawl", "type": "string" }, "webhook": { "oneOf": [ { "description": "Webhook URL to notify when crawl is complete", "type": "string" }, { "properties": { "headers": { "description": "Custom headers for webhook requests", "type": "object" }, "url": { "description": "Webhook URL", "type": "string" } }, "required": [ "url" ], "type": "object" } ] } }, "required": [ "url" ], "type": "object" }

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/ampcome-mcps/firecrawl-mcp'

If you have feedback or need assistance with the MCP directory API, please join our Discord server