Unstructured API MCP Server

Official

invoke_firecrawl_crawlhtml

Initiate an asynchronous web crawl to extract HTML content from a specified URL. Results are stored in an S3 bucket, with control over the maximum number of pages to crawl.

Instructions

Start an asynchronous web crawl job using Firecrawl to retrieve HTML content.

Args: url: URL to crawl s3_uri: S3 URI where results will be uploaded limit: Maximum number of pages to crawl (default: 100) Returns: Dictionary with crawl job information including the job ID

Input Schema

NameRequiredDescriptionDefault
limitNo
s3_uriYes
urlYes

Input Schema (JSON Schema)

{ "properties": { "limit": { "default": 100, "title": "Limit", "type": "integer" }, "s3_uri": { "title": "S3 Uri", "type": "string" }, "url": { "title": "Url", "type": "string" } }, "required": [ "url", "s3_uri" ], "title": "invoke_firecrawl_crawlhtmlArguments", "type": "object" }
ID: 56f7310rbq