Unstructured API MCP Server

Official

Overview InspectNew Schema Related Servers Reviews Score

invoke_firecrawl_crawlhtml

Initiate an asynchronous web crawl to extract HTML content from a specified URL. Results are stored in an S3 bucket, with control over the maximum number of pages to crawl.

Instructions

Start an asynchronous web crawl job using Firecrawl to retrieve HTML content.

Args:
    url: URL to crawl
    s3_uri: S3 URI where results will be uploaded 
    limit: Maximum number of pages to crawl (default: 100)

Returns:
    Dictionary with crawl job information including the job ID

Input Schema

Name	Required	Description	Default
`limit`	No
`s3_uri`	Yes
`url`	Yes

Input Schema (JSON Schema)

{
  "properties": {
    "limit": {
      "default": 100,
      "title": "Limit",
      "type": "integer"
    },
    "s3_uri": {
      "title": "S3 Uri",
      "type": "string"
    },
    "url": {
      "title": "Url",
      "type": "string"
    }
  },
  "required": [
    "url",
    "s3_uri"
  ],
  "title": "invoke_firecrawl_crawlhtmlArguments",
  "type": "object"
}

Install Server

HTTP connection URL

Related Tools

crawl
@josemartinrodriguezmortaloni/webSearch-Tools
fetch_urls
@everford/fetcher-mcp
search
@apify/mcp-server-rag-web-browser
get_html
@ztobs/cline-browser-use-mcp
fetch_url
@jae-jae/fetcher-mcp
fetch_html
@zcaceres/fetch-mcp

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/Unstructured-IO/UNS-MCP'

If you have feedback or need assistance with the MCP directory API, please join our Discord server