invoke_firecrawl_llmtxt

Crawl a website to generate a standardized markdown file (llmfull.txt) for LLM inference, extracting data via GPT-4 optimization. Results are uploaded to a specified S3 URI for processing.

Instructions

Start an asynchronous llmfull.txt generation job using Firecrawl. This file is a standardized markdown file containing information to help LLMs use a website at inference time. The llmstxt endpoint leverages Firecrawl to crawl your website and extracts data using gpt-4o-mini Args: url: URL to crawl s3_uri: S3 URI where results will be uploaded max_urls: Maximum number of pages to crawl (1-100, default: 10)

Returns:
    Dictionary with job information including the job ID

Input Schema

Name	Required	Description	Default
`max_urls`	No
`s3_uri`	Yes
`url`	Yes

Input Schema (JSON Schema)

{
  "properties": {
    "max_urls": {
      "default": 10,
      "title": "Max Urls",
      "type": "integer"
    },
    "s3_uri": {
      "title": "S3 Uri",
      "type": "string"
    },
    "url": {
      "title": "Url",
      "type": "string"
    }
  },
  "required": [
    "url",
    "s3_uri"
  ],
  "title": "invoke_firecrawl_llmtxtArguments",
  "type": "object"
}

Unstructured API MCP Server