invoke_firecrawl_llmtxt
Crawl a website to generate a standardized markdown file (llmfull.txt) for LLM inference, extracting data via GPT-4 optimization. Results are uploaded to a specified S3 URI for processing.
Instructions
Start an asynchronous llmfull.txt generation job using Firecrawl. This file is a standardized markdown file containing information to help LLMs use a website at inference time. The llmstxt endpoint leverages Firecrawl to crawl your website and extracts data using gpt-4o-mini Args: url: URL to crawl s3_uri: S3 URI where results will be uploaded max_urls: Maximum number of pages to crawl (1-100, default: 10)
Copy
Input Schema
Name | Required | Description | Default |
---|---|---|---|
max_urls | No | ||
s3_uri | Yes | ||
url | Yes |
Input Schema (JSON Schema)
You must be authenticated.
Other Tools from Unstructured API MCP Server
- cancel_crawlhtml_job
- cancel_job
- check_crawlhtml_status
- check_llmtxt_status
- create_astradb_destination
- create_azure_source
- create_gdrive_source
- create_neo4j_destination
- create_s3_destination
- create_s3_source
- create_weaviate_destination
- create_workflow
- delete_astradb_destination
- delete_azure_source
- delete_gdrive_source
- delete_neo4j_destination
- delete_s3_destination
- delete_s3_source
- delete_weaviate_destination
- delete_workflow
- get_destination_info
- get_job_info
- get_source_info
- get_workflow_info
- invoke_firecrawl_crawlhtml
- invoke_firecrawl_llmtxt
- list_destinations
- list_jobs
- list_sources
- list_workflows
- run_workflow
- update_astradb_destination
- update_azure_source
- update_gdrive_source
- update_neo4j_destination
- update_s3_destination
- update_s3_source
- update_weaviate_destination
- update_workflow
Related Tools
- @JoeBuildsStuff/mcp-jina-ai
- @apify/mcp-server-rag-web-browser
- @spences10/mcp-jinaai-reader
- @spences10/mcp-jinaai-search
- @apappascs/tavily-search-mcp-server