firecrawl_generate_llmstxt

Generate a standardized llms.txt file for any website to define how AI models should interact with the site, creating machine-readable permission guidelines for large language models.

Instructions

Generate a standardized llms.txt (and optionally llms-full.txt) file for a given domain. This file defines how large language models should interact with the site.

Best for: Creating machine-readable permission guidelines for AI models. Not recommended for: General content extraction or research. Arguments:

url (string, required): The base URL of the website to analyze.
maxUrls (number, optional): Max number of URLs to include (default: 10).
showFullText (boolean, optional): Whether to include llms-full.txt contents in the response. Prompt Example: "Generate an LLMs.txt file for example.com." Usage Example:

{ "name": "firecrawl_generate_llmstxt", "arguments": { "url": "https://example.com", "maxUrls": 20, "showFullText": true } }

Returns: LLMs.txt file contents (and optionally llms-full.txt).

Input Schema

TableJSON Schema

Name	Required	Description
`maxUrls`	No	Maximum number of URLs to process (1-100, default: 10)
`showFullText`	No	Whether to show the full LLMs-full.txt in the response
`url`	Yes	The URL to generate LLMs.txt from

Implementation Reference

src/index.ts:1366-1418 (handler)
The main handler for the 'firecrawl_generate_llmstxt' tool. Validates input using isGenerateLLMsTextOptions, calls FirecrawlApp.generateLLMsText with parameters and origin, handles response formatting including optional llms-full.txt, logs performance, and manages errors with retry logic.
case 'firecrawl_generate_llmstxt': { if (!isGenerateLLMsTextOptions(args)) { throw new Error('Invalid arguments for firecrawl_generate_llmstxt'); } try { const { url, ...params } = args; const generateStartTime = Date.now(); safeLog('info', `Starting LLMs.txt generation for URL: ${url}`); // Start the generation process const response = await withRetry( async () => // @ts-expect-error Extended API options including origin client.generateLLMsText(url, { ...params, origin: 'mcp-server' }), 'LLMs.txt generation' ); if (!response.success) { throw new Error(response.error || 'LLMs.txt generation failed'); } // Log performance metrics safeLog( 'info', `LLMs.txt generation completed in ${Date.now() - generateStartTime}ms` ); // Format the response let resultText = ''; if ('data' in response) { resultText = `LLMs.txt content:\n\n${response.data.llmstxt}`; if (args.showFullText && response.data.llmsfulltxt) { resultText += `\n\nLLMs-full.txt content:\n\n${response.data.llmsfulltxt}`; } } return { content: [{ type: 'text', text: trimResponseText(resultText) }], isError: false, }; } catch (error) { const errorMessage = error instanceof Error ? error.message : String(error); return { content: [{ type: 'text', text: trimResponseText(errorMessage) }], isError: true, }; } }
src/index.ts:640-683 (schema)
Tool schema definition for 'firecrawl_generate_llmstxt', including name, detailed description, and inputSchema with properties for url (required), maxUrls, and showFullText.
const GENERATE_LLMSTXT_TOOL: Tool = { name: 'firecrawl_generate_llmstxt', description: ` Generate a standardized llms.txt (and optionally llms-full.txt) file for a given domain. This file defines how large language models should interact with the site. **Best for:** Creating machine-readable permission guidelines for AI models. **Not recommended for:** General content extraction or research. **Arguments:** - url (string, required): The base URL of the website to analyze. - maxUrls (number, optional): Max number of URLs to include (default: 10). - showFullText (boolean, optional): Whether to include llms-full.txt contents in the response. **Prompt Example:** "Generate an LLMs.txt file for example.com." **Usage Example:** \`\`\`json { "name": "firecrawl_generate_llmstxt", "arguments": { "url": "https://example.com", "maxUrls": 20, "showFullText": true } } \`\`\` **Returns:** LLMs.txt file contents (and optionally llms-full.txt). `, inputSchema: { type: 'object', properties: { url: { type: 'string', description: 'The URL to generate LLMs.txt from', }, maxUrls: { type: 'number', description: 'Maximum number of URLs to process (1-100, default: 10)', }, showFullText: { type: 'boolean', description: 'Whether to show the full LLMs-full.txt in the response', }, }, required: ['url'], }, };
src/index.ts:962-973 (registration)
Registration of the firecrawl_generate_llmstxt tool (as GENERATE_LLMSTXT_TOOL) in the list of available tools returned by ListToolsRequestSchema handler.
server.setRequestHandler(ListToolsRequestSchema, async () => ({ tools: [ SCRAPE_TOOL, MAP_TOOL, CRAWL_TOOL, CHECK_CRAWL_STATUS_TOOL, SEARCH_TOOL, EXTRACT_TOOL, DEEP_RESEARCH_TOOL, GENERATE_LLMSTXT_TOOL, ], }));
src/index.ts:839-848 (helper)
Type guard function used in the handler to validate arguments for the firecrawl_generate_llmstxt tool.
function isGenerateLLMsTextOptions( args: unknown ): args is { url: string } & Partial<GenerateLLMsTextParams> { return ( typeof args === 'object' && args !== null && 'url' in args && typeof (args as { url: unknown }).url === 'string' ); }
src/index.ts:688-702 (helper)
TypeScript interface defining optional parameters (maxUrls, showFullText, __experimental_stream) for the generateLLMsText operation used by the tool.
interface GenerateLLMsTextParams { /** * Maximum number of URLs to process (1-100) * @default 10 */ maxUrls?: number; /** * Whether to show the full LLMs-full.txt in the response * @default false */ showFullText?: boolean; /** * Experimental flag for streaming */ __experimental_stream?: boolean;

Firecrawl MCP Server