Skip to main content
Glama
ampcome-mcps

Firecrawl MCP Server

by ampcome-mcps

firecrawl_generate_llmstxt

Generate a standardized llms.txt file for any website to define how AI models should interact with the site, creating machine-readable permission guidelines for large language models.

Instructions

Generate a standardized llms.txt (and optionally llms-full.txt) file for a given domain. This file defines how large language models should interact with the site.

Best for: Creating machine-readable permission guidelines for AI models. Not recommended for: General content extraction or research. Arguments:

  • url (string, required): The base URL of the website to analyze.

  • maxUrls (number, optional): Max number of URLs to include (default: 10).

  • showFullText (boolean, optional): Whether to include llms-full.txt contents in the response. Prompt Example: "Generate an LLMs.txt file for example.com." Usage Example:

{ "name": "firecrawl_generate_llmstxt", "arguments": { "url": "https://example.com", "maxUrls": 20, "showFullText": true } }

Returns: LLMs.txt file contents (and optionally llms-full.txt).

Input Schema

TableJSON Schema
NameRequiredDescriptionDefault
maxUrlsNoMaximum number of URLs to process (1-100, default: 10)
showFullTextNoWhether to show the full LLMs-full.txt in the response
urlYesThe URL to generate LLMs.txt from

Implementation Reference

  • The main handler for the 'firecrawl_generate_llmstxt' tool. Validates input using isGenerateLLMsTextOptions, calls FirecrawlApp.generateLLMsText with parameters and origin, handles response formatting including optional llms-full.txt, logs performance, and manages errors with retry logic.
    case 'firecrawl_generate_llmstxt': { if (!isGenerateLLMsTextOptions(args)) { throw new Error('Invalid arguments for firecrawl_generate_llmstxt'); } try { const { url, ...params } = args; const generateStartTime = Date.now(); safeLog('info', `Starting LLMs.txt generation for URL: ${url}`); // Start the generation process const response = await withRetry( async () => // @ts-expect-error Extended API options including origin client.generateLLMsText(url, { ...params, origin: 'mcp-server' }), 'LLMs.txt generation' ); if (!response.success) { throw new Error(response.error || 'LLMs.txt generation failed'); } // Log performance metrics safeLog( 'info', `LLMs.txt generation completed in ${Date.now() - generateStartTime}ms` ); // Format the response let resultText = ''; if ('data' in response) { resultText = `LLMs.txt content:\n\n${response.data.llmstxt}`; if (args.showFullText && response.data.llmsfulltxt) { resultText += `\n\nLLMs-full.txt content:\n\n${response.data.llmsfulltxt}`; } } return { content: [{ type: 'text', text: trimResponseText(resultText) }], isError: false, }; } catch (error) { const errorMessage = error instanceof Error ? error.message : String(error); return { content: [{ type: 'text', text: trimResponseText(errorMessage) }], isError: true, }; } }
  • Tool schema definition for 'firecrawl_generate_llmstxt', including name, detailed description, and inputSchema with properties for url (required), maxUrls, and showFullText.
    const GENERATE_LLMSTXT_TOOL: Tool = { name: 'firecrawl_generate_llmstxt', description: ` Generate a standardized llms.txt (and optionally llms-full.txt) file for a given domain. This file defines how large language models should interact with the site. **Best for:** Creating machine-readable permission guidelines for AI models. **Not recommended for:** General content extraction or research. **Arguments:** - url (string, required): The base URL of the website to analyze. - maxUrls (number, optional): Max number of URLs to include (default: 10). - showFullText (boolean, optional): Whether to include llms-full.txt contents in the response. **Prompt Example:** "Generate an LLMs.txt file for example.com." **Usage Example:** \`\`\`json { "name": "firecrawl_generate_llmstxt", "arguments": { "url": "https://example.com", "maxUrls": 20, "showFullText": true } } \`\`\` **Returns:** LLMs.txt file contents (and optionally llms-full.txt). `, inputSchema: { type: 'object', properties: { url: { type: 'string', description: 'The URL to generate LLMs.txt from', }, maxUrls: { type: 'number', description: 'Maximum number of URLs to process (1-100, default: 10)', }, showFullText: { type: 'boolean', description: 'Whether to show the full LLMs-full.txt in the response', }, }, required: ['url'], }, };
  • src/index.ts:962-973 (registration)
    Registration of the firecrawl_generate_llmstxt tool (as GENERATE_LLMSTXT_TOOL) in the list of available tools returned by ListToolsRequestSchema handler.
    server.setRequestHandler(ListToolsRequestSchema, async () => ({ tools: [ SCRAPE_TOOL, MAP_TOOL, CRAWL_TOOL, CHECK_CRAWL_STATUS_TOOL, SEARCH_TOOL, EXTRACT_TOOL, DEEP_RESEARCH_TOOL, GENERATE_LLMSTXT_TOOL, ], }));
  • Type guard function used in the handler to validate arguments for the firecrawl_generate_llmstxt tool.
    function isGenerateLLMsTextOptions( args: unknown ): args is { url: string } & Partial<GenerateLLMsTextParams> { return ( typeof args === 'object' && args !== null && 'url' in args && typeof (args as { url: unknown }).url === 'string' ); }
  • TypeScript interface defining optional parameters (maxUrls, showFullText, __experimental_stream) for the generateLLMsText operation used by the tool.
    interface GenerateLLMsTextParams { /** * Maximum number of URLs to process (1-100) * @default 10 */ maxUrls?: number; /** * Whether to show the full LLMs-full.txt in the response * @default false */ showFullText?: boolean; /** * Experimental flag for streaming */ __experimental_stream?: boolean;

Latest Blog Posts

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/ampcome-mcps/firecrawl-mcp'

If you have feedback or need assistance with the MCP directory API, please join our Discord server