Skip to main content
Glama
NYO2008

Firecrawl MCP Server

by NYO2008

firecrawl_generate_llmstxt

Generate machine-readable permission guidelines for AI models by creating standardized LLMs.txt files that define how large language models should interact with websites.

Instructions

Generate a standardized llms.txt (and optionally llms-full.txt) file for a given domain. This file defines how large language models should interact with the site.

Best for: Creating machine-readable permission guidelines for AI models. Not recommended for: General content extraction or research. Arguments:

  • url (string, required): The base URL of the website to analyze.

  • maxUrls (number, optional): Max number of URLs to include (default: 10).

  • showFullText (boolean, optional): Whether to include llms-full.txt contents in the response. Prompt Example: "Generate an LLMs.txt file for example.com." Usage Example:

{
  "name": "firecrawl_generate_llmstxt",
  "arguments": {
    "url": "https://example.com",
    "maxUrls": 20,
    "showFullText": true
  }
}

Returns: LLMs.txt file contents (and optionally llms-full.txt).

Input Schema

TableJSON Schema
NameRequiredDescriptionDefault
urlYesThe URL to generate LLMs.txt from
maxUrlsNoMaximum number of URLs to process (1-100, default: 10)
showFullTextNoWhether to show the full LLMs-full.txt in the response

Implementation Reference

  • The primary handler logic for the 'firecrawl_generate_llmstxt' tool. It performs input validation, invokes the Firecrawl client's generateLLMsText method with retry logic, processes the response (including optional llms-full.txt), and formats the output or handles errors.
    case 'firecrawl_generate_llmstxt': {
      if (!isGenerateLLMsTextOptions(args)) {
        throw new Error('Invalid arguments for firecrawl_generate_llmstxt');
      }
    
      try {
        const { url, ...params } = args;
        const generateStartTime = Date.now();
    
        safeLog('info', `Starting LLMs.txt generation for URL: ${url}`);
    
        // Start the generation process
        const response = await withRetry(
          async () =>
            // @ts-expect-error Extended API options including origin
            client.generateLLMsText(url, { ...params, origin: 'mcp-server' }),
          'LLMs.txt generation'
        );
    
        if (!response.success) {
          throw new Error(response.error || 'LLMs.txt generation failed');
        }
    
        // Log performance metrics
        safeLog(
          'info',
          `LLMs.txt generation completed in ${Date.now() - generateStartTime}ms`
        );
    
        // Format the response
        let resultText = '';
    
        if ('data' in response) {
          resultText = `LLMs.txt content:\n\n${response.data.llmstxt}`;
    
          if (args.showFullText && response.data.llmsfulltxt) {
            resultText += `\n\nLLMs-full.txt content:\n\n${response.data.llmsfulltxt}`;
          }
        }
    
        return {
          content: [{ type: 'text', text: trimResponseText(resultText) }],
          isError: false,
        };
      } catch (error) {
        const errorMessage =
          error instanceof Error ? error.message : String(error);
        return {
          content: [{ type: 'text', text: trimResponseText(errorMessage) }],
          isError: true,
        };
      }
    }
  • Tool schema definition including name, detailed description, and inputSchema for parameter validation (url required, maxUrls and showFullText optional).
    const GENERATE_LLMSTXT_TOOL: Tool = {
      name: 'firecrawl_generate_llmstxt',
      description: `
    Generate a standardized llms.txt (and optionally llms-full.txt) file for a given domain. This file defines how large language models should interact with the site.
    
    **Best for:** Creating machine-readable permission guidelines for AI models.
    **Not recommended for:** General content extraction or research.
    **Arguments:**
    - url (string, required): The base URL of the website to analyze.
    - maxUrls (number, optional): Max number of URLs to include (default: 10).
    - showFullText (boolean, optional): Whether to include llms-full.txt contents in the response.
    **Prompt Example:** "Generate an LLMs.txt file for example.com."
    **Usage Example:**
    \`\`\`json
    {
      "name": "firecrawl_generate_llmstxt",
      "arguments": {
        "url": "https://example.com",
        "maxUrls": 20,
        "showFullText": true
      }
    }
    \`\`\`
    **Returns:** LLMs.txt file contents (and optionally llms-full.txt).
    `,
      inputSchema: {
        type: 'object',
        properties: {
          url: {
            type: 'string',
            description: 'The URL to generate LLMs.txt from',
          },
          maxUrls: {
            type: 'number',
            description: 'Maximum number of URLs to process (1-100, default: 10)',
          },
          showFullText: {
            type: 'boolean',
            description: 'Whether to show the full LLMs-full.txt in the response',
          },
        },
        required: ['url'],
      },
    };
  • src/index.ts:955-966 (registration)
    Registration of the tool in the list returned by ListToolsRequestSchema handler, making it discoverable by MCP clients.
    server.setRequestHandler(ListToolsRequestSchema, async () => ({
      tools: [
        SCRAPE_TOOL,
        MAP_TOOL,
        CRAWL_TOOL,
        CHECK_CRAWL_STATUS_TOOL,
        SEARCH_TOOL,
        EXTRACT_TOOL,
        DEEP_RESEARCH_TOOL,
        GENERATE_LLMSTXT_TOOL,
      ],
    }));
  • Type guard helper function used in the handler to validate input arguments for the tool.
    function isGenerateLLMsTextOptions(
      args: unknown
    ): args is { url: string } & Partial<GenerateLLMsTextParams> {
      return (
        typeof args === 'object' &&
        args !== null &&
        'url' in args &&
        typeof (args as { url: unknown }).url === 'string'
      );
    }
  • TypeScript interface defining optional parameters for the LLMs.txt generation (maxUrls, showFullText, experimental_stream). Used in type guard and handler.
    interface GenerateLLMsTextParams {
      /**
       * Maximum number of URLs to process (1-100)
       * @default 10
       */
      maxUrls?: number;
      /**
       * Whether to show the full LLMs-full.txt in the response
       * @default false
       */
      showFullText?: boolean;
      /**
       * Experimental flag for streaming
       */
      __experimental_stream?: boolean;
    }

Latest Blog Posts

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/NYO2008/firecrawl-mcp-server'

If you have feedback or need assistance with the MCP directory API, please join our Discord server