Skip to main content
Glama
Krieg2065

Firecrawl MCP Server

by Krieg2065

firecrawl_batch_scrape

Scrape multiple web pages simultaneously to extract content in various formats like markdown, HTML, or screenshots, returning a job ID for status tracking.

Instructions

Scrape multiple URLs in batch mode. Returns a job ID that can be used to check status.

Input Schema

TableJSON Schema
NameRequiredDescriptionDefault
urlsYesList of URLs to scrape
optionsNo

Implementation Reference

  • Handler for firecrawl_batch_scrape tool: validates input, queues batch scrape operation using client.asyncBatchScrapeUrls via processBatchOperation, returns job ID.
    case 'firecrawl_batch_scrape': {
      if (!isBatchScrapeOptions(args)) {
        throw new Error('Invalid arguments for firecrawl_batch_scrape');
      }
    
      try {
        const operationId = `batch_${++operationCounter}`;
        const operation: QueuedBatchOperation = {
          id: operationId,
          urls: args.urls,
          options: args.options,
          status: 'pending',
          progress: {
            completed: 0,
            total: args.urls.length,
          },
        };
    
        batchOperations.set(operationId, operation);
    
        // Queue the operation
        batchQueue.add(() => processBatchOperation(operation));
    
        safeLog(
          'info',
          `Queued batch operation ${operationId} with ${args.urls.length} URLs`
        );
    
        return {
          content: [
            {
              type: 'text',
              text: trimResponseText(
                `Batch operation queued with ID: ${operationId}. Use firecrawl_check_batch_status to check progress.`
              ),
            },
          ],
          isError: false,
        };
      } catch (error) {
        const errorMessage =
          error instanceof Error
            ? error.message
            : `Batch operation failed: ${JSON.stringify(error)}`;
        return {
          content: [{ type: 'text', text: trimResponseText(errorMessage) }],
          isError: true,
        };
      }
    }
  • Tool schema definition including input schema for batch URLs and scrape options.
    const BATCH_SCRAPE_TOOL: Tool = {
      name: 'firecrawl_batch_scrape',
      description:
        'Scrape multiple URLs in batch mode. Returns a job ID that can be used to check status.',
      inputSchema: {
        type: 'object',
        properties: {
          urls: {
            type: 'array',
            items: { type: 'string' },
            description: 'List of URLs to scrape',
          },
          options: {
            type: 'object',
            properties: {
              formats: {
                type: 'array',
                items: {
                  type: 'string',
                  enum: [
                    'markdown',
                    'html',
                    'rawHtml',
                    'screenshot',
                    'links',
                    'screenshot@fullPage',
                    'extract',
                  ],
                },
              },
              onlyMainContent: {
                type: 'boolean',
              },
              includeTags: {
                type: 'array',
                items: { type: 'string' },
              },
              excludeTags: {
                type: 'array',
                items: { type: 'string' },
              },
              waitFor: {
                type: 'number',
              },
            },
          },
        },
        required: ['urls'],
      },
    };
  • src/index.ts:962-973 (registration)
    Registration of firecrawl_batch_scrape tool (as BATCH_SCRAPE_TOOL) in the listTools response.
        SCRAPE_TOOL,
        MAP_TOOL,
        CRAWL_TOOL,
        BATCH_SCRAPE_TOOL,
        CHECK_BATCH_STATUS_TOOL,
        CHECK_CRAWL_STATUS_TOOL,
        SEARCH_TOOL,
        EXTRACT_TOOL,
        DEEP_RESEARCH_TOOL,
        GENERATE_LLMSTXT_TOOL,
      ],
    }));
  • Helper function that performs the actual batch scrape using client.asyncBatchScrapeUrls, handles retries, credits, and updates operation status.
    async function processBatchOperation(
      operation: QueuedBatchOperation
    ): Promise<void> {
      try {
        operation.status = 'processing';
        let totalCreditsUsed = 0;
    
        // Use library's built-in batch processing
        const response = await withRetry(
          async () =>
            client.asyncBatchScrapeUrls(operation.urls, operation.options),
          `batch ${operation.id} processing`
        );
    
        if (!response.success) {
          throw new Error(response.error || 'Batch operation failed');
        }
    
        // Track credits if using cloud API
        if (!FIRECRAWL_API_URL && hasCredits(response)) {
          totalCreditsUsed += response.creditsUsed;
          await updateCreditUsage(response.creditsUsed);
        }
    
        operation.status = 'completed';
        operation.result = response;
    
        // Log final credit usage for the batch
        if (!FIRECRAWL_API_URL) {
          safeLog(
            'info',
            `Batch ${operation.id} completed. Total credits used: ${totalCreditsUsed}`
          );
        }
      } catch (error) {
        operation.status = 'failed';
        operation.error = error instanceof Error ? error.message : String(error);
    
        safeLog('error', `Batch ${operation.id} failed: ${operation.error}`);
      }
    }
  • Type guard function for validating batch scrape input arguments.
    function isBatchScrapeOptions(args: unknown): args is BatchScrapeOptions {
      return (
        typeof args === 'object' &&
        args !== null &&
        'urls' in args &&
        Array.isArray((args as { urls: unknown }).urls) &&
        (args as { urls: unknown[] }).urls.every((url) => typeof url === 'string')
      );
    }

Latest Blog Posts

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/Krieg2065/firecrawl-mcp-server'

If you have feedback or need assistance with the MCP directory API, please join our Discord server