Skip to main content
Glama

firecrawl_batch_scrape

Scrape multiple URLs simultaneously to extract content in various formats like markdown, HTML, or screenshots, returning a job ID for status tracking.

Instructions

Scrape multiple URLs in batch mode. Returns a job ID that can be used to check status.

Input Schema

TableJSON Schema
NameRequiredDescriptionDefault
urlsYesList of URLs to scrape
optionsNo

Implementation Reference

  • Defines the tool metadata, description, and input schema for validating batch scrape requests.
    const BATCH_SCRAPE_TOOL: Tool = { name: 'firecrawl_batch_scrape', description: 'Scrape multiple URLs in batch mode. Returns a job ID that can be used to check status.', inputSchema: { type: 'object', properties: { urls: { type: 'array', items: { type: 'string' }, description: 'List of URLs to scrape', }, options: { type: 'object', properties: { formats: { type: 'array', items: { type: 'string', enum: [ 'markdown', 'html', 'rawHtml', 'screenshot', 'links', 'screenshot@fullPage', 'extract', ], }, }, onlyMainContent: { type: 'boolean', }, includeTags: { type: 'array', items: { type: 'string' }, }, excludeTags: { type: 'array', items: { type: 'string' }, }, waitFor: { type: 'number', }, }, }, }, required: ['urls'], }, };
  • src/index.ts:862-874 (registration)
    Registers the firecrawl_batch_scrape tool (as BATCH_SCRAPE_TOOL) in the MCP server's listTools handler.
    server.setRequestHandler(ListToolsRequestSchema, async () => ({ tools: [ SCRAPE_TOOL, MAP_TOOL, CRAWL_TOOL, BATCH_SCRAPE_TOOL, CHECK_BATCH_STATUS_TOOL, CHECK_CRAWL_STATUS_TOOL, SEARCH_TOOL, EXTRACT_TOOL, DEEP_RESEARCH_TOOL, ], }));
  • Primary tool handler in the CallToolRequestSchema switch statement. Validates input using isBatchScrapeOptions, creates a queued batch operation ID, adds to PQueue for asynchronous processing via processBatchOperation, and returns the job ID immediately.
    case 'firecrawl_batch_scrape': { if (!isBatchScrapeOptions(args)) { throw new Error('Invalid arguments for firecrawl_batch_scrape'); } try { const operationId = `batch_${++operationCounter}`; const operation: QueuedBatchOperation = { id: operationId, urls: args.urls, options: args.options, status: 'pending', progress: { completed: 0, total: args.urls.length, }, }; batchOperations.set(operationId, operation); // Queue the operation batchQueue.add(() => processBatchOperation(operation)); server.sendLoggingMessage({ level: 'info', data: `Queued batch operation ${operationId} with ${args.urls.length} URLs`, }); return { content: [ { type: 'text', text: `Batch operation queued with ID: ${operationId}. Use firecrawl_check_batch_status to check progress.`, }, ], isError: false, }; } catch (error) { const errorMessage = error instanceof Error ? error.message : `Batch operation failed: ${JSON.stringify(error)}`; return { content: [{ type: 'text', text: errorMessage }], isError: true, }; } }
  • Asynchronous processor for batch operations, invoked by the PQueue. Calls Firecrawl client's asyncBatchScrapeUrls with retry logic, handles credit tracking, updates operation status/result/error in the shared batchOperations map.
    async function processBatchOperation( operation: QueuedBatchOperation ): Promise<void> { try { operation.status = 'processing'; let totalCreditsUsed = 0; // Use library's built-in batch processing const response = await withRetry( async () => client.asyncBatchScrapeUrls(operation.urls, operation.options), `batch ${operation.id} processing` ); if (!response.success) { throw new Error(response.error || 'Batch operation failed'); } // Track credits if using cloud API if (!FIRECRAWL_API_URL && hasCredits(response)) { totalCreditsUsed += response.creditsUsed; await updateCreditUsage(response.creditsUsed); } operation.status = 'completed'; operation.result = response; // Log final credit usage for the batch if (!FIRECRAWL_API_URL) { server.sendLoggingMessage({ level: 'info', data: `Batch ${operation.id} completed. Total credits used: ${totalCreditsUsed}`, }); } } catch (error) { operation.status = 'failed'; operation.error = error instanceof Error ? error.message : String(error); server.sendLoggingMessage({ level: 'error', data: `Batch ${operation.id} failed: ${operation.error}`, }); } }
  • Type guard function used in the handler to validate that arguments contain a valid 'urls' array of strings.
    function isBatchScrapeOptions(args: unknown): args is BatchScrapeOptions { return ( typeof args === 'object' && args !== null && 'urls' in args && Array.isArray((args as { urls: unknown }).urls) && (args as { urls: unknown[] }).urls.every((url) => typeof url === 'string') ); }

Latest Blog Posts

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/mcma123/firecrawl-mcp-server'

If you have feedback or need assistance with the MCP directory API, please join our Discord server