firecrawl_batch_scrape
Scrape multiple URLs simultaneously to extract content in various formats, returning a job ID for status tracking.
Instructions
Scrape multiple URLs in batch mode. Returns a job ID that can be used to check status.
Input Schema
TableJSON Schema
| Name | Required | Description | Default |
|---|---|---|---|
| urls | Yes | List of URLs to scrape | |
| options | No |
Implementation Reference
- src/index.ts:328-377 (schema)Input schema and description for the firecrawl_batch_scrape tool.const BATCH_SCRAPE_TOOL: Tool = { name: 'firecrawl_batch_scrape', description: 'Scrape multiple URLs in batch mode. Returns a job ID that can be used to check status.', inputSchema: { type: 'object', properties: { urls: { type: 'array', items: { type: 'string' }, description: 'List of URLs to scrape', }, options: { type: 'object', properties: { formats: { type: 'array', items: { type: 'string', enum: [ 'markdown', 'html', 'rawHtml', 'screenshot', 'links', 'screenshot@fullPage', 'extract', ], }, }, onlyMainContent: { type: 'boolean', }, includeTags: { type: 'array', items: { type: 'string' }, }, excludeTags: { type: 'array', items: { type: 'string' }, }, waitFor: { type: 'number', }, }, }, }, required: ['urls'], }, };
- src/index.ts:960-973 (registration)Registration of the firecrawl_batch_scrape tool (as BATCH_SCRAPE_TOOL) in the list of available tools.server.setRequestHandler(ListToolsRequestSchema, async () => ({ tools: [ SCRAPE_TOOL, MAP_TOOL, CRAWL_TOOL, BATCH_SCRAPE_TOOL, CHECK_BATCH_STATUS_TOOL, CHECK_CRAWL_STATUS_TOOL, SEARCH_TOOL, EXTRACT_TOOL, DEEP_RESEARCH_TOOL, GENERATE_LLMSTXT_TOOL, ], }));
- src/index.ts:1096-1145 (handler)Main handler for firecrawl_batch_scrape: validates input, queues the batch operation using PQueue, stores in batchOperations map, and returns the job ID immediately.case 'firecrawl_batch_scrape': { if (!isBatchScrapeOptions(args)) { throw new Error('Invalid arguments for firecrawl_batch_scrape'); } try { const operationId = `batch_${++operationCounter}`; const operation: QueuedBatchOperation = { id: operationId, urls: args.urls, options: args.options, status: 'pending', progress: { completed: 0, total: args.urls.length, }, }; batchOperations.set(operationId, operation); // Queue the operation batchQueue.add(() => processBatchOperation(operation)); safeLog( 'info', `Queued batch operation ${operationId} with ${args.urls.length} URLs` ); return { content: [ { type: 'text', text: trimResponseText( `Batch operation queued with ID: ${operationId}. Use firecrawl_check_batch_status to check progress.` ), }, ], isError: false, }; } catch (error) { const errorMessage = error instanceof Error ? error.message : `Batch operation failed: ${JSON.stringify(error)}`; return { content: [{ type: 'text', text: trimResponseText(errorMessage) }], isError: true, }; } }
- src/index.ts:917-957 (helper)Helper function that executes the actual batch scrape by calling the Firecrawl client's asyncBatchScrapeUrls method, handles retries, credit tracking, and updates the operation status.async function processBatchOperation( operation: QueuedBatchOperation ): Promise<void> { try { operation.status = 'processing'; let totalCreditsUsed = 0; // Use library's built-in batch processing const response = await withRetry( async () => client.asyncBatchScrapeUrls(operation.urls, operation.options), `batch ${operation.id} processing` ); if (!response.success) { throw new Error(response.error || 'Batch operation failed'); } // Track credits if using cloud API if (!FIRECRAWL_API_URL && hasCredits(response)) { totalCreditsUsed += response.creditsUsed; await updateCreditUsage(response.creditsUsed); } operation.status = 'completed'; operation.result = response; // Log final credit usage for the batch if (!FIRECRAWL_API_URL) { safeLog( 'info', `Batch ${operation.id} completed. Total credits used: ${totalCreditsUsed}` ); } } catch (error) { operation.status = 'failed'; operation.error = error instanceof Error ? error.message : String(error); safeLog('error', `Batch ${operation.id} failed: ${operation.error}`); } }
- src/index.ts:710-718 (helper)Type guard function to validate input arguments for the batch scrape tool.function isBatchScrapeOptions(args: unknown): args is BatchScrapeOptions { return ( typeof args === 'object' && args !== null && 'urls' in args && Array.isArray((args as { urls: unknown }).urls) && (args as { urls: unknown[] }).urls.every((url) => typeof url === 'string') ); }