firecrawl_batch_scrape
Scrape multiple web pages simultaneously to extract content in various formats like markdown, HTML, or screenshots, returning a job ID for status tracking.
Instructions
Scrape multiple URLs in batch mode. Returns a job ID that can be used to check status.
Input Schema
TableJSON Schema
| Name | Required | Description | Default |
|---|---|---|---|
| urls | Yes | List of URLs to scrape | |
| options | No |
Implementation Reference
- src/index.ts:1096-1145 (handler)Handler for firecrawl_batch_scrape tool: validates input, queues batch scrape operation using client.asyncBatchScrapeUrls via processBatchOperation, returns job ID.case 'firecrawl_batch_scrape': { if (!isBatchScrapeOptions(args)) { throw new Error('Invalid arguments for firecrawl_batch_scrape'); } try { const operationId = `batch_${++operationCounter}`; const operation: QueuedBatchOperation = { id: operationId, urls: args.urls, options: args.options, status: 'pending', progress: { completed: 0, total: args.urls.length, }, }; batchOperations.set(operationId, operation); // Queue the operation batchQueue.add(() => processBatchOperation(operation)); safeLog( 'info', `Queued batch operation ${operationId} with ${args.urls.length} URLs` ); return { content: [ { type: 'text', text: trimResponseText( `Batch operation queued with ID: ${operationId}. Use firecrawl_check_batch_status to check progress.` ), }, ], isError: false, }; } catch (error) { const errorMessage = error instanceof Error ? error.message : `Batch operation failed: ${JSON.stringify(error)}`; return { content: [{ type: 'text', text: trimResponseText(errorMessage) }], isError: true, }; } }
- src/index.ts:328-377 (schema)Tool schema definition including input schema for batch URLs and scrape options.const BATCH_SCRAPE_TOOL: Tool = { name: 'firecrawl_batch_scrape', description: 'Scrape multiple URLs in batch mode. Returns a job ID that can be used to check status.', inputSchema: { type: 'object', properties: { urls: { type: 'array', items: { type: 'string' }, description: 'List of URLs to scrape', }, options: { type: 'object', properties: { formats: { type: 'array', items: { type: 'string', enum: [ 'markdown', 'html', 'rawHtml', 'screenshot', 'links', 'screenshot@fullPage', 'extract', ], }, }, onlyMainContent: { type: 'boolean', }, includeTags: { type: 'array', items: { type: 'string' }, }, excludeTags: { type: 'array', items: { type: 'string' }, }, waitFor: { type: 'number', }, }, }, }, required: ['urls'], }, };
- src/index.ts:962-973 (registration)Registration of firecrawl_batch_scrape tool (as BATCH_SCRAPE_TOOL) in the listTools response.SCRAPE_TOOL, MAP_TOOL, CRAWL_TOOL, BATCH_SCRAPE_TOOL, CHECK_BATCH_STATUS_TOOL, CHECK_CRAWL_STATUS_TOOL, SEARCH_TOOL, EXTRACT_TOOL, DEEP_RESEARCH_TOOL, GENERATE_LLMSTXT_TOOL, ], }));
- src/index.ts:917-957 (helper)Helper function that performs the actual batch scrape using client.asyncBatchScrapeUrls, handles retries, credits, and updates operation status.async function processBatchOperation( operation: QueuedBatchOperation ): Promise<void> { try { operation.status = 'processing'; let totalCreditsUsed = 0; // Use library's built-in batch processing const response = await withRetry( async () => client.asyncBatchScrapeUrls(operation.urls, operation.options), `batch ${operation.id} processing` ); if (!response.success) { throw new Error(response.error || 'Batch operation failed'); } // Track credits if using cloud API if (!FIRECRAWL_API_URL && hasCredits(response)) { totalCreditsUsed += response.creditsUsed; await updateCreditUsage(response.creditsUsed); } operation.status = 'completed'; operation.result = response; // Log final credit usage for the batch if (!FIRECRAWL_API_URL) { safeLog( 'info', `Batch ${operation.id} completed. Total credits used: ${totalCreditsUsed}` ); } } catch (error) { operation.status = 'failed'; operation.error = error instanceof Error ? error.message : String(error); safeLog('error', `Batch ${operation.id} failed: ${operation.error}`); } }
- src/index.ts:710-718 (helper)Type guard function for validating batch scrape input arguments.function isBatchScrapeOptions(args: unknown): args is BatchScrapeOptions { return ( typeof args === 'object' && args !== null && 'urls' in args && Array.isArray((args as { urls: unknown }).urls) && (args as { urls: unknown[] }).urls.every((url) => typeof url === 'string') ); }