Skip to main content
Glama

firecrawl_batch_scrape

Scrape multiple URLs simultaneously to extract web content in various formats, returning a job ID for status tracking.

Instructions

Scrape multiple URLs in batch mode. Returns a job ID that can be used to check status.

Input Schema

TableJSON Schema
NameRequiredDescriptionDefault
urlsYesList of URLs to scrape
optionsNo

Implementation Reference

  • Defines the Tool object for 'firecrawl_batch_scrape' including name, description, and detailed inputSchema for batch scraping URLs.
    const BATCH_SCRAPE_TOOL: Tool = { name: 'firecrawl_batch_scrape', description: 'Scrape multiple URLs in batch mode. Returns a job ID that can be used to check status.', inputSchema: { type: 'object', properties: { urls: { type: 'array', items: { type: 'string' }, description: 'List of URLs to scrape', }, options: { type: 'object', properties: { formats: { type: 'array', items: { type: 'string', enum: [ 'markdown', 'html', 'rawHtml', 'screenshot', 'links', 'screenshot@fullPage', 'extract', ], }, }, onlyMainContent: { type: 'boolean', }, includeTags: { type: 'array', items: { type: 'string' }, }, excludeTags: { type: 'array', items: { type: 'string' }, }, waitFor: { type: 'number', }, }, }, }, required: ['urls'], }, };
  • src/index.ts:862-874 (registration)
    Registers the 'firecrawl_batch_scrape' tool (as BATCH_SCRAPE_TOOL) in the list of available tools returned by ListToolsRequestSchema.
    server.setRequestHandler(ListToolsRequestSchema, async () => ({ tools: [ SCRAPE_TOOL, MAP_TOOL, CRAWL_TOOL, BATCH_SCRAPE_TOOL, CHECK_BATCH_STATUS_TOOL, CHECK_CRAWL_STATUS_TOOL, SEARCH_TOOL, EXTRACT_TOOL, DEEP_RESEARCH_TOOL, ], }));
  • Tool handler in CallToolRequestSchema switch statement. Validates input, queues batch scrape operation using PQueue and client.asyncBatchScrapeUrls (via processBatchOperation), stores in batchOperations map, and returns job ID immediately for async processing.
    case 'firecrawl_batch_scrape': { if (!isBatchScrapeOptions(args)) { throw new Error('Invalid arguments for firecrawl_batch_scrape'); } try { const operationId = `batch_${++operationCounter}`; const operation: QueuedBatchOperation = { id: operationId, urls: args.urls, options: args.options, status: 'pending', progress: { completed: 0, total: args.urls.length, }, }; batchOperations.set(operationId, operation); // Queue the operation batchQueue.add(() => processBatchOperation(operation)); server.sendLoggingMessage({ level: 'info', data: `Queued batch operation ${operationId} with ${args.urls.length} URLs`, }); return { content: [ { type: 'text', text: `Batch operation queued with ID: ${operationId}. Use firecrawl_check_batch_status to check progress.`, }, ], isError: false, }; } catch (error) { const errorMessage = error instanceof Error ? error.message : `Batch operation failed: ${JSON.stringify(error)}`; return { content: [{ type: 'text', text: errorMessage }], isError: true, }; } }
  • Async helper function that executes the actual batch scrape by calling Firecrawl client.asyncBatchScrapeUrls, handles retries, credit tracking, updates operation status/result/error, and logs progress.
    async function processBatchOperation( operation: QueuedBatchOperation ): Promise<void> { try { operation.status = 'processing'; let totalCreditsUsed = 0; // Use library's built-in batch processing const response = await withRetry( async () => client.asyncBatchScrapeUrls(operation.urls, operation.options), `batch ${operation.id} processing` ); if (!response.success) { throw new Error(response.error || 'Batch operation failed'); } // Track credits if using cloud API if (!FIRECRAWL_API_URL && hasCredits(response)) { totalCreditsUsed += response.creditsUsed; await updateCreditUsage(response.creditsUsed); } operation.status = 'completed'; operation.result = response; // Log final credit usage for the batch if (!FIRECRAWL_API_URL) { server.sendLoggingMessage({ level: 'info', data: `Batch ${operation.id} completed. Total credits used: ${totalCreditsUsed}`, }); } } catch (error) { operation.status = 'failed'; operation.error = error instanceof Error ? error.message : String(error); server.sendLoggingMessage({ level: 'error', data: `Batch ${operation.id} failed: ${operation.error}`, }); } }
  • Type guard function to validate input arguments for batch scrape tool.
    function isBatchScrapeOptions(args: unknown): args is BatchScrapeOptions { return ( typeof args === 'object' && args !== null && 'urls' in args && Array.isArray((args as { urls: unknown }).urls) && (args as { urls: unknown[] }).urls.every((url) => typeof url === 'string') ); }

Latest Blog Posts

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/mcma123/firecrawl-mcp-server'

If you have feedback or need assistance with the MCP directory API, please join our Discord server