Skip to main content
Glama

Extract web content with Firecrawl. Modes: scrape (single page), crawl (deep crawl), map (URL discovery), extract (structured data), actions (interactive).

firecrawl_process

Extract web content through scraping, crawling, mapping, or structured data extraction to process URLs and discover information.

Instructions

Extract web content with Firecrawl. Modes: scrape (single page), crawl (deep crawl), map (URL discovery), extract (structured data), actions (interactive).

Input Schema

TableJSON Schema
NameRequiredDescriptionDefault
urlYesURL(s)
modeYesProcessing mode
extract_depthNoExtraction depth

Implementation Reference

  • MCP server tool registration and handler for 'firecrawl_process'. Defines the tool schema, description, and execution logic that delegates to the UnifiedFirecrawlProcessingProvider's process_content method.
    server.tool(
    	{
    		name: 'firecrawl_process',
    		description: this.firecrawl_process_provider.description,
    		schema: v.object({
    			url: v.pipe(
    				v.union([v.string(), v.array(v.string())]),
    				v.description('URL(s)'),
    			),
    			mode: v.pipe(
    				v.union([
    					v.literal('scrape'),
    					v.literal('crawl'),
    					v.literal('map'),
    					v.literal('extract'),
    					v.literal('actions'),
    				]),
    				v.description('Processing mode'),
    			),
    			extract_depth: v.optional(
    				v.pipe(
    					v.union([v.literal('basic'), v.literal('advanced')]),
    					v.description('Extraction depth'),
    				),
    			),
    		}),
    	},
    	async ({ url, mode, extract_depth }) => {
    		try {
    			const result =
    				await this.firecrawl_process_provider!.process_content(
    					url,
    					extract_depth,
    					mode as any,
    				);
    			const safe_result = handle_large_result(
    				result,
    				'firecrawl_process',
    			);
    			return {
    				content: [
    					{
    						type: 'text' as const,
    						text: JSON.stringify(safe_result, null, 2),
    					},
    				],
    			};
    		} catch (error) {
    			const error_response = create_error_response(
    				error as Error,
    			);
    			return {
    				content: [
    					{
    						type: 'text' as const,
    						text: error_response.error,
    					},
    				],
    				isError: true,
    			};
    		}
    	},
    );
  • Core handler logic in UnifiedFirecrawlProvider.process_content. Selects sub-provider based on mode (scrape/crawl/etc.) and delegates processing.
    async process_content(
    	url: string | string[],
    	extract_depth: 'basic' | 'advanced' = 'basic',
    	mode: FirecrawlMode = 'scrape',
    ): Promise<ProcessingResult> {
    	if (!mode) {
    		throw new ProviderError(
    			ErrorType.INVALID_INPUT,
    			'Mode parameter is required',
    			this.name,
    		);
    	}
    
    	const selectedProvider = this.providers.get(mode);
    
    	if (!selectedProvider) {
    		throw new ProviderError(
    			ErrorType.INVALID_INPUT,
    			`Invalid mode: ${mode}. Valid options: ${Array.from(this.providers.keys()).join(', ')}`,
    			this.name,
    		);
    	}
    
    	return selectedProvider.process_content(url, extract_depth);
    }
  • Valibot schema defining input parameters for the 'firecrawl_process' tool: url, mode, extract_depth.
    schema: v.object({
    	url: v.pipe(
    		v.union([v.string(), v.array(v.string())]),
    		v.description('URL(s)'),
    	),
    	mode: v.pipe(
    		v.union([
    			v.literal('scrape'),
    			v.literal('crawl'),
    			v.literal('map'),
    			v.literal('extract'),
    			v.literal('actions'),
    		]),
    		v.description('Processing mode'),
    	),
    	extract_depth: v.optional(
    		v.pipe(
    			v.union([v.literal('basic'), v.literal('advanced')]),
    			v.description('Extraction depth'),
    		),
    	),
    }),
  • TypeScript interface defining the contract for the firecrawl_process provider, including process_content signature.
    export interface UnifiedFirecrawlProcessingProvider {
    	name: string;
    	description: string;
    	process_content(
    		url: string | string[],
    		extract_depth?: 'basic' | 'advanced',
    		mode?: FirecrawlMode,
    	): Promise<ProcessingResult>;
    }
  • Registers the UnifiedFirecrawlProvider instance if any Firecrawl API key is valid, enabling the 'firecrawl_process' tool.
    if (has_firecrawl) {
    	register_firecrawl_process_provider(
    		new UnifiedFirecrawlProvider(),
    	);
    }
Behavior1/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

With no annotations provided, the description carries the full burden of behavioral disclosure but fails to deliver. It mentions five modes but doesn't explain what each mode actually does behaviorally - what 'deep crawl' entails, what 'structured data' extraction means, what 'interactive' actions involve, or any operational constraints like rate limits, authentication needs, or potential destructive effects. The description is purely functional without behavioral context.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness4/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is extremely concise - just two sentences that directly mirror the title. While efficient and front-loaded, it may be overly terse given the tool's complexity with five different modes. Every word serves a purpose, but more elaboration would be helpful for a multi-mode tool with significant behavioral variation.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness2/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

For a tool with 3 parameters, 5 operational modes, no annotations, and no output schema, the description is insufficiently complete. It doesn't explain the behavioral differences between modes, doesn't describe what the tool returns, doesn't mention any constraints or requirements, and provides no examples. Given the complexity and lack of structured documentation elsewhere, the description should do much more heavy lifting.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema description coverage is 100%, so the schema already documents all three parameters (url, mode, extract_depth) with their types and constraints. The description adds no additional parameter semantics beyond what's in the schema - it doesn't explain what the modes actually do, what the extraction depth levels mean, or provide examples of valid URL formats. The baseline score of 3 reflects adequate but minimal value addition.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose2/5

Does the description clearly state what the tool does and how it differs from similar tools?

Tautological: description restates name/title.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines2/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description provides no guidance on when to use this tool versus alternatives. While it lists five modes, it doesn't explain when to choose scrape vs crawl vs map vs extract vs actions, nor does it mention when to prefer Firecrawl over sibling tools like tavily_extract_process, web_search, or other content extraction tools. The agent receives no contextual usage instructions.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

Install Server

Other Tools

Latest Blog Posts

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/spences10/mcp-omnisearch'

If you have feedback or need assistance with the MCP directory API, please join our Discord server