Skip to main content
Glama
firecrawl

firecrawl-mcp-server

firecrawl_agent

Autonomously searches the web, follows links, and reads pages to extract structured data from multiple sources based on your prompt. Asynchronous; polling required.

Instructions

Autonomous web research agent. This is a separate AI agent layer that independently browses the internet, searches for information, navigates through pages, and extracts structured data based on your query. You describe what you need, and the agent figures out where to find it.

How it works: The agent performs web searches, follows links, reads pages, and gathers data autonomously. This runs asynchronously - it returns a job ID immediately, and you poll firecrawl_agent_status to check when complete and retrieve results.

IMPORTANT - Async workflow with patient polling:

  1. Call firecrawl_agent with your prompt/schema → returns job ID immediately

  2. Poll firecrawl_agent_status with the job ID to check progress

  3. Keep polling for at least 2-3 minutes - agent research typically takes 1-5 minutes for complex queries

  4. Poll every 15-30 seconds until status is "completed" or "failed"

  5. Do NOT give up after just a few polling attempts - the agent needs time to research

Expected wait times:

  • Simple queries with provided URLs: 30 seconds - 1 minute

  • Complex research across multiple sites: 2-5 minutes

  • Deep research tasks: 5+ minutes

Best for: Complex research tasks where you don't know the exact URLs; multi-source data gathering; finding information scattered across the web; extracting data from JavaScript-heavy SPAs that fail with regular scrape. Not recommended for:

  • Single-page extraction when you have a URL (use firecrawl_scrape, faster and cheaper)

  • Web search (use firecrawl_search first)

  • Interactive page tasks like clicking, filling forms, login, or navigating JS-heavy SPAs (use firecrawl_scrape + firecrawl_interact)

  • Extracting specific data from a known page (use firecrawl_scrape with JSON format)

Arguments:

  • prompt: Natural language description of the data you want (required, max 10,000 characters)

  • urls: Optional array of URLs to focus the agent on specific pages

  • schema: Optional JSON schema for structured output

Prompt Example: "Find the founders of Firecrawl and their backgrounds" Usage Example (start agent, then poll patiently for results):

{
  "name": "firecrawl_agent",
  "arguments": {
    "prompt": "Find the top 5 AI startups founded in 2024 and their funding amounts",
    "schema": {
      "type": "object",
      "properties": {
        "startups": {
          "type": "array",
          "items": {
            "type": "object",
            "properties": {
              "name": { "type": "string" },
              "funding": { "type": "string" },
              "founded": { "type": "string" }
            }
          }
        }
      }
    }
  }
}

Then poll with firecrawl_agent_status every 15-30 seconds for at least 2-3 minutes.

Usage Example (with URLs - agent focuses on specific pages):

{
  "name": "firecrawl_agent",
  "arguments": {
    "urls": ["https://docs.firecrawl.dev", "https://firecrawl.dev/pricing"],
    "prompt": "Compare the features and pricing information from these pages"
  }
}

Returns: Job ID for status checking. Use firecrawl_agent_status to poll for results.

Input Schema

TableJSON Schema
NameRequiredDescriptionDefault
urlsNo
promptYes
schemaNo
Behavior5/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations already indicate openWorldHint=true. Description adds crucial behavioral context: async execution with job ID return, polling requirement, typical duration, and autonomy in navigation. No contradictions with annotations (readOnlyHint=false, destructiveHint=false).

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness4/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Description is well-structured with markdown headers and sections, front-loading key purpose. While lengthy, it earns its length given async complexity. Could be slightly more concise, but all information is useful and clearly organized.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness5/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given complexity (async, 3 params, no output schema, nested objects), description covers async workflow, polling guidance, timing expectations, argument details, examples, and return value (job ID). No gaps remain for effective tool use.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters5/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema coverage is 0% (no property descriptions), but description compensates fully. It explains prompt as 'natural language description of the data you want' with max length, urls as optional focus array, schema as optional JSON schema for structured output. Includes examples.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

Description clearly states it is an 'Autonomous web research agent' that independently browses and extracts data. It distinguishes from siblings like firecrawl_scrape (single-page extraction) and firecrawl_search (web search), providing specific guidance on when to use which.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines5/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

Explicit 'Best for' and 'Not recommended for' sections list when to use this tool vs alternatives (e.g., firecrawl_scrape, firecrawl_search). Detailed async workflow with polling intervals (15-30 seconds) and expected wait times (30s to 5+ min) is provided.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

Install Server

Other Tools

Latest Blog Posts

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/firecrawl/firecrawl-mcp-server'

If you have feedback or need assistance with the MCP directory API, please join our Discord server