Skip to main content
Glama

Server Details

40+ web scraping tools from Firecrawl, Bright Data, Jina, Olostep, ScrapeGraph, Notte, and Riveter. Scrape, crawl, screenshot, and extract from any website. Starts at $0.01/call. Get your API key at app.xpay.sh or xpay.tools

Status
Healthy
Last Tested
Transport
Streamable HTTP
URL

Glama MCP Gateway

Connect through Glama MCP Gateway for full control over tool access and complete visibility into every call.

MCP client
Glama
MCP server

Full call logging

Every tool call is logged with complete inputs and outputs, so you can debug issues and audit what your agents are doing.

Tool access control

Enable or disable individual tools per connector, so you decide what your agents can and cannot do.

Managed credentials

Glama handles OAuth flows, token storage, and automatic rotation, so credentials never expire on your clients.

Usage analytics

See which tools your agents call, how often, and when, so you can understand usage patterns and catch anomalies.

100% free. Your data is private.
Tool DescriptionsB

Average 3.8/5 across 13 of 13 tools scored. Lowest: 2.4/5.

Server CoherenceC
Disambiguation2/5

There is significant overlap and ambiguity among tools, particularly in the web scraping domain. For example, firecrawl_scrape, scrape_as_markdown, and read_url all appear to extract content from single web pages, with unclear distinctions in their use cases. Similarly, firecrawl_search, search_engine, and search_web all perform web searches, causing confusion about which tool to select for a given task. The descriptions provide some guidance, but the boundaries between tools are poorly defined, leading to potential misselection.

Naming Consistency2/5

Naming conventions are inconsistent and chaotic. The server mixes snake_case (e.g., capture_screenshot_url, extract_pdf) with inconsistent verb styles (e.g., firecrawl_crawl vs. firecrawl_extract vs. firecrawl_map). Some tools use generic names like read_url or search_web, while others are prefixed with firecrawl_, creating a fragmented pattern. There is no discernible overall naming scheme, making the tool set harder to navigate and predict.

Tool Count3/5

With 13 tools, the count is borderline but reasonable for a web scraping collection. However, the tools feel redundant rather than complementary, as many overlap in functionality (e.g., multiple search and single-page scraping tools). This suggests the count could be optimized by merging similar tools, but it is not extreme enough to score lower, as the domain of web scraping can justify a moderate number of tools.

Completeness4/5

The tool set covers a broad range of web scraping and data extraction tasks, including single-page scraping, batch processing, crawling, mapping, searching, and specialized functions like PDF extraction and screenshot capture. There are minor gaps, such as no explicit tool for updating or deleting scraped data, but these are not critical for the domain. The surface is largely complete for web scraping workflows, allowing agents to perform most common operations.

Available Tools

13 tools
capture_screenshot_urlAInspect

Capture high-quality screenshots of web pages in base64 encoded JPEG format. Use this tool when you need to visually inspect a website, take a snapshot for analysis, or show users what a webpage looks like.

ParametersJSON Schema
NameRequiredDescriptionDefault
urlYesThe complete HTTP/HTTPS URL of the webpage to capture (e.g., 'https://example.com')
return_urlNoSet to true to return screenshot URLs instead of downloading images as base64
firstScreenOnlyNoSet to true for a single screen capture (faster), false for full page capture including content below the fold
Behavior3/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

With no annotations provided, the description carries the full burden. It discloses the output format (base64 JPEG) and quality level, but omits other behavioral traits like timeout behavior, JavaScript execution policy, rate limits, or error handling for invalid URLs.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Two sentences with zero waste: first establishes functionality and format, second establishes use cases. Information is front-loaded and every sentence earns its place.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Appropriate for a 3-parameter tool with simple schema and no output schema. Covers the core value proposition (visual capture) and default return format. Could mention the URL return alternative or error scenarios, but schema handles parameter details adequately.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema description coverage is 100%, establishing a baseline of 3. The description implies the default output behavior (base64) but does not add significant semantic meaning beyond what the schema already provides for each parameter.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description uses specific verb 'Capture' with resource 'screenshots of web pages' and specifies the output format 'base64 encoded JPEG'. This clearly distinguishes it from text-extraction siblings like scrape_as_markdown or firecrawl_scrape.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

Provides clear when-to-use guidance ('visually inspect a website', 'take a snapshot for analysis', 'show users what a webpage looks like'). However, it does not explicitly name text-based alternatives (e.g., scrape_as_markdown) for non-visual extraction scenarios.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

extract_pdfAInspect

Extract figures, tables, and equations from PDF documents using layout detection. Perfect for extracting visual elements from academic papers on arXiv or any PDF URL. Returns base64-encoded images of detected elements with metadata.

ParametersJSON Schema
NameRequiredDescriptionDefault
idNoarXiv paper ID (e.g., '2301.12345' or 'hep-th/9901001'). Either id or url is required.
urlNoDirect PDF URL. Either id or url is required.
typeNoFilter by float types (comma-separated): figure, table, equation. If not specified, returns all types.
max_edgeNoMaximum edge size for extracted images in pixels (default: 1024)
Behavior3/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

With no annotations provided, the description carries the full burden. It discloses the output format (base64-encoded images with metadata) and method (layout detection), but omits operational details like rate limits, handling of scanned vs. text-based PDFs, or error behavior.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Two well-structured sentences with zero waste. The first sentence establishes core functionality; the second covers use case and output format. Information density is high with no redundant phrases.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the lack of output schema, the description appropriately explains return values (base64 images with metadata). All four parameters are documented in the schema. It could be improved by mentioning error handling or specific metadata fields returned.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema description coverage is 100%, establishing a baseline of 3. The description adds domain context by mentioning 'arXiv' (relating to the 'id' parameter) and specific element types (relating to the 'type' parameter), but does not elaborate on parameter syntax or the conditional requirement logic mentioned in schema descriptions.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the specific action (extract) and resources (figures, tables, equations from PDFs) using layout detection. It effectively distinguishes itself from sibling web scraping tools by focusing on visual element extraction from PDFs specifically.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description provides clear usage context ('Perfect for extracting visual elements from academic papers on arXiv'), suggesting when to use this over general scraping tools. However, it lacks explicit guidance on when not to use it or specific prerequisites (e.g., PDF accessibility requirements).

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

firecrawl_crawlAInspect

Starts a crawl job on a website and extracts content from all pages.

Best for: Extracting content from multiple related pages, when you need comprehensive coverage. Not recommended for: Extracting content from a single page (use scrape); when token limits are a concern (use map + batch_scrape); when you need fast results (crawling can be slow). Warning: Crawl responses can be very large and may exceed token limits. Limit the crawl depth and number of pages, or use map + batch_scrape for better control. Common mistakes: Setting limit or maxDiscoveryDepth too high (causes token overflow) or too low (causes missing pages); using crawl for a single page (use scrape instead). Using a /* wildcard is not recommended. Prompt Example: "Get all blog posts from the first two levels of example.com/blog." Usage Example:

{
  "name": "firecrawl_crawl",
  "arguments": {
    "url": "https://example.com/blog/*",
    "maxDiscoveryDepth": 5,
    "limit": 20,
    "allowExternalLinks": false,
    "deduplicateSimilarURLs": true,
    "sitemap": "include"
  }
}

Returns: Operation ID for status checking; use firecrawl_check_crawl_status to check progress. Safe Mode: Read-only crawling. Webhooks and interactive actions are disabled for security.

ParametersJSON Schema
NameRequiredDescriptionDefault
urlYes
delayNo
limitNo
promptNo
sitemapNo
excludePathsNo
includePathsNo
scrapeOptionsNo
maxConcurrencyNo
allowSubdomainsNo
crawlEntireDomainNo
maxDiscoveryDepthNo
allowExternalLinksNo
ignoreQueryParametersNo
deduplicateSimilarURLsNo
Behavior5/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

With no annotations provided, the description carries the full burden and excels: it warns about token overflow risks from large responses, declares the async pattern ('Returns: Operation ID'), explains 'Safe Mode' security constraints, and notes performance characteristics ('crawling can be slow').

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness4/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Well-structured with markdown headers that facilitate scanning. Every section adds distinct value (warnings, mistakes, examples). Slightly verbose given the length, but justified by the tool's complexity and lack of annotations/schema descriptions.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness5/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the complex async operation (15 params, nested objects, no output schema), the description is comprehensive. It explains the return value mechanism (Operation ID), references the status-checking tool, covers security (Safe Mode), and provides concrete usage patterns.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema coverage is 0%, requiring the description to compensate. It partially succeeds by explaining critical parameters through 'Common mistakes' (limit, maxDiscoveryDepth) and usage examples (url patterns, sitemap, allowExternalLinks). However, with 15 total parameters including complex nested objects (scrapeOptions), many parameters like delay, prompt, and excludePaths remain undocumented.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description opens with a precise action ('Starts a crawl job') and scope ('extracts content from all pages'). It clearly distinguishes from sibling tools by contrasting with 'scrape' for single pages and 'map + batch_scrape' for token-limited scenarios.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines5/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

Explicitly structured with 'Best for' and 'Not recommended for' sections that name specific alternatives (scrape, map + batch_scrape). It identifies exact conditions for selection: multiple related pages vs. single pages, token limit concerns, and speed requirements.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

firecrawl_extractAInspect

Extract structured information from web pages using LLM capabilities. Supports both cloud AI and self-hosted LLM extraction.

Best for: Extracting specific structured data like prices, names, details from web pages. Not recommended for: When you need the full content of a page (use scrape); when you're not looking for specific structured data. Arguments:

  • urls: Array of URLs to extract information from

  • prompt: Custom prompt for the LLM extraction

  • schema: JSON schema for structured data extraction

  • allowExternalLinks: Allow extraction from external links

  • enableWebSearch: Enable web search for additional context

  • includeSubdomains: Include subdomains in extraction Prompt Example: "Extract the product name, price, and description from these product pages." Usage Example:

{
  "name": "firecrawl_extract",
  "arguments": {
    "urls": ["https://example.com/page1", "https://example.com/page2"],
    "prompt": "Extract product information including name, price, and description",
    "schema": {
      "type": "object",
      "properties": {
        "name": { "type": "string" },
        "price": { "type": "number" },
        "description": { "type": "string" }
      },
      "required": ["name", "price"]
    },
    "allowExternalLinks": false,
    "enableWebSearch": false,
    "includeSubdomains": false
  }
}

Returns: Extracted structured data as defined by your schema.

ParametersJSON Schema
NameRequiredDescriptionDefault
urlsYes
promptNo
schemaNo
enableWebSearchNo
includeSubdomainsNo
allowExternalLinksNo
Behavior3/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

With no annotations provided, description carries full burden. It discloses LLM usage ('cloud AI and self-hosted LLM extraction') which implies probabilistic/non-deterministic behavior. However, missing safety classification (read-only vs destructive), rate limits, error handling, or idempotency details that would be essential for an LLM-based tool.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness4/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Well-structured with markdown headers front-loading critical info. The JSON usage example is lengthy but justified by complexity. Arguments section efficiently lists parameters without redundancy. No filler text, though slightly verbose compared to minimal ideal.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given high complexity (6 params, nested objects, LLM integration), zero schema coverage, and no annotations, the description adequately covers functionality, parameters, and return values. Minor gap: lacks error handling description or LLM-specific constraints (token limits, cost).

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters5/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema has 0% description coverage, but description fully compensates by documenting all 6 parameters ('urls: Array of URLs...', 'prompt: Custom prompt...', etc.) including the nested schema object. Includes concrete prompt example and full JSON usage example.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

Description opens with specific verb ('Extract') + resource ('structured information from web pages') + method ('using LLM capabilities'). Explicitly distinguishes from sibling scrape tools via 'Not recommended for: When you need the full content of a page (use scrape)'.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines5/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

Contains explicit '**Best for:**' and '**Not recommended for:**' sections with clear alternatives named (scrape). Provides concrete guidance on when to use LLM extraction vs. full page retrieval.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

firecrawl_mapAInspect

Map a website to discover all indexed URLs on the site.

Best for: Discovering URLs on a website before deciding what to scrape; finding specific sections or pages within a large site; locating the correct page when scrape returns empty or incomplete results. Not recommended for: When you already know which specific URL you need (use scrape); when you need the content of the pages (use scrape after mapping). Common mistakes: Using crawl to discover URLs instead of map; jumping straight to firecrawl_agent when scrape fails instead of using map first to find the right page.

IMPORTANT - Use map before agent: If firecrawl_scrape returns empty, minimal, or irrelevant content, use firecrawl_map with the search parameter to find the specific page URL containing your target content. This is faster and cheaper than using firecrawl_agent. Only use the agent as a last resort after map+scrape fails.

Prompt Example: "Find the webhook documentation page on this API docs site." Usage Example (discover all URLs):

{
  "name": "firecrawl_map",
  "arguments": {
    "url": "https://example.com"
  }
}

Usage Example (search for specific content - RECOMMENDED when scrape fails):

{
  "name": "firecrawl_map",
  "arguments": {
    "url": "https://docs.example.com/api",
    "search": "webhook events"
  }
}

Returns: Array of URLs found on the site, filtered by search query if provided.

ParametersJSON Schema
NameRequiredDescriptionDefault
urlYes
limitNo
searchNo
sitemapNo
includeSubdomainsNo
ignoreQueryParametersNo
Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

No annotations provided, so description carries full burden. It discloses return format ('Array of URLs'), performance characteristics ('faster and cheaper than using firecrawl_agent'), and workflow behavior. Minor gap: doesn't explicitly state if operation is read-only or idempotent.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Excellent information density with clear markdown structure. Front-loaded purpose followed by decision-making sections (Best for/Not recommended), workflow rules, and concrete JSON examples. No wasted sentences.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness3/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given no output schema, the description adequately explains the return value ('Array of URLs'). However, with 6 parameters and 0% schema coverage, leaving 4 parameters undocumented (limit, sitemap, includeSubdomains, ignoreQueryParameters) creates clear gaps for a tool of this complexity.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema has 0% description coverage. Description compensates partially by demonstrating 'url' and 'search' parameters in examples and explicitly mentioning 'search' in the workflow guidance. However, 'limit', 'sitemap', 'includeSubdomains', and 'ignoreQueryParameters' remain completely undocumented.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

Opens with specific verb+resource ('Map a website to discover all indexed URLs') and immediately distinguishes from siblings by contrasting with 'scrape' (when you know the URL), 'crawl' (common mistakes section), and 'firecrawl_agent' (cheaper/faster alternative).

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines5/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

Exceptional guidance with explicit 'Best for' and 'Not recommended for' sections naming alternatives (scrape, agent). The 'IMPORTANT' section provides clear workflow sequencing (use map before agent when scrape fails).

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

firecrawl_scrapeAInspect

Scrape content from a single URL with advanced options. This is the most powerful, fastest and most reliable scraper tool, if available you should always default to using this tool for any web scraping needs.

Best for: Single page content extraction, when you know exactly which page contains the information. Not recommended for: Multiple pages (call scrape multiple times or use crawl), unknown page location (use search). Common mistakes: Using markdown format when extracting specific data points (use JSON instead). Other Features: Use 'branding' format to extract brand identity (colors, fonts, typography, spacing, UI components) for design analysis or style replication.

CRITICAL - Format Selection (you MUST follow this): When the user asks for SPECIFIC data points, you MUST use JSON format with a schema. Only use markdown when the user needs the ENTIRE page content.

Use JSON format when user asks for:

  • Parameters, fields, or specifications (e.g., "get the header parameters", "what are the required fields")

  • Prices, numbers, or structured data (e.g., "extract the pricing", "get the product details")

  • API details, endpoints, or technical specs (e.g., "find the authentication endpoint")

  • Lists of items or properties (e.g., "list the features", "get all the options")

  • Any specific piece of information from a page

Use markdown format ONLY when:

  • User wants to read/summarize an entire article or blog post

  • User needs to see all content on a page without specific extraction

  • User explicitly asks for the full page content

Handling JavaScript-rendered pages (SPAs): If JSON extraction returns empty, minimal, or just navigation content, the page is likely JavaScript-rendered or the content is on a different URL. Try these steps IN ORDER:

  1. Add waitFor parameter: Set waitFor: 5000 to waitFor: 10000 to allow JavaScript to render before extraction

  2. Try a different URL: If the URL has a hash fragment (#section), try the base URL or look for a direct page URL

  3. Use firecrawl_map to find the correct page: Large documentation sites or SPAs often spread content across multiple URLs. Use firecrawl_map with a search parameter to discover the specific page containing your target content, then scrape that URL directly. Example: If scraping "https://docs.example.com/reference" fails to find webhook parameters, use firecrawl_map with {"url": "https://docs.example.com/reference", "search": "webhook"} to find URLs like "/reference/webhook-events", then scrape that specific page.

  4. Use firecrawl_agent: As a last resort for heavily dynamic pages where map+scrape still fails, use the agent which can autonomously navigate and research

Usage Example (JSON format - REQUIRED for specific data extraction):

{
  "name": "firecrawl_scrape",
  "arguments": {
    "url": "https://example.com/api-docs",
    "formats": ["json"],
    "jsonOptions": {
      "prompt": "Extract the header parameters for the authentication endpoint",
      "schema": {
        "type": "object",
        "properties": {
          "parameters": {
            "type": "array",
            "items": {
              "type": "object",
              "properties": {
                "name": { "type": "string" },
                "type": { "type": "string" },
                "required": { "type": "boolean" },
                "description": { "type": "string" }
              }
            }
          }
        }
      }
    }
  }
}

Prefer markdown format by default. You can read and reason over the full page content directly — no need for an intermediate query step. Use markdown for questions about page content, factual lookups, and any task where you need to understand the page.

Use JSON format when user needs:

  • Structured data with specific fields (extract all products with name, price, description)

  • Data in a specific schema for downstream processing

Use query format only when:

  • The page is extremely long and you need a single targeted answer without processing the full content

  • You want a quick factual answer and don't need to retain the page content

Usage Example (markdown format - default for most tasks):

{
  "name": "firecrawl_scrape",
  "arguments": {
    "url": "https://example.com/article",
    "formats": ["markdown"],
    "onlyMainContent": true
  }
}

Usage Example (branding format - extract brand identity):

{
  "name": "firecrawl_scrape",
  "arguments": {
    "url": "https://example.com",
    "formats": ["branding"]
  }
}

Branding format: Extracts comprehensive brand identity (colors, fonts, typography, spacing, logo, UI components) for design analysis or style replication. Performance: Add maxAge parameter for 500% faster scrapes using cached data. Returns: JSON structured data, markdown, branding profile, or other formats as specified. Safe Mode: Read-only content extraction. Interactive actions (click, write, executeJavascript) are disabled for security.

ParametersJSON Schema
NameRequiredDescriptionDefault
urlYes
proxyNo
maxAgeNo
mobileNo
formatsNo
parsersNo
profileNo
waitForNo
locationNo
pdfOptionsNo
excludeTagsNo
includeTagsNo
jsonOptionsNo
queryOptionsNo
storeInCacheNo
onlyMainContentNo
screenshotOptionsNo
zeroDataRetentionNo
removeBase64ImagesNo
skipTlsVerificationNo
Behavior5/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

With no annotations provided, the description carries the full burden and excels: it discloses 'Safe Mode' read-only restrictions, performance characteristics ('500% faster' with maxAge), return value types, JavaScript rendering limitations, and data retention options (zeroDataRetention mentioned). It provides comprehensive behavioral context beyond basic functionality.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness4/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is lengthy but well-structured with bold headers, clear sections (Best for/Not recommended), and front-loaded critical information. While the verbosity is justified by the lack of schema documentation and complex decision logic (format selection), it could be tightened by reducing redundant examples or collapsing the branding section which repeats earlier content.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the high complexity (15 parameters, nested objects, 0% schema coverage) and lack of output schema or annotations, the description provides strong contextual completeness through usage examples, return value descriptions, and error handling guidance. It falls short of perfect only by leaving approximately 9 parameters (proxy, mobile, location, etc.) without explicit semantic description.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters4/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

With 0% schema description coverage, the description compensates by providing extensive semantic detail for complex parameters: 'formats' (via detailed JSON/markdown/branding examples and selection logic), 'waitFor' (in the SPA section), and 'maxAge' (performance note). However, it omits explanations for simpler parameters like 'proxy', 'mobile', 'location', and 'parsers', leaving them to inference.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description opens with a clear, specific verb ('Scrape content from a single URL') and immediately distinguishes itself from siblings by stating it is 'most powerful' for single pages vs. using 'crawl' for multiple pages, 'search' for unknown locations, and 'firecrawl_map' for discovery. This provides precise scope definition.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines5/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

Exceptional guidance includes explicit 'Best for' and 'Not recommended for' sections naming alternatives (crawl, search). The 'CRITICAL - Format Selection' section provides unambiguous rules for choosing between JSON and markdown formats, and the SPA handling section gives ordered troubleshooting steps involving other tools (firecrawl_map, firecrawl_agent).

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

read_urlAInspect

Extract and convert web page content to clean, readable markdown format. Perfect for reading articles, documentation, blog posts, or any web content. Use this when you need to analyze text content from websites, bypass paywalls, or get structured data.

ParametersJSON Schema
NameRequiredDescriptionDefault
urlYesThe complete URL of the webpage or PDF file to read and convert (e.g., 'https://example.com/article'). Can be a single URL string or an array of URLs for parallel reading.
withAllLinksNoSet to true to extract and return all hyperlinks found on the page as structured data
withAllImagesNoSet to true to extract and return all images found on the page as structured data
Behavior3/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

No annotations are provided, so the description carries the full disclosure burden. It valuably mentions 'bypass paywalls' as a specific capability, but omits other critical behavioral traits such as JavaScript execution support, error handling patterns, rate limiting, or whether the operation is idempotent.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description consists of three efficiently structured sentences with zero redundancy. It is properly front-loaded with the core action ('Extract and convert') in the first sentence, followed by use cases and invocation guidelines, making every sentence earn its place.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

For a three-parameter tool with simple boolean flags and no output schema, the description adequately covers the primary value proposition (markdown conversion, paywall bypassing). It could be improved by mentioning the return format or error behavior, but it is sufficient for agent selection given the schema completeness.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

With 100% schema description coverage, the structured documentation already explains all three parameters (url, withAllLinks, withAllImages) adequately. The description mentions 'structured data' which loosely alludes to the link/image extraction flags but does not add specific semantic guidance beyond the schema definitions, meeting the baseline expectation.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose4/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the tool extracts and converts web content to markdown format, with specific use cases like articles and documentation. However, it does not explicitly differentiate from the sibling tool 'scrape_as_markdown' which appears to have identical functionality, creating potential selection ambiguity.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines3/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description provides explicit positive guidance ('Use this when you need to analyze text content...'), but lacks negative constraints or alternative recommendations. Given the numerous sibling scraping tools (firecrawl_scrape, scrape_as_markdown, etc.), it should explicitly state when to prefer this over alternatives.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

scrape_as_markdownCInspect

Scrape a single webpage URL with advanced options for content extraction and get back the results in MarkDown language. This tool can unlock any webpage even if it uses bot detection or CAPTCHA.

ParametersJSON Schema
NameRequiredDescriptionDefault
urlYes
Behavior3/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

With no annotations provided, the description carries the full burden. It successfully discloses the anti-bot/CAPTCHA bypass capability, which is critical behavioral context. However, it omits rate limits, authentication requirements, and the misleading 'advanced options' claim creates uncertainty about actual behavior.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness3/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is compact at two sentences and front-loads the core function. However, the inclusion of the inaccurate 'advanced options' claim wastes space and misleads, preventing a higher score.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness3/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given no output schema, the description appropriately specifies the return format (MarkDown). For a single-parameter tool, the description is almost sufficient, but the discrepancy between claimed 'advanced options' and the actual schema limits completeness, and no information is provided about error handling or rate limiting.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters2/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema description coverage is 0%, requiring the description to compensate. While it implies the URL parameter via 'single webpage URL', it falsely suggests additional parameters ('advanced options') exist when they do not. It provides no format guidance or constraints for the url parameter beyond the schema's URI format.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose3/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description states the core action (scrape URL) and output format (MarkDown), distinguishing it from generic scraping siblings. However, it inaccurately claims the tool accepts 'advanced options for content extraction' when the input schema only contains a single 'url' parameter with additionalProperties: false, creating confusion about actual capabilities.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines2/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description mentions CAPTCHA/bot detection bypass, implicitly suggesting when to use this tool over simpler alternatives like read_url. However, it fails to explicitly compare against siblings (firecrawl_scrape, scrape_batch) or state when NOT to use this tool.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

scrape_batchCInspect

Scrape multiple webpages URLs with advanced options for content extraction and get back the results in MarkDown language. This tool can unlock any webpage even if it uses bot detection or CAPTCHA.

ParametersJSON Schema
NameRequiredDescriptionDefault
urlsYesArray of URLs to scrape (max 10)
Behavior2/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Discloses anti-bot/CAPTCHA bypass capabilities and Markdown output format (valuable given no output schema). However, it falsely claims 'advanced options for content extraction' which do not exist in the input schema, misleading agents to expect configuration parameters that aren't available.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness3/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Compact two-sentence structure, but the claim about 'advanced options' wastes space and creates confusion since no such options are exposed in the tool interface. Front-loading is adequate but the second sentence makes unsupported promises.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness2/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Inadequate for a scraping tool with no annotations or output schema. Fails to explain the return data structure (how results map to input URLs), error handling for invalid URLs, or resolve the discrepancy between 'advanced options' marketing and the simple single-parameter interface.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters2/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

While the schema has 100% coverage for the single 'urls' parameter, the description detracts value by referencing 'advanced options' that aren't in the schema. It adds no clarifying details about URL format requirements or the 10-item limit beyond what the schema already states.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose4/5

Does the description clearly state what the tool does and how it differs from similar tools?

Clearly states the core action (scrape multiple webpages), output format (Markdown), and anti-bot capabilities. However, it fails to differentiate from numerous siblings like 'scrape_as_markdown', 'firecrawl_scrape', or 'read_url' that likely overlap in functionality.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines1/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

Provides no guidance on when to use this tool versus the 12+ sibling tools available (e.g., when to use batch vs single scrape, or vs firecrawl variants). No prerequisites or error conditions mentioned despite the anti-bot claims.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

search_engineBInspect

Scrape search results from Google, Bing or Yandex. Returns SERP results in JSON or Markdown (URL, title, description), Ideal for gathering current information, news, and detailed search results.

ParametersJSON Schema
NameRequiredDescriptionDefault
queryYes
cursorNoPagination cursor for next page
engineNogoogle
Behavior3/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

With no annotations provided, the description carries the full burden. It successfully discloses the output format (JSON or Markdown) and structure (URL, title, description), compensating somewhat for the missing output schema. However, it omits critical operational details: rate limits, read-only safety status, error handling behavior, and pagination mechanics despite mentioning the cursor parameter.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is efficiently structured with two information-dense sentences. It front-loads the core action and engines, immediately follows with output format specifics, and concludes with use-case guidance. No filler words or redundant phrases are present.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness3/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the absence of annotations and output schema, the description adequately compensates by describing the return structure and supported engines. However, for a 3-parameter search tool with pagination support, it lacks crucial context about cursor usage patterns, rate limiting, and the critical distinction from the batch variant sibling.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema coverage is only 33% (only 'cursor' has a description). The description compensates partially by mentioning Google, Bing, and Yandex (mapping to the 'engine' enum), but fails to explicitly document the 'query' parameter (beyond implying it via 'search results') or explain pagination behavior for the cursor. Given the low coverage, more explicit parameter documentation is needed.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose4/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the tool scrapes search results from specific engines (Google, Bing, Yandex) and returns SERP data. However, it fails to explicitly differentiate from similar siblings like 'search_engine_batch', 'firecrawl_search', and 'search_web', which could lead to confusion about which search tool to select.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines2/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

While it mentions the tool is 'Ideal for gathering current information, news, and detailed search results,' it provides no explicit guidance on when to use this single-query tool versus 'search_engine_batch' (for multiple queries) or versus 'firecrawl_search'/'search_web'. No prerequisites or exclusions are stated.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

search_engine_batchAInspect

Run multiple search queries simultaneously. Returns JSON for Google, Markdown for Bing/Yandex.

ParametersJSON Schema
NameRequiredDescriptionDefault
queriesYes
Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

With no annotations provided, the description carries the full burden. It successfully discloses critical behavioral differences in return formats ('Returns JSON for Google, Markdown for Bing/Yandex'). However, it omits other behavioral traits like rate limits, error handling for partial failures, or pagination behavior despite the cursor parameter.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description consists of two efficient sentences. The first front-loads the action and batch nature; the second provides essential output format information. No sentences are wasted or redundant.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness3/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the 0% schema coverage and lack of output schema or annotations, the description covers the essential batch functionality and output formats. However, it leaves significant gaps regarding parameter semantics (engine selection, cursor usage) and operational constraints that would be necessary for complete understanding.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters2/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema description coverage is 0%, requiring the description to compensate. While it references 'multiple search queries' implying the queries array structure, it fails to document the engine enum options (google/bing/yandex), the cursor parameter's purpose for pagination, or the fact that queries is an array of objects with nested properties.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the core action ('Run multiple search queries simultaneously'), identifies the resource (search queries), and distinguishes from single-query siblings via 'multiple' and 'simultaneously'. It also differentiates output formats by engine.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines3/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description implies batch usage through 'multiple' and 'simultaneously', suggesting when to use this over single-query alternatives like search_engine. However, it lacks explicit guidance on when NOT to use it or direct comparisons to siblings.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

search_webAInspect

Search the entire web for current information, news, articles, and websites. Use this when you need up-to-date information, want to find specific websites, research topics, or get the latest news. Ideal for answering questions about recent events, finding resources, or discovering relevant content.

ParametersJSON Schema
NameRequiredDescriptionDefault
glNoCountry code, e.g., 'dz' for Algeria
hlNoLanguage code, e.g., 'zh-cn' for Simplified Chinese
numNoMaximum number of search results to return, between 1-100
tbsNoTime-based search parameter, e.g., 'qdr:h' for past hour, can be qdr:h, qdr:d, qdr:w, qdr:m, qdr:y
queryYesSearch terms or keywords to find relevant web content (e.g., 'climate change news 2024', 'best pizza recipe'). Can be a single query string or an array of queries for parallel search.
locationNoLocation for search results, e.g., 'London', 'New York', 'Tokyo'
Behavior2/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

No annotations are provided, so the description carries the full burden. It fails to disclose critical behavioral traits: result format (snippets vs full pages?), pagination behavior, rate limits, or whether this is a destructive operation (though implied safe). Does not mention what the tool returns since no output schema exists.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Three well-structured sentences with zero waste: first defines capability, second gives usage trigger, third specifies ideal scenarios. Information is front-loaded and appropriately sized for the tool's complexity.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness3/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Adequate for a 6-parameter search tool, but gaps exist: no output schema is provided, yet the description fails to explain what gets returned (URL list, snippets, content?). Sibling differentiation is missing given the crowded search tool namespace.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

With 100% schema description coverage, the schema adequately documents all 6 parameters (query, num, tbs, hl, gl, location). The main description adds no parameter-specific context, relying entirely on the schema. Baseline 3 is appropriate when structured data does the heavy lifting.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose4/5

Does the description clearly state what the tool does and how it differs from similar tools?

Description clearly states the tool searches the 'entire web' for current information, news, articles, and websites. However, it does not explicitly differentiate from siblings like 'search_engine', 'search_engine_batch', or 'firecrawl_search', leaving ambiguity about which search tool to use.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

Provides explicit positive guidance with 'Use this when you need up-to-date information...' and lists ideal scenarios (recent events, finding resources). Lacks negative guidance or explicit alternatives for when NOT to use this versus sibling search tools.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

Discussions

No comments yet. Be the first to start the discussion!

Try in Browser

Your Connectors

Sign in to create a connector for this server.

Resources