Skip to main content
Glama

scrape

Extract content from any webpage as markdown, LLM-optimized text, plain text, or JSON. Automatically bypasses bot protection and handles JavaScript rendering for reliable data collection.

Instructions

Scrape a single URL and extract its content as markdown, LLM-optimized text, plain text, or full JSON. Automatically falls back to the webclaw cloud API when bot protection or JS rendering is detected.

Input Schema

TableJSON Schema
NameRequiredDescriptionDefault
browserNoBrowser profile: "chrome" (default), "firefox", or "random"
exclude_selectorsNoCSS selectors to exclude from output
formatNoOutput format: "markdown" (default), "llm", "text", or "json"
include_selectorsNoCSS selectors to include (only extract matching elements)
only_main_contentNoIf true, extract only the main content (article/main element)
urlYesURL to scrape
Behavior3/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

No annotations provided, so description carries full disclosure burden. Adds valuable operational context about automatic fallback to webclaw cloud API for bot protection and JS rendering, but omits rate limits, authentication requirements, timeout behavior, and error handling.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Two sentences with zero waste. First sentence establishes core functionality and output options; second provides critical implementation detail about fallback behavior. Well-structured and appropriately front-loaded.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Adequate for a 6-parameter tool with complete schema coverage. Description compensates for missing output schema by detailing return format options, though could enhance with error behavior or rate limit documentation given the web scraping domain.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters4/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema has 100% description coverage (baseline 3). Description adds semantic value by clarifying that 'llm' format means 'LLM-optimized text' and elaborating on format intentions beyond the schema's terse descriptions.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

Clear specific verb 'scrape' with scope 'single URL' that explicitly distinguishes from sibling tools like 'batch' and 'crawl'. Lists output formats (markdown, LLM-optimized text, plain text, JSON) to clarify extraction capabilities.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines3/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

Implicitly differentiates from batch/crawl via 'single URL' phrasing, but lacks explicit when-to-use guidance versus siblings like 'extract' or 'map'. No mention of prerequisites such as URL accessibility or when to prefer local vs cloud fallback.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

Install Server

Other Tools

Latest Blog Posts

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/0xMassi/webclaw'

If you have feedback or need assistance with the MCP directory API, please join our Discord server