Skip to main content
Glama

Extract Web Page HTML

web.scrape.extract
Read-onlyIdempotent

Extract raw HTML from any URL for data extraction, content analysis, or price monitoring. Automatically handles anti-bot protection.

Instructions

⚡ ACTION: Extract raw HTML from any URL — cheapest web scraping API ($0.00013 for simple sites). Returns decoded HTML content, HTTP status code, and content length. Use for data extraction, content analysis, or price monitoring. Handles anti-bot protection automatically (Zyte)

Input Schema

TableJSON Schema
NameRequiredDescriptionDefault
urlYesURL to scrape — returns raw HTML content. Fast and cheap ($0.00013 for simple sites). Example: "https://example.com"

Output Schema

TableJSON Schema
NameRequiredDescriptionDefault
resultNoTool response payload. Shape varies per tool — consult the tool description and inputSchema. May be an object, array, string, or number depending on the upstream provider response.
errorNoPresent only when the call failed. Includes error code, message, request_id, and any provider-specific extras.
Behavior5/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations already indicate read-only, non-destructive, idempotent behavior. The description adds valuable behavioral details: returns decoded HTML, HTTP status code, and content length; automatically handles anti-bot protection; and mentions pricing ($0.00013 for simple sites). No contradictions exist between description and annotations.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is three sentences, front-loaded with a clear action emoji and purpose. Each sentence earns its place: action/cost, return values/use cases, and anti-bot feature. No redundant or extraneous information. Exceptionally concise and well-structured.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness5/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the tool's simplicity (single parameter, output schema exists but not needed in description), the description covers all essential aspects: what it does, what it returns, use cases, and key features. An agent can confidently select and invoke this tool without further clarification. Complete for its complexity level.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

The input schema has 100% coverage for the single parameter 'url', which already includes a description mentioning fast and cheap scraping with an example. The tool description adds no new semantic information beyond reinforcing the URL parameter. Per rules, high schema coverage sets baseline at 3.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states 'Extract raw HTML from any URL' with specific verb and resource. It distinguishes itself from siblings like web.scrape.browser (dynamic scraping) and web.scrape.screenshot (screenshot capture) by focusing on raw HTML extraction. Also notes cheap pricing and automatic anti-bot protection, further differentiating.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description provides context for when to use: 'Use for data extraction, content analysis, or price monitoring.' It also highlights the cheap cost and anti-bot handling, implying suitability for simple static sites. However, it does not explicitly exclude use cases or mention alternatives like diffbot or browser-based tools. Still, the guidance is clear and contextually informative.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

Install Server

Other Tools

Latest Blog Posts

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/whiteknightonhorse/APIbase'

If you have feedback or need assistance with the MCP directory API, please join our Discord server