Skip to main content
Glama

scrape_phones

Extract phone numbers from web pages with optional filters for type, area code, or keywords. Handles JavaScript-rendered content and batch processes up to 50 URLs.

Instructions

Call ping first

New session or unsure the extension is online: ping first, then any scrape*. If EXTENSION_NOT_CONNECTED: ping again, then fix WebSocket using error.details.bridge, MCP stderr, and ~/.lionscraper/port, then retry.

lang (optional)

en-US | zh-CN: human-readable errors for this call; omitted → English; Chinese users pass lang: "zh-CN" on each call.

Do not substitute raw HTTP

When you need a real browser DOM, logged-in session cookies, JS-rendered content (SPAs), extension-side pagination or multi-URL scheduling, or structured field extraction, do not use WebFetch, curl, wget, or cookie-less IDE fetch tools instead of this server’s ping + scrape*. Only if the page is fully public and mostly static and the user clearly wants a trivial GET of raw HTML may you consider a plain HTTP client.

Purpose

Extract phone numbers from HTML with simple type hints; optional filters.

When to use

Business or support pages when the user wants callable numbers.

Returns

MultiUrlResult; success data is { number, type }[].

Parameters

filter: optional type, areaCode, keyword, limit. For multiple URLs, see common params (min interval 500ms, concurrency cap 3 in extension).

Input Schema

TableJSON Schema
NameRequiredDescriptionDefault
urlYesOne http(s) URL or string[] (batch). Max 50 URLs per request (extension-enforced). Server forwards as-is.
langNoBCP 47 for human-readable errors this call: en-US | zh-CN. Omitted → English. Pass zh-CN when the user works in Chinese; the Server cannot infer chat language.
delayNoMilliseconds to wait after load before extraction (default 0). Use for late-rendered DOM.
waitForScrollNoScroll the page (or a container) before extraction to trigger lazy-loaded content. Distinct from top-level scrollSpeed.
timeoutMsNoPer-URL task timeout for the extension (default 60000). Not the MCP WebSocket wait; see bridgeTimeoutMs.
bridgeTimeoutMsNoMCP Server only: max ms to wait for one tool call on the WebSocket bridge (capped). Omitted → derived from URL count, maxPages, timeoutMs, scrapeInterval. Stripped before forwarding to the extension.
includeHtmlNoIf true, include document.documentElement.outerHTML in result meta for that URL.
includeTextNoIf true, include document.body.innerText in result meta for that URL.
scrapeIntervalNoMs between starting tasks in a multi-URL run. Omitted → extension default; extension enforces min 500ms.
concurrencyNoParallel tabs for multi-URL runs. Omitted → extension default; extension enforces max 3.
scrollSpeedNoOptional global scroll speed (px) for batch tuning—not the same as waitForScroll.scrollSpeed.
filterNoPost-extraction filter for phone list.
Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

With no annotations provided, description carries full burden and discloses: extension/WebSocket bridge architecture, rate limiting ('min interval 500ms', 'concurrency cap 3'), error handling ('EXTENSION_NOT_CONNECTED'), return structure ('MultiUrlResult... { number, type }[]'), and DOM requirements ('real browser DOM, logged-in session cookies, JS-rendered content'). Lacks explicit destructive/idempotency statement.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness3/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Well-structured with headers (###), but violates front-loading principles—opens with operational instructions (ping prerequisite) before stating purpose. Lengthy HTTP substitution warning, while valuable, consumes significant space before core function description. Density is justified by complexity but organization could be tighter.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness5/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given 12 parameters, nested objects, zero annotations, and no output schema, description achieves completeness: specifies return type structure (MultiUrlResult), explains extension-side enforcement (50 URL max, 500ms intervals), documents prerequisite error handling workflow, and distinguishes from raw HTTP alternatives.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters4/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema coverage is 100% (baseline 3). Description adds usage context beyond schema: instructs Chinese users to pass 'zh-CN' on each call, references filter sub-fields by name in prose, and explains infrastructure constraints ('extension-enforced', 'Server forwards as-is'). Elevates above baseline by clarifying expected values and operational semantics.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

Description explicitly states specific verb 'Extract' with resource 'phone numbers' from source 'HTML' plus features 'simple type hints; optional filters'. Clearly distinguishes from siblings (scrape_article, scrape_emails, etc.) by specifying phone number extraction domain.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines5/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

Provides explicit when-to-use ('Business or support pages when the user wants callable numbers'), explicit prerequisite workflow ('Call ping first... then any scrape*'), and clear anti-guidance contrasting with alternatives ('do not use WebFetch, curl, wget... Only if the page is fully public and mostly static... may you consider a plain HTTP client').

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

Install Server

Other Tools

Latest Blog Posts

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/dowant/lionscraper-mcp'

If you have feedback or need assistance with the MCP directory API, please join our Discord server