Skip to main content
Glama

s_fetch_page

Fetch web page content with pagination support and bot-detection avoidance. Retrieve website data in HTML or markdown format with configurable modes for different complexity levels.

Instructions

Fetches a complete web page with pagination support. Retrieves content from websites with bot-detection avoidance. For best performance, start with 'basic' mode (fastest), then only escalate to 'stealth' or 'max-stealth' modes if basic mode fails. Content is returned as 'METADATA: {json}\n\n[content]' where metadata includes length information and truncation status.

Args: url: URL to fetch mode: Fetching mode (basic, stealth, or max-stealth) format: Output format (html or markdown) max_length: Maximum number of characters to return. start_index: On return output starting at this character index, useful if a previous fetch was truncated and more content is required.

Input Schema

TableJSON Schema
NameRequiredDescriptionDefault
urlYes
modeNobasic
formatNomarkdown
max_lengthNo
start_indexNo

Implementation Reference

  • The main handler and registration point for the 's_fetch_page' tool using @mcp.tool() decorator. Defines input parameters (schema) and delegates execution to the core implementation.
    @mcp.tool() async def s_fetch_page( url: str, mode: str = "basic", format: str = "markdown", max_length: int = 5000, start_index: int = 0, ) -> str: """Fetches a complete web page with pagination support. Retrieves content from websites with bot-detection avoidance. For best performance, start with 'basic' mode (fastest), then only escalate to 'stealth' or 'max-stealth' modes if basic mode fails. Content is returned as 'METADATA: {json}\\n\\n[content]' where metadata includes length information and truncation status. Args: url: URL to fetch mode: Fetching mode (basic, stealth, or max-stealth) format: Output format (html or markdown) max_length: Maximum number of characters to return. start_index: On return output starting at this character index, useful if a previous fetch was truncated and more content is required. """ try: result = await fetch_page_impl(url, mode, format, max_length, start_index) return result except Exception as e: logger = getLogger("scrapling_fetch_mcp") logger.error("DETAILED ERROR IN s_fetch_page: %s", str(e)) logger.error("TRACEBACK: %s", format_exc()) raise
  • Core implementation of the fetching logic delegated by the handler. Fetches the page using browse_url, optionally converts to markdown, applies truncation and pagination, and formats output with metadata.
    async def fetch_page_impl( url: str, mode: str, format: str, max_length: int, start_index: int ) -> str: page = await browse_url(url, mode) is_markdown = format == "markdown" full_content = ( _html_to_markdown(page.html_content) if is_markdown else page.html_content ) total_length = len(full_content) truncated_content = full_content[start_index : start_index + max_length] is_truncated = total_length > (start_index + max_length) metadata_json = _create_metadata( total_length, len(truncated_content), is_truncated, start_index ) return f"METADATA: {metadata_json}\n\n{truncated_content}"

Other Tools

Latest Blog Posts

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/cyberchitta/scrapling-fetch-mcp'

If you have feedback or need assistance with the MCP directory API, please join our Discord server