s_fetch_page

s_fetch_page

Fetch web page content with pagination support and bot-detection avoidance. Retrieve website data in HTML or markdown format with configurable modes for different complexity levels.

Instructions

Fetches a complete web page with pagination support. Retrieves content from websites with bot-detection avoidance. For best performance, start with 'basic' mode (fastest), then only escalate to 'stealth' or 'max-stealth' modes if basic mode fails. Content is returned as 'METADATA: {json}\n\n[content]' where metadata includes length information and truncation status.

Args: url: URL to fetch mode: Fetching mode (basic, stealth, or max-stealth) format: Output format (html or markdown) max_length: Maximum number of characters to return. start_index: On return output starting at this character index, useful if a previous fetch was truncated and more content is required.

Input Schema

TableJSON Schema

Name	Required	Default
`url`	Yes
`mode`	No	basic
`format`	No	markdown
`max_length`	No
`start_index`	No

Implementation Reference

src/scrapling_fetch_mcp/mcp.py:14-38 (handler)
The main handler and registration point for the 's_fetch_page' tool using @mcp.tool() decorator. Defines input parameters (schema) and delegates execution to the core implementation.
@mcp.tool() async def s_fetch_page( url: str, mode: str = "basic", format: str = "markdown", max_length: int = 5000, start_index: int = 0, ) -> str: """Fetches a complete web page with pagination support. Retrieves content from websites with bot-detection avoidance. For best performance, start with 'basic' mode (fastest), then only escalate to 'stealth' or 'max-stealth' modes if basic mode fails. Content is returned as 'METADATA: {json}\\n\\n[content]' where metadata includes length information and truncation status. Args: url: URL to fetch mode: Fetching mode (basic, stealth, or max-stealth) format: Output format (html or markdown) max_length: Maximum number of characters to return. start_index: On return output starting at this character index, useful if a previous fetch was truncated and more content is required. """ try: result = await fetch_page_impl(url, mode, format, max_length, start_index) return result except Exception as e: logger = getLogger("scrapling_fetch_mcp") logger.error("DETAILED ERROR IN s_fetch_page: %s", str(e)) logger.error("TRACEBACK: %s", format_exc()) raise
src/scrapling_fetch_mcp/_fetcher.py:75-91 (helper)
Core implementation of the fetching logic delegated by the handler. Fetches the page using browse_url, optionally converts to markdown, applies truncation and pagination, and formats output with metadata.
async def fetch_page_impl( url: str, mode: str, format: str, max_length: int, start_index: int ) -> str: page = await browse_url(url, mode) is_markdown = format == "markdown" full_content = ( _html_to_markdown(page.html_content) if is_markdown else page.html_content ) total_length = len(full_content) truncated_content = full_content[start_index : start_index + max_length] is_truncated = total_length > (start_index + max_length) metadata_json = _create_metadata( total_length, len(truncated_content), is_truncated, start_index ) return f"METADATA: {metadata_json}\n\n{truncated_content}"

Scrapling Fetch MCP

Instructions

Input Schema

Implementation Reference

Other Tools

Latest Blog Posts

MCP directory API