Skip to main content
Glama
coucya

mcp-server-requests

by coucya

fetch

Retrieve web page content from any URL and process it into raw HTML, cleaned HTML, or readable Markdown format for analysis and integration.

Instructions

Fetch web page content

Function/Features:

  • Retrieves web page content from any HTTP/HTTPS URL

Args: url (str): The URL to fetch content from. return_content ('raw' | 'basic_clean' | 'strict_clean' | 'markdown', optional): Processing format for HTML content. Defaults to "markdown". - "raw": Returns unmodified HTML content with full response headers - "basic_clean": Removes non-displaying tags (script, style, meta, etc.) while preserving structure - "strict_clean": Removes non-displaying tags and most HTML attributes, keeping only essential structure - "markdown": Converts HTML content to clean, readable Markdown format

Examples: // Returns content as markdown fetch({url: "https://example.com"})

// Returns raw HTML content
fetch({url: "https://api.example.com/data", return_content: "raw"})  

Input Schema

TableJSON Schema
NameRequiredDescriptionDefault
urlYes(require) The URL to fetch content from
return_contentNo(optional, Defaults to "markdown") processing format for HTML contentmarkdown

Output Schema

TableJSON Schema
NameRequiredDescriptionDefault
resultYes

Implementation Reference

  • The 'fetch' tool handler function, which is also registered via @mcp.tool(). It performs an HTTP GET request to the specified URL and returns the content in the requested format (raw, cleaned HTML, or markdown). Includes input schema via Annotated types and comprehensive docstring.
    @mcp.tool()
    def fetch(
        url: Annotated[str, "(require) The URL to fetch content from"],
        return_content: Annotated[Literal['raw', 'basic_clean', 'strict_clean', 'markdown'], "(optional, Defaults to \"markdown\") processing format for HTML content"] = "markdown",
    ) -> str:
        """Fetch web page content
    
        Function/Features:
        - Retrieves web page content from any HTTP/HTTPS URL
    
        Args:
            url (str): The URL to fetch content from.
            return_content ('raw' | 'basic_clean' | 'strict_clean' | 'markdown', optional):
                Processing format for HTML content. Defaults to "markdown".
                - "raw": Returns unmodified HTML content with full response headers
                - "basic_clean": Removes non-displaying tags (script, style, meta, etc.) while preserving structure
                - "strict_clean": Removes non-displaying tags and most HTML attributes, keeping only essential structure
                - "markdown": Converts HTML content to clean, readable Markdown format
    
        Examples:
            // Returns content as markdown
            fetch({url: "https://example.com"})
    
            // Returns raw HTML content
            fetch({url: "https://api.example.com/data", return_content: "raw"})  
        """
        return mcp_http_request("GET", url, return_content=return_content, user_agent=ua, force_user_agnet=ua_force, format_headers=False)
  • Helper function that executes the actual HTTP request and handles response formatting or error handling. Called by the 'fetch' tool.
    def mcp_http_request(
        method: str,
        url: str,
        *,
        query: Optional[dict] = None,
        data: Optional[str | bytes | bytearray] = None,
        json: Optional[dict] = None,
        headers: Optional[dict] = None,
        user_agent: Optional[str] = None,
        force_user_agnet: Optional[bool] = None,
        format_status: bool = True,
        format_headers: bool = True,
        return_content: Literal['raw', 'basic_clean', 'strict_clean', 'markdown'] = "raw",
    ) -> str:
        hs = {}
    
        if headers:
            hs.update(headers)
    
        if force_user_agnet:
            if user_agent:
                hs["User-Agent"] = user_agent
        else:
            if "User-Agent" not in hs and user_agent:
                hs["User-Agent"] = user_agent
    
        try:
            response = http_request(
                method, url,
                query=query,
                headers=hs,
                data=data,
                json_=json
            )
    
            return format_response_result(
                response,
                format_status=format_status,
                format_headers=format_headers,
                return_content=return_content
            )
        except Exception as e:
            return format_error_result(e)
  • Helper function that formats the HTTP response content according to the specified return_content type, handling HTML cleaning and markdown conversion. Used by mcp_http_request which is called by 'fetch'.
    def format_response_result(
        response: Response,
        *,
        format_status: bool | None = None,
        format_headers: bool | None = None,
        return_content: Literal["raw", "basic_clean", "strict_clean", "markdown"] = "raw",
    ) -> str:
        http_version = response.version
        status = response.status_code
        reason = response.reason
        headers = response.headers
        content = response.content
        content_type = response.content_type
    
        if not isinstance(content_type, str):
            content_type = 'application/octet-stream'
    
        if content_type.startswith("text/") or content_type.startswith("application/json"):
            try:
                if isinstance(content, (bytes, bytearray)):
                    content = content.decode('utf-8')
                else:
                    content = str(content)
            except UnicodeDecodeError as e:
                err_message = f"response content type is \"{content_type}\", but not utf-8 encoded'"
                raise ResponseError(response, err_message) from e
            except Exception as e:
                err_message = f"response content type is \"{content_type}\", but cannot be converted to a string"
                raise ResponseError(response, err_message) from e
        else:
            err_message = f'response content type is "{content_type}", cannot be converted to a string'
            raise ResponseError(response, err_message)
    
        if content_type.startswith("text/html"):
            if return_content == "raw":
                pass
            elif return_content == "basic_clean":
                content = clean_html(content, allowed_attrs=True)
            elif return_content == "strict_clean":
                content = clean_html(content, allowed_attrs=("id", "src", "href"))
            elif return_content == "markdown":
                content = html_to_markdown(content)
    
        strs = []
    
        if format_status:
            strs.append(f"{http_version} {status} {reason}\r\n")
        if format_headers:
            response_header_str = "\r\n".join(f"{k}: {v}" for k, v in headers)
            strs.append(response_header_str)
        if len(strs) > 0:
            strs.append("\r\n\r\n")
        strs.append(content)
    
        return "\r\n".join(strs)
Behavior3/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

With no annotations provided, the description carries the full burden of behavioral disclosure. It clearly describes what the tool does (fetching and processing web content) and includes important details about the different processing formats available. However, it doesn't mention potential behavioral aspects like rate limits, authentication requirements, error handling, timeout behavior, or what happens with non-HTML content.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is well-structured with clear sections (Function/Features, Args, Examples) and front-loads the core purpose. Every sentence adds value: the opening statement establishes purpose, the features section clarifies scope, the args section provides parameter context, and the examples demonstrate practical usage. There's no wasted text or redundancy.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given that an output schema exists (so return values don't need explanation in the description), the description provides good coverage of the tool's functionality. It explains what the tool does, documents the parameters meaningfully, and includes helpful examples. The main gap is the lack of behavioral context around error conditions, performance characteristics, or limitations that would be important for a web fetching tool.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters4/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

With 100% schema description coverage, the schema already documents both parameters thoroughly. The description adds meaningful context by explaining the semantic differences between the four 'return_content' options with clear definitions of what each format does, which goes beyond the enum values listed in the schema. This helps the agent understand when to choose each processing option.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the specific action ('Fetch web page content') and resource ('from any HTTP/HTTPS URL'), making the purpose immediately apparent. It distinguishes itself from sibling tools like 'fetch_to_file' and 'http_request' by focusing specifically on retrieving and processing web content rather than saving to files or making general HTTP requests.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description provides clear context about when to use this tool (retrieving web page content from URLs) and includes examples that demonstrate different use cases. However, it doesn't explicitly state when NOT to use this tool or provide direct comparisons with sibling alternatives like 'fetch_to_file' or 'http_request'.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

Install Server

Other Tools

Latest Blog Posts

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/coucya/mcp-server-requests'

If you have feedback or need assistance with the MCP directory API, please join our Discord server