Fetch JSONPath MCP

fetch-text

Extract text content from URLs using HTTP methods, converting HTML to Markdown format for readable output.

Instructions

Fetch text content from a URL using various HTTP methods. Defaults to converting HTML to Markdown format.

Input Schema

TableJSON Schema

Name	Required	Description	Default
`url`	Yes	The URL to get text content from
`method`	No	HTTP method to use (GET, POST, PUT, DELETE, PATCH, etc.). Default is GET.	GET
`data`	No	Request body data for POST/PUT/PATCH requests. Can be a JSON object or string.
`headers`	No	Additional HTTP headers to include in the request
`output_format`	No	Output format: 'markdown' (default), 'clean_text', or 'raw_html'.	markdown

Implementation Reference

src/jsonrpc_mcp/server.py:215-225 (handler)

Handler logic for the 'fetch-text' tool within the @server.call_tool() dispatcher. Validates the URL argument and invokes fetch_url_content with as_json=False for text processing.

elif tool_name == "fetch-text":
    url = args.get("url")
    if not url or not isinstance(url, str):
        result = "Failed to call tool, error: Missing required property: url"
    else:
        method = args.get("method", "GET")
        data = args.get("data")
        headers = args.get("headers")
        output_format = args.get("output_format", "markdown")
        result = await fetch_url_content(url, as_json=False, method=method, data=data, headers=headers, output_format=output_format)

src/jsonrpc_mcp/server.py:60-93 (registration)

Tool registration in @server.list_tools(), defining the name, description, and input schema for 'fetch-text'.

types.Tool(
    name="fetch-text",
    description="Fetch text content from a URL using various HTTP methods. Defaults to converting HTML to Markdown format.",
    inputSchema={
        "type": "object",
        "properties": {
            "url": {
                "type": "string",
                "description": "The URL to get text content from",
            },
            "method": {
                "type": "string",
                "description": "HTTP method to use (GET, POST, PUT, DELETE, PATCH, etc.). Default is GET.",
                "default": "GET"
            },
            "data": {
                "type": ["object", "string", "null"],
                "description": "Request body data for POST/PUT/PATCH requests. Can be a JSON object or string.",
            },
            "headers": {
                "type": "object",
                "description": "Additional HTTP headers to include in the request",
                "additionalProperties": {"type": "string"}
            },
            "output_format": {
                "type": "string",
                "description": "Output format: 'markdown' (default), 'clean_text', or 'raw_html'.",
                "enum": ["markdown", "clean_text", "raw_html"],
                "default": "markdown"
            }
        },
        "required": ["url"],
    },
),

src/jsonrpc_mcp/server.py:63-93 (schema)

Input schema definition for the 'fetch-text' tool, specifying parameters like url (required), method, data, headers, and output_format.

    inputSchema={
        "type": "object",
        "properties": {
            "url": {
                "type": "string",
                "description": "The URL to get text content from",
            },
            "method": {
                "type": "string",
                "description": "HTTP method to use (GET, POST, PUT, DELETE, PATCH, etc.). Default is GET.",
                "default": "GET"
            },
            "data": {
                "type": ["object", "string", "null"],
                "description": "Request body data for POST/PUT/PATCH requests. Can be a JSON object or string.",
            },
            "headers": {
                "type": "object",
                "description": "Additional HTTP headers to include in the request",
                "additionalProperties": {"type": "string"}
            },
            "output_format": {
                "type": "string",
                "description": "Output format: 'markdown' (default), 'clean_text', or 'raw_html'.",
                "enum": ["markdown", "clean_text", "raw_html"],
                "default": "markdown"
            }
        },
        "required": ["url"],
    },
),

src/jsonrpc_mcp/utils.py:232-343 (helper)

Primary helper function implementing URL fetching logic for 'fetch-text' tool (called with as_json=False). Performs HTTP requests using httpx, validates responses, handles various methods, and processes text content via extract_text_content.

async def fetch_url_content(
    url: str, 
    as_json: bool = True, 
    method: str = "GET", 
    data: dict | str | None = None,
    headers: dict[str, str] | None = None,
    output_format: str = "markdown"
) -> str:
    """
    Fetch content from a URL using different HTTP methods.
    
    Args:
        url: URL to fetch content from
        as_json: If True, validates content as JSON; if False, returns text content
        method: HTTP method (GET, POST, PUT, DELETE, etc.)
        data: Request body data (for POST/PUT requests)
        headers: Additional headers to include in the request
        output_format: If as_json=False, output format - "markdown", "clean_text", or "raw_html"
        
    Returns:
        String content from the URL (JSON, Markdown, clean text, or raw HTML)
        
    Raises:
        httpx.RequestError: For network-related errors
        json.JSONDecodeError: If as_json=True and content is not valid JSON
        ValueError: If URL is invalid or unsafe
    """
    # Validate URL first
    validate_url(url)
    
    config = await get_http_client_config()
    max_size = config.pop("max_size", 10 * 1024 * 1024)  # Remove from client config
    
    # Merge additional headers with config headers (user headers override defaults)
    if headers:
        if config.get("headers"):
            config["headers"].update(headers)
        else:
            config["headers"] = headers
    
    async with httpx.AsyncClient(**config) as client:
        # Handle different HTTP methods
        method = method.upper()
        
        if method == "GET":
            response = await client.get(url)
        elif method == "POST":
            if isinstance(data, dict):
                response = await client.post(url, json=data)
            else:
                response = await client.post(url, content=data)
        elif method == "PUT":
            if isinstance(data, dict):
                response = await client.put(url, json=data)
            else:
                response = await client.put(url, content=data)
        elif method == "DELETE":
            response = await client.delete(url)
        elif method == "PATCH":
            if isinstance(data, dict):
                response = await client.patch(url, json=data)
            else:
                response = await client.patch(url, content=data)
        elif method == "HEAD":
            response = await client.head(url)
        elif method == "OPTIONS":
            response = await client.options(url)
        else:
            # For any other method, use the generic request method
            if isinstance(data, dict):
                response = await client.request(method, url, json=data)
            else:
                response = await client.request(method, url, content=data)
        
        response.raise_for_status()
        
        # Check response size
        content_length = len(response.content)
        if content_length > max_size:
            raise ValueError(f"Response size ({content_length} bytes) exceeds maximum allowed ({max_size} bytes)")
        
        if as_json:
            # For JSON responses, use response.text directly (no compression expected)
            content_to_parse = response.text
            if not content_to_parse:
                # If response.text is empty, try decoding content directly
                try:
                    content_to_parse = response.content.decode('utf-8')
                except UnicodeDecodeError:
                    content_to_parse = ""
            
            if content_to_parse:
                try:
                    json.loads(content_to_parse)
                    return content_to_parse
                except json.JSONDecodeError:
                    # If text parsing fails, try content decoding as fallback
                    if content_to_parse == response.text:
                        try:
                            fallback_content = response.content.decode('utf-8')
                            json.loads(fallback_content)
                            return fallback_content
                        except (json.JSONDecodeError, UnicodeDecodeError):
                            pass
                    raise json.JSONDecodeError("Response is not valid JSON", content_to_parse, 0)
            else:
                # Empty response
                return ""
        else:
            # For text content, apply format conversion
            return extract_text_content(response.text, output_format)

src/jsonrpc_mcp/utils.py:62-121 (helper)

Supporting utility for formatting fetched HTML content into markdown, clean text, or raw HTML. Invoked by fetch_url_content for 'fetch-text' tool.

def extract_text_content(html_content: str, output_format: str = "markdown") -> str:
    """
    Extract text content from HTML in different formats.
    
    Args:
        html_content: Raw HTML content
        output_format: Output format - "markdown" (default), "clean_text", or "raw_html"
    
    Returns:
        Extracted content in the specified format
    """
    if output_format == "raw_html":
        return html_content
    
    try:
        from markdownify import markdownify as md
        
        if output_format == "markdown":
            # Convert HTML to Markdown
            markdown_text = md(html_content, 
                               heading_style="ATX",  # Use # for headings
                               bullets="*",          # Use * for bullets
                               strip=["script", "style", "noscript"])
            
            # Clean up extra whitespace
            lines = (line.rstrip() for line in markdown_text.splitlines())
            markdown_text = '\n'.join(line for line in lines if line.strip() or not line)
            
            return markdown_text.strip()
            
        elif output_format == "clean_text":
            # Parse HTML with BeautifulSoup
            soup = BeautifulSoup(html_content, 'html.parser')
            
            # Remove script and style elements
            for script in soup(["script", "style", "noscript"]):
                script.decompose()
            
            # Get text content
            text = soup.get_text()
            
            # Break into lines and remove leading and trailing space on each
            lines = (line.strip() for line in text.splitlines())
            
            # Break multi-headlines into a line each
            chunks = (phrase.strip() for line in lines for phrase in line.split("  "))
            
            # Drop blank lines
            text = ' '.join(chunk for chunk in chunks if chunk)
            
            return text
            
        else:
            # Unknown format, return raw HTML
            return html_content
            
    except Exception:
        # If processing fails, return original content
        return html_content

Tool Definition Quality

C2.9/5.0

Behavior2/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

No annotations are provided, so the description carries the full burden of behavioral disclosure. It mentions 'various HTTP methods' and 'defaults to converting HTML to Markdown format,' which adds some context about functionality. However, it doesn't cover critical aspects like error handling, rate limits, authentication needs, or what happens with non-HTML content, leaving significant gaps for a tool that interacts with external URLs.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness4/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is a single, efficient sentence that front-loads the core purpose. It avoids unnecessary details, though it could be slightly more structured by explicitly separating key points. Overall, it's concise with minimal waste.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness2/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the tool's complexity (5 parameters, no annotations, no output schema), the description is incomplete. It lacks details on behavioral traits, error handling, and output specifics, which are crucial for a tool fetching content from URLs. The schema covers parameters well, but the description doesn't compensate for missing annotations or output schema, leaving the agent with insufficient context.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema description coverage is 100%, so the schema already documents all parameters thoroughly. The description adds minimal value beyond the schema, mentioning 'various HTTP methods' and 'defaults to converting HTML to Markdown format,' which loosely relates to 'method' and 'output_format' parameters but doesn't provide additional semantics. Baseline 3 is appropriate as the schema does the heavy lifting.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose4/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the tool's purpose: 'Fetch text content from a URL using various HTTP methods.' It specifies the resource (URL) and action (fetch text content), but doesn't explicitly differentiate from sibling tools like 'fetch-json' or 'batch-fetch-text' beyond mentioning 'text content' and 'Markdown format.' This makes it clear but not fully sibling-distinctive.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines2/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description provides no guidance on when to use this tool versus alternatives like 'fetch-json' or 'batch-fetch-text.' It mentions 'various HTTP methods' and 'defaults to converting HTML to Markdown format,' which implies some context, but lacks explicit when-to-use or when-not-to-use statements, leaving the agent to infer usage scenarios.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

Install Server

Other Tools

Latest Blog Posts

Lightport: Open-Sourcing Glama's AI Gateway
By punkpeye on April 27, 2026.
open source
OpenAI
Tool Definition Quality Score (TDQS)
By punkpeye on April 3, 2026.
mcp
The Hackers Who Tracked My Sleep Cycle
By punkpeye on March 26, 2026.
security

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/ackness/fetch-jsonpath-mcp'

If you have feedback or need assistance with the MCP directory API, please join our Discord server