fetch
Convert web page HTML content to clean markdown format for easier reading and processing. Use this tool to extract and transform webpage text from any URL.
Instructions
scrape the html content and return the markdown format using jina api.
Args:
url: The search query string
Returns:
text : html in markdown format
Input Schema
TableJSON Schema
| Name | Required | Description | Default |
|---|---|---|---|
| url | Yes |
Implementation Reference
- main.py:155-170 (handler)The handler function for the 'fetch' MCP tool. Validates input URL, fetches content using the helper fetch_url, and returns markdown text.async def fetch(url: str): """ scrape the html content and return the markdown format using jina api. Args: url: The search query string Returns: text : html in markdown format """ if not isinstance(url, str): raise ValueError("Query must be a non-empty string") text = await fetch_url(url) return text
- main.py:154-154 (registration)MCP decorator registering the 'fetch' function as a tool.@mcp.tool()
- main.py:156-164 (schema)Docstring providing input (url: str) and output (markdown text) schema for the fetch tool.""" scrape the html content and return the markdown format using jina api. Args: url: The search query string Returns: text : html in markdown format """
- main.py:59-80 (helper)Helper function implementing the core fetching logic: uses Jina AI for HTML to markdown conversion with timeout fallback to raw HTML parsing.async def fetch_url(url: str): jina_timeout = 15.0 raw_html_timeout = 5.0 url = f"https://r.jina.ai/{url}" async with httpx.AsyncClient() as client: try: print(f"fetching result from\n{url}") response = await client.get(url, timeout=jina_timeout) """ using jina api to convert html to markdown """ text = response.text return text except httpx.TimeoutException: try: print("Jina API timed out, fetching raw HTML...") response = await client.get(url, timeout=raw_html_timeout) """ using raw html """ soup = BeautifulSoup(response.text, "html.parser") text = soup.get_text() return text except httpx.TimeoutException: return "Timeout error"