Skip to main content
Glama
billallison

URL Text Fetcher MCP Server

by billallison

fetch_page_links

Extract all links from a specified webpage to analyze or navigate its content. Input a URL, and the tool retrieves a list of all hyperlinks found on the page for further processing or research.

Instructions

Return a list of all links on the page.

Args: url: The URL to fetch links from

Input Schema

TableJSON Schema
NameRequiredDescriptionDefault
urlYes

Implementation Reference

  • Primary handler implementation using @mcp.tool() decorator. Fetches webpage content safely, parses HTML with BeautifulSoup, extracts and filters links from <a> tags, limits output to first 100, and formats response.
    @mcp.tool() async def fetch_page_links(url: str) -> str: """Return a list of all links on the page. Args: url: The URL to fetch links from """ # Sanitize URL input url = sanitize_url(url) if not url: return "Error: Invalid URL format" # Validate URL safety if not is_safe_url(url): logger.warning(f"Blocked unsafe URL for link fetching: {url}") return "Error: URL not allowed for security reasons" try: logger.info(f"Fetching page links: {url}") resp = requests.get(url, headers=HEADERS, timeout=REQUEST_TIMEOUT, stream=True) resp.raise_for_status() # Check content length content_length = resp.headers.get('Content-Length') if content_length and int(content_length) > MAX_RESPONSE_SIZE: return f"Error: Page too large ({content_length} bytes)" # Read content with size limit content_chunks = [] total_size = 0 for chunk in resp.iter_content(chunk_size=8192, decode_unicode=True): if chunk: total_size += len(chunk) if total_size > MAX_RESPONSE_SIZE: return "Error: Page content too large" content_chunks.append(chunk) html_content = ''.join(content_chunks) soup = BeautifulSoup(html_content, "html.parser") links = [a.get('href') for a in soup.find_all('a', href=True) if a.get('href')] # Filter and clean links valid_links = [] for link in links: if link.startswith(('http://', 'https://', '/')): valid_links.append(link) links_text = "\n".join(f"- {link}" for link in valid_links[:100]) # Limit to 100 links return f"Links found on {url} ({len(valid_links)} total, showing first 100):\n\n{links_text}" except requests.RequestException as e: logger.error(f"Request failed for {url}: {e}") return "Error: Unable to fetch page" except Exception as e: logger.error(f"Unexpected error fetching links from {url}: {e}", exc_info=True) return "Error: Unable to process page"
  • Alternative handler implementation (non-async) with Pydantic Field schema for input validation. Identical logic to primary handler: fetches, parses, extracts, filters, and formats links.
    @mcp.tool() def fetch_page_links(url: str = Field(description="The URL to fetch links from")) -> str: """Return a list of all links on the page""" # Sanitize URL input url = sanitize_url(url) if not url: return "Error: Invalid URL format" # Validate URL safety if not is_safe_url(url): logger.warning(f"Blocked unsafe URL for link fetching: {url}") return "Error: URL not allowed for security reasons" try: logger.info(f"Fetching page links: {url}") resp = requests.get(url, headers=HEADERS, timeout=REQUEST_TIMEOUT, stream=True) resp.raise_for_status() # Check content length content_length = resp.headers.get('Content-Length') if content_length and int(content_length) > MAX_RESPONSE_SIZE: return f"Error: Page too large ({content_length} bytes)" # Read content with size limit content_chunks = [] total_size = 0 for chunk in resp.iter_content(chunk_size=8192, decode_unicode=True): if chunk: total_size += len(chunk) if total_size > MAX_RESPONSE_SIZE: return "Error: Page content too large" content_chunks.append(chunk) html_content = ''.join(content_chunks) soup = BeautifulSoup(html_content, "html.parser") links = [a.get('href') for a in soup.find_all('a', href=True) if a.get('href')] # Filter and clean links valid_links = [] for link in links: if link.startswith(('http://', 'https://', '/')): valid_links.append(link) links_text = "\n".join(f"- {link}" for link in valid_links[:100]) # Limit to 100 links return f"Links found on {url} ({len(valid_links)} total, showing first 100):\n\n{links_text}" except requests.RequestException as e: logger.error(f"Request failed for {url}: {e}") return "Error: Unable to fetch page" except Exception as e: logger.error(f"Unexpected error fetching links from {url}: {e}", exc_info=True) return "Error: Unable to process page"
  • Pydantic Field-based input schema definition for the tool parameter in the fastmcp variant.
    def fetch_page_links(url: str = Field(description="The URL to fetch links from")) -> str:
  • FastMCP tool registration decorator for the fetch_page_links handler.
    @mcp.tool()
  • FastMCP tool registration decorator for the fetch_page_links handler in fastmcp variant.
    @mcp.tool()

Other Tools

Related Tools

Latest Blog Posts

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/billallison/brsearch-mcp-server'

If you have feedback or need assistance with the MCP directory API, please join our Discord server