Why this server?
This server directly enables AI models to scrape and extract data from any website, including rendering JavaScript content and outputting structured data in HTML or Markdown format, which perfectly matches the user's request.
-license-quality-maintenanceEnables AI models to scrape and extract data from any website globally using Thordata's 195+ country proxy network. Bypasses anti-bot systems and renders JavaScript content, outputting structured data in Markdown, HTML, or Links format.Last updatedWhy this server?
This server explicitly offers 'web scraping and crawling capabilities' for LLM clients, supporting single-page and multi-page crawling, and providing output in HTML format.
Alicense-qualityBmaintenanceEnables web scraping and crawling capabilities for LLM clients, supporting single-page scraping, multi-page website crawling, and web search with multiple engines (Playwright, Cheerio, Puppeteer) and flexible output formats including markdown, HTML, text, and screenshots.Last updated76MITWhy this server?
This server is a dedicated web scraping tool that provides multiple export formats and content extraction rules, specifically designed for both static and dynamic websites to extract HTML content.
AlicenseAqualityCmaintenanceA TypeScript-based web scraping server built on the Model Context Protocol that offers multiple export formats, content extraction rules, and support for both static and dynamic (SPA) websites.Last updated741MITWhy this server?
This server allows fetching content from URLs, explicitly supporting HTML, JSON, text, and images, making it suitable for retrieving raw HTML from websites.
AlicenseAqualityCmaintenanceA Model Context Protocol (MCP) server that enables Claude or other LLMs to fetch content from URLs, supporting HTML, JSON, text, and images with configurable request parameters.Last updated33MITWhy this server?
Described as providing 'advanced search and retrieval for web crawler data' and enabling AI clients to filter and analyze web content, this server is well-suited for scraping website HTML.
Flicense-qualityBmaintenanceBridge the gap between your web crawl and AI language models. With mcp-server-webcrawl, your AI client filters and analyzes web content under your direction or autonomously, extracting insights from your web content. Supports WARC, wget, InterroBot, Katana, and SiteOne crawlers.Last updated39PythonWhy this server?
This server retrieves and processes web content by fetching URLs and converting HTML to Markdown, which is a common use case after scraping HTML to make it LLM-friendly.
AlicenseBqualityCmaintenanceEnables LLMs to retrieve and process web content by fetching URLs and converting HTML to markdown format. Supports chunked reading of large pages and can access both public websites and local networks.Last updated1MITWhy this server?
This server enables AI applications to automate web browsers for 'content extraction' and navigation, making it a powerful tool for scraping dynamic website HTML.
Flicense-qualityCmaintenanceEnables AI assistants to automate web browsers through Playwright, providing capabilities for navigation, content extraction, form filling, screenshot capture, and JavaScript execution. Supports multiple browser engines with comprehensive error handling and security features.Last updated1Why this server?
This server explicitly provides 'web search, content extraction, web crawling, and scraping capabilities,' making it highly relevant for the user's need to scrape website HTML.
FlicenseCqualityCmaintenanceBuilt as a Model Context Protocol (MCP) server that provides advanced web search, content extraction, web crawling, and scraping capabilities using the Firecrawl API.Last updated41Why this server?
This server is a 'powerful text extraction service' that converts web content into clean, LLM-optimized Markdown, implying the ability to first scrape HTML content.

Skrape MCP Serverofficial
AlicenseBqualityDmaintenanceThis server converts webpages into clean, structured Markdown optimized for language model consumption, removing unnecessary content and supporting JavaScript rendering.Last updated112MIT