Why this server?
This server directly enables AI models to scrape and extract data from any website, including rendering JavaScript content and outputting structured data in HTML or Markdown format, which perfectly matches the user's request.
-security-license-qualityEnables AI models to scrape and extract data from any website globally using Thordata's 195+ country proxy network. Bypasses anti-bot systems and renders JavaScript content, outputting structured data in Markdown, HTML, or Links format.Last updatedMITWhy this server?
This server explicitly offers 'web scraping and crawling capabilities' for LLM clients, supporting single-page and multi-page crawling, and providing output in HTML format.
-securityFlicense-qualityEnables web scraping and crawling capabilities for LLM clients, supporting single-page scraping, multi-page website crawling, and web search with multiple engines (Playwright, Cheerio, Puppeteer) and flexible output formats including markdown, HTML, text, and screenshots.Last updated36Why this server?
This server is a dedicated web scraping tool that provides multiple export formats and content extraction rules, specifically designed for both static and dynamic websites to extract HTML content.
AsecurityAlicense-qualityA TypeScript-based web scraping server built on the Model Context Protocol that offers multiple export formats, content extraction rules, and support for both static and dynamic (SPA) websites.Last updated731MITWhy this server?
This server allows fetching content from URLs, explicitly supporting HTML, JSON, text, and images, making it suitable for retrieving raw HTML from websites.
AsecurityAlicense-qualityA Model Context Protocol (MCP) server that enables Claude or other LLMs to fetch content from URLs, supporting HTML, JSON, text, and images with configurable request parameters.Last updated32MITWhy this server?
Described as providing 'advanced search and retrieval for web crawler data' and enabling AI clients to filter and analyze web content, this server is well-suited for scraping website HTML.
-securityFlicense-qualityBridge the gap between your web crawl and AI language models. With mcp-server-webcrawl, your AI client filters and analyzes web content under your direction or autonomously, extracting insights from your web content. Supports WARC, wget, InterroBot, Katana, and SiteOne crawlers.Last updated38PythonWhy this server?
This server retrieves and processes web content by fetching URLs and converting HTML to Markdown, which is a common use case after scraping HTML to make it LLM-friendly.
AsecurityAlicense-qualityEnables LLMs to retrieve and process web content by fetching URLs and converting HTML to markdown format. Supports chunked reading of large pages and can access both public websites and local networks.Last updated1MITWhy this server?
This server enables AI applications to automate web browsers for 'content extraction' and navigation, making it a powerful tool for scraping dynamic website HTML.
-securityFlicense-qualityEnables AI assistants to automate web browsers through Playwright, providing capabilities for navigation, content extraction, form filling, screenshot capture, and JavaScript execution. Supports multiple browser engines with comprehensive error handling and security features.Last updated1Why this server?
This server explicitly provides 'web search, content extraction, web crawling, and scraping capabilities,' making it highly relevant for the user's need to scrape website HTML.
AsecurityFlicense-qualityBuilt as a Model Context Protocol (MCP) server that provides advanced web search, content extraction, web crawling, and scraping capabilities using the Firecrawl API.Last updated41Why this server?
This server is a 'powerful text extraction service' that converts web content into clean, LLM-optimized Markdown, implying the ability to first scrape HTML content.

Skrape MCP Serverofficial
AsecurityAlicense-qualityThis server converts webpages into clean, structured Markdown optimized for language model consumption, removing unnecessary content and supporting JavaScript rendering.Last updated112MIT