Web scraping and content extraction

Search for:

Web scraping and content extraction

View all MCP Servers

Why this server?
This server is ideal for '爬取网页内容' as its core function is to scrape and extract structured data from any website, bypassing anti-bot systems and handling JavaScript content.
Thordata MCP Server
Web Scraping Browser Automation
xja1023789-collab
-
license
-
quality
-
maintenance
Enables AI models to scrape and extract data from any website globally using Thordata's 195+ country proxy network. Bypasses anti-bot systems and renders JavaScript content, outputting structured data in Markdown, HTML, or Links format.
Last updated 2025-09-23
Why this server?
This server specifically provides tools for 'web search, content extraction, web crawling, and scraping capabilities,' directly matching the user's need for retrieving webpage content.
WebSearch
Web Scraping Browser Automation Search
josemartinrodriguezmortaloni
F
license
C
quality
C
maintenance
Built as a Model Context Protocol (MCP) server that provides advanced web search, content extraction, web crawling, and scraping capabilities using the Firecrawl API.
Last updated 2026-06-17
4
1
Why this server?
Designed to scrape and extract data from single pages or perform multi-page website crawling, making it highly effective for collecting webpage content and outputting structured data.
AnyCrawl MCP Server
Web Scraping Browser Automation
any4ai
A
license
-
quality
C
maintenance
Enables web scraping and crawling capabilities for LLM clients, supporting single-page scraping, multi-page website crawling, and web search with multiple engines (Playwright, Cheerio, Puppeteer) and flexible output formats including markdown, HTML, text, and screenshots.
Last updated 2026-03-19
5
6
MIT
Why this server?
A powerful tool enabling AI-powered web scraping to transform web pages into markdown, specifically for extracting structured data and content from webpages.
ScrapeGraph MCP Serverofficial
Web Scraping RAG Systems Browser Automation
ScrapeGraphAI
A
license
A
quality
B
maintenance
A production-ready Model Context Protocol server that enables language models to leverage AI-powered web scraping capabilities, offering tools for transforming webpages to markdown, extracting structured data, and executing AI-powered web searches.
Last updated 2026-07-17
8
89
MIT
Why this server?
This server specializes in web scraping of difficult-to-access websites, including those with bot detection or captchas, ensuring content can be reliably extracted.
ScrAPI MCP Server
Web Scraping App Automation
DevEnterpriseSoftware
A
license
A
quality
A
maintenance
A server that enables web scraping of difficult-to-access websites affected by bot detection, captchas, or geolocation restrictions, returning results in either HTML or Markdown format.
Last updated 2026-07-02
2
45
18
MIT
Why this server?
Focuses on fetching and analyzing web content from URLs, supporting content extraction, summarization, and extracting metadata, which is key for gathering webpage content.
URL Fetcher MCP Server
Browser Automation Web Scraping Text Summarization
lucoo01
F
license
-
quality
D
maintenance
Enables AI assistants to fetch and analyze web content from URLs through MCP protocol. Supports batch processing, content extraction, summarization, and metadata extraction with intelligent filtering of ads and navigation elements.
Last updated 2025-10-11
Why this server?
Enables robust browser automation and direct interaction with web pages using Playwright, which is a common method for dynamically retrieving content from JavaScript-heavy sites.
Playwright MCP
Browser Automation Web Scraping
mattreya
A
license
-
quality
D
maintenance
Enables LLMs to perform browser automation and web page interactions using Playwright's accessibility tree instead of screenshots. Provides fast, deterministic web automation through structured data without requiring vision models.
Last updated 2025-09-22
6,451,720
Apache 2.0
Why this server?
Specifically designed to fetch clean web content and convert it into markdown format for LLMs, indicating strong capabilities in webpage content extraction.
pure.md MCP serverofficial
Web Scraping Search Browser Automation
puremd
A
license
D
quality
D
maintenance
An MCP server that enables AI clients like Cursor, Windsurf, and Claude Desktop to access web content in markdown format, providing web unblocking and searching capabilities.
Last updated 2025-04-01
2
19
60
MIT
Why this server?
Converts entire webpages into clean, structured Markdown by removing non-essential elements, making it an excellent tool for extracting the main content of a webpage.
Skrape MCP Serverofficial
Web Scraping RAG Systems
skrapeai
A
license
B
quality
F
maintenance
This server converts webpages into clean, structured Markdown optimized for language model consumption, removing unnecessary content and supporting JavaScript rendering.
Last updated 2025-07-30
1
12
MIT

Web scraping and content extraction

Thordata MCP Server

WebSearch

AnyCrawl MCP Server

ScrapeGraph MCP Serverofficial

ScrAPI MCP Server

URL Fetcher MCP Server

Playwright MCP

pure.md MCP serverofficial

Skrape MCP Serverofficial