This MCP server enables AI assistants to retrieve text content from bot-protected websites and extract specific information using regex patterns.
Core Capabilities:
Web Page Fetching: Retrieve complete web pages with pagination support, optimized for text-based documentation and reference materials
Pattern Extraction: Search and extract specific content using regular expressions with configurable context around matches
Bot Detection Bypass: Three protection modes (basic, stealth, max-stealth) that automatically escalate when sites block access
Flexible Output: Content delivered in HTML or Markdown format with configurable length limits and continuation from specific positions
Intelligent Integration: Claude automatically selects appropriate tools based on natural language requests without requiring technical commands
Primarily designed for low-volume retrieval of documentation, articles, and reference materials from websites that implement bot detection.
Enables installation of the MCP server through PyPI's package repository, with version tracking and dependency management.
Scrapling Fetch MCP
An MCP server that helps AI assistants access text content from websites that implement bot detection, bridging the gap between what you can see in your browser and what the AI can access.
Intended Use
This tool is optimized for low-volume retrieval of documentation and reference materials (text/HTML only) from websites that implement bot detection. It has not been designed or tested for general-purpose site scraping or data harvesting.
Note: This project was developed in collaboration with Claude Sonnet 3.7, using LLM Context.
Installation
Requirements:
Python 3.10+
uv package manager
Install dependencies and the tool:
Setup with Claude
Add this configuration to your Claude client's MCP server configuration:
Available Tools
This package provides two distinct tools:
s-fetch-page: Retrieves complete web pages with pagination support
s-fetch-pattern: Extracts content matching regex patterns with surrounding context
Example Usage
Fetching a Complete Page
Extracting Specific Content with Pattern Matching
Functionality Options
Protection Levels:
basic
: Fast retrieval (1-2 seconds) but lower success with heavily protected sitesstealth
: Balanced protection (3-8 seconds) that works with most sitesmax-stealth
: Maximum protection (10+ seconds) for heavily protected sites
Content Targeting Options:
s-fetch-page: Retrieve entire pages with pagination support (using
start_index
andmax_length
)s-fetch-pattern: Extract specific content using regular expressions (with
search_pattern
andcontext_chars
)Results include position information for follow-up queries with
s-fetch-page
Tips for Best Results
Start with
basic
mode and only escalate to higher protection levels if neededFor large documents, use the pagination parameters with
s-fetch-page
Use
s-fetch-pattern
when looking for specific information on large pagesThe AI will automatically adjust its approach based on the site's protection level
Limitations
Designed only for text content: Specifically for documentation, articles, and reference materials
Not designed for high-volume scraping or data harvesting
May not work with sites requiring authentication
Performance varies by site complexity
License
Apache 2
hybrid server
The server is able to function both locally and remotely, depending on the configuration or use case.
An MCP server that helps AI assistants access text content from websites that implement bot detection, bridging the gap between what you can see in your browser and what the AI can access.
Related MCP Servers
- -securityFlicense-qualityMCP server that enables AI assistants to perform SEO automation tasks including keyword research, SERP analysis, and competitor analysis through Google Ads API integration.Last updated -1
- -securityAlicense-qualityAn MCP server that enables AI assistants to control a web browser through natural language commands, allowing them to navigate websites and extract information via SSE transport.Last updated -749MIT License
- -securityFlicense-qualityA MCP server that allows AI assistants to interact with the browser, including getting page content as markdown, modifying page styles, and searching browser history.Last updated -82
- AsecurityFlicenseAqualityAn MCP server that provides AI assistants with powerful tools to interact with YouTube, including video searching, transcript extraction, comment retrieval, and more.Last updated -820