Question 1

What can you do with this server?

Accepted Answer

The Scrapy MCP Server is a robust, enterprise-grade web scraping platform that offers comprehensive data extraction capabilities for commercial use.

Core Scraping Capabilities:

* Multiple scraping methods: HTTP requests, Scrapy framework, Selenium, or Playwright with intelligent method selection
* Concurrent processing: Scrape multiple URLs simultaneously with exponential backoff retry mechanisms
* JavaScript support: Fully render dynamic, JavaScript-heavy websites using complete browser rendering
* Advanced data extraction: Configure flexible extraction rules using simple or advanced selectors, or automatically extract structured data like contact information, social media links, product details, and addresses
* Link extraction: Specialized link extraction with domain filtering and internal/external link options
* Form interaction: Automatically fill and submit various form types including text inputs, checkboxes, and file uploads

Anti-Detection & Performance:

* Stealth techniques: Bypass anti-bot measures using undetected-chromedriver, Playwright stealth, random User-Agent rotation, and proxy support
* Performance optimization: In-memory caching, rate limiting, and intelligent request handling to prevent server overload
* Monitoring tools: Track server metrics including request counts, success rates, cache statistics, and detailed performance monitoring

Enterprise Features:

* Ethical compliance: Check robots.txt files for responsible data collection
* Error handling: Robust error classification and handling mechanisms
* Cache management: Clear scraping results cache and manage server resources

Question 2

Which integrations are available for this server?

Accepted Answer

Provides web scraping capabilities using the Scrapy framework for large-scale data extraction, with support for concurrent requests, custom pipelines, and advanced crawling features.

Enables browser automation and JavaScript-heavy website scraping through Selenium WebDriver, with support for form filling, element waiting, and dynamic content extraction.

Question 3

How do I use Scrapy MCP Server?

Accepted Answer

1. Click on "Install Server".
2. Wait a few minutes for the server to deploy. Once ready, it will show a "Started" state.
3. In the chat, type @ followed by the MCP server name and your instructions, e.g., "@Scrapy MCP Server scrape the pricing page from example.com and convert it to markdown"

That's it! The server will respond to your query, and you can continue using it as needed.

Here is a step-by-step guide with screenshots.

工具名称	功能描述	主要参数
scrape_webpage	单页面抓取	`url`, `method`(自动选择), `extract_config`(选择器配置), `wait_for_element`(CSS 选择器)
scrape_multiple_webpages	批量页面抓取	`urls`(列表), `method`(统一方法), `extract_config`(全局配置)
scrape_with_stealth	反检测抓取	`url`, `method`(selenium/playwright), `scroll_page`(滚动加载), `wait_for_element`
fill_and_submit_form	表单自动化	`url`, `form_data`(选择器:值), `submit`(是否提交), `submit_button_selector`
extract_links	专业链接提取	`url`, `filter_domains`(域名过滤), `exclude_domains`(排除域名), `internal_only`(仅内部)
extract_structured_data	结构化数据提取	`url`, `data_type`(all/contact/social/content/products/addresses)
get_page_info	页面信息获取	`url`(目标 URL) - 返回标题、状态码、元数据
check_robots_txt	爬虫规则检查	`url`(域名 URL) - 检查 robots.txt 规则
convert_webpage_to_markdown	页面转 Markdown	`url`, `method`, `extract_main_content`(提取主内容), `embed_images`(嵌入图片), `formatting_options`
batch_convert_webpages_to_markdown	批量 Markdown 转换	`urls`(列表), `method`, `extract_main_content`, `embed_images`, `embed_options`

工具名称	功能描述	主要参数
convert_pdf_to_markdown	PDF 转 Markdown	`pdf_source`(URL/路径), `method`(auto/pymupdf/pypdf), `page_range`, `output_format`
batch_convert_pdfs_to_markdown	批量 PDF 转换	`pdf_sources`(列表), `method`, `page_range`, `output_format`, `include_metadata`

工具名称	功能描述	主要参数
get_server_metrics	性能指标监控	无参数 - 返回请求统计、性能指标、缓存情况
clear_cache	缓存管理	无参数 - 清空所有缓存数据

Scrapy MCP Server

🛠️ MCP Server Core Tools (14)

Web Page

PDF Document

Service Management

🎯 Quick Navigation

🤝 Contribution

📄 License

Resources

Tools

Appeared in Searches

New MCP Servers

Latest Blog Posts

MCP directory API