Converts web pages and crawled website content into token-efficient Markdown, providing clean, compressed text specifically optimized for AI agent context windows.
@robot-resources/scraper-mcp
MCP server for Scraper — context compression for AI agents.
What is Robot Resources?
Human Resources, but for your AI agents.
Robot Resources gives AI agents two superpowers:
Router — Routes each LLM call to the cheapest capable model. 60-90% cost savings across OpenAI, Anthropic, and Google.
Scraper — Compresses web pages to clean markdown. 70-80% fewer tokens per page.
Both run locally. Your API keys never leave your machine. Free, unlimited, no tiers.
Install the full suite
npx robot-resourcesOne command sets up everything. Learn more at robotresources.ai
About this MCP server
This package gives AI agents two tools to compress web content into token-efficient markdown via the Model Context Protocol: single-page compression and multi-page BFS crawling.
Installation
npx @robot-resources/scraper-mcpOr install globally:
npm install -g @robot-resources/scraper-mcpClaude Desktop Configuration
Add to your claude_desktop_config.json:
{
"mcpServers": {
"scraper": {
"command": "npx",
"args": ["-y", "@robot-resources/scraper-mcp"]
}
}
}Tools
scraper_compress_url
Compress a single web page into markdown with 70-90% fewer tokens.
Parameters:
Parameter | Type | Required | Default | Description |
| string | yes | — | URL to compress |
| string | no |
|
|
| number | no |
| Fetch timeout in milliseconds |
| number | no |
| Max retry attempts (0-10) |
Example prompt: "Compress https://docs.example.com/getting-started"
scraper_crawl_url
Crawl multiple pages from a starting URL using BFS link discovery.
Parameters:
Parameter | Type | Required | Default | Description |
| string | yes | — | Starting URL to crawl |
| number | no |
| Max pages to crawl (1-100) |
| number | no |
| Max link depth (0-5) |
| string | no |
|
|
| string[] | no | — | URL patterns to include (glob) |
| string[] | no | — | URL patterns to exclude (glob) |
| number | no |
| Per-page timeout in milliseconds |
Example prompt: "Crawl the docs at https://docs.example.com with max 20 pages"
Fetch Modes
Mode | How | Use when |
| Plain HTTP | Default sites, APIs, docs |
| TLS fingerprint impersonation | Anti-bot protected sites |
| Headless browser (Playwright) | JS-rendered SPAs |
| Fast → stealth fallback on 403/challenge | Unknown sites (default) |
Stealth requires impit and render requires playwright as peer dependencies of @robot-resources/scraper.
Requirements
Node.js 18+
Related
@robot-resources/scraper - Core compression library
@robot-resources/router-mcp - MCP server for LLM cost optimization
Robot Resources - Human Resources, but for your AI agents
License
MIT
Resources
Looking for Admin?
Admins can modify the Dockerfile, update the server description, and track usage metrics. If you are the server author, to access the admin panel.