Integrates with GitHub workflows through GitHub Actions for automated release processes.
Enables JetBrains IDEs to use the web content extraction functionality through AI Assistant integration.
Converts HTML content to clean Markdown format while preserving links for knowledge graphs and downstream processing.
Leverages Mozilla Readability (same technology as Firefox Reader View) for effective content extraction from web pages.
Distributed as an npm package with version tracking and badge integration for easy installation and updates.
@just-every/mcp-read-website-fast
Fast, token-efficient web content extraction for AI agents - converts websites to clean Markdown.
Overview
Existing MCP web crawlers are slow and consume large quantities of tokens. This pauses the development process and provides incomplete results as LLMs need to parse whole web pages.
This MCP package fetches web pages locally, strips noise, and converts content to clean Markdown while preserving links. Designed for Claude Code, IDEs and LLM pipelines with minimal token footprint. Crawl sites locally with minimal dependencies.
Features
- Fast startup using official MCP SDK with lazy loading for optimal performance
- Content extraction using Mozilla Readability (same as Firefox Reader View)
- HTML to Markdown conversion with Turndown + GFM support
- Smart caching with SHA-256 hashed URLs
- Polite crawling with robots.txt support and rate limiting
- Concurrent fetching with configurable depth crawling
- Stream-first design for low memory usage
- Link preservation for knowledge graphs
- Optional chunking for downstream processing
Installation
Claude Code
VS Code
Cursor
JetBrains IDEs
Settings → Tools → AI Assistant → Model Context Protocol (MCP) → Add
Choose “As JSON” and paste:
Or, in the chat window, type /add and fill in the same JSON—both paths land the server in a single step. 
Raw JSON (works in any MCP client)
Drop this into your client’s mcp.json (e.g. .vscode/mcp.json, ~/.cursor/mcp.json, or .mcp.json for Claude).
Features
- Fast startup using official MCP SDK with lazy loading for optimal performance
- Content extraction using Mozilla Readability (same as Firefox Reader View)
- HTML to Markdown conversion with Turndown + GFM support
- Smart caching with SHA-256 hashed URLs
- Polite crawling with robots.txt support and rate limiting
- Concurrent fetching with configurable depth crawling
- Stream-first design for low memory usage
- Link preservation for knowledge graphs
- Optional chunking for downstream processing
Available Tools
read_website_fast
- Fetches a webpage and converts it to clean markdown- Parameters:
url
(required): The HTTP/HTTPS URL to fetchdepth
(optional): Crawl depth (0 = single page)respectRobots
(optional): Whether to respect robots.txt
- Parameters:
Available Resources
read-website-fast://status
- Get cache statisticsread-website-fast://clear-cache
- Clear the cache directory
Development Usage
Install
Single page fetch
Crawl with depth
Output formats
CLI Options
-d, --depth <number>
- Crawl depth (0 = single page, default: 0)-c, --concurrency <number>
- Max concurrent requests (default: 3)--no-robots
- Ignore robots.txt--all-origins
- Allow cross-origin crawling-u, --user-agent <string>
- Custom user agent--cache-dir <path>
- Cache directory (default: .cache)-t, --timeout <ms>
- Request timeout in milliseconds (default: 30000)-o, --output <format>
- Output format: json, markdown, or both (default: markdown)
Clear cache
Architecture
Development
Contributing
Contributions are welcome! Please:
- Fork the repository
- Create a feature branch
- Add tests for new functionality
- Submit a pull request
Troubleshooting
Cache Issues
Timeout Errors
- Increase timeout with
-t
flag - Check network connectivity
- Verify URL is accessible
Content Not Extracted
- Some sites block automated access
- Try custom user agent with
-u
flag - Check if site requires JavaScript (not supported)
License
MIT
This server cannot be installed
Fast, token-efficient web content extraction tool that converts websites to clean Markdown for AI agents, featuring smart caching, content extraction with Mozilla Readability, and polite crawling capabilities.
Related MCP Servers
- AsecurityAlicenseAqualityExtracts and transforms webpage content into clean, LLM-optimized Markdown. Returns article title, main content, excerpt, byline and site name. Uses Mozilla's Readability algorithm to remove ads, navigation, footers and non-essential elements while preserving the core content structure.Last updated -1411MIT License
- AsecurityAlicenseAqualityEnables web content scanning and analysis by fetching, analyzing, and extracting information from web pages using tools like page fetching, link extraction, site crawling, and more.Last updated -67TypeScriptMIT License
- AsecurityAlicenseAqualityConverts various file types and web content to Markdown format. It provides a set of tools to transform PDFs, images, audio files, web pages, and more into easily readable and shareable Markdown text.Last updated -1021,611TypeScriptMIT License
Skrape MCP Serverofficial
AsecurityAlicenseAqualityThis server converts webpages into clean, structured Markdown optimized for language model consumption, removing unnecessary content and supporting JavaScript rendering.Last updated -15JavaScriptMIT License