Skip to main content
Glama
README.mdβ€’6.11 kB
<p align="center"> <img src="images/product.jpg" alt="Thordata"> </p> <h1 align="center" style="border-bottom: none;"> Thordata MCP Server </h1> <p align="center"> <em>Built on a 195+ country proxy network, Thordata MCP breaks through web data barriers, delivering pure, structured, globally unlimited real-time information streams to AI models</em> </p> <div align="center"> [![Licence](https://img.shields.io/badge/license-MIT-blue.svg)](LICENSE) <br/> </div> --- ## πŸ“– Overview ScraperMCP server seamlessly bridges AI and web ecosystems, providing one-click access to any website worldwide, real-time JavaScript rendering, intelligent anti-crawling mechanism bypass, and outputting AI-ready structured data content. ## πŸ› οΈ MCP Tools Thordata MCP supports dual-channel data acquisition through unlocker and regular proxies, fully compatible with multiple data formats including MarkDown, HTML, and Links. ### Web Scraper API Tool Thordata MCP provides the parse_with_ai_selectors tool, leveraging Thordata Web Scraper API to implement intelligent scraping of any website. ## βœ… Prerequisites Before deployment, please ensure you have: - **Thordata Web Scraper API Account**: Visit [thordata](https://www.thordata.com/) to obtain your exclusive username and password; ## πŸ“¦ Configuration ### Environment Variables Thordata MCP server supports the following environment variable configurations: | Name | Description | Default Value | |----------------------------|-----------------------------------------------|---------------| | `UNLOCKER_PROXY_LOGIN` | Unlocker username | | | `UNLOCKER_PROXY_PASSWORD` | Unlocker password | | | `UNLOCKER_PROXY_URL` | Unlocker proxy address | | | `DEFAULT_PROXY_LOGIN` | Regular proxy username | | | `DEFAULT_PROXY_PASSWORD` | Regular proxy password | | | `DEFAULT_PROXY_URL` | Regular proxy address | | ### Using uv Configuration - Install uv package manager: ```bash # macOS and Linux curl -LsSf https://astral.sh/uv/install.sh | sh ``` Or: ```bash # Windows powershell -ExecutionPolicy ByPass -c "irm https://astral.sh/uv/install.ps1 | iex" ``` - Use the following configuration: ```json { "mcpServers": { "Scraper": { "command": "uv", "args": [ "--directory", "<absolute folder path>", # e.g., D:\\ScraperMcp "run", "Scraper.py" ] } } } ``` ### Startup Command fastmcp run Scraper.py:mcp ### πŸ–₯️ Manual Setup Guide #### Claude Desktop Configuration 1. Open Claude application 2. Navigate to **Settings β†’ Developer β†’ Edit Configuration** 3. Add the above configuration to the `claude_desktop_config.json` file #### Cursor AI Configuration 1. Open Cursor editor 2. Navigate to **Settings β†’ Cursor Settings β†’ MCP** 3. Click **Add New Global MCP Server** 4. Configure corresponding parameters #### Cline Configuration 1. Open Cline settings 2. Navigate to **MCP Server Settings β†’ Installed** 3. Click **Configure MCP Server** 4. Configure corresponding parameters ### Manual Setup: Cline Settings β†’ MCP Server Settings β†’ Installed β†’ Click Configure MCP Server and configure corresponding parameters ## πŸ›‘οΈ License Open source distribution under MIT License - see [LICENSE](LICENSE) file for details. --- ## About Thordata Thordata, as a market-leading web intelligence collection platform, adheres to the highest business ethics and compliance standards, empowering global enterprises to uncover data-driven business insights. <div align="center"> <sub> Made by <a href="https://www.thordata.com/">Thordata</a>, if MCP saves you valuable time, we invite you to give ⭐ support. </sub> </div> ## ✨ Core Features <details> <summary><strong>Global Website Content Scraping</strong></summary> <br> - Supports data extraction from any URL, including complex single-page applications - Complete JavaScript rendering capability, ensuring perfect presentation of dynamic content - Flexible rendering mode selection: full JS rendering, pure HTML, or no rendering </details> <details> <summary><strong>Intelligent AI Data Preprocessing</strong></summary> <br> - Automated HTML cleaning and conversion to highly readable Markdown - Intelligent extraction of valid and usable links, optimizing data structure - Native HTML format support, maintaining data integrity </details> <details> <summary><strong>Global Network Barrier-Free Access</strong></summary> <br> - Efficiently bypasses complex anti-crawling protection systems - Stable scraping of high-difficulty website content - 195+ country IP pool automatic rotation, breaking geographical restrictions </details> <details> <summary><strong>Cross-Platform Flexible Deployment</strong></summary> <br> - Customizable rendering and parsing parameter configuration - Seamless integration with AI models and analysis tools - Full support for macOS, Windows, and Linux systems </details> --- ## Why Choose Thordata MCP?&nbsp;πŸ•ΈοΈ ➜ πŸ“¦ ➜ πŸ€– Just tell the LLM *"Summarize the latest discussions about MCP on Hacker News"* and get precise answers immediately. MCP (Multi-Client Protocol) handles all the tedious steps for you: | Thordata MCP Core Value | Benefits for You | |-------------------------------------------------------------------|-------------------------------------------| | **Thordata global proxy network intelligently bypasses anti-bot detection** | Ensures access availability and identity anonymity | | **One-click data acquisition solution** | Easily handles complex single-page applications | | **Multi-format output support (MarkDown/HTML/Links)** | Precisely matches your data requirements |

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/xja1023789-collab/ScraperMcp_el'

If you have feedback or need assistance with the MCP directory API, please join our Discord server