# MCP WebScout
A Model Context Protocol (MCP) server providing web search (DuckDuckGo) and intelligent content extraction with LLM-powered analysis.
## Features
- **search**: Search the web using DuckDuckGo
- **fetch**: Advanced web fetching with Crawl4AI and LLM extraction
## System Requirements
| Requirement | Version | Notes |
| ---------------- | ------- | ------------------------------------------- |
| Python | >= 3.10 | Required runtime environment |
| pip | latest | Package manager (included with Python) |
| Playwright | latest | Required by Crawl4AI for browser automation |
| DeepSeek API Key | - | Required for LLM extraction mode |
| Proxy (optional) | - | Required for users in mainland China |
### Python Dependencies (14 packages)
| Package | Version | Purpose |
| ----------------- | -------- | ------------------------------ |
| mcp | >=1.0.0 | MCP protocol implementation |
| duckduckgo-search | >=3.0.0 | DuckDuckGo search API |
| requests | >=2.32.0 | HTTP requests |
| beautifulsoup4 | >=4.12.0 | HTML parsing |
| openai | >=1.30.0 | OpenAI API client for DeepSeek |
| crawl4ai | >=0.5.0 | Advanced web scraping |
## Quick Start
Get started in 5 steps:
### 1. Clone and Setup Environment
```bash
git clone <repository>
cd mcp-webscout
python -m venv .venv
```
On Windows:
```powershell
.venv\Scripts\activate
```
On macOS/Linux:
```bash
source .venv/bin/activate
```
### 2. Install Dependencies
```bash
pip install -e ".[dev]"
```
### 3. Install Playwright Browsers
```bash
playwright install chromium
```
### 4. Configure Environment Variables
```bash
cp .env.example .env
```
Edit `.env` and add your configuration:
```env
# Required for LLM extraction
DEEPSEEK_API_KEY=sk-your-actual-key-here
# Required for mainland China users
PROXY_URL=http://127.0.0.1:7890
USE_PROXY=true
```
### 5. Verify Installation
```bash
# Run tests
pytest tests/ -v
# Test the server
python -m mcp_webscout --help
```
## Detailed Configuration
For detailed environment setup instructions, see [ENV_SETUP.md](ENV_SETUP.md).
## Usage
### As a Command
```bash
mcp-webscout
```
### As a Python Module
```bash
python -m mcp_webscout
```
### With Claude Desktop
Add to your `claude_desktop_config.json`:
#### Basic Configuration
```json
{
"mcpServers": {
"webscout": {
"command": "mcp-webscout"
}
}
}
```
#### With Environment Variables (Recommended)
```json
{
"mcpServers": {
"webscout": {
"command": "mcp-webscout",
"env": {
"DEEPSEEK_API_KEY": "sk-your-key-here",
"PROXY_URL": "http://127.0.0.1:7890",
"USE_PROXY": "true",
"DEFAULT_MAX_LENGTH": "5000",
"PYTHONUTF8": "1"
}
}
}
}
```
#### Windows Configuration
```json
{
"mcpServers": {
"webscout": {
"command": "python",
"args": ["-m", "mcp_webscout"],
"env": {
"DEEPSEEK_API_KEY": "sk-your-key-here",
"PROXY_URL": "http://127.0.0.1:7890",
"USE_PROXY": "true",
"PYTHONUTF8": "1"
}
}
}
}
```
## Tools
### search
Search the web using DuckDuckGo.
**Parameters:**
| Name | Type | Required | Description |
| ----------- | ------- | -------- | ---------------------------------- |
| query | string | Yes | Search query |
| max_results | integer | No | Maximum results (1-10, default: 5) |
**Returns:**
Formatted search results with titles, URLs, and snippets.
**Example:**
```json
{
"query": "Python programming",
"max_results": 3
}
```
### fetch
Advanced web fetching with Crawl4AI and LLM extraction.
**Parameters:**
| Name | Type | Required | Description |
| ---------- | ------- | -------- | ---------------------------------------------- |
| url | string | Yes | URL to fetch |
| mode | string | No | Extraction mode: simple, llm (default: simple) |
| prompt | string | No | Custom extraction prompt for LLM mode |
| max_length | integer | No | Maximum characters (default: 5000) |
| use_proxy | boolean | No | Use proxy (default: true) |
**Returns:**
Fetched and optionally extracted content.