MD Webcrawl MCP

# MD MCP Webcrawler Project A Python-based MCP (https://modelcontextprotocol.io/introduction) web crawler for extracting and saving website content. ## Features - Extract website content and save as markdown files - Map website structure and links - Batch processing of multiple URLs - Configurable output directory ## Installation 1. Clone the repository: ```bash git clone https://github.com/yourusername/webcrawler.git cd webcrawler ``` 2. Install dependencies: ```bash pip install -r requirements.txt ``` 3. Optional: Configure environment variables: ```bash export OUTPUT_PATH=./output # Set your preferred output directory ``` ## Output Crawled content is saved in markdown format in the specified output directory. ## Configuration The server can be configured through environment variables: - `OUTPUT_PATH`: Default output directory for saved files - `MAX_CONCURRENT_REQUESTS`: Maximum parallel requests (default: 5) - `REQUEST_TIMEOUT`: Request timeout in seconds (default: 30) ## Claude Set-Up Install with FastMCP ``` fastmcp install server.py ``` or user custom settings to run with fastmcp directly ```` "Crawl Server": { "command": "fastmcp", "args": [ "run", "/Users/mm22/Dev_Projekte/servers-main/src/Webcrawler/server.py" ], "env": { "OUTPUT_PATH": "/Users/user/Webcrawl" } ```` ## Development ### Live Development ```bash fastmcp dev server.py --with-editable . ``` ### Debug It helps to use https://modelcontextprotocol.io/docs/tools/inspector for debugging ## Examples ### Example 1: Extract and Save Content ```bash mcp call extract_content --url "https://example.com" --output_path "example.md" ``` ### Example 2: Create Content Index ```bash mcp call scan_linked_content --url "https://example.com" | \ mcp call create_index --content_map - --output_path "index.md" ``` ## Contributing 1. Fork the repository 2. Create a feature branch (`git checkout -b feature/AmazingFeature`) 3. Commit your changes (`git commit -m 'Add some AmazingFeature'`) 4. Push to the branch (`git push origin feature/AmazingFeature`) 5. Open a Pull Request ## License Distributed under the MIT License. See `LICENSE` for more information. ## Requirements - Python 3.7+ - FastMCP (uv pip install fastmcp) - Dependencies listed in requirements.txt