MCP Web Research Agent
A powerful MCP (Model Context Protocol) tool for automated web research, scraping, and intelligence gathering.
A sophisticated web research automation tool that converts your existing scraper into an MCP-compatible agent for enhanced AI workflows. Perfect for competitive intelligence, market research, and automated data collection.
๐ Features
๐ Intelligent Scraping: Recursive web crawling with configurable depth
๐ Search Integration: Multi-engine search with result processing
๐พ Database Storage: Persistent SQLite storage with advanced querying
๐ Multiple Export Formats: JSON, Markdown, and CSV exports
๐ค MCP Integration: Seamless integration with AI assistants
โก Async Ready: Built for concurrent operations
๐ง Configurable: Adjustable settings for any use case
๐ ๏ธ Installation
Prerequisites
Python 3.8+
MCP-compatible client (Claude Desktop, etc.)
Quick Install
MCP Client Configuration
Add to your MCP client configuration:
๐ Usage
Available Tools
scrape_url
Scrape a single URL for specific keywords
search_and_scrape
Search the web and automatically scrape results
get_scraping_results
Query the database for previous scraping results
export_results
Export results to various formats
get_scraping_stats
Get current statistics and status
๐๏ธ Database Schema
The agent uses SQLite with the following structure:
๐ง Configuration
Default Settings
Max Depth: 3 levels of recursive crawling
Request Delay: 1 second between requests
User Agent: Modern Chrome browser simulation
Database:
scraper_results.db(auto-created)
Customization
Modify settings in the MCPWebScraper constructor:
๐งช Development
Running Tests
Example Usage
Project Structure
๐ค Contributing
Contributions are welcome! Please feel free to submit a Pull Request.
Fork the repository
Create your feature branch (
git checkout -b feature/amazing-feature)Commit your changes (
git commit -m 'Add some amazing feature')Push to the branch (
git push origin feature/amazing-feature)Open a Pull Request
๐ License
This project is licensed under the MIT License - see the LICENSE file for details.
๐ Acknowledgments
Built on the Model Context Protocol
Inspired by modern web scraping best practices
Thanks to the open-source community for amazing tools
Built with โค๏ธ for the MCP ecosystem
This server cannot be installed
local-only server
The server can only run on the client's local machine because it depends on local resources.
Enables automated web research and intelligence gathering through recursive web crawling, multi-engine search integration, and persistent SQLite storage with support for keyword filtering and multiple export formats.