Click on "Install Server".
Wait a few minutes for the server to deploy. Once ready, it will show a "Started" state.
In the chat, type
@followed by the MCP server name and your instructions, e.g., "@MCP Web Research Agentsearch for latest AI research papers and scrape content for 'machine learning' keywords"
That's it! The server will respond to your query, and you can continue using it as needed.
Here is a step-by-step guide with screenshots.
MCP Web Research Agent
A powerful MCP (Model Context Protocol) tool for automated web research, scraping, and intelligence gathering.
A sophisticated web research automation tool that converts your existing scraper into an MCP-compatible agent for enhanced AI workflows. Perfect for competitive intelligence, market research, and automated data collection.
๐ Features
๐ Intelligent Scraping: Recursive web crawling with configurable depth
๐ Search Integration: Multi-engine search with result processing
๐พ Database Storage: Persistent SQLite storage with advanced querying
๐ Multiple Export Formats: JSON, Markdown, and CSV exports
๐ค MCP Integration: Seamless integration with AI assistants
โก Async Ready: Built for concurrent operations
๐ง Configurable: Adjustable settings for any use case
๐ ๏ธ Installation
Prerequisites
Python 3.8+
MCP-compatible client (Claude Desktop, etc.)
Quick Install
# Clone the repository
git clone https://github.com/yourusername/mcp-web-research-agent.git
cd mcp-web-research-agent
# Install dependencies
pip install -e .MCP Client Configuration
Add to your MCP client configuration:
{
"mcpServers": {
"web-research-agent": {
"command": "python",
"args": ["/path/to/mcp-web-research-agent/server.py"]
}
}
}๐ Usage
Available Tools
scrape_url
Scrape a single URL for specific keywords
result = await scrape_url(
url="https://example.com",
keywords=["python", "automation", "scraping"],
extract_links=False,
max_depth=1
)search_and_scrape
Search the web and automatically scrape results
result = await search_and_scrape(
query="web scraping best practices",
keywords=["python", "beautifulsoup", "requests"],
search_engine_url="https://searx.gophernuttz.us/search/",
max_results=10
)get_scraping_results
Query the database for previous scraping results
result = await get_scraping_results(
keyword_filter="python",
limit=50
)export_results
Export results to various formats
result = await export_results(
format="markdown",
keyword_filter="python",
output_path="/path/to/output.md"
)get_scraping_stats
Get current statistics and status
result = await get_scraping_stats()๐๏ธ Database Schema
The agent uses SQLite with the following structure:
-- URLs table
CREATE TABLE urls (
id INTEGER PRIMARY KEY AUTOINCREMENT,
url TEXT UNIQUE NOT NULL,
title TEXT,
content TEXT,
timestamp DATETIME DEFAULT CURRENT_TIMESTAMP
);
-- Keywords table
CREATE TABLE keywords (
id INTEGER PRIMARY KEY AUTOINCREMENT,
keyword TEXT UNIQUE NOT NULL
);
-- URL-Keyword relationships
CREATE TABLE url_keywords (
id INTEGER PRIMARY KEY AUTOINCREMENT,
url_id INTEGER,
keyword_id INTEGER,
matches INTEGER DEFAULT 1,
context TEXT,
FOREIGN KEY (url_id) REFERENCES urls (id),
FOREIGN KEY (keyword_id) REFERENCES keywords (id),
UNIQUE(url_id, keyword_id)
);๐ง Configuration
Default Settings
Max Depth: 3 levels of recursive crawling
Request Delay: 1 second between requests
User Agent: Modern Chrome browser simulation
Database:
scraper_results.db(auto-created)
Customization
Modify settings in the MCPWebScraper constructor:
scraper = MCPWebScraper(
db_manager=db_manager,
max_depth=5, # Increase crawl depth
delay=0.5 # Faster requests
)๐งช Development
Running Tests
python test_mcp_scraper.pyExample Usage
python example_usage.pyProject Structure
mcp-web-research-agent/
โโโ server.py # MCP server implementation
โโโ scraper.py # Core scraping logic
โโโ database.py # Database management
โโโ requirements.txt # Python dependencies
โโโ pyproject.toml # Package configuration
โโโ test_mcp_scraper.py # Unit tests
โโโ example_usage.py # Usage examples
โโโ README.md # This file๐ค Contributing
Contributions are welcome! Please feel free to submit a Pull Request.
Fork the repository
Create your feature branch (
git checkout -b feature/amazing-feature)Commit your changes (
git commit -m 'Add some amazing feature')Push to the branch (
git push origin feature/amazing-feature)Open a Pull Request
๐ License
This project is licensed under the MIT License - see the LICENSE file for details.
๐ Acknowledgments
Built on the Model Context Protocol
Inspired by modern web scraping best practices
Thanks to the open-source community for amazing tools
Built with โค๏ธ for the MCP ecosystem
This server cannot be installed
Resources
Unclaimed servers have limited discoverability.
Looking for Admin?
If you are the server author, to access and configure the admin panel.