Webpage MCP Server
A Model Context Protocol (MCP) server for querying webpages and page contents from a specific website.
Overview
This MCP server provides tools to list and retrieve webpage content by parsing a sitemap.xml file and fetching HTML content from specified URLs. It includes built-in rate limiting to protect against abuse.
Features
List Pages: Parse sitemap.xml to get all available webpage paths
Get Page Content: Fetch HTML content from any webpage
Sitemap Resource: Access the raw sitemap.xml file
Rate Limiting: 10 requests per minute per user to prevent abuse
Installation
Configuration
The server uses environment variables for configuration:
Variable | Description | Default |
| The base URL of the website to query |
|
| Server host address |
|
| Server port number |
|
Environment Setup
Create a .env.local file in the project root:
Sitemap Configuration
Place your sitemap.xml file in the assets/ directory. The server will automatically read from:
Usage
Running the Server
STDIO Mode (for MCP clients):
HTTP Mode:
Test Mode:
Running Tests
Available Tools
1. list_pages()
Lists all webpage paths from the sitemap.
Parameters: None
Returns: List of page paths
Example:
2. get_page(path, user_id=None)
Fetches HTML content from a webpage.
Parameters:
path(str): The webpage path (e.g., "/blog/post-1")user_id(str, optional): User identifier for rate limiting
Returns: Dictionary with HTML content and metadata
Example:
Rate Limit Response:
Available Resources
sitemap://sitemap.xml
Access the raw sitemap.xml content.
Example:
Deployment
Deploy to Dedalus
Use with Dedalus SDK
Rate Limiting
The server implements rate limiting to protect against abuse:
Limit: 10 requests per minute per user
Window: 60 seconds rolling window
Identifier: Uses
user_idparameter or 'default' if not provided
When the rate limit is exceeded, the server returns an error response with:
Time until the limit resets
Total number of allowed requests
Architecture
Error Handling
The server handles various error conditions gracefully:
Missing Sitemap: Returns error if
assets/sitemap.xmldoesn't existInvalid Path: Returns error for malformed paths
Failed HTTP Request: Returns error with details when webpage fetch fails
Rate Limit: Returns structured error with reset time
Development
Adding New Tools
To add a new tool, use the @mcp.tool() decorator:
Adding New Resources
To add a new resource, use the @mcp.resource() decorator:
License
MIT License - See LICENSE file for details
This server cannot be installed