README.md•3.34 kB
# Broken Link Checker MCP Server
An MCP (Model Context Protocol) server that provides broken link checking capabilities using the [broken-link-checker](https://github.com/stevenvachon/broken-link-checker) library.
## Features
- **Check Single Page Links**: Scan all links on a single HTML page for broken links
- **Check Entire Site**: Recursively crawl and check all links across an entire website
- Detailed reporting including HTTP status codes, broken reasons, and link metadata
- Support for excluding external links and respecting robots.txt
## Installation
```bash
npm install
```
## Usage with Claude Desktop
### Step 1: Configure Claude Desktop
Add this server to your Claude Desktop configuration file:
**MacOS**: `~/Library/Application Support/Claude/claude_desktop_config.json`
**Windows**: `%APPDATA%/Claude/claude_desktop_config.json`
```json
{
"mcpServers": {
"broken-link-checker": {
"command": "node",
"args": ["/Users/davinoishi/Documents/Projects-AI/BLC/index.js"]
}
}
}
```
Make sure to update the path to match your actual installation directory.
### Step 2: Restart Claude Desktop
After updating the configuration, restart Claude Desktop for the changes to take effect.
### Step 3: Use the Tools
The MCP server provides two main tools:
#### 1. `check_page_links`
Check all links on a single HTML page.
**Parameters**:
- `url` (required): The URL of the page to check
- `excludeExternalLinks` (optional): If true, only check internal links (default: false)
- `honorRobotExclusions` (optional): If true, respect robots.txt (default: true)
**Example**:
```
Can you check the links on https://example.com for any broken links?
```
#### 2. `check_site`
Recursively crawl and check all links across an entire website.
**Parameters**:
- `url` (required): The starting URL of the site to check
- `excludeExternalLinks` (optional): If true, only check internal links (default: false)
- `honorRobotExclusions` (optional): If true, respect robots.txt (default: true)
- `maxSocketsPerHost` (optional): Maximum concurrent requests per host (default: 1)
**Example**:
```
Can you crawl https://example.com and check all pages for broken links?
```
## Using with ngrok (Public URL)
If you need to expose this service publicly, you can use ngrok:
```bash
# Install ngrok if you haven't already
npm install -g ngrok
# Start the MCP server (it runs on stdio by default)
# You may need to modify the server to support HTTP if you want to expose it via ngrok
```
**Note**: The current implementation uses stdio transport which is designed for local use with Claude Desktop. For public access via ngrok, you would need to modify the server to use HTTP/SSE transport instead.
## Output Format
Both tools return JSON with the following structure:
```json
{
"summary": {
"totalLinks": 100,
"brokenLinks": 5,
"workingLinks": 95
},
"brokenLinks": [
{
"url": "https://example.com/broken-page",
"base": "https://example.com",
"broken": true,
"brokenReason": "HTTP_404",
"http": {
"statusCode": 404
}
}
]
}
```
## Development
The main server code is in `index.js`. The server uses:
- `@modelcontextprotocol/sdk` for MCP protocol implementation
- `broken-link-checker` for link checking functionality
## License
MIT