Skip to main content
Glama

Broken Link Checker MCP Server

by davinoishi
README.md4.97 kB
# Broken Link Checker MCP Server An MCP (Model Context Protocol) server that provides broken link checking capabilities using the [broken-link-checker](https://github.com/stevenvachon/broken-link-checker) library. ## Features - **Check Single Page Links**: Scan all links on a single HTML page for broken links - **Check Entire Site**: Recursively crawl and check all links across an entire website - Detailed reporting including HTTP status codes, broken reasons, and link metadata - Support for excluding external links and respecting robots.txt - **Two deployment modes**: Local stdio or Remote HTTP/SSE ## Installation ```bash npm install ``` ## Deployment Options ### Option 1: Local Usage (stdio transport) Use `index.js` for local Claude Desktop integration. ### Option 2: Remote Usage (HTTP/SSE transport) Use `server.js` for remote deployment with ngrok or similar proxy services. ## Usage with Claude Desktop (Local) ### Step 1: Configure Claude Desktop Add this server to your Claude Desktop configuration file: **MacOS**: `~/Library/Application Support/Claude/claude_desktop_config.json` **Windows**: `%APPDATA%/Claude/claude_desktop_config.json` ```json { "mcpServers": { "broken-link-checker": { "command": "node", "args": ["/Users/davinoishi/Documents/Projects-AI/BLC/index.js"] } } } ``` Make sure to update the path to match your actual installation directory. ### Step 2: Restart Claude Desktop After updating the configuration, restart Claude Desktop for the changes to take effect. ### Step 3: Use the Tools The MCP server provides two main tools: #### 1. `check_page_links` Check all links on a single HTML page. **Parameters**: - `url` (required): The URL of the page to check - `excludeExternalLinks` (optional): If true, only check internal links (default: false) - `honorRobotExclusions` (optional): If true, respect robots.txt (default: true) **Example**: ``` Can you check the links on https://example.com for any broken links? ``` #### 2. `check_site` Recursively crawl and check all links across an entire website. **Parameters**: - `url` (required): The starting URL of the site to check - `excludeExternalLinks` (optional): If true, only check internal links (default: false) - `honorRobotExclusions` (optional): If true, respect robots.txt (default: true) - `maxSocketsPerHost` (optional): Maximum concurrent requests per host (default: 1) **Example**: ``` Can you crawl https://example.com and check all pages for broken links? ``` ## Remote Deployment with HTTP/SSE Transport For remote deployments (e.g., deploying on a VPS and connecting via ngrok), use the HTTP/SSE server: ### Step 1: Start the HTTP Server ```bash # Start the HTTP/SSE server (default port 3000) npm run start:http # Or specify a custom port PORT=8080 npm run start:http ``` The server will start on `http://localhost:3000` (or your specified port). ### Step 2: Expose with ngrok (or alternative) ```bash # Install ngrok if you haven't already npm install -g ngrok # Expose your local server ngrok http 3000 ``` ngrok will provide you with a public URL like: `https://abc123.ngrok.io` ### Step 3: Configure Claude Desktop for Remote Connection Update your Claude Desktop configuration to use the HTTP/SSE transport: **MacOS**: `~/Library/Application Support/Claude/claude_desktop_config.json` **Windows**: `%APPDATA%/Claude/claude_desktop_config.json` ```json { "mcpServers": { "broken-link-checker": { "url": "https://your-ngrok-url.ngrok.io/sse" } } } ``` Replace `your-ngrok-url.ngrok.io` with your actual ngrok URL. ### Step 4: Test the Connection 1. Check the health endpoint: `https://your-ngrok-url.ngrok.io/health` 2. Restart Claude Desktop 3. Ask Claude to check links on a webpage ### Environment Variables You can configure the server using environment variables: ```bash # Copy the example environment file cp .env.example .env # Edit .env with your settings PORT=3000 HOST=0.0.0.0 ``` ### Production Deployment For production deployments, consider: 1. **Use a process manager** (PM2, systemd): ```bash npm install -g pm2 pm2 start server.js --name broken-link-checker-mcp pm2 save pm2 startup ``` 2. **Use a reverse proxy** (nginx, Caddy) for HTTPS 3. **Add authentication** if exposing publicly 4. **Monitor logs and resource usage** ## Output Format Both tools return JSON with the following structure: ```json { "summary": { "totalLinks": 100, "brokenLinks": 5, "workingLinks": 95 }, "brokenLinks": [ { "url": "https://example.com/broken-page", "base": "https://example.com", "broken": true, "brokenReason": "HTTP_404", "http": { "statusCode": 404 } } ] } ``` ## Development The main server code is in `index.js`. The server uses: - `@modelcontextprotocol/sdk` for MCP protocol implementation - `broken-link-checker` for link checking functionality ## License MIT

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/davinoishi/broken-link-checker-mcp'

If you have feedback or need assistance with the MCP directory API, please join our Discord server