check_page_links
Check all links on a web page for broken links, returning HTTP status codes and failure reasons. Supports filtering external links and respecting robots.txt exclusions.
Instructions
Check all links on a single HTML page for broken links. Returns detailed information about each link found including broken status, HTTP status codes, and reasons for failure.
Input Schema
TableJSON Schema
| Name | Required | Description | Default |
|---|---|---|---|
| excludeExternalLinks | No | If true, only check internal links (default: false) | |
| honorRobotExclusions | No | If true, respect robots.txt and meta robots tags (default: true) | |
| url | Yes | The URL of the page to check for broken links |
Implementation Reference
- index.js:172-202 (handler)MCP CallToolRequest handler for the 'check_page_links' tool. It constructs options from input arguments, invokes the checkPageLinks helper, filters broken links, computes a summary, and formats the response as JSON text content.if (name === "check_page_links") { const options = { excludeExternalLinks: args.excludeExternalLinks || false, honorRobotExclusions: args.honorRobotExclusions !== false, }; const result = await checkPageLinks(args.url, options); const brokenLinks = result.results.filter((link) => link.broken); const summary = { totalLinks: result.results.length, brokenLinks: brokenLinks.length, workingLinks: result.results.length - brokenLinks.length, }; return { content: [ { type: "text", text: JSON.stringify( { summary, brokenLinks, allLinks: result.results, }, null, 2 ), }, ], };
- index.js:108-128 (schema)Input schema definition for the check_page_links tool, specifying the required 'url' parameter and optional boolean flags for excluding external links and honoring robot exclusions.type: "object", properties: { url: { type: "string", description: "The URL of the page to check for broken links", }, excludeExternalLinks: { type: "boolean", description: "If true, only check internal links (default: false)", default: false, }, honorRobotExclusions: { type: "boolean", description: "If true, respect robots.txt and meta robots tags (default: true)", default: true, }, }, required: ["url"], },
- index.js:103-129 (registration)Registration of the check_page_links tool in the ListToolsRequestHandler response, including name, description, and input schema.{ name: "check_page_links", description: "Check all links on a single HTML page for broken links. Returns detailed information about each link found including broken status, HTTP status codes, and reasons for failure.", inputSchema: { type: "object", properties: { url: { type: "string", description: "The URL of the page to check for broken links", }, excludeExternalLinks: { type: "boolean", description: "If true, only check internal links (default: false)", default: false, }, honorRobotExclusions: { type: "boolean", description: "If true, respect robots.txt and meta robots tags (default: true)", default: true, }, }, required: ["url"], }, },
- index.js:26-56 (helper)Core implementation helper function that uses broken-link-checker's HtmlUrlChecker to check all links on the given URL, collecting detailed results for each link including broken status, reasons, and HTTP status.function checkPageLinks(url, options = {}) { return new Promise((resolve, reject) => { const results = []; const errors = []; const htmlChecker = new HtmlUrlChecker(options, { link: (result) => { results.push({ url: result.url.resolved, base: result.base.resolved, html: { tagName: result.html.tagName, text: result.html.text, }, broken: result.broken, brokenReason: result.brokenReason, excluded: result.excluded, excludedReason: result.excludedReason, http: { statusCode: result.http?.response?.statusCode, }, }); }, complete: () => { resolve({ results, errors }); }, }); htmlChecker.enqueue(url); }); }