check_page_links
Check all links on a single HTML page for broken links, returning HTTP status codes and failure reasons. Supports excluding external links and respecting robots.txt exclusions.
Instructions
Check all links on a single HTML page for broken links. Returns detailed information about each link found including broken status, HTTP status codes, and reasons for failure.
Input Schema
TableJSON Schema
| Name | Required | Description | Default |
|---|---|---|---|
| url | Yes | The URL of the page to check for broken links | |
| excludeExternalLinks | No | If true, only check internal links (default: false) | |
| honorRobotExclusions | No | If true, respect robots.txt and meta robots tags (default: true) |
Implementation Reference
- index.js:26-56 (handler)The core handler function that performs the link checking on a single page using HtmlUrlChecker from the broken-link-checker library. It enqueues the URL, processes link results, and resolves with results and any errors.function checkPageLinks(url, options = {}) { return new Promise((resolve, reject) => { const results = []; const errors = []; const htmlChecker = new HtmlUrlChecker(options, { link: (result) => { results.push({ url: result.url.resolved, base: result.base.resolved, html: { tagName: result.html.tagName, text: result.html.text, }, broken: result.broken, brokenReason: result.brokenReason, excluded: result.excluded, excludedReason: result.excludedReason, http: { statusCode: result.http?.response?.statusCode, }, }); }, complete: () => { resolve({ results, errors }); }, }); htmlChecker.enqueue(url); }); }
- index.js:107-128 (schema)Input schema definition for the check_page_links tool, specifying the required 'url' parameter and optional boolean flags for link filtering and robot exclusion respect.inputSchema: { type: "object", properties: { url: { type: "string", description: "The URL of the page to check for broken links", }, excludeExternalLinks: { type: "boolean", description: "If true, only check internal links (default: false)", default: false, }, honorRobotExclusions: { type: "boolean", description: "If true, respect robots.txt and meta robots tags (default: true)", default: true, }, }, required: ["url"], },
- index.js:103-129 (registration)Tool registration in the ListToolsRequestSchema handler, defining the tool's name, description, and input schema.{ name: "check_page_links", description: "Check all links on a single HTML page for broken links. Returns detailed information about each link found including broken status, HTTP status codes, and reasons for failure.", inputSchema: { type: "object", properties: { url: { type: "string", description: "The URL of the page to check for broken links", }, excludeExternalLinks: { type: "boolean", description: "If true, only check internal links (default: false)", default: false, }, honorRobotExclusions: { type: "boolean", description: "If true, respect robots.txt and meta robots tags (default: true)", default: true, }, }, required: ["url"], }, },
- server.js:38-68 (handler)Identical core handler function for the HTTP/SSE server version, performing link checking using HtmlUrlChecker.function checkPageLinks(url, options = {}) { return new Promise((resolve, reject) => { const results = []; const errors = []; const htmlChecker = new HtmlUrlChecker(options, { link: (result) => { results.push({ url: result.url.resolved, base: result.base.resolved, html: { tagName: result.html.tagName, text: result.html.text, }, broken: result.broken, brokenReason: result.brokenReason, excluded: result.excluded, excludedReason: result.excludedReason, http: { statusCode: result.http?.response?.statusCode, }, }); }, complete: () => { resolve({ results, errors }); }, }); htmlChecker.enqueue(url); }); }
- server.js:119-140 (schema)Input schema for check_page_links tool in the server.js version, matching the stdio version.inputSchema: { type: "object", properties: { url: { type: "string", description: "The URL of the page to check for broken links", }, excludeExternalLinks: { type: "boolean", description: "If true, only check internal links (default: false)", default: false, }, honorRobotExclusions: { type: "boolean", description: "If true, respect robots.txt and meta robots tags (default: true)", default: true, }, }, required: ["url"], },