Skip to main content
Glama
by bsmi021

check-links

Validates all links on a specified webpage by extracting anchor tags and checking their reachability via HEAD requests, identifying broken, valid, or invalid URLs for efficient link management.

Instructions

Fetches a given URL, extracts all anchor ('a') links, and checks each linked URL for validity (reachability via HEAD request). Returns a list of checked links with their status ('valid', 'broken', or 'invalid_url' if the href attribute couldn't be resolved to a full URL).

Input Schema

NameRequiredDescriptionDefault
urlYesThe fully qualified URL of the web page to check for broken links. Must be a valid HTTP or HTTPS URL.

Input Schema (JSON Schema)

{ "$schema": "http://json-schema.org/draft-07/schema#", "additionalProperties": false, "properties": { "url": { "description": "The fully qualified URL of the web page to check for broken links. Must be a valid HTTP or HTTPS URL.", "format": "uri", "type": "string" } }, "required": [ "url" ], "type": "object" }

Implementation Reference

  • The handler function for the 'check-links' tool. It processes the input arguments, calls the CheckLinksService.checkLinksOnPage method, formats the results as MCP content, and handles errors appropriately.
    const processRequest = async (args: CheckLinksToolArgs) => { logger.debug(`Received ${TOOL_NAME} request`, { args }); // Input validation is implicitly handled by Zod in TOOL_PARAMS when the server calls this. // If more complex validation or transformation is needed, it can be done here. try { // Call the service method with the validated URL const results = await serviceInstance.checkLinksOnPage(args.url); // Format the successful output for MCP return { content: [{ type: "text" as const, text: JSON.stringify(results, null, 2) // Pretty print JSON }] }; } catch (error) { // Combine error and args into a single context object for logging const logContext = { args, errorDetails: error instanceof Error ? { name: error.name, message: error.message, stack: error.stack } : String(error) }; logger.error(`Error processing ${TOOL_NAME}`, logContext); // Pass combined context // Map service-specific errors to McpError if (error instanceof ValidationError) { throw new McpError( ErrorCode.InvalidParams, `Validation failed: ${error.message}`, error.details // Pass details if available ); } if (error instanceof ServiceError) { // Use the message from the ServiceError throw new McpError( ErrorCode.InternalError, error.message, // Pass service error message directly error.details ); } if (error instanceof McpError) { throw error; // Re-throw existing McpErrors } // Catch-all for unexpected errors throw new McpError( ErrorCode.InternalError, error instanceof Error ? `An unexpected error occurred in ${TOOL_NAME}: ${error.message}` : `An unexpected error occurred in ${TOOL_NAME}.` ); } };
  • Registers the 'check-links' tool with the MCP server using server.tool(), providing name, description, Zod schema, and handler.
    // Register the tool with the server server.tool( TOOL_NAME, TOOL_DESCRIPTION, TOOL_PARAMS, // Pass the Zod schema directly processRequest );
  • Defines the tool name, description, and input schema (Zod object with 'url' parameter) for the 'check-links' tool.
    export const TOOL_NAME = "check-links"; export const TOOL_DESCRIPTION = `Fetches a given URL, extracts all anchor ('a') links, and checks each linked URL for validity (reachability via HEAD request). Returns a list of checked links with their status ('valid', 'broken', or 'invalid_url' if the href attribute couldn't be resolved to a full URL).`; export const TOOL_PARAMS = { url: z.string().url().describe("The fully qualified URL of the web page to check for broken links. Must be a valid HTTP or HTTPS URL."), };
  • The core helper method in CheckLinksService that implements the link checking logic: fetches HTML, extracts 'a' links, resolves absolute URLs, checks reachability concurrently, and returns results.
    public async checkLinksOnPage(pageUrl: string): Promise<LinkCheckResult[]> { if (!pageUrl || typeof pageUrl !== 'string') { throw new ValidationError('Invalid input: pageUrl string is required.'); } logger.info(`Starting link check for page: ${pageUrl}`); const results: LinkCheckResult[] = []; const checkedUrls = new Set<string>(); // Keep track of URLs already checked in this run try { // Fetch the HTML content and Cheerio object const { $ } = await fetchHtml(pageUrl); logger.debug(`Successfully fetched HTML for ${pageUrl}`); const linkElements = $('a[href]').toArray(); logger.debug(`Found ${linkElements.length} anchor elements on ${pageUrl}`); // Process links concurrently for better performance const checkPromises = linkElements.map(async (element) => { const href = $(element).attr('href'); // Basic filtering for href attribute if (!href || href.startsWith('#') || href.startsWith('mailto:') || href.startsWith('tel:')) { logger.debug(`Skipping invalid or local href: ${href}`); return null; // Skip this link } let absoluteUrl: string; try { // Resolve the relative URL against the page URL absoluteUrl = new URL(href, pageUrl).toString(); } catch (e) { logger.warn(`Could not resolve href '${href}' on page ${pageUrl}`, { error: e instanceof Error ? e.message : String(e) }); // Add result for invalid URL format return { url: href, status: 'invalid_url' } as LinkCheckResult; } // Check if this absolute URL has already been processed in this run if (checkedUrls.has(absoluteUrl)) { logger.debug(`Skipping already checked URL: ${absoluteUrl}`); return null; // Skip duplicate check } checkedUrls.add(absoluteUrl); // Mark as checked for this run // Check the validity (reachability) of the absolute URL try { const isValid = await isValidUrl(absoluteUrl); logger.debug(`Checked URL: ${absoluteUrl} - Status: ${isValid ? 'valid' : 'broken'}`); return { url: absoluteUrl, status: isValid ? 'valid' : 'broken' } as LinkCheckResult; } catch (checkError) { // Log error during isValidUrl check, but still report as broken logger.error(`Error checking validity of URL ${absoluteUrl}`, { error: checkError instanceof Error ? checkError.message : String(checkError) }); return { url: absoluteUrl, status: 'broken' } as LinkCheckResult; } }); // Wait for all checks to complete and filter out nulls (skipped links) const completedResults = (await Promise.all(checkPromises)).filter(result => result !== null) as LinkCheckResult[]; results.push(...completedResults); } catch (fetchError) { // Handle errors during the initial fetchHtml call logger.error(`Failed to fetch or process page ${pageUrl}`, { error: fetchError instanceof Error ? fetchError.message : String(fetchError) }); // Wrap fetch error in a ServiceError throw new ServiceError(`Failed to fetch or process the page at ${pageUrl}: ${fetchError instanceof Error ? fetchError.message : String(fetchError)}`, fetchError); } logger.info(`Finished link check for page: ${pageUrl}. Found ${results.length} results.`); return results; }
  • TypeScript interface defining the expected input arguments for the 'check-links' tool.
    export interface CheckLinksArgs { url: string; }

Other Tools

Related Tools

Latest Blog Posts

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/bsmi021/mcp-server-webscan'

If you have feedback or need assistance with the MCP directory API, please join our Discord server