MCP Webscan Server

extract-links

Extract and analyze all hyperlinks from a web page, organizing them into a structured format with URLs, anchor text, and contextual information. Performance-optimized with stream processing and worker threads for efficient handling of large pages. Works with either a direct URL or raw HTML content. Handles relative and absolute URLs properly by supporting an optional base URL parameter. Results can be limited to prevent overwhelming output for link-dense pages. Returns a comprehensive link inventory that includes destination URLs, link text, titles (if available), and whether links are internal or external to the source domain. Useful for site mapping, content analysis, broken link checking, SEO analysis, and as a preparatory step for targeted crawling operations.

Input Schema

NameRequiredDescriptionDefault
baseUrlNo
limitNo
urlYes

Input Schema (JSON Schema)

{ "$schema": "http://json-schema.org/draft-07/schema#", "additionalProperties": false, "properties": { "baseUrl": { "format": "uri", "type": "string" }, "limit": { "default": 100, "maximum": 5000, "minimum": 1, "type": "number" }, "url": { "format": "uri", "type": "string" } }, "required": [ "url" ], "type": "object" }