web_extract_links
Extract all links from a webpage to analyze content structure, gather references, or compile resource lists for web development and research purposes.
Instructions
Extract all links from a webpage
Input Schema
TableJSON Schema
| Name | Required | Description | Default |
|---|---|---|---|
| url | Yes | URL to extract links from |
Implementation Reference
- src/modules/web.ts:20-33 (handler)The handler for the 'web_extract_links' tool, which fetches a URL's HTML, uses a regex to find all 'href' attributes starting with 'http', de-duplicates them, and returns up to 30 unique links.
server.tool("web_extract_links", "Extract all links from a webpage", { url: z.string().url().describe("URL to extract links from") }, async ({ url }) => { const html = await safeFetchText(url); const links: string[] = []; const regex = /href=["']([^"']+)["']/gi; let match; while ((match = regex.exec(html)) !== null) { const href = match[1]; if (href.startsWith("http")) links.push(href); } const unique = [...new Set(links)].slice(0, 30); return { content: [{ type: "text", text: `**Links from** ${url}\n\n${unique.map((l, i) => `${i + 1}. ${l}`).join("\n")}\n\nTotal: ${unique.length} unique links` }] }; });