web_read

Extract and process content from web pages or HTML strings to retrieve information for analysis and research.

Instructions

Alias of web.read

Input Schema

TableJSON Schema

Name	Required	Description	Default
`url`	Yes
`html`	No

Implementation Reference

src/tools/webRead.ts:4-30 (handler)
Core handler function for webRead tool: parses HTML with JSDOM and Readability to extract readable article content, metadata, links, language, and word count.
export function webRead(args: { url: string, html?: string }) { const { url, html } = args; const doc = new JSDOM(html || '', { url }); const reader = new Readability(doc.window.document); const art = reader.parse(); if (!art) return { title: '', byline: '', lang: '', text: '', wordCount: 0, links: [], meta: {} }; const links: Array<{text: string, url: string}> = []; const anchorEls = doc.window.document.querySelectorAll('a[href]'); anchorEls.forEach(a => { const href = (a as HTMLAnchorElement).href; const text = (a as HTMLElement).textContent?.trim() || ''; if (href) links.push({ text, url: href }); }); const meta: Record<string,string> = {}; const metas = doc.window.document.querySelectorAll('meta[name], meta[property]'); metas.forEach((m:any) => { const key = m.getAttribute('name') || m.getAttribute('property'); const val = m.getAttribute('content'); if (key && val) meta[key] = val; }); return { title: art.title || '', byline: art.byline || '', lang: (doc.window.document.documentElement.getAttribute('lang') || '').toLowerCase(), text: art.textContent || '', wordCount: (art.textContent || '').split(/\s+/).filter(Boolean).length, links, meta }; }
src/server.ts:87-87 (schema)
Zod schema defining the input parameters for the web_read tool: url (required string) and optional html.
const webReadShape = { url: z.string(), html: z.string().optional() };
src/server.ts:95-101 (registration)
Registration of the 'web_read' tool name, using webReadShape schema and webRead handler, returning JSON stringified result.
server.tool('web_read', 'Alias of web.read', webReadShape, OPEN, async ({ url, html }) => { const res = webRead({ url, html }); return { content: [{ type: 'text', text: JSON.stringify(res) }] }; } );
src/server.ts:88-94 (registration)
Primary registration of the related 'web.read' tool, sharing the same schema and handler as 'web_read' alias.
server.tool('web.read', 'Extract readable content from given HTML (or pass html from web.fetch).', webReadShape, OPEN, async ({ url, html }) => { const res = webRead({ url, html }); return { content: [{ type: 'text', text: JSON.stringify(res) }] }; } );

TOOL4LM

web_read

Instructions

Input Schema

Implementation Reference

Other Tools

Latest Blog Posts

MCP directory API