Skip to main content
Glama

web_read

Extract and process web page content or HTML data for structured analysis. Enables local LLMs to retrieve and interpret online information without API dependencies.

Instructions

Alias of web.read

Input Schema

TableJSON Schema
NameRequiredDescriptionDefault
htmlNo
urlYes

Implementation Reference

  • Core handler function for webRead that uses JSDOM and Readability to extract readable content, title, byline, language, text, word count, links, and meta from HTML.
    export function webRead(args: { url: string, html?: string }) { const { url, html } = args; const doc = new JSDOM(html || '', { url }); const reader = new Readability(doc.window.document); const art = reader.parse(); if (!art) return { title: '', byline: '', lang: '', text: '', wordCount: 0, links: [], meta: {} }; const links: Array<{text: string, url: string}> = []; const anchorEls = doc.window.document.querySelectorAll('a[href]'); anchorEls.forEach(a => { const href = (a as HTMLAnchorElement).href; const text = (a as HTMLElement).textContent?.trim() || ''; if (href) links.push({ text, url: href }); }); const meta: Record<string,string> = {}; const metas = doc.window.document.querySelectorAll('meta[name], meta[property]'); metas.forEach((m:any) => { const key = m.getAttribute('name') || m.getAttribute('property'); const val = m.getAttribute('content'); if (key && val) meta[key] = val; }); return { title: art.title || '', byline: art.byline || '', lang: (doc.window.document.documentElement.getAttribute('lang') || '').toLowerCase(), text: art.textContent || '', wordCount: (art.textContent || '').split(/\s+/).filter(Boolean).length, links, meta }; }
  • Zod schema defining input parameters for the web_read tool: url (required string) and optional html.
    const webReadShape = { url: z.string(), html: z.string().optional() };
  • src/server.ts:95-101 (registration)
    Registration of the 'web_read' tool in the MCP server, which is an alias calling the webRead handler.
    server.tool('web_read', 'Alias of web.read', webReadShape, OPEN, async ({ url, html }) => { const res = webRead({ url, html }); return { content: [{ type: 'text', text: JSON.stringify(res) }] }; } );

Latest Blog Posts

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/khanhs-234/tool4lm'

If you have feedback or need assistance with the MCP directory API, please join our Discord server