Skip to main content
Glama

web_read

Extract and process content from web pages or HTML strings to retrieve information for analysis and research.

Instructions

Alias of web.read

Input Schema

TableJSON Schema
NameRequiredDescriptionDefault
urlYes
htmlNo

Implementation Reference

  • Core handler function for webRead tool: parses HTML with JSDOM and Readability to extract readable article content, metadata, links, language, and word count.
    export function webRead(args: { url: string, html?: string }) { const { url, html } = args; const doc = new JSDOM(html || '', { url }); const reader = new Readability(doc.window.document); const art = reader.parse(); if (!art) return { title: '', byline: '', lang: '', text: '', wordCount: 0, links: [], meta: {} }; const links: Array<{text: string, url: string}> = []; const anchorEls = doc.window.document.querySelectorAll('a[href]'); anchorEls.forEach(a => { const href = (a as HTMLAnchorElement).href; const text = (a as HTMLElement).textContent?.trim() || ''; if (href) links.push({ text, url: href }); }); const meta: Record<string,string> = {}; const metas = doc.window.document.querySelectorAll('meta[name], meta[property]'); metas.forEach((m:any) => { const key = m.getAttribute('name') || m.getAttribute('property'); const val = m.getAttribute('content'); if (key && val) meta[key] = val; }); return { title: art.title || '', byline: art.byline || '', lang: (doc.window.document.documentElement.getAttribute('lang') || '').toLowerCase(), text: art.textContent || '', wordCount: (art.textContent || '').split(/\s+/).filter(Boolean).length, links, meta }; }
  • Zod schema defining the input parameters for the web_read tool: url (required string) and optional html.
    const webReadShape = { url: z.string(), html: z.string().optional() };
  • src/server.ts:95-101 (registration)
    Registration of the 'web_read' tool name, using webReadShape schema and webRead handler, returning JSON stringified result.
    server.tool('web_read', 'Alias of web.read', webReadShape, OPEN, async ({ url, html }) => { const res = webRead({ url, html }); return { content: [{ type: 'text', text: JSON.stringify(res) }] }; } );
  • src/server.ts:88-94 (registration)
    Primary registration of the related 'web.read' tool, sharing the same schema and handler as 'web_read' alias.
    server.tool('web.read', 'Extract readable content from given HTML (or pass html from web.fetch).', webReadShape, OPEN, async ({ url, html }) => { const res = webRead({ url, html }); return { content: [{ type: 'text', text: JSON.stringify(res) }] }; } );

Latest Blog Posts

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/khanhs-234/tool4lm'

If you have feedback or need assistance with the MCP directory API, please join our Discord server