Skip to main content
Glama

web_read

Extract and process content from web pages or HTML strings to retrieve information for analysis and research.

Instructions

Alias of web.read

Input Schema

TableJSON Schema
NameRequiredDescriptionDefault
urlYes
htmlNo

Implementation Reference

  • Core handler function for webRead tool: parses HTML with JSDOM and Readability to extract readable article content, metadata, links, language, and word count.
    export function webRead(args: { url: string, html?: string }) {
      const { url, html } = args;
      const doc = new JSDOM(html || '', { url });
      const reader = new Readability(doc.window.document);
      const art = reader.parse();
      if (!art) return { title: '', byline: '', lang: '', text: '', wordCount: 0, links: [], meta: {} };
      const links: Array<{text: string, url: string}> = [];
      const anchorEls = doc.window.document.querySelectorAll('a[href]');
      anchorEls.forEach(a => {
        const href = (a as HTMLAnchorElement).href;
        const text = (a as HTMLElement).textContent?.trim() || '';
        if (href) links.push({ text, url: href });
      });
      const meta: Record<string,string> = {};
      const metas = doc.window.document.querySelectorAll('meta[name], meta[property]');
      metas.forEach((m:any) => {
        const key = m.getAttribute('name') || m.getAttribute('property');
        const val = m.getAttribute('content');
        if (key && val) meta[key] = val;
      });
      return {
        title: art.title || '', byline: art.byline || '',
        lang: (doc.window.document.documentElement.getAttribute('lang') || '').toLowerCase(),
        text: art.textContent || '', wordCount: (art.textContent || '').split(/\s+/).filter(Boolean).length,
        links, meta
      };
    }
  • Zod schema defining the input parameters for the web_read tool: url (required string) and optional html.
    const webReadShape = { url: z.string(), html: z.string().optional() };
  • src/server.ts:95-101 (registration)
    Registration of the 'web_read' tool name, using webReadShape schema and webRead handler, returning JSON stringified result.
    server.tool('web_read', 'Alias of web.read',
      webReadShape, OPEN,
      async ({ url, html }) => {
        const res = webRead({ url, html });
        return { content: [{ type: 'text', text: JSON.stringify(res) }] };
      }
    );
  • src/server.ts:88-94 (registration)
    Primary registration of the related 'web.read' tool, sharing the same schema and handler as 'web_read' alias.
    server.tool('web.read', 'Extract readable content from given HTML (or pass html from web.fetch).',
      webReadShape, OPEN,
      async ({ url, html }) => {
        const res = webRead({ url, html });
        return { content: [{ type: 'text', text: JSON.stringify(res) }] };
      }
    );
Behavior3/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

The description adds no behavioral context beyond what the openWorldHint annotation already provides. While it doesn't contradict the annotation (which suggests this tool can access external resources), it fails to disclose any additional behavioral traits such as rate limits, authentication requirements, or what specific type of web reading operation it performs.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is extremely concise at just three words, with zero wasted language. While this conciseness comes at the expense of informational value, the description is perfectly front-loaded and every word serves a purpose in establishing the tool's relationship to another tool.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness2/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

For a tool with 2 parameters (one required), 0% schema description coverage, no output schema, and only a basic openWorldHint annotation, the description is woefully incomplete. It doesn't explain what the tool does, when to use it, what the parameters mean, or what behavior to expect - leaving the agent with insufficient context to effectively use this tool.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters2/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

With 0% schema description coverage for both parameters (url and html), the description provides no semantic information about what these parameters mean or how they should be used. The description 'Alias of web.read' doesn't explain the purpose of either parameter or their relationship to each other, leaving the agent with only the bare schema information.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose2/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description 'Alias of web.read' is a tautology that merely restates the tool name without explaining what the tool actually does. It doesn't specify what resource it operates on or what action it performs. While it hints at a relationship with 'web.read', it fails to articulate the tool's purpose independently.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines2/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description provides no guidance on when to use this tool versus alternatives. With sibling tools like 'web.fetch', 'web.search', and 'web.read' available, there's no indication of what distinguishes 'web_read' from these other web-related tools or when one should be preferred over another.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

Install Server

Other Tools

Latest Blog Posts

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/khanhs-234/tool4lm'

If you have feedback or need assistance with the MCP directory API, please join our Discord server