MCP Node Fetch

extract-html-fragment

Extract specific HTML elements from webpages using CSS selectors to retrieve targeted content for data processing or analysis.

Instructions

Extract a specific HTML fragment from a webpage using CSS selectors

Input Schema

TableJSON Schema

Name	Required	Description	Default
`url`	Yes	URL to fetch
`selector`	Yes	CSS selector for the HTML fragment to extract
`anchorId`	No	Optional anchor ID to locate a specific fragment
`method`	No	HTTP method	GET
`headers`	No	HTTP headers
`body`	No	Request body for POST requests
`timeout`	No	Request timeout in milliseconds
`followRedirects`	No	Whether to follow redirects
`fragmentSelector`	No	CSS selector for the HTML fragment to extract (when responseType is html-fragment)

Implementation Reference

src/index.ts:288-378 (handler)

The primary handler function for the 'extract-html-fragment' tool. Fetches the URL, parses HTML using JSDOM, locates elements via CSS selector (optionally using anchorId), extracts outerHTML of matching elements, and returns a structured JSON result including status, match count, and HTML content.

// Handler for extract-html-fragment tool
async function handleExtractHtmlFragment(args: ExtractHtmlFragmentArgs): Promise<z.infer<typeof CallToolResultSchema>> {
  try {
    const { url, selector, anchorId, method, headers, body, timeout, followRedirects } = args;
    
    console.error(`Extracting HTML fragment from ${url} using selector "${selector}"`); 
    
    // Create request options
    const options: any = {
      method,
      headers: headers || {},
      body: body || undefined,
      redirect: followRedirects ? "follow" : "manual",
    };
    
    if (timeout) {
      // @ts-ignore - undici specific options
      options.bodyTimeout = timeout;
      // @ts-ignore - undici specific options
      options.headersTimeout = timeout;
    }
    
    // Perform the request
    const response = await fetch(url, options);
    
    if (!response.ok) {
      throw new Error(`Failed to fetch URL: HTTP ${response.status} ${response.statusText}`);
    }
    
    // Get the HTML content
    const html = await response.text();
    
    // Parse the HTML using JSDOM
    const dom = new JSDOM(html);
    const document = dom.window.document;
    
    // Handle anchor if provided
    if (anchorId) {
      // Scroll to the anchor element
      const anchorElement = document.getElementById(anchorId);
      if (!anchorElement) {
        throw new Error(`Anchor element with ID "${anchorId}" not found`);
      }
    }
    
    // Find the element(s) matching the selector
    const elements = document.querySelectorAll(selector);
    
    if (elements.length === 0) {
      throw new Error(`No elements found matching selector "${selector}"`);
    }
    
    // Create a result object
    const result: Record<string, any> = {
      status: response.status,
      statusText: response.statusText,
      url: response.url,
      selector: selector,
      matchCount: elements.length,
    };
    
    // Get the HTML content of the selected elements
    if (elements.length === 1) {
      // If only one element found, return its outer HTML
      result.html = elements[0].outerHTML;
    } else {
      // If multiple elements found, return an array of their outer HTML
      result.html = Array.from(elements).map(el => el.outerHTML);
    }
    
    return {
      content: [
        {
          type: "text",
          text: JSON.stringify(result, null, 2)
        }
      ]
    };
  } catch (error) {
    console.error(`Error extracting HTML fragment:`, error);
    return {
      isError: true,
      content: [
        {
          type: "text",
          text: `Error extracting HTML fragment: ${(error as Error).message}`
        }
      ]
    };
  }
}

src/index.ts:27-36 (schema)

Zod schema for validating input arguments to the extract-html-fragment tool, used in the handler for parsing args.

const ExtractHtmlFragmentArgsSchema = z.object({
  url: z.string().url().describe("URL to fetch"),
  selector: z.string().describe("CSS selector for the HTML fragment to extract"),
  anchorId: z.string().optional().describe("Optional anchor ID to locate a specific fragment"),
  method: z.enum(["GET", "POST"]).default("GET").describe("HTTP method"),
  headers: z.record(z.string()).optional().describe("HTTP headers"),
  body: z.string().optional().describe("Request body for POST requests"),
  timeout: z.number().positive().optional().describe("Request timeout in milliseconds"),
  followRedirects: z.boolean().default(true).describe("Whether to follow redirects")
});

src/index.ts:50-83 (registration)

Tool registration in the TOOLS array, including name, description, and JSON inputSchema returned by listTools.

{
  name: "extract-html-fragment",
  description: "Extract a specific HTML fragment from a webpage using CSS selectors",
  inputSchema: {
    type: "object",
    properties: {
      url: { type: "string", description: "URL to fetch" },
      selector: { type: "string", description: "CSS selector for the HTML fragment to extract" },
      anchorId: { type: "string", description: "Optional anchor ID to locate a specific fragment" },
      method: { 
        type: "string", 
        enum: ["GET", "POST"], 
        default: "GET", 
        description: "HTTP method" 
      },
      headers: { 
        type: "object", 
        additionalProperties: { type: "string" }, 
        description: "HTTP headers" 
      },
      body: { type: "string", description: "Request body for POST requests" },
      timeout: { type: "number", description: "Request timeout in milliseconds" },
      followRedirects: { 
        type: "boolean", 
        default: true, 
        description: "Whether to follow redirects" 
      },
      fragmentSelector: { 
        type: "string", 
        description: "CSS selector for the HTML fragment to extract (when responseType is html-fragment)" 
      }
    },
    required: ["url", "selector"]
  }

src/index.ts:161-162 (registration)
Registration of the tool handler in the switch statement of the CallToolRequestSchema handler.
```
case "extract-html-fragment":
  return handleExtractHtmlFragment(ExtractHtmlFragmentArgsSchema.parse(args));
```

Tool Definition Quality

C2.9/5.0

Behavior2/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

With no annotations provided, the description carries full burden for behavioral disclosure. While 'extract' implies a read operation, it doesn't address important behaviors like error handling, rate limits, authentication needs, or what happens when selectors don't match. The description mentions CSS selectors but doesn't explain the extraction mechanism or output format.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is a single, efficient sentence that gets straight to the point with zero wasted words. It's appropriately sized for the tool's complexity and front-loads the core functionality without unnecessary elaboration.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness2/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

For a tool with 9 parameters, no annotations, and no output schema, the description is inadequate. It doesn't explain what the tool returns, how errors are handled, or provide context about the extraction process. The agent would need to guess about the output format and error conditions when using this tool.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema description coverage is 100%, so the schema already documents all 9 parameters thoroughly. The description adds minimal value beyond what's in the schema - it mentions CSS selectors (which appears twice in the schema) but doesn't provide additional context about parameter interactions or usage patterns.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose4/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the tool's purpose as extracting HTML fragments from webpages using CSS selectors, which is a specific verb+resource combination. However, it doesn't distinguish this tool from its sibling tools (check-status and fetch-url), which likely have different functions but could overlap in webpage interaction scenarios.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines2/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description provides no guidance on when to use this tool versus alternatives like fetch-url or check-status. It doesn't mention prerequisites, limitations, or typical use cases, leaving the agent to infer usage context from the tool name and parameters alone.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

Install Server

Other Tools

Latest Blog Posts

Lightport: Open-Sourcing Glama's AI Gateway
By punkpeye on April 27, 2026.
open source
OpenAI
Tool Definition Quality Score (TDQS)
By punkpeye on April 3, 2026.
mcp
The Hackers Who Tracked My Sleep Cycle
By punkpeye on March 26, 2026.
security

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/mcollina/mcp-node-fetch'

If you have feedback or need assistance with the MCP directory API, please join our Discord server