fetch_chunk

Retrieve the plain-text content of a specific webpage section using a CSS anchor. Reduce token costs by fetching only relevant content through structure-aware selection.

Instructions

Fetch the plain-text content of a specific section of a webpage by its CSS anchor. Call get_manifest first to discover available anchors. Uses a three-stage anchor resolution: getElementById → querySelector → fuzzy heading match.

Input Schema

TableJSON Schema

Name	Required	Description	Default
`url`	Yes	Fully-qualified URL of the webpage
`anchor`	Yes	CSS anchor of the target section (e.g. "#introduction" or "#wasp-003")

Implementation Reference

chunks.ts:107-121 (handler)

The main handler function `fetchChunk` that fetches a chunk's plain-text content by URL and CSS anchor. It fetches the page HTML via getManifest, parses it with JSDOM, finds the element by anchor (using getElementById → querySelector → fuzzy heading match), and if it's a heading returns the section text, otherwise returns the element's text content.

export async function fetchChunk(url: string, anchor: string): Promise<string> {
  const { html } = await getManifest(url);
  const dom = new JSDOM(html, { url });
  const document = dom.window.document;

  const el = findElement(document, anchor);
  if (!el) return "";

  if (isHeading(el)) {
    return collectSectionText(document, el);
  }

  // Non-heading anchor (from native manifest) — return element's own text
  return el.textContent?.trim() ?? "";
}

index.ts:43-57 (schema)

The input schema definition for the fetch_chunk tool. It declares two required parameters: `url` (a fully-qualified URL string) and `anchor` (a CSS anchor string like '#introduction').

inputSchema: {
  type: "object",
  properties: {
    url: {
      type: "string",
      description: "Fully-qualified URL of the webpage",
    },
    anchor: {
      type: "string",
      description:
        'CSS anchor of the target section (e.g. "#introduction" or "#wasp-003")',
    },
  },
  required: ["url", "anchor"],
},

index.ts:101-112 (registration)

The registration of the fetch_chunk tool handler in the CallToolRequestSchema. When name === 'fetch_chunk', it extracts url and anchor from args, calls fetchChunk from chunks.ts, and returns the content.

if (name === "fetch_chunk") {
  const { url, anchor } = args as { url: string; anchor: string };
  const content = await fetchChunk(url, anchor);
  return {
    content: [
      {
        type: "text",
        text: content || "(no content found for this anchor)",
      },
    ],
  };
}

chunks.ts:35-51 (helper)

The `findElement` helper that resolves an anchor to a DOM element using a three-stage strategy: getElementById first, then querySelector, then fuzzy heading text match.

function findElement(document: Document, anchor: string): Element | null {
  // Stage 1: getElementById — handles CSS-special chars in IDs
  const id = anchor.startsWith("#") ? anchor.slice(1) : anchor;
  const byId = document.getElementById(id);
  if (byId) return byId;

  // Stage 2: querySelector — handles non-id selectors from native manifests
  try {
    const byQuery = document.querySelector(anchor);
    if (byQuery) return byQuery;
  } catch {
    // Invalid selector — fall through
  }

  // Stage 3: fuzzy slug match on heading text
  return fuzzyFindHeading(document, anchor);
}

chunks.ts:53-105 (helper)

The `collectSectionText` helper that extracts all text content under a heading element until the next heading at the same or higher level. Uses Range API first, then falls back to a document-order text node walk.

function collectSectionText(document: Document, headingEl: Element): string {
  const depth = getHeadingDepth(headingEl);

  // Find the next heading at the same depth or shallower (lower number = higher)
  const allHeadings = Array.from(document.querySelectorAll("h1,h2,h3,h4"));
  const idx = allHeadings.indexOf(headingEl);
  let nextStop: Element | null = null;
  for (let i = idx + 1; i < allHeadings.length; i++) {
    if (getHeadingDepth(allHeadings[i]) <= depth) {
      nextStop = allHeadings[i];
      break;
    }
  }

  // Primary: Range API (same approach as browser content.js)
  try {
    const range = document.createRange();
    range.setStartAfter(headingEl);
    if (nextStop) {
      range.setEndBefore(nextStop);
    } else {
      const body = document.body ?? document.documentElement;
      range.setEndAfter(body);
    }
    const text = range.toString().trim();
    if (text) return text;
  } catch {
    // Range API inconsistency in JSDOM — fall through to sibling walk
  }

  // Fallback: document-order text node walk using compareDocumentPosition
  const body = document.body ?? document.documentElement;
  // SHOW_TEXT = 4
  const walker = document.createTreeWalker(body, 4);
  const parts: string[] = [];
  let node: Node | null = walker.nextNode();

  while (node) {
    const posFromHeading = headingEl.compareDocumentPosition(node);
    if (posFromHeading & FOLLOWING) {
      if (nextStop) {
        const posFromStop = nextStop.compareDocumentPosition(node);
        // Stop when node is no longer preceding nextStop
        if (!(posFromStop & PRECEDING)) break;
      }
      const text = node.textContent?.trim();
      if (text) parts.push(text);
    }
    node = walker.nextNode();
  }

  return parts.join(" ").trim();
}

wasp-mcp

fetch_chunk

Instructions

Input Schema

Implementation Reference

Tool Definition Quality

Other Tools

Latest Blog Posts

MCP directory API