fetch_chunk
Retrieve the plain-text content of a specific webpage section using a CSS anchor. Reduce token costs by fetching only relevant content through structure-aware selection.
Instructions
Fetch the plain-text content of a specific section of a webpage by its CSS anchor. Call get_manifest first to discover available anchors. Uses a three-stage anchor resolution: getElementById → querySelector → fuzzy heading match.
Input Schema
| Name | Required | Description | Default |
|---|---|---|---|
| url | Yes | Fully-qualified URL of the webpage | |
| anchor | Yes | CSS anchor of the target section (e.g. "#introduction" or "#wasp-003") |
Implementation Reference
- chunks.ts:107-121 (handler)The main handler function `fetchChunk` that fetches a chunk's plain-text content by URL and CSS anchor. It fetches the page HTML via getManifest, parses it with JSDOM, finds the element by anchor (using getElementById → querySelector → fuzzy heading match), and if it's a heading returns the section text, otherwise returns the element's text content.
export async function fetchChunk(url: string, anchor: string): Promise<string> { const { html } = await getManifest(url); const dom = new JSDOM(html, { url }); const document = dom.window.document; const el = findElement(document, anchor); if (!el) return ""; if (isHeading(el)) { return collectSectionText(document, el); } // Non-heading anchor (from native manifest) — return element's own text return el.textContent?.trim() ?? ""; } - index.ts:43-57 (schema)The input schema definition for the fetch_chunk tool. It declares two required parameters: `url` (a fully-qualified URL string) and `anchor` (a CSS anchor string like '#introduction').
inputSchema: { type: "object", properties: { url: { type: "string", description: "Fully-qualified URL of the webpage", }, anchor: { type: "string", description: 'CSS anchor of the target section (e.g. "#introduction" or "#wasp-003")', }, }, required: ["url", "anchor"], }, - index.ts:101-112 (registration)The registration of the fetch_chunk tool handler in the CallToolRequestSchema. When name === 'fetch_chunk', it extracts url and anchor from args, calls fetchChunk from chunks.ts, and returns the content.
if (name === "fetch_chunk") { const { url, anchor } = args as { url: string; anchor: string }; const content = await fetchChunk(url, anchor); return { content: [ { type: "text", text: content || "(no content found for this anchor)", }, ], }; } - chunks.ts:35-51 (helper)The `findElement` helper that resolves an anchor to a DOM element using a three-stage strategy: getElementById first, then querySelector, then fuzzy heading text match.
function findElement(document: Document, anchor: string): Element | null { // Stage 1: getElementById — handles CSS-special chars in IDs const id = anchor.startsWith("#") ? anchor.slice(1) : anchor; const byId = document.getElementById(id); if (byId) return byId; // Stage 2: querySelector — handles non-id selectors from native manifests try { const byQuery = document.querySelector(anchor); if (byQuery) return byQuery; } catch { // Invalid selector — fall through } // Stage 3: fuzzy slug match on heading text return fuzzyFindHeading(document, anchor); } - chunks.ts:53-105 (helper)The `collectSectionText` helper that extracts all text content under a heading element until the next heading at the same or higher level. Uses Range API first, then falls back to a document-order text node walk.
function collectSectionText(document: Document, headingEl: Element): string { const depth = getHeadingDepth(headingEl); // Find the next heading at the same depth or shallower (lower number = higher) const allHeadings = Array.from(document.querySelectorAll("h1,h2,h3,h4")); const idx = allHeadings.indexOf(headingEl); let nextStop: Element | null = null; for (let i = idx + 1; i < allHeadings.length; i++) { if (getHeadingDepth(allHeadings[i]) <= depth) { nextStop = allHeadings[i]; break; } } // Primary: Range API (same approach as browser content.js) try { const range = document.createRange(); range.setStartAfter(headingEl); if (nextStop) { range.setEndBefore(nextStop); } else { const body = document.body ?? document.documentElement; range.setEndAfter(body); } const text = range.toString().trim(); if (text) return text; } catch { // Range API inconsistency in JSDOM — fall through to sibling walk } // Fallback: document-order text node walk using compareDocumentPosition const body = document.body ?? document.documentElement; // SHOW_TEXT = 4 const walker = document.createTreeWalker(body, 4); const parts: string[] = []; let node: Node | null = walker.nextNode(); while (node) { const posFromHeading = headingEl.compareDocumentPosition(node); if (posFromHeading & FOLLOWING) { if (nextStop) { const posFromStop = nextStop.compareDocumentPosition(node); // Stop when node is no longer preceding nextStop if (!(posFromStop & PRECEDING)) break; } const text = node.textContent?.trim(); if (text) parts.push(text); } node = walker.nextNode(); } return parts.join(" ").trim(); }