fetch_txt

fetch_txt

Extract plain text content from websites by fetching URLs and converting HTML to readable text with configurable length and starting point.

Instructions

Fetch a website, convert the content to plain text (no HTML)

Input Schema

TableJSON Schema

Name	Required	Description
`url`	Yes	URL of the website to fetch
`headers`	No	Optional headers to include in the request
`max_length`	No	Maximum number of characters to return (default: 5000)
`start_index`	No	Start content from this character index (default: 0)

Implementation Reference

src/Fetcher.ts:92-125 (handler)
Core handler for 'fetch_txt': fetches HTML, strips scripts/styles using JSDOM, extracts and normalizes plain text, applies length limits.
static async txt(requestPayload: RequestPayload) { try { const response = await this._fetch(requestPayload); const html = await response.text(); const dom = new JSDOM(html); const document = dom.window.document; const scripts = document.getElementsByTagName("script"); const styles = document.getElementsByTagName("style"); Array.from(scripts).forEach((script) => script.remove()); Array.from(styles).forEach((style) => style.remove()); const text = document.body.textContent || ""; let normalizedText = text.replace(/\s+/g, " ").trim(); // Apply length limits normalizedText = this.applyLengthLimits( normalizedText, requestPayload.max_length ?? 5000, requestPayload.start_index ?? 0 ); return { content: [{ type: "text", text: normalizedText }], isError: false, }; } catch (error) { return { content: [{ type: "text", text: (error as Error).message }], isError: true, }; } }
src/types.ts:5-10 (schema)
Zod validation schema for fetch_txt input parameters (shared across fetch tools). Used in index.ts for parsing args.
export const RequestPayloadSchema = z.object({ url: z.string().url(), headers: z.record(z.string()).optional(), max_length: z.number().int().min(0).optional().default(downloadLimit), start_index: z.number().int().min(0).optional().default(0), });
src/index.ts:82-108 (registration)
Tool registration in ListTools handler: defines name, description, and inputSchema for 'fetch_txt'.
{ name: "fetch_txt", description: "Fetch a website, convert the content to plain text (no HTML)", inputSchema: { type: "object", properties: { url: { type: "string", description: "URL of the website to fetch", }, headers: { type: "object", description: "Optional headers to include in the request", }, max_length: { type: "number", description: `Maximum number of characters to return (default: ${downloadLimit})`, }, start_index: { type: "number", description: "Start content from this character index (default: 0)", }, }, required: ["url"], }, },
src/index.ts:152-154 (registration)
Dispatch logic in CallToolRequest handler: routes 'fetch_txt' calls to Fetcher.txt implementation.
if (request.params.name === "fetch_txt") { const fetchResult = await Fetcher.txt(validatedArgs); return fetchResult;
src/Fetcher.ts:15-44 (helper)
Shared helper for performing secure HTTP fetches with private IP blocking and custom headers, used by all fetch tools including txt.
private static async _fetch({ url, headers, }: RequestPayload): Promise<Response> { try { if (is_ip_private(url)) { throw new Error( `Fetcher blocked an attempt to fetch a private IP ${url}. This is to prevent a security vulnerability where a local MCP could fetch privileged local IPs and exfiltrate data.`, ); } const response = await fetch(url, { headers: { "User-Agent": "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/120.0.0.0 Safari/537.36", ...headers, }, }); if (!response.ok) { throw new Error(`HTTP error: ${response.status}`); } return response; } catch (e: unknown) { if (e instanceof Error) { throw new Error(`Failed to fetch ${url}: ${e.message}`); } else { throw new Error(`Failed to fetch ${url}: Unknown error`); } } }

Fetch MCP Server

Instructions

Input Schema

Implementation Reference

Other Tools

Latest Blog Posts

MCP directory API