Skip to main content
Glama
j0hanz

superFetch MCP Server

fetch-url

Fetch public webpages and convert HTML to clean Markdown for AI processing, with built-in content cleaning and secure web access.

Instructions

Web Content Extractor Fetch public webpages and convert HTML to clean Markdown.

  • READ-ONLY. No JavaScript execution.

  • GitHub/GitLab/Bitbucket URLs auto-transform to raw endpoints (check resolvedUrl).

  • If truncated=true, use cacheResourceUri with resources/read for full content.

  • For large pages/timeouts, use task mode (task: {}).

  • If error queue_full, retry with task mode.

Input Schema

TableJSON Schema
NameRequiredDescriptionDefault
urlYesTarget URL. Max 2048 chars.
skipNoiseRemovalNoPreserve navigation/footers (disable noise filtering).
forceRefreshNoBypass cache and fetch fresh content.
maxInlineCharsNoInline markdown limit (0-10485760, 0=unlimited). Lower of this or global limit applies.

Implementation Reference

  • Main tool handler function - fetchUrlToolHandler is the primary entry point that wraps executeFetch with error handling and logging
    export async function fetchUrlToolHandler( input: FetchUrlInput, extra?: ToolHandlerExtra ): Promise<ToolResponseBase> { return executeFetch(input, extra).catch((error: unknown) => { logError('fetch-url tool error', toError(error)); return handleToolError(error, input.url, 'Failed to fetch URL'); }); }
  • Core execution logic - executeFetch function orchestrates the URL fetching, progress reporting, and response building
    async function executeFetch( input: FetchUrlInput, extra?: ToolHandlerExtra ): Promise<ToolResponseBase> { const { url } = input; const signal = buildToolAbortSignal(extra?.signal); const progress = createProgressReporter(extra); const contextStr = getUrlContext(url); progress.report(0, `fetch-url: ${contextStr} [starting]`); logDebug('Fetching URL', { url }); try { progress.report(1, `fetch-url: ${contextStr} [fetching HTML]`); const { pipeline, inlineResult } = await fetchPipeline( url, signal, progress, input.skipNoiseRemoval, input.forceRefresh, input.maxInlineChars ); if (pipeline.fromCache) { progress.report(3, `fetch-url: ${contextStr} [loaded from cache]`); } progress.report(4, `fetch-url: ${contextStr} • completed`); return buildResponse(pipeline, inlineResult, url); } catch (error) { const isAbort = isAbortError(error); progress.report( 4, `fetch-url: ${contextStr} • ${isAbort ? 'cancelled' : 'failed'}` ); throw error; } }
  • Input validation schema - defines the URL, skipNoiseRemoval, forceRefresh, and maxInlineChars parameters for fetch-url tool
    export const fetchUrlInputSchema = z.strictObject({ url: z .url({ protocol: /^https?$/i }) .min(1) .max(config.constants.maxUrlLength) .describe(`Target URL. Max ${config.constants.maxUrlLength} chars.`), skipNoiseRemoval: z .boolean() .optional() .describe('Preserve navigation/footers (disable noise filtering).'), forceRefresh: z .boolean() .optional() .describe('Bypass cache and fetch fresh content.'), maxInlineChars: z .number() .int() .min(0) .max(config.constants.maxHtmlSize) .optional() .describe( `Inline markdown limit (0-${config.constants.maxHtmlSize}, 0=unlimited). Lower of this or global limit applies.` ), });
  • Output validation schema - defines the structure of the response including url, markdown, metadata, cacheResourceUri, and truncation fields
    export const fetchUrlOutputSchema = z.strictObject({ url: z .string() .min(1) .max(config.constants.maxUrlLength) .describe('Fetched URL.'), inputUrl: z .string() .max(config.constants.maxUrlLength) .optional() .describe('Original requested URL.'), resolvedUrl: z .string() .max(config.constants.maxUrlLength) .optional() .describe('Final URL after raw-content transformations.'), finalUrl: z .string() .max(config.constants.maxUrlLength) .optional() .describe('Final URL after HTTP redirects.'), cacheResourceUri: z .string() .max(config.constants.maxUrlLength) .optional() .describe('URI for resources/read to get full markdown.'), title: z.string().max(512).optional().describe('Page title.'), metadata: z .strictObject({ title: z.string().max(512).optional().describe('Detected page title.'), description: z .string() .max(2048) .optional() .describe('Detected page description.'), author: z.string().max(512).optional().describe('Detected page author.'), image: z .string() .max(config.constants.maxUrlLength) .optional() .describe('Detected page preview image URL.'), favicon: z .string() .max(config.constants.maxUrlLength) .optional() .describe('Detected page favicon URL.'), publishedAt: z .string() .max(64) .optional() .describe('Detected publication date.'), modifiedAt: z .string() .max(64) .optional() .describe('Detected last modified date.'), }) .optional() .describe('Extracted page metadata.'), markdown: (config.constants.maxInlineContentChars > 0 ? z.string().max(config.constants.maxInlineContentChars) : z.string() ) .optional() .describe('Extracted Markdown. May be truncated (check truncated field).'), fromCache: z.boolean().optional().describe('True if served from cache.'), fetchedAt: z.string().max(64).optional().describe('ISO timestamp of fetch.'), contentSize: z .number() .int() .min(0) .max(config.constants.maxHtmlSize * 4) .optional() .describe('Full markdown size before truncation.'), truncated: z.boolean().optional().describe('True if markdown was truncated.'), });
  • Tool registration - registerTools function registers fetch-url with the MCP server, including input/output schemas and handler
    export function registerTools(server: McpServer): void { if (!config.tools.enabled.includes(FETCH_URL_TOOL_NAME)) { unregisterTaskCapableTool(FETCH_URL_TOOL_NAME); return; } registerTaskCapableTool({ name: FETCH_URL_TOOL_NAME, parseArguments: (args) => { const parsed = fetchUrlInputSchema.safeParse(args); if (!parsed.success) { throw new McpError( ErrorCode.InvalidParams, 'Invalid arguments for fetch-url' ); } return parsed.data; }, execute: fetchUrlToolHandler, }); const registeredTool = server.registerTool( TOOL_DEFINITION.name, { title: TOOL_DEFINITION.title, description: TOOL_DEFINITION.description, inputSchema: TOOL_DEFINITION.inputSchema, outputSchema: TOOL_DEFINITION.outputSchema, annotations: TOOL_DEFINITION.annotations, execution: TOOL_DEFINITION.execution, icons: [TOOL_ICON], } as { inputSchema: typeof fetchUrlInputSchema } & Record<string, unknown>, withRequestContextIfMissing(TOOL_DEFINITION.handler) ); // SDK typing gap workaround: preserve runtime `execution` metadata until the // registered tool type includes this field. applyRegisteredToolExecutionMetadata( registeredTool, TOOL_DEFINITION.execution ); }
Install Server

Other Tools

Latest Blog Posts

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/j0hanz/super-fetch-mcp-server'

If you have feedback or need assistance with the MCP directory API, please join our Discord server