fetch-url - superFetch MCP Server

Fetch URL

fetch-url

Read-onlyIdempotent

Fetch public webpages and convert HTML to clean Markdown for AI processing, with built-in content cleaning and secure web access.

Instructions

Web Content Extractor Fetch public webpages and convert HTML to clean Markdown.

READ-ONLY. No JavaScript execution.
GitHub/GitLab/Bitbucket URLs auto-transform to raw endpoints (check resolvedUrl).
If truncated=true, use cacheResourceUri with resources/read for full content.
For large pages/timeouts, use task mode (task: {}).
If error queue_full, retry with task mode.

Input Schema

TableJSON Schema

Name	Required	Description
`url`	Yes	Target URL. Max 2048 chars.
`skipNoiseRemoval`	No	Preserve navigation/footers (disable noise filtering).
`forceRefresh`	No	Bypass cache and fetch fresh content.
`maxInlineChars`	No	Inline markdown limit (0-10485760, 0=unlimited). Lower of this or global limit applies.

Implementation Reference

src/tools/fetch-url.ts:450-458 (handler)

Main tool handler function - fetchUrlToolHandler is the primary entry point that wraps executeFetch with error handling and logging

export async function fetchUrlToolHandler(
  input: FetchUrlInput,
  extra?: ToolHandlerExtra
): Promise<ToolResponseBase> {
  return executeFetch(input, extra).catch((error: unknown) => {
    logError('fetch-url tool error', toError(error));
    return handleToolError(error, input.url, 'Failed to fetch URL');
  });
}

src/tools/fetch-url.ts:411-448 (handler)

Core execution logic - executeFetch function orchestrates the URL fetching, progress reporting, and response building

async function executeFetch(
  input: FetchUrlInput,
  extra?: ToolHandlerExtra
): Promise<ToolResponseBase> {
  const { url } = input;
  const signal = buildToolAbortSignal(extra?.signal);
  const progress = createProgressReporter(extra);

  const contextStr = getUrlContext(url);
  progress.report(0, `fetch-url: ${contextStr} [starting]`);
  logDebug('Fetching URL', { url });

  try {
    progress.report(1, `fetch-url: ${contextStr} [fetching HTML]`);
    const { pipeline, inlineResult } = await fetchPipeline(
      url,
      signal,
      progress,
      input.skipNoiseRemoval,
      input.forceRefresh,
      input.maxInlineChars
    );

    if (pipeline.fromCache) {
      progress.report(3, `fetch-url: ${contextStr} [loaded from cache]`);
    }

    progress.report(4, `fetch-url: ${contextStr} • completed`);
    return buildResponse(pipeline, inlineResult, url);
  } catch (error) {
    const isAbort = isAbortError(error);
    progress.report(
      4,
      `fetch-url: ${contextStr} • ${isAbort ? 'cancelled' : 'failed'}`
    );
    throw error;
  }
}

src/schemas/inputs.ts:5-28 (schema)

Input validation schema - defines the URL, skipNoiseRemoval, forceRefresh, and maxInlineChars parameters for fetch-url tool

export const fetchUrlInputSchema = z.strictObject({
  url: z
    .url({ protocol: /^https?$/i })
    .min(1)
    .max(config.constants.maxUrlLength)
    .describe(`Target URL. Max ${config.constants.maxUrlLength} chars.`),
  skipNoiseRemoval: z
    .boolean()
    .optional()
    .describe('Preserve navigation/footers (disable noise filtering).'),
  forceRefresh: z
    .boolean()
    .optional()
    .describe('Bypass cache and fetch fresh content.'),
  maxInlineChars: z
    .number()
    .int()
    .min(0)
    .max(config.constants.maxHtmlSize)
    .optional()
    .describe(
      `Inline markdown limit (0-${config.constants.maxHtmlSize}, 0=unlimited). Lower of this or global limit applies.`
    ),
});

src/schemas/outputs.ts:5-80 (schema)

Output validation schema - defines the structure of the response including url, markdown, metadata, cacheResourceUri, and truncation fields

export const fetchUrlOutputSchema = z.strictObject({
  url: z
    .string()
    .min(1)
    .max(config.constants.maxUrlLength)
    .describe('Fetched URL.'),
  inputUrl: z
    .string()
    .max(config.constants.maxUrlLength)
    .optional()
    .describe('Original requested URL.'),
  resolvedUrl: z
    .string()
    .max(config.constants.maxUrlLength)
    .optional()
    .describe('Final URL after raw-content transformations.'),
  finalUrl: z
    .string()
    .max(config.constants.maxUrlLength)
    .optional()
    .describe('Final URL after HTTP redirects.'),
  cacheResourceUri: z
    .string()
    .max(config.constants.maxUrlLength)
    .optional()
    .describe('URI for resources/read to get full markdown.'),
  title: z.string().max(512).optional().describe('Page title.'),
  metadata: z
    .strictObject({
      title: z.string().max(512).optional().describe('Detected page title.'),
      description: z
        .string()
        .max(2048)
        .optional()
        .describe('Detected page description.'),
      author: z.string().max(512).optional().describe('Detected page author.'),
      image: z
        .string()
        .max(config.constants.maxUrlLength)
        .optional()
        .describe('Detected page preview image URL.'),
      favicon: z
        .string()
        .max(config.constants.maxUrlLength)
        .optional()
        .describe('Detected page favicon URL.'),
      publishedAt: z
        .string()
        .max(64)
        .optional()
        .describe('Detected publication date.'),
      modifiedAt: z
        .string()
        .max(64)
        .optional()
        .describe('Detected last modified date.'),
    })
    .optional()
    .describe('Extracted page metadata.'),
  markdown: (config.constants.maxInlineContentChars > 0
    ? z.string().max(config.constants.maxInlineContentChars)
    : z.string()
  )
    .optional()
    .describe('Extracted Markdown. May be truncated (check truncated field).'),
  fromCache: z.boolean().optional().describe('True if served from cache.'),
  fetchedAt: z.string().max(64).optional().describe('ISO timestamp of fetch.'),
  contentSize: z
    .number()
    .int()
    .min(0)
    .max(config.constants.maxHtmlSize * 4)
    .optional()
    .describe('Full markdown size before truncation.'),
  truncated: z.boolean().optional().describe('True if markdown was truncated.'),
});

src/tools/fetch-url.ts:571-611 (registration)

Tool registration - registerTools function registers fetch-url with the MCP server, including input/output schemas and handler

export function registerTools(server: McpServer): void {
  if (!config.tools.enabled.includes(FETCH_URL_TOOL_NAME)) {
    unregisterTaskCapableTool(FETCH_URL_TOOL_NAME);
    return;
  }

  registerTaskCapableTool({
    name: FETCH_URL_TOOL_NAME,
    parseArguments: (args) => {
      const parsed = fetchUrlInputSchema.safeParse(args);
      if (!parsed.success) {
        throw new McpError(
          ErrorCode.InvalidParams,
          'Invalid arguments for fetch-url'
        );
      }
      return parsed.data;
    },
    execute: fetchUrlToolHandler,
  });

  const registeredTool = server.registerTool(
    TOOL_DEFINITION.name,
    {
      title: TOOL_DEFINITION.title,
      description: TOOL_DEFINITION.description,
      inputSchema: TOOL_DEFINITION.inputSchema,
      outputSchema: TOOL_DEFINITION.outputSchema,
      annotations: TOOL_DEFINITION.annotations,
      execution: TOOL_DEFINITION.execution,
      icons: [TOOL_ICON],
    } as { inputSchema: typeof fetchUrlInputSchema } & Record<string, unknown>,
    withRequestContextIfMissing(TOOL_DEFINITION.handler)
  );
  // SDK typing gap workaround: preserve runtime `execution` metadata until the
  // registered tool type includes this field.
  applyRegisteredToolExecutionMetadata(
    registeredTool,
    TOOL_DEFINITION.execution
  );
}

superFetch MCP Server

Fetch URL

Instructions

Input Schema

Implementation Reference

Tool Definition Quality

Other Tools

Latest Blog Posts

MCP directory API