Skip to main content
Glama
zcag
by zcag

fetch_markdown

Convert web pages to clean Markdown with metadata and token counts for LLM processing. Use this tool to extract article content from URLs with a lightweight, browserless approach.

Instructions

Fetch a web page and convert it to clean, LLM-optimized Markdown. Returns the article content with metadata (title, author, date) and token count. Much faster and lighter than browser-based solutions.

Input Schema

TableJSON Schema
NameRequiredDescriptionDefault
urlYesThe URL of the web page to fetch and convert
include_headerNoInclude title/source/author header in output
rawNoExtract full page content instead of just the main article

Implementation Reference

  • The async handler function that executes the fetch_markdown tool logic. It fetches the URL, converts HTML to Markdown using the readdown library, and returns the content with metadata (tokens, characters, title, author, date).
    async ({ url, include_header, raw }) => {
      try {
        const response = await fetch(url, {
          headers: {
            "User-Agent":
              "Mozilla/5.0 (compatible; readdown/0.1; +https://github.com/zcag/readdown)",
            Accept: "text/html,application/xhtml+xml",
          },
          signal: AbortSignal.timeout(15000),
        });
    
        if (!response.ok) {
          return {
            content: [
              {
                type: "text" as const,
                text: `Failed to fetch ${url}: HTTP ${response.status} ${response.statusText}`,
              },
            ],
            isError: true,
          };
        }
    
        const html = await response.text();
        const result = readdown(html, {
          url,
          includeHeader: include_header,
          raw,
        });
    
        const summary = [
          `Tokens: ~${result.tokens}`,
          `Characters: ${result.chars}`,
          result.metadata.title ? `Title: ${result.metadata.title}` : null,
          result.metadata.author ? `Author: ${result.metadata.author}` : null,
          result.metadata.date ? `Date: ${result.metadata.date}` : null,
        ]
          .filter(Boolean)
          .join(" | ");
    
        return {
          content: [
            { type: "text" as const, text: `[${summary}]\n\n${result.markdown}` },
          ],
        };
      } catch (err) {
        const message = err instanceof Error ? err.message : String(err);
        return {
          content: [
            {
              type: "text" as const,
              text: `Error fetching ${url}: ${message}`,
            },
          ],
          isError: true,
        };
      }
    }
  • Input schema definition using Zod. Defines three parameters: url (required string URL), include_header (optional boolean, default true), and raw (optional boolean, default false).
    {
      url: z.string().url().describe("The URL of the web page to fetch and convert"),
      include_header: z
        .boolean()
        .optional()
        .default(true)
        .describe("Include title/source/author header in output"),
      raw: z
        .boolean()
        .optional()
        .default(false)
        .describe("Extract full page content instead of just the main article"),
    },
  • src/index.ts:13-89 (registration)
    Tool registration using server.tool(). Registers the 'fetch_markdown' tool with its name, description, schema, and handler function.
    server.tool(
      "fetch_markdown",
      "Fetch a web page and convert it to clean, LLM-optimized Markdown. " +
        "Returns the article content with metadata (title, author, date) and token count. " +
        "Much faster and lighter than browser-based solutions.",
      {
        url: z.string().url().describe("The URL of the web page to fetch and convert"),
        include_header: z
          .boolean()
          .optional()
          .default(true)
          .describe("Include title/source/author header in output"),
        raw: z
          .boolean()
          .optional()
          .default(false)
          .describe("Extract full page content instead of just the main article"),
      },
      async ({ url, include_header, raw }) => {
        try {
          const response = await fetch(url, {
            headers: {
              "User-Agent":
                "Mozilla/5.0 (compatible; readdown/0.1; +https://github.com/zcag/readdown)",
              Accept: "text/html,application/xhtml+xml",
            },
            signal: AbortSignal.timeout(15000),
          });
    
          if (!response.ok) {
            return {
              content: [
                {
                  type: "text" as const,
                  text: `Failed to fetch ${url}: HTTP ${response.status} ${response.statusText}`,
                },
              ],
              isError: true,
            };
          }
    
          const html = await response.text();
          const result = readdown(html, {
            url,
            includeHeader: include_header,
            raw,
          });
    
          const summary = [
            `Tokens: ~${result.tokens}`,
            `Characters: ${result.chars}`,
            result.metadata.title ? `Title: ${result.metadata.title}` : null,
            result.metadata.author ? `Author: ${result.metadata.author}` : null,
            result.metadata.date ? `Date: ${result.metadata.date}` : null,
          ]
            .filter(Boolean)
            .join(" | ");
    
          return {
            content: [
              { type: "text" as const, text: `[${summary}]\n\n${result.markdown}` },
            ],
          };
        } catch (err) {
          const message = err instanceof Error ? err.message : String(err);
          return {
            content: [
              {
                type: "text" as const,
                text: `Error fetching ${url}: ${message}`,
              },
            ],
            isError: true,
          };
        }
      }
    );
Install Server

Other Tools

Latest Blog Posts

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/zcag/readdown-mcp'

If you have feedback or need assistance with the MCP directory API, please join our Discord server