Skip to main content
Glama
tatn

MCP Server Fetch TypeScript

by tatn

get_markdown_summary

Extracts and converts the main content of a web page into Markdown format, removing headers, footers, and navigation menus. Ideal for capturing essential content from articles, blogs, or documentation.

Instructions

Extracts and converts the main content area of a web page to Markdown format, automatically removing navigation menus, headers, footers, and other peripheral content. Perfect for capturing the core content of articles, blog posts, or documentation pages.

Input Schema

TableJSON Schema
NameRequiredDescriptionDefault
urlYesURL of the web page whose main content should be extracted and converted to Markdown.

Implementation Reference

  • src/index.ts:96-109 (registration)
    Registration of the get_markdown_summary tool in the ListToolsRequestSchema handler, including name, description, and input schema requiring a URL.
    { name: "get_markdown_summary", description: "Extracts and converts the main content area of a web page to Markdown format, automatically removing navigation menus, headers, footers, and other peripheral content. Perfect for capturing the core content of articles, blog posts, or documentation pages.", inputSchema: { type: "object", properties: { url: { type: "string", description: "URL of the web page whose main content should be extracted and converted to Markdown." } }, required: ["url"] } },
  • Handler case in CallToolRequestSchema that executes the get_markdown_summary tool by calling the helper getMarkdownStringFromHtmlByTD(url, true) with mainOnly enabled for summary extraction.
    case "get_markdown_summary": { return { content: [{ type: "text", text: (await getMarkdownStringFromHtmlByTD(url, true)) }] }; }
  • Primary helper function implementing the Markdown conversion logic using Turndown. Fetches HTML via getHtmlString, removes script/style/(header/footer/nav if mainOnly), adds custom rules for tables and definition lists (dl), and converts to Markdown. Called by the handler with mainOnly=true for summary extraction.
    export async function getMarkdownStringFromHtmlByTD( request_url: string, mainOnly: boolean = false, ) { const htmlString = await getHtmlString(request_url); const turndownService = new Turndown({ headingStyle: 'atx' }); turndownService.remove('script'); turndownService.remove('style'); if (mainOnly) { turndownService.remove('header'); turndownService.remove('footer'); turndownService.remove('nav'); } turndownService.addRule('table', { filter: 'table', // eslint-disable-next-line @typescript-eslint/no-unused-vars replacement: function (content, node, _options) { // Process each row in the table const rows = Array.from(node.querySelectorAll('tr')); if (rows.length === 0) { return ''; } const headerRow = rows[0]; const headerCells = Array.from( headerRow.querySelectorAll('th, td'), ).map((cell) => cell.textContent?.trim() || ''); const separator = headerCells.map(() => '---').join('|'); // Header row and separator line let markdown = `\n| ${headerCells.join(' | ')} |\n|${separator}|`; // Process remaining rows for (let i = 1; i < rows.length; i++) { const row = rows[i]; const rowCells = Array.from(row.querySelectorAll('th, td')).map( (cell) => cell.textContent?.trim() || '', ); markdown += `\n| ${rowCells.join(' | ')} |`; } return markdown + '\n'; }, }); turndownService.addRule('dl', { filter: 'dl', // eslint-disable-next-line @typescript-eslint/no-unused-vars replacement: function (content, node, _options) { let markdown = '\n\n'; const items = Array.from(node.children); let currentDt: string = ''; items.forEach((item) => { if (item.tagName === 'DT') { currentDt = item.textContent?.trim() || ''; if (currentDt) { markdown += `**${currentDt}:**`; } } else if (item.tagName === 'DD') { const ddContent = item.textContent?.trim() || ''; if (ddContent) { markdown += ` ${ddContent}\n`; } } }); return markdown + '\n'; }, }); const markdownString = turndownService.turndown(htmlString); return markdownString; }

Other Tools

Related Tools

Latest Blog Posts

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/tatn/mcp-server-fetch-typescript'

If you have feedback or need assistance with the MCP directory API, please join our Discord server