Skip to main content
Glama

Simple Document Processing MCP Server

html_to_markdown

Convert HTML files to Markdown format for improved readability and simplified document processing. Specify input file path and output directory to generate formatted Markdown files efficiently.

Instructions

Convert HTML to Markdown format

Input Schema

NameRequiredDescriptionDefault
inputPathYesPath to the input HTML file
outputDirYesDirectory where Markdown file should be saved

Input Schema (JSON Schema)

{ "properties": { "inputPath": { "description": "Path to the input HTML file", "type": "string" }, "outputDir": { "description": "Directory where Markdown file should be saved", "type": "string" } }, "required": [ "inputPath", "outputDir" ], "type": "object" }

Implementation Reference

  • Main handler function that reads HTML file, converts it to Markdown using TurndownService, saves to output directory with unique filename.
    export async function htmlToMarkdown(inputPath: string, outputDir: string) { try { console.error(`Starting HTML to Markdown conversion...`); console.error(`Input file: ${inputPath}`); console.error(`Output directory: ${outputDir}`); // 確保輸出目錄存在 try { await fs.access(outputDir); console.error(`Output directory exists: ${outputDir}`); } catch { console.error(`Creating output directory: ${outputDir}`); await fs.mkdir(outputDir, { recursive: true }); console.error(`Created output directory: ${outputDir}`); } const uniqueId = generateUniqueId(); const htmlContent = await fs.readFile(inputPath, "utf-8"); const turndownService = new TurndownService(); const markdown = turndownService.turndown(htmlContent); const outputPath = path.join(outputDir, `markdown_${uniqueId}.md`); await fs.writeFile(outputPath, markdown); console.error(`Written Markdown to ${outputPath}`); return { success: true, data: `Successfully converted HTML to Markdown: ${outputPath}`, }; } catch (error) { console.error(`Error in htmlToMarkdown:`, error); return { success: false, error: error instanceof Error ? error.message : "Unknown error", }; } }
  • Tool schema definition with name, description, and input schema requiring inputPath and outputDir.
    export const HTML_TO_MARKDOWN_TOOL: Tool = { name: "html_to_markdown", description: "Convert HTML to Markdown format", inputSchema: { type: "object", properties: { inputPath: { type: "string", description: "Path to the input HTML file", }, outputDir: { type: "string", description: "Directory where Markdown file should be saved", }, }, required: ["inputPath", "outputDir"], }, };
  • Imports HTML_TO_MARKDOWN_TOOL from htmlTools.js and includes it in the exported tools array for registration.
    import { HTML_CLEAN_TOOL, HTML_EXTRACT_RESOURCES_TOOL, HTML_FORMAT_TOOL, HTML_TO_MARKDOWN_TOOL, HTML_TO_TEXT_TOOL } from "./htmlTools.js"; import { PDF_MERGE_TOOL, PDF_SPLIT_TOOL } from "./pdfTools.js"; import { TEXT_DIFF_TOOL, TEXT_ENCODING_CONVERT_TOOL, TEXT_FORMAT_TOOL, TEXT_SPLIT_TOOL } from "./txtTools.js"; export const tools = [DOCUMENT_READER_TOOL, PDF_MERGE_TOOL, PDF_SPLIT_TOOL, DOCX_TO_PDF_TOOL, DOCX_TO_HTML_TOOL, HTML_CLEAN_TOOL, HTML_TO_TEXT_TOOL, HTML_TO_MARKDOWN_TOOL, HTML_EXTRACT_RESOURCES_TOOL, HTML_FORMAT_TOOL, TEXT_DIFF_TOOL, TEXT_SPLIT_TOOL, TEXT_FORMAT_TOOL, TEXT_ENCODING_CONVERT_TOOL, EXCEL_READ_TOOL, FORMAT_CONVERTER_TOOL];
  • src/index.ts:186-202 (registration)
    Server request handler dispatches calls to 'html_to_markdown' by invoking the htmlToMarkdown function.
    if (name === "html_to_markdown") { const { inputPath, outputDir } = args as { inputPath: string; outputDir: string; }; const result = await htmlToMarkdown(inputPath, outputDir); if (!result.success) { return { content: [{ type: "text", text: `Error: ${result.error}` }], isError: true, }; } return { content: [{ type: "text", text: fileOperationResponse(result.data) }], isError: false, }; }

Latest Blog Posts

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/cablate/mcp-doc-forge'

If you have feedback or need assistance with the MCP directory API, please join our Discord server