Skip to main content
Glama
cablate

Simple Document Processing MCP Server

html_to_markdown

Convert HTML files to Markdown format for easier editing and documentation. Specify input HTML file and output directory to generate clean Markdown content.

Instructions

Convert HTML to Markdown format

Input Schema

TableJSON Schema
NameRequiredDescriptionDefault
inputPathYesPath to the input HTML file
outputDirYesDirectory where Markdown file should be saved

Implementation Reference

  • The core handler function that reads an HTML file from inputPath, converts it to Markdown using TurndownService, generates a unique filename, saves it to outputDir, and returns success/error info.
    export async function htmlToMarkdown(inputPath: string, outputDir: string) { try { console.error(`Starting HTML to Markdown conversion...`); console.error(`Input file: ${inputPath}`); console.error(`Output directory: ${outputDir}`); // 確保輸出目錄存在 try { await fs.access(outputDir); console.error(`Output directory exists: ${outputDir}`); } catch { console.error(`Creating output directory: ${outputDir}`); await fs.mkdir(outputDir, { recursive: true }); console.error(`Created output directory: ${outputDir}`); } const uniqueId = generateUniqueId(); const htmlContent = await fs.readFile(inputPath, "utf-8"); const turndownService = new TurndownService(); const markdown = turndownService.turndown(htmlContent); const outputPath = path.join(outputDir, `markdown_${uniqueId}.md`); await fs.writeFile(outputPath, markdown); console.error(`Written Markdown to ${outputPath}`); return { success: true, data: `Successfully converted HTML to Markdown: ${outputPath}`, }; } catch (error) { console.error(`Error in htmlToMarkdown:`, error); return { success: false, error: error instanceof Error ? error.message : "Unknown error", }; } }
  • Defines the tool's metadata, name, description, and input schema (requiring inputPath and outputDir strings).
    export const HTML_TO_MARKDOWN_TOOL: Tool = { name: "html_to_markdown", description: "Convert HTML to Markdown format", inputSchema: { type: "object", properties: { inputPath: { type: "string", description: "Path to the input HTML file", }, outputDir: { type: "string", description: "Directory where Markdown file should be saved", }, }, required: ["inputPath", "outputDir"], }, };
  • Registers HTML_TO_MARKDOWN_TOOL in the central tools array exported for use in the MCP server.
    export const tools = [DOCUMENT_READER_TOOL, PDF_MERGE_TOOL, PDF_SPLIT_TOOL, DOCX_TO_PDF_TOOL, DOCX_TO_HTML_TOOL, HTML_CLEAN_TOOL, HTML_TO_TEXT_TOOL, HTML_TO_MARKDOWN_TOOL, HTML_EXTRACT_RESOURCES_TOOL, HTML_FORMAT_TOOL, TEXT_DIFF_TOOL, TEXT_SPLIT_TOOL, TEXT_FORMAT_TOOL, TEXT_ENCODING_CONVERT_TOOL, EXCEL_READ_TOOL, FORMAT_CONVERTER_TOOL];
  • Dispatch handler in the main MCP server that matches the tool name, extracts arguments, calls the htmlToMarkdown function, and formats the response.
    if (name === "html_to_markdown") { const { inputPath, outputDir } = args as { inputPath: string; outputDir: string; }; const result = await htmlToMarkdown(inputPath, outputDir); if (!result.success) { return { content: [{ type: "text", text: `Error: ${result.error}` }], isError: true, }; } return { content: [{ type: "text", text: fileOperationResponse(result.data) }], isError: false, }; }
  • Helper function to generate unique IDs for output filenames, used in htmlToMarkdown.
    function generateUniqueId(): string { return randomBytes(9).toString("hex"); }

Latest Blog Posts

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/cablate/mcp-doc-forge'

If you have feedback or need assistance with the MCP directory API, please join our Discord server