Skip to main content
Glama
guanweiqiang

document-converter-mcp

by guanweiqiang

Server Configuration

Describes the environment variables required to run the server.

NameRequiredDescriptionDefault
DOC_CONVERTER_WORKSPACEYesAbsolute path to the workspace directory for safe file access.

Capabilities

Features and capabilities supported by this server

CapabilityDetails
tools
{
  "listChanged": true
}

Tools

Functions exposed to the LLM to take actions

NameDescription
markdown_to_pdfB

Convert a Markdown file to PDF format using Pandoc. Arguments:

  • inputPath (string, required): Path to the input Markdown file (relative to workspace)

  • outputPath (string, optional): Path for the output PDF. Defaults to same name with .pdf extension

  • title (string, optional): Document title for the PDF metadata

  • toc (boolean, optional): Include a table of contents

  • pageSize (enum, optional): Page size — 'A4' or 'Letter'. Defaults to 'A4'

  • theme (enum, optional): Theme — 'default', 'github', or 'academic'. Currently informational

  • pdfEngine (enum, optional): PDF engine — 'pdflatex', 'xelatex', 'lualatex', 'wkhtmltopdf', 'weasyprint', or 'typst'. Leave unset to let Pandoc choose

  • cjkMainFont (string, optional): CJK main font used by xelatex for Chinese/Japanese/Korean PDF output. Example: 'Microsoft YaHei', 'SimSun', 'Noto Sans CJK SC'

  • preserveSource (boolean, optional): When true, save the original Markdown as a sidecar file (e.g. sample.pdf.source.md) for accurate PDF-to-Markdown recovery. Defaults to false

  • strictMarkdown (boolean, optional): If true, reject files with structural issues (unclosed code blocks). Defaults to false

  • overwrite (boolean, optional): Allow overwriting existing output. Defaults to false

markdown_to_docxB

Convert a Markdown file to DOCX (Word) format using Pandoc. Arguments:

  • inputPath (string, required): Path to the input Markdown file

  • outputPath (string, optional): Output path. Defaults to same name with .docx

  • referenceDocx (string, optional): Path to a reference DOCX template for styling

  • toc (boolean, optional): Include a table of contents in the DOCX

  • strictMarkdown (boolean, optional): If true, reject files with structural issues (unclosed code blocks). Defaults to false

  • overwrite (boolean, optional): Allow overwriting. Defaults to false

docx_to_markdownA

Convert a DOCX file to Markdown format using Pandoc or MarkItDown. Arguments:

  • inputPath (string, required): Path to the input DOCX file

  • outputPath (string, optional): Output path. Defaults to same name with .md

  • extractImages (boolean, optional): Extract embedded images. Defaults to false

  • imageDir (string, optional): Directory to store extracted images

  • engine (enum, optional): Conversion engine — 'pandoc' or 'markitdown'. Defaults to 'pandoc'

  • markdownFlavor (enum, optional): Markdown dialect for Pandoc output — 'gfm', 'commonmark', or 'pandoc'. Defaults to 'gfm'

  • cleanForLLM (boolean, optional): Clean up the Markdown for LLM consumption. Defaults to false

  • overwrite (boolean, optional): Allow overwriting. Defaults to false

pdf_to_markdownA

Extract text content from a PDF file into Markdown format.

IMPORTANT: This is CONTENT EXTRACTION, not layout reconstruction.

  • Scanned PDFs, complex tables, two-column papers, and mathematical formulas may not convert reliably.

  • For scanned PDFs, an OCR engine is required (not included).

  • Default engine is MarkItDown (better text extraction). Falls back to Pandoc if unavailable.

Arguments:

  • inputPath (string, required): Path to the input PDF file

  • outputPath (string, optional): Output path. Defaults to same name with .md

  • engine (enum, optional): Engine — 'markitdown' (default) or 'pandoc'

  • cleanForLLM (boolean, optional): Clean up Markdown for LLM consumption

  • preferSourceSidecar (boolean, optional): When true (default), first check for a source sidecar file (sample.pdf.source.md) and return it instead of extracting PDF text. This is the only reliable way to recover original Markdown structure.

  • overwrite (boolean, optional): Allow overwriting. Defaults to false

markdown_to_htmlA

Convert a Markdown file to HTML format using Pandoc. Arguments:

  • inputPath (string, required): Path to the input Markdown file

  • outputPath (string, optional): Output path. Defaults to same name with .html

  • cssPath (string, optional): Path to a CSS stylesheet to embed

  • standalone (boolean, optional): Generate a complete HTML document with head/body. Defaults to true

  • strictMarkdown (boolean, optional): If true, reject files with structural issues (unclosed code blocks). Defaults to false

  • overwrite (boolean, optional): Allow overwriting. Defaults to false

batch_convertB

Convert all matching files in a directory from one format to another. Individual file failures do NOT abort the entire batch. Arguments:

  • inputDir (string, required): Source directory (relative to workspace)

  • outputDir (string, required): Destination directory (relative to workspace)

  • from (enum, required): Source format — 'md', 'markdown', 'docx', or 'pdf'

  • to (enum, required): Target format — 'md', 'markdown', 'docx', 'pdf', or 'html'

  • recursive (boolean, optional): Traverse subdirectories. Defaults to false

  • overwrite (boolean, optional): Overwrite existing files. Defaults to false

  • cleanForLLM (boolean, optional): Clean Markdown output for LLM consumption

Prompts

Interactive templates invoked by user choice

NameDescription

No prompts

Resources

Contextual data attached and managed by the client

NameDescription

No resources

Latest Blog Posts

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/guanweiqiang/document-convert-mcp'

If you have feedback or need assistance with the MCP directory API, please join our Discord server