Skip to main content
Glama

File Converter MCP

CI/CD Pipeline npm version License: MIT

A Model Context Protocol (MCP) server that aggregates various file conversion tools for quick formatting and file type transformations.

Features

Supported Conversions

  • PDF to Markdown - Convert PDF documents to markdown format

  • Image Format Conversion - Transform between common image formats (PNG, JPG, WebP, etc.)

  • Document Conversion - Convert between document formats (DOCX, TXT, HTML, etc.)

  • Spreadsheet Conversion - Transform spreadsheet formats (CSV, XLSX, JSON, etc.)

  • Code Format Conversion - Convert between code formats and syntax highlighting

  • Archive Operations - Extract and create archive files (ZIP, TAR, etc.)

Conversion Engines

  • PDF Engine: marker (recommended) and pymupdf4llm support

  • Image Engine: Sharp and ImageMagick integration

  • Document Engine: Pandoc integration for broad format support

  • Archive Engine: Built-in Node.js compression libraries

Installation

npm install -g file-converter-mcp

Dependencies

Install conversion engines based on your needs:

# PDF conversion engines pip install marker-pdf pymupdf4llm # Image processing (choose one) npm install sharp # OR brew install imagemagick # macOS apt-get install imagemagick # Ubuntu # Document conversion brew install pandoc # macOS apt-get install pandoc # Ubuntu # Archive tools (usually pre-installed) # zip, unzip, tar, gzip

Usage

MCP Configuration

Add to your MCP client configuration:

{ "mcpServers": { "file-converter": { "command": "file-converter-mcp", "args": [] } } }

Available Tools

PDF Conversion

  • convert_pdf_to_markdown - Convert PDF files to Markdown

  • extract_pdf_text - Extract plain text from PDF files

  • extract_pdf_images - Extract images from PDF files

Image Conversion

  • convert_image_format - Convert between image formats

  • resize_image - Resize images with quality options

  • compress_image - Reduce image file size

Document Conversion

  • convert_document - Convert between document formats using Pandoc

  • extract_document_text - Extract text from various document formats

  • convert_markdown_to_html - Convert Markdown to HTML with styling

Spreadsheet Conversion

  • convert_csv_to_json - Convert CSV data to JSON format

  • convert_json_to_csv - Convert JSON data to CSV format

  • convert_xlsx_to_csv - Extract CSV data from Excel files

Archive Operations

  • create_archive - Create ZIP or TAR archives from files/folders

  • extract_archive - Extract contents from archive files

  • list_archive_contents - List files in archive without extracting

Utility Tools

  • detect_file_type - Identify file format and encoding

  • validate_conversion - Check if conversion is supported

  • batch_convert - Convert multiple files in one operation

Examples

Basic PDF Conversion

// Convert PDF to Markdown await client.callTool("convert_pdf_to_markdown", { input_path: "/path/to/document.pdf", output_path: "/path/to/output.md", options: { engine: "marker", preserve_formatting: true } });

Image Format Conversion

// Convert PNG to WebP with compression await client.callTool("convert_image_format", { input_path: "/path/to/image.png", output_path: "/path/to/image.webp", options: { quality: 80, format: "webp" } });

Document Conversion

// Convert DOCX to Markdown using Pandoc await client.callTool("convert_document", { input_path: "/path/to/document.docx", output_path: "/path/to/document.md", options: { format: "markdown", preserve_styles: false } });

Batch Operations

// Convert multiple files at once await client.callTool("batch_convert", { input_directory: "/path/to/input/", output_directory: "/path/to/output/", conversions: [ { from: "pdf", to: "markdown" }, { from: "png", to: "webp" }, { from: "docx", to: "txt" } ] });

Configuration Options

Conversion Settings

interface ConversionOptions { engine?: string; // Conversion engine to use quality?: number; // Output quality (1-100) preserve_formatting?: boolean; // Maintain original formatting output_format?: string; // Specific output format compression_level?: number; // Compression level (0-9) custom_options?: Record<string, any>; // Engine-specific options }

Supported File Types

Input Formats

  • Documents: PDF, DOCX, DOC, RTF, TXT, HTML, XML

  • Images: PNG, JPG, JPEG, WebP, GIF, BMP, TIFF, SVG

  • Spreadsheets: CSV, XLSX, XLS, JSON, TSV

  • Archives: ZIP, TAR, GZ, 7Z, RAR (extract only)

  • Code: Various programming language files

Output Formats

  • Text: Markdown, HTML, TXT, RTF

  • Images: PNG, JPG, WebP, GIF, BMP

  • Data: JSON, CSV, XML, YAML

  • Archives: ZIP, TAR, GZ

Performance Considerations

  • Memory Usage: Large files are processed in chunks to prevent memory issues

  • Processing Speed: Different engines have different speed/quality tradeoffs

  • Batch Processing: More efficient for multiple file conversions

  • Caching: Converted files can be cached to avoid re-processing

Error Handling

The server provides comprehensive error handling:

  • Input file validation and format detection

  • Graceful fallback between conversion engines

  • Detailed error messages with suggested solutions

  • Progress tracking for long-running conversions

Development

# Clone repository git clone https://github.com/cordlesssteve/file-converter-mcp.git cd file-converter-mcp # Install dependencies npm install # Build project npm run build # Run development mode npm run dev # Run tests npm test

Contributing

  1. Fork the repository

  2. Create a feature branch

  3. Add support for new file formats or conversion engines

  4. Add tests for new functionality

  5. Submit a pull request

License

MIT License - see LICENSE file for details.

Support

Latest Blog Posts

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/cordlesssteve/document-organizer-mcp'

If you have feedback or need assistance with the MCP directory API, please join our Discord server