Skip to main content
Glama
cordlesssteve

Document Organizer MCP Server

File Converter MCP

CI/CD Pipeline npm version License: MIT

A Model Context Protocol (MCP) server that aggregates various file conversion tools for quick formatting and file type transformations.

Features

Supported Conversions

  • PDF to Markdown - Convert PDF documents to markdown format

  • Image Format Conversion - Transform between common image formats (PNG, JPG, WebP, etc.)

  • Document Conversion - Convert between document formats (DOCX, TXT, HTML, etc.)

  • Spreadsheet Conversion - Transform spreadsheet formats (CSV, XLSX, JSON, etc.)

  • Code Format Conversion - Convert between code formats and syntax highlighting

  • Archive Operations - Extract and create archive files (ZIP, TAR, etc.)

Conversion Engines

  • PDF Engine: marker (recommended) and pymupdf4llm support

  • Image Engine: Sharp and ImageMagick integration

  • Document Engine: Pandoc integration for broad format support

  • Archive Engine: Built-in Node.js compression libraries

Installation

npm install -g file-converter-mcp

Dependencies

Install conversion engines based on your needs:

# PDF conversion engines
pip install marker-pdf pymupdf4llm

# Image processing (choose one)
npm install sharp
# OR
brew install imagemagick  # macOS
apt-get install imagemagick  # Ubuntu

# Document conversion
brew install pandoc  # macOS
apt-get install pandoc  # Ubuntu

# Archive tools (usually pre-installed)
# zip, unzip, tar, gzip

Usage

MCP Configuration

Add to your MCP client configuration:

{
  "mcpServers": {
    "file-converter": {
      "command": "file-converter-mcp",
      "args": []
    }
  }
}

Available Tools

PDF Conversion

  • convert_pdf_to_markdown - Convert PDF files to Markdown

  • extract_pdf_text - Extract plain text from PDF files

  • extract_pdf_images - Extract images from PDF files

Image Conversion

  • convert_image_format - Convert between image formats

  • resize_image - Resize images with quality options

  • compress_image - Reduce image file size

Document Conversion

  • convert_document - Convert between document formats using Pandoc

  • extract_document_text - Extract text from various document formats

  • convert_markdown_to_html - Convert Markdown to HTML with styling

Spreadsheet Conversion

  • convert_csv_to_json - Convert CSV data to JSON format

  • convert_json_to_csv - Convert JSON data to CSV format

  • convert_xlsx_to_csv - Extract CSV data from Excel files

Archive Operations

  • create_archive - Create ZIP or TAR archives from files/folders

  • extract_archive - Extract contents from archive files

  • list_archive_contents - List files in archive without extracting

Utility Tools

  • detect_file_type - Identify file format and encoding

  • validate_conversion - Check if conversion is supported

  • batch_convert - Convert multiple files in one operation

Examples

Basic PDF Conversion

// Convert PDF to Markdown
await client.callTool("convert_pdf_to_markdown", {
  input_path: "/path/to/document.pdf",
  output_path: "/path/to/output.md",
  options: {
    engine: "marker",
    preserve_formatting: true
  }
});

Image Format Conversion

// Convert PNG to WebP with compression
await client.callTool("convert_image_format", {
  input_path: "/path/to/image.png",
  output_path: "/path/to/image.webp",
  options: {
    quality: 80,
    format: "webp"
  }
});

Document Conversion

// Convert DOCX to Markdown using Pandoc
await client.callTool("convert_document", {
  input_path: "/path/to/document.docx",
  output_path: "/path/to/document.md",
  options: {
    format: "markdown",
    preserve_styles: false
  }
});

Batch Operations

// Convert multiple files at once
await client.callTool("batch_convert", {
  input_directory: "/path/to/input/",
  output_directory: "/path/to/output/",
  conversions: [
    { from: "pdf", to: "markdown" },
    { from: "png", to: "webp" },
    { from: "docx", to: "txt" }
  ]
});

Configuration Options

Conversion Settings

interface ConversionOptions {
  engine?: string;                    // Conversion engine to use
  quality?: number;                   // Output quality (1-100)
  preserve_formatting?: boolean;      // Maintain original formatting
  output_format?: string;             // Specific output format
  compression_level?: number;         // Compression level (0-9)
  custom_options?: Record<string, any>; // Engine-specific options
}

Supported File Types

Input Formats

  • Documents: PDF, DOCX, DOC, RTF, TXT, HTML, XML

  • Images: PNG, JPG, JPEG, WebP, GIF, BMP, TIFF, SVG

  • Spreadsheets: CSV, XLSX, XLS, JSON, TSV

  • Archives: ZIP, TAR, GZ, 7Z, RAR (extract only)

  • Code: Various programming language files

Output Formats

  • Text: Markdown, HTML, TXT, RTF

  • Images: PNG, JPG, WebP, GIF, BMP

  • Data: JSON, CSV, XML, YAML

  • Archives: ZIP, TAR, GZ

Performance Considerations

  • Memory Usage: Large files are processed in chunks to prevent memory issues

  • Processing Speed: Different engines have different speed/quality tradeoffs

  • Batch Processing: More efficient for multiple file conversions

  • Caching: Converted files can be cached to avoid re-processing

Error Handling

The server provides comprehensive error handling:

  • Input file validation and format detection

  • Graceful fallback between conversion engines

  • Detailed error messages with suggested solutions

  • Progress tracking for long-running conversions

Development

# Clone repository
git clone https://github.com/cordlesssteve/file-converter-mcp.git
cd file-converter-mcp

# Install dependencies
npm install

# Build project
npm run build

# Run development mode
npm run dev

# Run tests
npm test

Contributing

  1. Fork the repository

  2. Create a feature branch

  3. Add support for new file formats or conversion engines

  4. Add tests for new functionality

  5. Submit a pull request

License

MIT License - see LICENSE file for details.

Support

Latest Blog Posts

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/cordlesssteve/document-organizer-mcp'

If you have feedback or need assistance with the MCP directory API, please join our Discord server