Uses Cheerio for HTML parsing and manipulation during document conversion operations
Provides containerized deployment options with Docker and Docker Compose configurations for running the document processing server
Enforces code quality standards and linting rules in the development workflow
Supports version control workflow for contributing to the project and managing source code
Hosts the project repository and provides issue tracking for bug reports and feature requests
Built with pure JavaScript architecture for document processing without external system dependencies
Provides comprehensive Markdown document processing including conversion to PDF, HTML, and DOCX formats with theme support
Optionally integrates with Nginx as a reverse proxy for production Docker deployments
Requires Node.js runtime environment (≥18.0.0) for executing the document processing server
Distributes the package through npm registry and uses npm packages for core document processing functionality
Supports installation and package management using pnpm as an alternative to npm
Displays project status badges for npm version, license, and download statistics
Handles SVG graphics processing as part of document conversion and formatting operations
Developed using TypeScript for type safety and follows TypeScript coding standards for contributions
Uses XML parsing capabilities through xml2js and custom OOXML parser for advanced DOCX style preservation
Document Operations MCP Server
Document Operations MCP Server - A universal MCP server for document processing, conversion, and automation. Handle PDF, DOCX, HTML, Markdown, and more through a unified API and toolset.
Demo
Video
https://github.com/user-attachments/assets/43dfeeec-8097-413e-8519-a7de98e31136
In this demo, we showcase how to:
- Configure doc-ops-mcp in MCP clients
- Convert DOCX documents to PDF format
- Add default watermarks to converted PDF files
Table of Contents
- Quick Start
- System Architecture
- Optional Integration
- Features
- Open Source Licenses
- Future Roadmap
- Docker Deployment
- Development Guide
- Troubleshooting
- Contributing
1. Quick Start
First, add the Document Operations MCP server to your MCP client.
Standard config works in most MCP clients:
Follow the MCP install guide, use the standard config above.
Follow the MCP install guide, use the standard config above.
Go to Cursor Settings
-> MCP
-> Add new MCP Server
. Name to your liking, use command
type with the command npx -y doc-ops-mcp
.
For other MCP clients, use the standard config above and refer to your client's documentation for MCP server installation.
Configuration
The Document Operations MCP server supports configuration through environment variables. These can be provided in the MCP client configuration as part of the "env"
object:
Supported Document Operations
Format | Convert to PDF | Convert to DOCX | Convert to HTML | Convert to Markdown | Content Rewriting | Watermark/QR Code |
---|---|---|---|---|---|---|
✅ | ❌ | ❌ | ❌ | ❌ | ✅ | |
DOCX | ✅ | ✅ | ✅ | ✅ | ✅ | ❌ |
HTML | ✅ | ❌ | ✅ | ✅ | ✅ | ❌ |
Markdown | ✅ | ✅ | ✅ | ✅ | ✅ | ❌ |
Rewriting Features:
- Content Replacement: Support batch text replacement and regular expression replacement
- Format Adjustment: Modify document structure, heading levels, and style formatting
- Smart Rewriting: Content optimization while preserving original document format
Usage Examples
Format Conversion:
Document Rewriting:
PDF Enhancement:
Environment Variables
The server supports environment variables for controlling output paths and PDF enhancement features:
Core Directories
OUTPUT_DIR
: Controls where all generated files are saved (default:~/Documents
)CACHE_DIR
: Directory for temporary and cache files (default:~/.cache/doc-ops-mcp
)
PDF Enhancement Features
WATERMARK_IMAGE
: Default watermark image path for PDF files- Automatically added to all PDF conversions
- Supported formats: PNG, JPG
- If not set, default text watermark "doc-ops-mcp" will be used
QR_CODE_IMAGE
: Default QR code image path for PDF files- Added to PDFs only when explicitly requested (
addQrCode=true
) - Supported formats: PNG, JPG
- If not set, QR code functionality will be unavailable
- Added to PDFs only when explicitly requested (
Output Path Rules:
- If
outputPath
is not provided → files saved toOUTPUT_DIR
with auto-generated names - If
outputPath
is relative → resolved relative toOUTPUT_DIR
- If
outputPath
is absolute → used as-is, ignoringOUTPUT_DIR
See OUTPUT_PATH_CONTROL.md for detailed documentation.
2. System Architecture
Document Operations MCP Server adopts a pure JavaScript architecture design, providing complete document processing capabilities:
Architecture Overview
Core Features:
- Pure JavaScript implementation with no external system dependencies
- Complete document reading, conversion, and style processing capabilities
- Built-in PDF watermark and QR code addition functionality
- Intelligent conversion planning and path optimization
Conversion Flow:
- Direct Conversion: Supports direct conversion between most formats
- Multi-step Conversion: Complex conversions achieved through intermediate formats
- Style Preservation: Uses OOXML parser to ensure complete style integrity
3. Optional Integration
This server can work with playwright-mcp
for enhanced PDF conversion capabilities. Please refer to the official playwright-mcp
documentation for detailed configuration.
🔧 PDF Conversion Workflow
This server supports complete PDF conversion functionality:
- Document Parsing: Use OOXML parser to ensure complete style preservation
- Format Conversion: Convert documents to high-quality HTML format
- PDF Generation: Built-in converter or optionally work with
playwright-mcp
- Enhancement Processing: Automatically add watermarks and QR codes (if configured)
How It Works
This server uses intelligent conversion architecture:
- Smart Planning:
plan_conversion
analyzes conversion requirements and selects optimal paths - Format Conversion: Use specialized converters to handle various document formats
- Style Preservation: Ensure style integrity through OOXML parser
- Enhancement Processing: Automatically add watermarks, QR codes and other enhancements
- Optional Integration: Support working with
playwright-mcp
for enhanced capabilities
4. Features
MCP Tools
Core Document Tools
Tool Name | Description | Input Parameters | External Dependencies |
---|---|---|---|
read_document | Read document content | filePath : Document pathextractMetadata : Extract metadatapreserveFormatting : Preserve formatting | None |
write_document | Write document content | content : Document contentoutputPath : Output file pathencoding : File encoding | None |
convert_document | Smart document conversion | inputPath : Input file pathoutputPath : Output file pathpreserveFormatting : Preserve formatting | None |
plan_conversion | Conversion planner | sourceFormat : Source formattargetFormat : Target formatpreserveStyles : Preserve stylesquality : Conversion quality | None |
read_document
Read various document formats including PDF, DOCX, DOC, HTML, MD, and more.
Parameters:
filePath
(string, required) - Document path to readextractMetadata
(boolean, optional) - Extract document metadata, defaults tofalse
preserveFormatting
(boolean, optional) - Preserve formatting (HTML output), defaults tofalse
write_document
Write content to document files in specified formats.
Parameters:
content
(string, required) - Content to writeoutputPath
(string, optional) - Output file path (auto-generated if not provided)encoding
(string, optional) - File encoding, defaults toutf-8
convert_document
Convert documents between formats with enhanced style preservation.
Parameters:
inputPath
(string, required) - Input file pathoutputPath
(string, optional) - Output file path (auto-generated if not provided)preserveFormatting
(boolean, optional) - Preserve formatting, defaults totrue
useInternalPlaywright
(boolean, optional) - Use built-in Playwright for PDF conversion, defaults tofalse
convert_docx_to_pdf
Convert DOCX to PDF with automatic watermark addition (if configured).
Parameters:
docxPath
(string, required) - DOCX file pathoutputPath
(string, optional) - Output PDF path (auto-generated if not provided)addQrCode
(boolean, optional) - Whether to add QR code, defaults tofalse
preserveFormatting
(boolean, optional) - Preserve original formatting, defaults totrue
chineseFont
(string, optional) - Chinese font, defaults toMicrosoft YaHei
convert_markdown_to_pdf
Convert Markdown to PDF with automatic watermark addition (if configured).
Parameters:
markdownPath
(string, required) - Markdown file pathoutputPath
(string, optional) - Output PDF path (auto-generated if not provided)theme
(string, optional) - Theme style, defaults to"github"
includeTableOfContents
(boolean, optional) - Include table of contents, defaults tofalse
addQrCode
(boolean, optional) - Whether to add QR code, defaults tofalse
convert_markdown_to_html
Convert Markdown to HTML.
Parameters:
markdownPath
(string, required) - Markdown file pathoutputPath
(string, optional) - Output HTML path (auto-generated if not provided)theme
(string, optional) - Theme style, defaults to"github"
includeTableOfContents
(boolean, optional) - Include table of contents, defaults tofalse
convert_markdown_to_docx
Convert Markdown to DOCX.
Parameters:
markdownPath
(string, required) - Markdown file pathoutputPath
(string, optional) - Output DOCX path (auto-generated if not provided)
convert_html_to_markdown
Convert HTML to Markdown.
Parameters:
htmlPath
(string, required) - HTML file pathoutputPath
(string, optional) - Output Markdown path (auto-generated if not provided)
plan_conversion
🎯 Smart Conversion Planner - Analyze conversion requirements and generate optimal conversion plans.
Parameters:
sourceFormat
(string, required) - Source file format (pdf, docx, html, markdown, md, txt, doc)targetFormat
(string, required) - Target file format (pdf, docx, html, markdown, md, txt, doc)sourceFile
(string, optional) - Source file path (for generating specific conversion parameters)preserveStyles
(boolean, optional) - Whether to preserve style formatting, defaults totrue
includeImages
(boolean, optional) - Whether to include images, defaults totrue
theme
(string, optional) - Conversion theme, defaults togithub
quality
(string, optional) - Conversion quality requirement (fast, balanced, high), defaults tobalanced
process_pdf_post_conversion
Parameters:
playwrightPdfPath
(string, required) - Generated PDF file pathtargetPath
(string, optional) - Target PDF file path (auto-generated if not provided)addWatermark
(boolean, optional) - Whether to add watermark, defaults tofalse
addQrCode
(boolean, optional) - Whether to add QR code, defaults tofalse
watermarkImage
(string, optional) - Watermark image pathqrCodePath
(string, optional) - QR code image path
PDF Enhancement Tools
add_watermark
🎨 PDF Watermark Addition Tool - Add image or text watermarks to PDF documents.
Parameters:
pdfPath
(string, required) - PDF file pathwatermarkImage
(string, optional) - Watermark image path (PNG/JPG)watermarkText
(string, optional) - Watermark text contentwatermarkImageScale
(number, optional) - Image scale ratio, defaults to0.25
watermarkImageOpacity
(number, optional) - Image opacity, defaults to0.6
watermarkImagePosition
(string, optional) - Image position, defaults tofullscreen
add_qrcode
📱 PDF QR Code Addition Tool - Add QR codes to PDF documents.
Parameters:
pdfPath
(string, required) - PDF file pathqrCodePath
(string, optional) - QR code image pathqrScale
(number, optional) - QR code scale ratio, defaults to0.15
qrOpacity
(number, optional) - QR code opacity, defaults to1.0
qrPosition
(string, optional) - QR code position, defaults tobottom-center
addText
(boolean, optional) - Whether to add explanatory text, defaults totrue
System Requirements
System Requirements
- Node.js ≥ 18.0.0
- Zero external system dependencies - All processing via npm packages
- Optional Integration: playwright-mcp for enhanced PDF conversion
Core Technology Stack
- pdf-lib - PDF operations and enhancement
- word-extractor - DOCX document text extraction
- marked - Markdown parsing and rendering
- cheerio - HTML parsing and manipulation
- docx - DOCX document generation
- jszip - ZIP file processing
- xml2js - XML parsing and conversion
- Custom OOXML Parser - Advanced DOCX style preservation
Installation
Architecture Components
- MCP Server Core: Handles JSON-RPC 2.0 communication and tool registration
- Smart Router: Routes requests to optimal processing modules
- Conversion Engine: Contains specialized converters for different document types
- Style Processor: Ensures style preservation during format conversion
- Security Module: Provides path validation and content security handling
5. Open Source Licenses
Project License
- This Project: MIT License
- Compatibility: Available for commercial and non-commercial use
Third-Party Dependencies
Library | Version | License | Purpose |
---|---|---|---|
pdf-lib | ^1.17.1 | MIT | PDF document manipulation |
word-extractor | ^1.0.4 | MIT | DOCX document text extraction |
marked | ^15.0.12 | MIT | Markdown parsing and rendering |
cheerio | ^1.0.0-rc.12 | MIT | HTML parsing and manipulation |
docx | ^9.5.1 | Apache-2.0 | DOCX document generation |
jszip | ^3.10.1 | MIT | ZIP file processing |
xml2js | ^0.6.2 | MIT | XML parsing and conversion |
License Compatibility
- ✅ Commercial Use: All dependencies support commercial use
- ✅ Distribution: Free to distribute and modify
- ✅ Patent Protection: Apache-2.0 provides patent protection
- ⚠️ Notice: Original license notices must be retained
6. Future Roadmap
Core Features
- 🔄 Enhanced Conversion Quality: Improve style preservation for complex documents
- 📊 Excel Support: Complete Excel read/write and conversion functionality
- 🎨 Template System: Support for custom document templates
- 🔍 OCR Integration: Image text recognition capabilities
System Improvements
- 🌐 Multi-language Support: Internationalization and localization
- 🔐 Security Enhancements: Document encryption and access control
- ⚡ Performance Optimization: Large file handling and memory optimization
- 🔌 Plugin System: Extensible processor architecture
Version Roadmap
- v2.0: Complete Excel support and template system
- v3.0: OCR integration and multi-language support
- v4.0: Advanced security features and plugin system
7. Docker Deployment
Quick Start
Using Pre-built Image
Building from Source
Docker Compose Deployment
Create a docker-compose.yml
file:
Environment Variables
Variable | Description | Default |
---|---|---|
PORT | Server port | 3000 |
NODE_ENV | Environment mode | production |
LOG_LEVEL | Logging level | info |
MAX_FILE_SIZE | Maximum file size (MB) | 50 |
Volume Mounts
Mount local directories for persistent storage:
Docker Configuration Examples
Production Deployment
Health Checks
The container includes built-in health checks:
8. Development Guide
Local Development
Project Structure
Adding New Tools
- Create a new tool file in
src/tools/
- Implement the tool logic
- Register the tool in
src/index.ts
- Add test cases
- Update documentation
9. Troubleshooting
Common Issues
- Port conflicts: Change the host port in docker-compose.yml
- Permission issues: Ensure volume mounts have correct permissions
- Memory issues: Increase Docker memory allocation
Debug Mode
10. Contributing
How to Contribute
- Fork the Project
- Create a Feature Branch (
git checkout -b feature/AmazingFeature
) - Commit Your Changes (
git commit -m 'Add some AmazingFeature'
) - Push to the Branch (
git push origin feature/AmazingFeature
) - Open a Pull Request
Intellectual Property License
By submitting a Pull Request, you agree that all contributions submitted through Pull Requests will be licensed under the MIT License. This means:
- You grant the project maintainers and users the right to use, modify, and distribute your contributions under the MIT License
- You confirm that you have the right to make these contributions
- You understand that your contributions will become part of the open source project
- You waive any claims to exclusive ownership of the contributed code
If you cannot agree to these terms, please do not submit a Pull Request.
Code Standards
- Use TypeScript
- Follow ESLint configuration
- Add appropriate tests
- Update relevant documentation
Reporting Issues
- Use GitHub Issues
- Provide detailed error information and reproduction steps
- Include system environment information
License
This project is licensed under the MIT License - see the LICENSE file for details.
hybrid server
The server is able to function both locally and remotely, depending on the configuration or use case.
Tools
A universal MCP server for document processing, conversion, and automation. Handle PDF, DOCX, HTML, Markdown, and more through a unified API and toolset.
Related MCP Servers
- AsecurityAlicenseAqualityMCP server for seamless document format conversion using Pandoc, supporting Markdown, HTML, PDF, DOCX (.docx), csv and more.Last updated -1400MIT License
- AsecurityAlicenseAqualityA powerful MCP server for fetching and transforming web content into various formats (HTML, JSON, Markdown, Plain Text) with ease.Last updated -41,06936MIT License
- -securityAlicense-qualityAn MCP server that provides multiple file conversion tools for AI agents, supporting various document and image format conversions including DOCX to PDF, PDF to DOCX, image conversions, Excel to CSV, HTML to PDF, and Markdown to PDF.Last updated -17MIT License
- -securityFlicense-qualityThis MCP server enables interactions with the PDF Generator API for creating, converting, and managing PDF documents using natural language commands.Last updated -