The Markdownify MCP Server - UTF-8 Enhanced converts various file types and web content into Markdown format, with enhanced UTF-8 support for multilingual content. You can:
Convert audio files to Markdown, including transcription if possible
Convert Bing search results pages to Markdown
Convert documents (DOCX, PDF, PPTX, XLSX) to Markdown
Convert images to Markdown, including metadata and description
Convert web pages to Markdown
Convert YouTube videos to Markdown, including transcript if available
Retrieve and process existing Markdown files by absolute path
Markdownify MCP Server - UTF-8 Enhanced
This is an enhanced version of the original Markdownify MCP project, with improved UTF-8 encoding support and optimized handling of multilingual content.
Enhancements
Added comprehensive UTF-8 encoding support
Optimized handling of multilingual content
Fixed encoding issues on Windows systems
Improved error handling mechanisms
Key Differences from Original Project
Enhanced Encoding Support:
Full UTF-8 support across all operations
Proper handling of Chinese, Japanese, Korean and other non-ASCII characters
Fixed Windows-specific encoding issues (cmd.exe and PowerShell compatibility)
Improved Error Handling:
Detailed error messages in both English and Chinese
Better exception handling for network issues
Graceful fallback mechanisms for conversion failures
Extended Functionality:
Added support for batch processing multiple files
Enhanced YouTube video transcript handling
Improved metadata extraction from various file formats
Better preservation of document formatting
Performance Optimizations:
Optimized memory usage for large file conversions
Faster processing of multilingual content
Reduced dependency conflicts
Better Development Experience:
Comprehensive debugging options
Detailed logging system
Environment-specific configuration support
Clear documentation in both English and Chinese
Features
Supports converting various file types to Markdown:
PDF files
Images (with metadata)
Audio (with transcription)
Word documents (DOCX)
Excel spreadsheets (XLSX)
PowerPoint presentations (PPTX)
Web content:
YouTube video transcripts
Search results
General web pages
Existing Markdown files
Quick Start
Clone this repository:
git clone https://github.com/JDJR2024/markdownify-mcp-utf8.git cd markdownify-mcp-utf8Install dependencies:
pnpm installNote: This will also install
uv
and related Python dependencies.Build the project:
pnpm run buildStart the server:
pnpm start
Requirements
Node.js 16.0 or higher
Python 3.8 or higher
pnpm package manager
Git
Detailed Installation Guide
1. Environment Setup
Install Node.js:
Download from Node.js official website
Verify installation:
node --version
Install pnpm:
npm install -g pnpm pnpm --versionInstall Python:
Download from Python official website
Ensure Python is added to PATH during installation
Verify installation:
python --version
(Windows Only) Configure UTF-8 Support:
# Set system-wide UTF-8 setx PYTHONIOENCODING UTF-8 # Set current session UTF-8 set PYTHONIOENCODING=UTF-8 # Enable UTF-8 in command prompt chcp 65001
2. Project Setup
Clone the repository:
git clone https://github.com/JDJR2024/markdownify-mcp-utf8.git cd markdownify-mcp-utf8Create and activate Python virtual environment:
# Windows python -m venv .venv .venv\Scripts\activate # Linux/macOS python3 -m venv .venv source .venv/bin/activateInstall project dependencies:
# Install Node.js dependencies pnpm install # Install Python dependencies (will be handled by setup.sh) ./setup.shBuild the project:
pnpm run build
3. Verification
Start the server:
pnpm startTest the installation:
# Convert a web page python convert_utf8.py "https://example.com" # Convert a local file python convert_utf8.py "path/to/your/file.docx"
Usage Guide
Basic Usage
Converting Web Pages:
python convert_utf8.py "https://example.com"The converted markdown will be saved as
converted_result.md
Converting Local Files:
# Convert DOCX python convert_utf8.py "document.docx" # Convert PDF python convert_utf8.py "document.pdf" # Convert PowerPoint python convert_utf8.py "presentation.pptx" # Convert Excel python convert_utf8.py "spreadsheet.xlsx"Converting YouTube Videos:
python convert_utf8.py "https://www.youtube.com/watch?v=VIDEO_ID"
Advanced Usage
Environment Variables:
# Set custom UV path export UV_PATH="/custom/path/to/uv" # Set custom output directory export MARKDOWN_OUTPUT_DIR="/custom/output/path"Batch Processing: Create a batch file (e.g.,
convert_batch.txt
) with URLs or file paths:https://example1.com https://example2.com file1.docx file2.pdfThen run:
while read -r line; do python convert_utf8.py "$line"; done < convert_batch.txt
Troubleshooting
Common Issues:
If you see encoding errors, ensure UTF-8 is properly set
For permission issues on Windows, run as Administrator
For Python path issues, ensure virtual environment is activated
Debugging:
# Enable debug output export DEBUG=true python convert_utf8.py "your_file.docx"
Usage
Command Line
Convert web page to Markdown:
Convert local file:
Desktop App Integration
To integrate this server with a desktop app, add the following to your app's server configuration:
Troubleshooting
Encoding Issues
If you encounter character encoding issues, ensure the
PYTHONIOENCODING
environment variable is set toutf-8
Windows users may need to run
chcp 65001
to enable UTF-8 support
Permission Issues
Ensure you have sufficient file read/write permissions
On Windows, you may need to run as administrator
Acknowledgments
This project is based on the original work by Zach Caceres. Thanks to the original author for their outstanding contribution.
License
This project continues to be licensed under the MIT License. See the LICENSE file for details.
Contributing
Contributions are welcome! Before submitting a Pull Request, please:
Ensure your code follows the project's coding standards
Add necessary tests and documentation
Update relevant sections in the README
Contact
For issues or suggestions:
Submit an Issue: https://github.com/JDJR2024/markdownify-mcp-utf8/issues
Create a Pull Request: https://github.com/JDJR2024/markdownify-mcp-utf8/pulls
Email: jdidndosmmxmx@gmail.com
hybrid server
The server is able to function both locally and remotely, depending on the configuration or use case.
Tools
A document conversion server that transforms various file formats (PDFs, documents, images, audio, web content) to Markdown with improved multilingual and UTF-8 support.
- Enhancements
- Key Differences from Original Project
- Features
- Quick Start
- Requirements
- Detailed Installation Guide
- Usage Guide
- Usage
- Troubleshooting
- Acknowledgments
- License
- Contributing
- Contact
Related Resources
Related MCP Servers
- AsecurityAlicenseAqualityConverts various file types and web content to Markdown format. It provides a set of tools to transform PDFs, images, audio files, web pages, and more into easily readable and shareable Markdown text.Last updated -1072,163MIT License
Skrape MCP Serverofficial
AsecurityAlicenseAqualityThis server converts webpages into clean, structured Markdown optimized for language model consumption, removing unnecessary content and supporting JavaScript rendering.Last updated -12MIT License- AsecurityFlicenseAqualityA server that converts various file formats (PDF, images, Office documents, etc.) to Markdown descriptions using Cloudflare AI services.Last updated -1434
- -securityFlicense-qualityConverts various file types (documents, images, audio, web content) to markdown format without requiring Docker, supporting PDF, Word, Excel, PowerPoint, images, audio files, web URLs, and more.Last updated -2229