Skip to main content
Glama
JaeHyeon-KAIST

pdf2zh-next-mcp

pdf2zh-next-mcp

PyPI License

MCP server for PDF translation using pdf2zh-next as the PDF processing backend. Designed for Claude Desktop.

Instead of translating each segment independently (which loses context), this server extracts all segments at once and lets the LLM translate them together — preserving terminology consistency and context across the entire document.

Using Claude Code? Check out pdf2zh-next-skill — a lightweight skill-based approach without MCP overhead. It handles large PDFs better by leveraging Claude Code's direct file I/O and auto-continuation.

How it works

┌─────────────────────────────────────────────────┐
│  Claude Desktop                                  │
│                                                  │
│  1. extract_segments  ──→  segments + formulas   │
│  2. LLM translates all segments at once          │
│  3. assemble_translated  ──→  final PDF          │
└─────────────────────────────────────────────────┘

The LLM sees every segment before translating — so terminology stays consistent, cross-page sentences flow naturally, and formula placeholders are preserved correctly.

Related MCP server: PDF2MD MCP Server

Prerequisites

pdf2zh-next must be installed separately:

uv tool install pdf2zh-next

Verify installation:

pdf2zh_next --version

You need uv to install both pdf2zh-next and this server.

Installation

uv tool install pdf2zh-next-mcp

From GitHub

uv tool install git+https://github.com/JaeHyeon-KAIST/pdf2zh-next-mcp

From source

git clone https://github.com/JaeHyeon-KAIST/pdf2zh-next-mcp
cd pdf2zh-next-mcp
uv sync

Setup

Add to your Claude Desktop MCP config:

  • macOS: ~/Library/Application Support/Claude/claude_desktop_config.json

  • Windows: %APPDATA%\Claude\claude_desktop_config.json

If installed from PyPI or GitHub:

{
  "mcpServers": {
    "pdf-translate": {
      "command": "uvx",
      "args": ["pdf2zh-next-mcp"]
    }
  }
}

If running from source:

{
  "mcpServers": {
    "pdf-translate": {
      "command": "uv",
      "args": [
        "run",
        "--directory", "/path/to/pdf2zh-next-mcp",
        "python", "-m", "pdf2zh_next_mcp.main"
      ]
    }
  }
}

Tip: If Claude Desktop can't find uvx, use the absolute path (e.g., /opt/homebrew/bin/uvx on macOS, C:\Users\you\.local\bin\uvx.exe on Windows).

Usage

Just ask:

"Translate this PDF to Korean: /path/to/paper.pdf"

Behind the scenes:

  1. extract_segments analyzes the PDF layout and returns all text segments

  2. The LLM translates everything at once (with full context)

  3. assemble_translated injects translations and generates the final PDF

Output files:

  • *-mono.pdf — translated PDF

  • *-dual.pdf — bilingual side-by-side

  • *-glossary.json — terminology glossary

Limitations

  • Large PDFs (~30+ pages): Claude Desktop has a per-turn output token limit. For documents with many segments, the translation may fail mid-process with "response could not be fully generated". For large PDFs, use pdf2zh-next-skill with Claude Code instead.

  • MCP tool result size: Segments are paginated to stay within Claude Desktop's 25K token limit per tool response. This is handled automatically.

Troubleshooting

BabeldocError: cannot unpack non-iterable NoneType object

BabelDOC needs CMap files for font character mapping. If its automatic download times out, install them manually:

cd ~/Downloads
curl -L https://github.com/funstory-ai/BabelDOC-Assets/archive/refs/heads/main.zip -o BabelDOC-Assets.zip
unzip BabelDOC-Assets.zip
mkdir -p ~/.cache/babeldoc/cmap
cp BabelDOC-Assets-main/cmap/*.json ~/.cache/babeldoc/cmap/

This is a one-time setup. The cache path is the same on all platforms.

License

MIT

A
license - permissive license
-
quality - not tested
D
maintenance

Maintenance

Maintainers
Response time
Release cycle
Releases (12mo)
Commit activity

Resources

Unclaimed servers have limited discoverability.

Looking for Admin?

If you are the server author, to access and configure the admin panel.

Latest Blog Posts

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/JaeHyeon-KAIST/pdf2zh-next-mcp'

If you have feedback or need assistance with the MCP directory API, please join our Discord server