pdf4vllm
Click on "Install Server".
Wait a few minutes for the server to deploy. Once ready, it will show a "Started" state.
In the chat, type
@followed by the MCP server name and your instructions, e.g., "@pdf4vllmread the quarterly report PDF in my documents folder"
That's it! The server will respond to your query, and you can continue using it as needed.
Here is a step-by-step guide with screenshots.
pdf4vllm
PDF reading MCP server optimized for vision LLMs.
문제
방식 | 문제점 |
텍스트 추출 | 인코딩 깨짐 → 쓰레기 출력, 이미지-텍스트 순서 뒤섞임 |
이미지 변환 | 토큰 폭발 (특히 페이지 많을 때) |
해결
pdf4vllm은 PDF가 지저분하다고 가정합니다.
텍스트 손상 자동 감지 → 이미지로 자동 전환
읽기 순서 보존 (텍스트 → 표 → 이미지 블록 순서대로)
페이지 제한으로 컨텍스트 오버플로우 방지
불필요한 이미지 자동 필터링 (로고, 선, 헤더/푸터)
설치
pip install pdf4vllm-mcp
# 또는
uvx pdf4vllm-mcpClaude Desktop 설정
git clone https://github.com/PyJudge/pdf4vllm-mcp.git
cd pdf4vllm-mcp
python scripts/install_mcp.py또는 직접 설정 (~/Library/Application Support/Claude/claude_desktop_config.json):
{
"mcpServers": {
"pdf4vllm": {
"command": "/python/경로",
"args": ["/pdf4vllm-mcp/경로/src/server.py"]
}
}
}도구
도구 | 설명 |
| PDF 파일 찾기 (glob 패턴 |
| PDF 내용 블록으로 추출 |
| PDF 내 텍스트 검색 ( |
추출 모드
모드 | 설명 |
| 텍스트 추출 시도 → 손상 감지 시 이미지로 전환 |
| 텍스트/표만 추출, 이미지 없음 |
| 페이지를 이미지로만 렌더링 |
Problem
Approach | Issue |
Text extraction | Encoding corruption → garbage output, mixed text-image ordering |
Image conversion | Token explosion (especially with many pages) |
Solution
pdf4vllm assumes PDFs are messy.
Auto-detects text corruption → switches to image automatically
Preserves reading order (text → table → image blocks in sequence)
Page limits prevent context overflow
Filters unnecessary images (logos, lines, headers/footers)
PDF Input
↓
Corruption Detection (pdfminer.six + pattern analysis)
↓
┌─────────────┬─────────────┐
│ Corrupted │ Clean │
│ → Image │ → Text + │
│ only │ Tables + │
│ │ Images │
└─────────────┴─────────────┘
↓
Ordered Blocks (JSON)Install
pip install pdf4vllm-mcp
# or run without installing
uvx pdf4vllm-mcpClaude Desktop Setup
git clone https://github.com/PyJudge/pdf4vllm-mcp.git
cd pdf4vllm-mcp
python scripts/install_mcp.pyOr manually edit ~/Library/Application Support/Claude/claude_desktop_config.json:
{
"mcpServers": {
"pdf4vllm": {
"command": "/path/to/python",
"args": ["/path/to/pdf4vllm-mcp/src/server.py"]
}
}
}Claude Code Setup
Create .mcp.json in your project:
{
"mcpServers": {
"pdf4vllm": {
"command": "uvx",
"args": ["pdf4vllm-mcp"]
}
}
}Tools
Tool | Description |
| Find PDF files with glob filtering ( |
| Extract PDF content as ordered blocks |
| Search text in PDFs using pdfgrep (requires |
Extraction Modes
Mode | Description |
| Try text extraction → switch to image if corrupted |
| Text/tables only, no images |
| Render pages as images only |
Output Format
{
"pages": [
{
"page_number": 1,
"content_blocks": [
{"type": "text", "content": "..."},
{"type": "table", "content": "| A | B |"},
{"type": "image", "content": "[IMAGE_0]"}
]
}
]
}When text is corrupted:
{
"page_number": 2,
"content_blocks": [],
"text_corrupted": true,
"page_image": "[IMAGE_1]"
}Configuration
config.json or environment variables:
{
"max_pages_per_request": 10,
"max_image_dimension": 842,
"page_image_dpi": 100
}export PDF_MAX_PAGES=20
export PDF_PAGE_IMAGE_DPI=150Test Server
pip install pdf4vllm-mcp[test]
python test_server.py
# → http://localhost:8000License
MIT
Resources
Unclaimed servers have limited discoverability.
Looking for Admin?
If you are the server author, to access and configure the admin panel.
Latest Blog Posts
MCP directory API
We provide all the information about MCP servers via our MCP API.
curl -X GET 'https://glama.ai/api/mcp/v1/servers/PyJudge/pdf4vllm-mcp'
If you have feedback or need assistance with the MCP directory API, please join our Discord server