ReadPDFx - OCR PDF MCP Server

INDONESIAN_INSTALLATION.md•3.63 KiB

# 🇮🇩 Indonesian OCR - Manual Installation Guide ## Status: Language Packs Downloaded ✅ Files telah berhasil didownload ke: **📁 `C:\Users\RYZEN\AppData\Local\Temp\indonesian_ocr_kqacrwt9`** ## 📋 Files yang Didownload: - ✅ `ind.traineddata` - Indonesian (Bahasa Indonesia) - **WAJIB** - ✅ `msa.traineddata` - Malay (similar to Indonesian) - **RECOMMENDED** ## 🔧 Manual Installation Steps ### Option 1: File Explorer (RECOMMENDED) 1. **Buka File Explorer sebagai Administrator:** - Right-click pada File Explorer di taskbar - Pilih "Run as administrator" 2. **Navigate ke temp folder:** ``` C:\Users\RYZEN\AppData\Local\Temp\indonesian_ocr_kqacrwt9 ``` 3. **Select files:** - Select `ind.traineddata` - Select `msa.traineddata` - Copy (Ctrl+C) 4. **Navigate ke Tesseract tessdata:** ``` C:\Program Files\Tesseract-OCR\tessdata ``` 5. **Paste files:** - Paste (Ctrl+V) - Allow administrator access jika diminta ### Option 2: Command Prompt Administrator 1. **Run CMD as Administrator:** - Press Win+R - Type `cmd` - Press Ctrl+Shift+Enter 2. **Execute copy command:** ```cmd copy "C:\Users\RYZEN\AppData\Local\Temp\indonesian_ocr_kqacrwt9\*.traineddata" "C:\Program Files\Tesseract-OCR\tessdata\" ``` ## ✅ Verification Steps ### 1. Check Available Languages ```bash tesseract --list-langs ``` **Expected Output:** ``` List of available languages in "C:\Program Files\Tesseract-OCR/tessdata/" (4): eng ind # ← Indonesian HARUS ADA! msa # ← Malay HARUS ADA! osd ``` ### 2. Test Indonesian OCR ```bash python test_indonesian_support.py ``` ### 3. Test MCP Tools ```bash python validate_tools.py ``` ## 🎯 Usage dengan lang='eng+ind' ### MCP Tools Usage: ```python # Smart PDF processing dengan Indonesian result = await process_pdf_smart(pdf_path, language='eng+ind') # OCR dengan kombinasi English + Indonesian ocr_result = await perform_ocr(file_path, language='eng+ind') # Batch processing dengan Indonesian support batch_result = await batch_process_pdfs(directory, language='eng+ind') ``` ### Python Direct Usage: ```python import pytesseract from PIL import Image # Load image image = Image.open('dokumen_indonesia.png') # OCR dengan Indonesian + English text = pytesseract.image_to_string(image, lang='eng+ind') ``` ### Command Line Usage: ```bash # OCR PDF dengan Indonesian support tesseract input.pdf output.txt -l eng+ind # OCR image dengan Indonesian tesseract dokumen.png result.txt -l eng+ind ``` ## 🌟 Optimal Language Combinations | Use Case | Language Code | Description | |----------|---------------|-------------| | **Mixed Documents** | `eng+ind` | English + Indonesian (OPTIMAL) | | **Fallback** | `eng` | English only (fallback) | ## 🚨 Troubleshooting ### Problem: Permission Denied - **Solution**: Run File Explorer atau Command Prompt as Administrator ### Problem: Files not found - **Check**: Files masih ada di temp folder sampai reboot - **Location**: `C:\Users\RYZEN\AppData\Local\Temp\indonesian_ocr_kqacrwt9` ### Problem: tesseract --list-langs tidak show `ind` - **Solution**: Restart terminal setelah copy files - **Verify**: Check files exist di `C:\Program Files\Tesseract-OCR\tessdata\` ## ⭐ After Installation Success 1. **Test Indonesian Support:** ```bash python test_indonesian_support.py ``` 2. **Validate All Tools:** ```bash python validate_tools.py ``` 3. **Test MCP Server:** ```bash python mcp_server_stdio.py # Should show: Indonesian language support: Check with list_ocr_languages tool ``` --- **Status**: 📦 Files Downloaded → 🔧 Manual Installation Required → 🎉 Ready for `lang='eng+ind'` Usage!

Loading blob content...

Latest Blog Posts

Redis vs ioredis vs valkey-glide
By punkpeye on January 26, 2026.
benchmark
Redis
valkey
Quickstart: Publish an MCP Server to the MCP Registry
By punkpeye on January 24, 2026.
mcp
official reference mirror
Official MCP Registry Server.json Requirements
By punkpeye on January 24, 2026.
mcp
official reference mirror

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/irev/mcp-readpdfx'

If you have feedback or need assistance with the MCP directory API, please join our Discord server

INDONESIAN_INSTALLATION.md•3.63 KiB