# 🇮🇩 Indonesian OCR - Manual Installation Guide
## Status: Language Packs Downloaded ✅
Files telah berhasil didownload ke:
**📁 `C:\Users\RYZEN\AppData\Local\Temp\indonesian_ocr_kqacrwt9`**
## 📋 Files yang Didownload:
- ✅ `ind.traineddata` - Indonesian (Bahasa Indonesia) - **WAJIB**
- ✅ `msa.traineddata` - Malay (similar to Indonesian) - **RECOMMENDED**
## 🔧 Manual Installation Steps
### Option 1: File Explorer (RECOMMENDED)
1. **Buka File Explorer sebagai Administrator:**
- Right-click pada File Explorer di taskbar
- Pilih "Run as administrator"
2. **Navigate ke temp folder:**
```
C:\Users\RYZEN\AppData\Local\Temp\indonesian_ocr_kqacrwt9
```
3. **Select files:**
- Select `ind.traineddata`
- Select `msa.traineddata`
- Copy (Ctrl+C)
4. **Navigate ke Tesseract tessdata:**
```
C:\Program Files\Tesseract-OCR\tessdata
```
5. **Paste files:**
- Paste (Ctrl+V)
- Allow administrator access jika diminta
### Option 2: Command Prompt Administrator
1. **Run CMD as Administrator:**
- Press Win+R
- Type `cmd`
- Press Ctrl+Shift+Enter
2. **Execute copy command:**
```cmd
copy "C:\Users\RYZEN\AppData\Local\Temp\indonesian_ocr_kqacrwt9\*.traineddata" "C:\Program Files\Tesseract-OCR\tessdata\"
```
## ✅ Verification Steps
### 1. Check Available Languages
```bash
tesseract --list-langs
```
**Expected Output:**
```
List of available languages in "C:\Program Files\Tesseract-OCR/tessdata/" (4):
eng
ind # ← Indonesian HARUS ADA!
msa # ← Malay HARUS ADA!
osd
```
### 2. Test Indonesian OCR
```bash
python test_indonesian_support.py
```
### 3. Test MCP Tools
```bash
python validate_tools.py
```
## 🎯 Usage dengan lang='eng+ind'
### MCP Tools Usage:
```python
# Smart PDF processing dengan Indonesian
result = await process_pdf_smart(pdf_path, language='eng+ind')
# OCR dengan kombinasi English + Indonesian
ocr_result = await perform_ocr(file_path, language='eng+ind')
# Batch processing dengan Indonesian support
batch_result = await batch_process_pdfs(directory, language='eng+ind')
```
### Python Direct Usage:
```python
import pytesseract
from PIL import Image
# Load image
image = Image.open('dokumen_indonesia.png')
# OCR dengan Indonesian + English
text = pytesseract.image_to_string(image, lang='eng+ind')
```
### Command Line Usage:
```bash
# OCR PDF dengan Indonesian support
tesseract input.pdf output.txt -l eng+ind
# OCR image dengan Indonesian
tesseract dokumen.png result.txt -l eng+ind
```
## 🌟 Optimal Language Combinations
| Use Case | Language Code | Description |
|----------|---------------|-------------|
| **Mixed Documents** | `eng+ind` | English + Indonesian (OPTIMAL) |
| **Fallback** | `eng` | English only (fallback) |
## 🚨 Troubleshooting
### Problem: Permission Denied
- **Solution**: Run File Explorer atau Command Prompt as Administrator
### Problem: Files not found
- **Check**: Files masih ada di temp folder sampai reboot
- **Location**: `C:\Users\RYZEN\AppData\Local\Temp\indonesian_ocr_kqacrwt9`
### Problem: tesseract --list-langs tidak show `ind`
- **Solution**: Restart terminal setelah copy files
- **Verify**: Check files exist di `C:\Program Files\Tesseract-OCR\tessdata\`
## ⭐ After Installation Success
1. **Test Indonesian Support:**
```bash
python test_indonesian_support.py
```
2. **Validate All Tools:**
```bash
python validate_tools.py
```
3. **Test MCP Server:**
```bash
python mcp_server_stdio.py
# Should show: Indonesian language support: Check with list_ocr_languages tool
```
---
**Status**: 📦 Files Downloaded → 🔧 Manual Installation Required → 🎉 Ready for `lang='eng+ind'` Usage!