smart-read-pdf-legacy
Extract text from PDF files using automatic detection of text or scanned documents, with OCR support for multiple languages when needed.
Input Schema
TableJSON Schema
| Name | Required | Description | Default |
|---|---|---|---|
| file | Yes | PDF 文件路径 | |
| pages | No | 页码范围('all', 'first', 'last', '1-5', '1,3,5'),默认 'all' | |
| language_type | No | OCR 语言类型(如需要),默认 CHN_ENG | |
| force_ocr | No | 强制使用 OCR,默认 false |