ocr_image
Extract text from images using optical character recognition. Supports PNG, JPEG, SVG with adjustable detail and model modes for standard or deep document understanding.
Instructions
Extract text from an image using Florence-2 OCR.
Args: image_path: Absolute or relative path to the image file (supports PNG, JPEG, SVG). detail_level: 'normal' for plain OCR, 'high' for OCR with region info. model_mode: 'fast' for Florence-2 (default), 'deep' for MiniCPM-V 4.6 (better document understanding).
Returns: Dict with extracted text and optionally bounding regions.
Input Schema
| Name | Required | Description | Default |
|---|---|---|---|
| image_path | Yes | ||
| model_mode | No | fast | |
| detail_level | No | normal |
Output Schema
| Name | Required | Description | Default |
|---|---|---|---|
No arguments | |||