get_screen_text
Extract visible text from your screen using OCR. Returns text grouped by detected windows with bounding boxes. Use when you need code, terminal, chat, or document content without visual layout. Avoids sending data to the cloud.
Instructions
Run OCR over the current screen and return extracted text only.
Returns extracted text grouped by detected region (window/panel) with approximate bounding boxes.
USE WHEN: you need the textual content visible on screen (code, terminal, chat, docs) and visual layout doesn't matter. NOT FOR: visual content (diagrams, photos) — use get_screenshot. ALTERNATIVES: get_screenshot (raw pixels), search_history (OCR over past captures).
BEHAVIOR: captures + OCR in one call; takes 200-800 ms depending on screen size. Cheaper in tokens than get_screenshot (~200-700 vs ~1200). Result is also indexed into the OCR history (visible via search_history).
Input Schema
| Name | Required | Description | Default |
|---|---|---|---|
No arguments | |||
Output Schema
| Name | Required | Description | Default |
|---|---|---|---|
| result | Yes |