read_file
Read DOCX, ODT, or Google Doc content with token-aware pagination. Returns excerpts with metadata for continued reading as needed.
Instructions
Read document content (DOCX, ODT, or Google Doc). Output is token-limited (~14k tokens) by default with pagination metadata (has_more, next_offset). Use offset/limit to paginate.
Input Schema
| Name | Required | Description | Default |
|---|---|---|---|
| file_path | No | Path to the DOCX or ODT file. | |
| google_doc_id | No | Google Doc ID or URL (alternative to file_path). Extract from URL: docs.google.com/document/d/{ID}/edit | |
| offset | No | 1-based paragraph offset for pagination. Negative values count from end. | |
| limit | No | Max paragraphs to return. When omitted, output is token-limited to ~14k tokens with pagination. | |
| node_ids | No | ||
| format | No | ||
| comment_rendering | No | How to render comments in read_file output. Use "paragraph_notes" (default) for paragraph-local comment threads, "inline_markers" to add `[cm-start:N]`/`[cm-end:N]` milestones in TOON output (combined with the thread blocks), "endnotes" to collect threaded comments into a trailing #COMMENTS block in TOON output, or "none" for the legacy output with no comment rendering. | |
| show_formatting | No | When true (default), shows inline formatting tags (<b>, <i>, <u>, <highlighting>, <a>). When false, emits plain text with no inline tags. | |
| include_fingerprint | No | When true and format="json", include a portable content_fingerprint ("sha256:nfkc:<32hex>") on each paragraph. Read-only metadata derived from the paragraph's normalized visible text; NOT an edit anchor. Edit tools accept only `_bk_*` IDs. No effect on TOON/simple output. Ignored for Google Docs and ODT. | |
| include_footnotes | No | When true and format="json", attach a `footnotes` array ({id, display_number, text}) to each paragraph node for the footnotes anchored to it. Windowed to the returned slice (a paginated walk returns each footnote exactly once) and counted toward the read token budget. Footnotes with an empty body or no anchored paragraph are excluded — use get_footnotes for the authoritative full enumeration. No effect on TOON/simple output. Ignored for Google Docs and ODT. Default: false. |