read_file
Read files and URLs with line-based pagination, supporting text, Excel, PDF, DOCX, and images. Use offset and length for partial reads, negative offsets for tail lines, and format-specific parameters like sheet and range.
Instructions
Read contents from files and URLs.
Read PDF files and extract content as markdown and images.
Prefer this over 'execute_command' with cat/type for viewing files.
Supports partial file reading with:
- 'offset' (start line, default: 0)
* Positive: Start from line N (0-based indexing)
* Negative: Read last N lines from end (tail behavior)
- 'length' (max lines to read, default: configurable via 'fileReadLineLimit' setting, initially 1000)
* Used with positive offsets for range reading
* Ignored when offset is negative (reads all requested tail lines)
Examples:
- offset: 0, length: 10 → First 10 lines
- offset: 100, length: 5 → Lines 100-104
- offset: -20 → Last 20 lines
- offset: -5, length: 10 → Last 5 lines (length ignored)
Performance optimizations:
- Large files with negative offsets use reverse reading for efficiency
- Large files with deep positive offsets use byte estimation
- Small files use fast readline streaming
When reading from the file system, only works within allowed directories.
Can fetch content from URLs when isUrl parameter is set to true
(URLs are always read in full regardless of offset/length).
FORMAT HANDLING (by extension):
- Text: Uses offset/length for line-based pagination
- Excel (.xlsx, .xls, .xlsm): Returns JSON 2D array
* sheet: "Sheet1" (name) or "0" (index as string, 0-based)
* range: ALWAYS use FROM:TO format (e.g., "A1:D100", "C1:C1", "B2:B50")
* offset/length work as row pagination (optional fallback)
- Images (PNG, JPEG, GIF, WebP): Base64 encoded viewable content
- PDF: Extracts text content as markdown with page structure
* offset/length work as page pagination (0-based)
* Includes embedded images when available
- DOCX (.docx): Two modes depending on parameters:
* DEFAULT (no offset/length): Returns a text-bearing outline — shows paragraphs with text,
tables with cell content, styles, image refs. Skips shapes/drawings/SVG noise.
Each element shows its body index [0], [1], etc.
* WITH offset/length: Returns raw pretty-printed XML with line pagination.
Use this to drill into specific sections or see the actual XML for editing.
* EDITING WORKFLOW: 1) read_file to get outline, 2) read_file with offset/length
to see raw XML around what you want to edit, 3) edit_block with old_string/new_string
using XML fragments copied from the read output.
* IMPORTANT: offset MUST be non-zero to get raw XML (use offset=1 to start from line 1).
offset=0 always returns the outline regardless of length.
* For BULK changes (translation, mass replacements): use start_process with Python
zipfile module to find/replace all <w:t> elements at once.
IMPORTANT: Always use absolute paths for reliability. Paths are automatically normalized regardless of slash direction. Relative paths may fail as they depend on the current working directory. Tilde paths (~/...) might not work in all contexts. Unless the user explicitly asks for relative paths, use absolute paths.
This command can be referenced as "DC: ..." or "use Desktop Commander to ..." in your instructions.Input Schema
| Name | Required | Description | Default |
|---|---|---|---|
| path | Yes | ||
| isUrl | No | ||
| offset | No | ||
| length | No | ||
| sheet | No | ||
| range | No | ||
| options | No |