Fetch a filing document. Small = inlined bytes; oversized = resource_link + navigation tools.
fetch_documentRetrieve the full content of a filing document using its document ID from a supported jurisdiction. Returns structured text or PDF bytes when under size limit; for larger files, provides navigation tools to locate and extract specific pages.
Instructions
Primary tool for reading a filing's content. Pass a document_id from list_filings / get_financials. MANDATORY for any substantive answer — filing metadata (dates, form codes, descriptions) alone doesn't answer the user; the numbers and text live inside the document.
── RESPONSE SHAPES ──
• kind='embedded' (PDF up to ~20 MB; structured text up to max_bytes): returns bytes_base64 with the full document, source_url_official (evergreen registry URL for citation, auto-resolved), and source_url_direct (short-TTL signed proxy URL). For PDFs the host converts bytes into a document content block — you read it natively including scans.
• kind='resource_link' (document exceeds max_bytes): NO bytes_base64. Returns reason, next_steps, the two source URLs, plus index_preview for PDFs ({page_count, text_layer, outline_present, index_status}). Use the navigation tools below.
── WORKFLOW FOR kind='resource_link' ──
Read
index_preview.text_layer. Values:full(every page has real text),partial(mixed),none(scanned / image-only),oversized_skipped(indexing skipped),encrypted/failed.If
full/partial: callget_document_navigation(outline + previews + landmarks) and/orsearch_documentto locate pages. Ifnone/oversized_skipped: skip search.Call
fetch_document_pages(pages='N-M', format='pdf'|'text'|'png')to get actual content. Preferpdffor citations,textfor skim,pngfor scanned or oversized.
── CRITICAL RULES ──
• Navigation-aids-only: previews, snippets, landmark matches, and outline titles returned by the navigation tools are for LOCATING pages. NEVER cite them as source material — quote only from fetch_document_pages output or this tool's inline bytes.
• No fallback to memory: if this tool fails (rate limit, 5xx, disconnect), do NOT fill in names / numbers / dates from training data. Tell the user what failed and offer retry or source_url_official.
• Don't reflexively retry with a larger max_bytes — for big PDFs the bytes are unreadable to you anyway. Use the navigation tools instead.
source_url_official is auto-resolved from a session-side cache populated by the most recent list_filings call. The optional company_id / transaction_id / filing_type / filing_description inputs are OVERRIDES for the rare case where document_id didn't come through list_filings. Per-country document availability, format, and pricing — call list_jurisdictions({jurisdiction:"<code>"}).
Input Schema
| Name | Required | Description | Default |
|---|---|---|---|
| jurisdiction | Yes | ISO 3166-1 alpha-2 country code (uppercase). All registries are official government sources. Currently supported: AU, BE, CA, CA-BC, CA-NT, CH, CY, CZ, DE, ES, FI, FR, GB, HK, IE, IM, IS, IT, KR, KY, LI, MC, MX, MY, NL, NO, NZ, PL, RU, TW. Per-country capability, ID format, examples, status mapping, and caveats: call `list_jurisdictions({jurisdiction:'<code>'})`. To find which countries support a specific tool: `list_jurisdictions({supports_tool:'<tool>'})`. | |
| document_id | Yes | ||
| format | No | Optional preferred content type. Common: application/xhtml+xml, application/pdf, application/xml, application/json. Omit to let the adapter choose the most structured format available (recommended — XHTML > XML > JSON > PDF). | |
| max_bytes | No | Optional inline-size cutoff. Defaults to ~20 MB. Documents above this come back as kind='resource_link' (use navigation tools). Raising this is NOT the right way to read a big PDF — use fetch_document_pages instead. | |
| fresh | No | Set true to bypass the R2 cache and re-fetch from upstream. Use sparingly — CH filings are immutable, the cache is safe. | |
| company_id | No | OVERRIDE (rare use). Normally auto-resolved from the list_filings side-cache. Only pass this when invoking fetch_document on a document_id that did NOT come through list_filings in this session. | |
| transaction_id | No | OVERRIDE (rare use). Normally auto-resolved from the list_filings side-cache. Pass only to override the cache. | |
| filing_type | No | OVERRIDE (rare use). Normally auto-resolved. Pass only to override the cached value. | |
| filing_description | No | OVERRIDE (rare use). Normally auto-resolved. |
Output Schema
| Name | Required | Description | Default |
|---|---|---|---|
| queried_at | Yes | ISO-8601 + Europe/London timezone stamp for when the registry was queried. | |
| jurisdiction | No | ||
| document_id | No | ||
| source_url | No | ||
| available_formats | No | ||
| chosen_format | No | ||
| size_bytes | No | ||
| pages | No | ||
| bytes_base64 | No | ||
| bytes_omitted_reason | No |