Skip to main content
Glama

Fetch a filing document. Small = inlined bytes; oversized = resource_link + navigation tools.

fetch_document
Read-onlyIdempotent

Retrieve the full content of a filing document using its document ID from a supported jurisdiction. Returns structured text or PDF bytes when under size limit; for larger files, provides navigation tools to locate and extract specific pages.

Instructions

Primary tool for reading a filing's content. Pass a document_id from list_filings / get_financials. MANDATORY for any substantive answer — filing metadata (dates, form codes, descriptions) alone doesn't answer the user; the numbers and text live inside the document.

── RESPONSE SHAPES ── • kind='embedded' (PDF up to ~20 MB; structured text up to max_bytes): returns bytes_base64 with the full document, source_url_official (evergreen registry URL for citation, auto-resolved), and source_url_direct (short-TTL signed proxy URL). For PDFs the host converts bytes into a document content block — you read it natively including scans. • kind='resource_link' (document exceeds max_bytes): NO bytes_base64. Returns reason, next_steps, the two source URLs, plus index_preview for PDFs ({page_count, text_layer, outline_present, index_status}). Use the navigation tools below.

── WORKFLOW FOR kind='resource_link' ──

  1. Read index_preview.text_layer. Values: full (every page has real text), partial (mixed), none (scanned / image-only), oversized_skipped (indexing skipped), encrypted / failed.

  2. If full / partial: call get_document_navigation (outline + previews + landmarks) and/or search_document to locate pages. If none / oversized_skipped: skip search.

  3. Call fetch_document_pages(pages='N-M', format='pdf'|'text'|'png') to get actual content. Prefer pdf for citations, text for skim, png for scanned or oversized.

── CRITICAL RULES ── • Navigation-aids-only: previews, snippets, landmark matches, and outline titles returned by the navigation tools are for LOCATING pages. NEVER cite them as source material — quote only from fetch_document_pages output or this tool's inline bytes. • No fallback to memory: if this tool fails (rate limit, 5xx, disconnect), do NOT fill in names / numbers / dates from training data. Tell the user what failed and offer retry or source_url_official. • Don't reflexively retry with a larger max_bytes — for big PDFs the bytes are unreadable to you anyway. Use the navigation tools instead.

source_url_official is auto-resolved from a session-side cache populated by the most recent list_filings call. The optional company_id / transaction_id / filing_type / filing_description inputs are OVERRIDES for the rare case where document_id didn't come through list_filings. Per-country document availability, format, and pricing — call list_jurisdictions({jurisdiction:"<code>"}).

Input Schema

TableJSON Schema
NameRequiredDescriptionDefault
jurisdictionYesISO 3166-1 alpha-2 country code (uppercase). All registries are official government sources. Currently supported: AU, BE, CA, CA-BC, CA-NT, CH, CY, CZ, DE, ES, FI, FR, GB, HK, IE, IM, IS, IT, KR, KY, LI, MC, MX, MY, NL, NO, NZ, PL, RU, TW. Per-country capability, ID format, examples, status mapping, and caveats: call `list_jurisdictions({jurisdiction:'<code>'})`. To find which countries support a specific tool: `list_jurisdictions({supports_tool:'<tool>'})`.
document_idYes
formatNoOptional preferred content type. Common: application/xhtml+xml, application/pdf, application/xml, application/json. Omit to let the adapter choose the most structured format available (recommended — XHTML > XML > JSON > PDF).
max_bytesNoOptional inline-size cutoff. Defaults to ~20 MB. Documents above this come back as kind='resource_link' (use navigation tools). Raising this is NOT the right way to read a big PDF — use fetch_document_pages instead.
freshNoSet true to bypass the R2 cache and re-fetch from upstream. Use sparingly — CH filings are immutable, the cache is safe.
company_idNoOVERRIDE (rare use). Normally auto-resolved from the list_filings side-cache. Only pass this when invoking fetch_document on a document_id that did NOT come through list_filings in this session.
transaction_idNoOVERRIDE (rare use). Normally auto-resolved from the list_filings side-cache. Pass only to override the cache.
filing_typeNoOVERRIDE (rare use). Normally auto-resolved. Pass only to override the cached value.
filing_descriptionNoOVERRIDE (rare use). Normally auto-resolved.

Output Schema

TableJSON Schema
NameRequiredDescriptionDefault
queried_atYesISO-8601 + Europe/London timezone stamp for when the registry was queried.
jurisdictionNo
document_idNo
source_urlNo
available_formatsNo
chosen_formatNo
size_bytesNo
pagesNo
bytes_base64No
bytes_omitted_reasonNo
Behavior5/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations indicate readOnlyHint=true, idempotentHint=true, so description adds value by detailing two response shapes (embedded vs. resource_link), caching behavior (fresh parameter), and auto-resolved IDs. No contradiction with annotations.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Well-structured with sections for shapes, workflow, and rules. Uses markdown headings and bullet points for clarity. No fluff; every sentence adds value.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness5/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given complexity (9 parameters, plus workflow), description is thorough: covers response shapes, navigation workflow, critical rules, and cross-references sibling tools. Output schema exists but description still explains return value constraints.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters4/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

With 89% schema description coverage, baseline is 3; description adds significant value by explaining overrides (company_id, etc.) and providing guidance on max_bytes usage. Slight deduction because some descriptions in schema are already detailed, so marginal addition is moderate.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states that fetch_document is the primary tool for reading filing content, with a specific verb ('fetch') and resource ('document'). It distinguishes itself from siblings like fetch_document_pages and get_document_navigation by explaining workflow and when to use each.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines5/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

Explicit guidance on when to use: mandatory for substantive answers; alternatives listed: get_document_navigation and fetch_document_pages for oversized documents. Includes critical rules like not fallback to memory and not reflexively retrying with larger max_bytes.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

Install Server

Other Tools

Latest Blog Posts

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/sophymarine/openregistry'

If you have feedback or need assistance with the MCP directory API, please join our Discord server