openregistry

Fetch a subset of pages from a cached PDF

fetch_document_pages

Read-onlyIdempotent

Return specific pages of a cached PDF document in pdf (authoritative), text (skimmable), or png (scanned) format. Up to 100 pages per call.

Instructions

Return specific pages of a PDF in one of three formats: • format='pdf' — pdf-lib page slice, preserves the original text layer and fonts (no re-encoding). This is the ONLY format that gives you byte-exact, citation-grade content. Use this for financial numbers, legal quotes, and any answer requiring precision. • format='text' — raw extracted text from pdfjs. Machine-readable but NOT authoritative — OCR errors on bad-quality text layers can silently garble digits. Use only for summarisation / light reading, and cross-check numbers by re-fetching with format='pdf'. • format='png' — page rasterization via Cloudflare Browser Rendering, for documents with text_layer='none' (scanned PDFs). Phase 6 — may return 'not implemented' in current deployment.

The response includes at most 100 pages (Anthropic document-block hard cap). Split larger ranges into multiple calls.

Requires the document's bytes to already be cached — call fetch_document on the full document first if this is a new filing.

Input Schema

TableJSON Schema

Name	Required	Description	Default
`jurisdiction`	Yes	ISO 3166-1 alpha-2 country code (uppercase). All registries are official government sources. Currently supported: AU, BE, CA, CA-BC, CA-NT, CH, CY, CZ, DE, ES, FI, FR, GB, HK, IE, IM, IS, IT, KR, KY, LI, MC, MX, MY, NL, NO, NZ, PL, RU, TW. Per-country capability, ID format, examples, status mapping, and caveats: call `list_jurisdictions({jurisdiction:'<code>'})`. To find which countries support a specific tool: `list_jurisdictions({supports_tool:'<tool>'})`.
`document_id`	Yes
`pages`	Yes	Page spec like '1-5', '3,7,9', or '1,3-5'. 1-based. Max 100 pages per call.
`format`	No	Output format. Use 'pdf' for authoritative content (default), 'text' for quick skimming, 'png' for scanned documents.	pdf
`dpi`	No	DPI for format='png'. Default 150. 72 for thumbnails, 200+ for high-detail reading.
`company_id`	No	OVERRIDE (rare use). Normally auto-resolved from the list_filings side-cache.
`transaction_id`	No	OVERRIDE (rare use). Normally auto-resolved.

Output Schema

TableJSON Schema

Name	Required	Description
`queried_at`	Yes	ISO-8601 + Europe/London timezone stamp for when the registry was queried.
`jurisdiction`	No
`document_id`	No
`pages_requested`	No
`chosen_format`	No
`size_bytes`	No
`bytes_base64`	No
`text`	No

Tool Definition Quality

A4.5/5.0

Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations already declare idempotent and read-only, so bar is lower. Description adds valuable details: byte-exact pdf output, potential OCR errors on text format, and 'not implemented' possibility for png. Minor deduction for not discussing rate limits or caching behavior beyond the prerequisite.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Well-structured with bullet points for formats, clear sentences, and no wasted words. Critical information front-loaded.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness5/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Despite the presence of an output schema, the description provides comprehensive guidance on usage, format caveats, and prerequisites. No gaps identified.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema coverage is 86%, so baseline is 3. Description adds some value by clarifying format behaviors ('authoritative', 'quick skimming', 'scanned documents') and DPI use cases, but mostly repeats schema descriptions.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

Clearly states it returns specific pages of a PDF in three formats (pdf, text, png). Distinguishes from sibling fetch_document by specifying it works on cached PDFs and returns subsets of pages.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines5/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

Explicitly tells when to use each format ('pdf' for precision, 'text' for summarization, 'png' for scanned PDFs), mentions hard cap of 100 pages and splitting, and notes prerequisite of calling fetch_document first for new filings.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

Install Server

Latest Blog Posts

Lightport: Open-Sourcing Glama's AI Gateway
By punkpeye on April 27, 2026.
open source
OpenAI
Tool Definition Quality Score (TDQS)
By punkpeye on April 3, 2026.
mcp
The Hackers Who Tracked My Sleep Cycle
By punkpeye on March 26, 2026.
security

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/sophymarine/openregistry'

If you have feedback or need assistance with the MCP directory API, please join our Discord server