Skip to main content
Glama

Document to Markdown

talonic_to_markdown

Convert any document—PDF, scan, or image—into clean markdown text using OCR. Accepts document ID, file upload, local path, or URL for flexible input.

Instructions

STATUS: stable.

Get the OCR-converted markdown for a document. Accepts an existing document_id, raw file bytes (base64), a local file path, or a URL. When given a raw file, the tool ingests it via extract first and then returns the markdown.

USE WHEN:

  • The user wants the full text content of a document for summarisation, translation, or analysis.

  • A previous tool call returned a document_id and you want to inspect its content.

  • The user asks 'what does the document say' or 'summarise this PDF' (you call this then summarise).

  • The user has a raw PDF / scan / image and wants markdown directly without designing a schema first.

DO NOT USE WHEN:

  • The user wants specific structured fields (use talonic_extract with a schema).

INPUTS (provide exactly EXACTLY ONE; never combine, e.g. do NOT pass both file_data and file_path):

  • document_id: id of an already-ingested document. Cheapest path, one API call.

  • file_data + filename: base64-encoded file bytes plus the original filename (with extension). RECOMMENDED for local-stdio installs (Claude Desktop, Cursor, Cline, Continue, Cowork). WARNING for hosted-MCP via Claude.ai connectors: Claude.ai imposes a hard size limit on tool-call arguments (effectively under ~1KB), so file_data CANNOT carry a real PDF through Claude.ai's pipeline. The bytes get truncated before reaching the MCP server. For files larger than a trivial test, use file_url or document_id instead when running through Claude.ai. Local stdio installs do NOT have this limit.

  • file_path: local path to a document file. Only works if the MCP server has read access to that path; in sandboxed chat clients use file_data instead.

  • file_url: a URL the Talonic API will fetch directly. Use for documents already on the public web. Best path for Claude.ai users dealing with files larger than the parameter cap.

Input Schema

TableJSON Schema
NameRequiredDescriptionDefault
document_idNoThe Talonic document id whose markdown you want. Get this from a previous talonic_extract or talonic_search response.
file_dataNoBase64-encoded file bytes. Recommended path when the agent already has the file in memory (e.g., the user attached a PDF to the conversation). Pair with `filename` so MIME type can be inferred.
filenameNoOriginal filename including extension, e.g. 'invoice.pdf'. Used to infer MIME type when uploading via `file_data`. Required when `file_data` is provided.
file_pathNoLocal path to a document file. Only works if the MCP server has read access to that path. In sandboxed chat clients (Claude Desktop, Cowork) use `file_data` instead.
file_urlNoURL to a document file. The Talonic API fetches it server-side.

Output Schema

TableJSON Schema
NameRequiredDescriptionDefault
document_idYesID of the document the markdown was extracted from.
markdownYesOCR-converted markdown text content of the document.
costNoPer-call cost and post-call balance from the underlying extract step, parsed from the X-Talonic-* response headers. `null` when the document was already ingested (document_id path) and no extract call ran. Not always present on legacy clients.
Behavior5/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Beyond annotations (readOnlyHint=false, destructiveHint=false), the description discloses that raw files are ingested via 'extract' first, warns about file size limits for hosted MCP, and notes the cheapest path for document_id. This adds rich behavioral context.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is well-structured with clear sections (status, use cases, input instructions). Every sentence is purposeful and adds necessary detail without redundancy. The length is justified by the complexity of multiple input methods.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness5/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the tool's multiple input options, a comprehensive description is needed. This description covers all input methods, their constraints, use cases, and even explains the ingestion process for raw files. With an output schema existing, no explanation of return values is required.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters5/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Despite 100% schema coverage, the description adds extensive value: it explicitly states 'provide EXACTLY ONE' input, advises on when to use each parameter (e.g., recommended for local-stdio, warning for Claude.ai), and explains the relationship between file_data and filename.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the tool converts documents to markdown via OCR. It specifies the verb 'Get' and the resource 'OCR-converted markdown', and distinguishes from sibling 'talonic_extract' by noting that for structured fields, that tool should be used instead.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines5/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description includes explicit 'USE WHEN' and 'DO NOT USE WHEN' sections, listing concrete scenarios like 'user wants full text', 'user asks 'what does the document say'', and directly names the alternative tool 'talonic_extract' for structured fields.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

Install Server

Other Tools

Latest Blog Posts

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/talonicdev/talonic-mcp'

If you have feedback or need assistance with the MCP directory API, please join our Discord server