Schema | talonic-mcp

talonic-mcp

Official

by talonicdev

Overview Schema Related Servers Score Discussions

TypeScript

Remote

Server Configuration

Describes the environment variables required to run the server.

Name	Required	Description	Default
`TALONIC_API_KEY`	Yes	Your Talonic API key. Starts with tlnc_.
`TALONIC_BASE_URL`	No	Override the API base URL. Default: https://api.talonic.com.	https://api.talonic.com

Capabilities

Features and capabilities supported by this server

Capability	Details
`tools`	{ "listChanged": true }
`resources`	{ "listChanged": true }

Tools

Functions exposed to the LLM to take actions

Name	Description
talonic_list_schemasA	STATUS: stable. List all saved schemas in the user's Talonic workspace. Returns each schema with its id (UUID), short_id (SCH-XXXXXXXX), name, description, version, field count, and full JSON Schema definition. Either id form is accepted by talonic_extract's `schema_id` parameter. USE WHEN: The user asks what schemas they have, or asks to see existing schemas. You want to discover existing schemas before designing a new one. Before recommending the user create a schema, check if one already covers the use case. The user asks to extract from a known document type and you want to find a matching schema. DO NOT USE WHEN: The user just wants to extract data from a document and provides an inline schema (call talonic_extract directly). TIP: Pair this with talonic_extract by passing the chosen schema's id as `schema_id`.
talonic_save_schemaA	STATUS: stable. Save a schema definition to the user's Talonic workspace so it can be reused across future extractions. Returns the saved schema with its newly assigned id and short_id. USE WHEN: The user asks to save a schema, store a template, or reuse the schema across docs. You have iterated on a schema with the user and they confirmed it should be saved. The user wants to standardise extraction across many documents of the same type. DO NOT USE WHEN: The user just wants to extract once with an inline schema (call talonic_extract directly with the schema inline). The user has not confirmed the schema design (avoid creating clutter in their workspace). DEFINITION FORMATS: JSON Schema (most reliable): { type: "object", properties: { vendor_name: { type: "string" } } } Flat key-type map: { vendor_name: "string", invoice_total: "number" } -- API normalises server-side. If you get a "no fields" error from the API, fall back to JSON Schema. TIP: After saving, call talonic_extract with `schema_id` set to the returned id (UUID or SCH- short id) for consistent results.
talonic_get_documentA	STATUS: stable. Fetch full metadata for a single document already in the user's Talonic workspace. Returns id, filename, page count, detected document type, language, processing log, and link URLs (self, extractions, dashboard). USE WHEN: You need details about a specific document the user already extracted or uploaded. You have a document_id from a previous extract or search call and want more context. The user asks 'tell me about document X' or similar. DO NOT USE WHEN: The user wants the document's full text content (use talonic_to_markdown for OCR markdown). The user wants extracted structured data (use talonic_extract with a schema, or fetch the extraction by id). The user has a file but no document_id yet (call talonic_extract first to ingest the document).
talonic_searchA	STATUS: stable. Search the user's Talonic workspace for documents, fields, sources, or schemas matching a query. Returns ranked results across all entity types in one call. USE WHEN: The user wants to find documents but does not know the exact filename or id. The query is conceptual ('contracts mentioning indemnification', 'Acme invoices'). You need to narrow a large workspace before calling talonic_extract or talonic_filter. The user asks 'do I have any docs about X' or 'find anything related to X'. DO NOT USE WHEN: The user has a specific document_id (use talonic_get_document instead). The user wants to apply structured field-value filters like 'amount > 1000' (use talonic_filter). The user wants to extract data from a brand-new document (use talonic_extract). TIP: The result includes documents, fieldMatches, sources, schemas, and fields. Both fields[] and fieldMatches[] include a `filterable` boolean. Only entries with filterable: true can be used with talonic_filter. Fields with filterable: false exist in a schema but have no extracted data yet. Pick the entity type the user actually needs.
talonic_filterA	STATUS: stable. Field-name resolution is server-side. Filter the user's Talonic documents by extracted field values using composable conditions. Conditions accept either a canonical field name (e.g. 'vendor.name', 'policy.0_coverage_type') or a field UUID. The Talonic API resolves names to ids server-side. USE WHEN: The user wants documents matching specific structured criteria, like 'invoices over 1000 EUR' or 'contracts expiring before 2026-12-31' or 'COIs from Acme'. The query is value-based on extracted fields, not a free-text concept search. You need to retrieve a sortable, paginated list filtered by field conditions. DO NOT USE WHEN: The user wants conceptual / free-text search across content (use talonic_search). The user is looking for a single document by id (use talonic_get_document). The user wants extracted data from a new document (use talonic_extract). OPERATORS: eq, neq: equality / inequality. gt, gte, lt, lte: numeric or date comparisons. between: requires both `value` and `value_to`. contains: substring match on string fields. is_empty: presence check, no value needed. Returns documents where the field is null or missing. is_not_empty: presence check, no value needed. Returns documents where the field has a materialized value. Results reflect data within seconds of extraction completing. SCHEMA TYPING: Numeric operators (`gt`, `gte`, `lt`, `lte`, `between`) only resolve correctly when the schema field is typed as `number`. A field typed as `string` that holds numeric content (e.g. '€1,500.00') will silently return zero matches even after extraction. Pick the right type at schema design time. If the response contains a `warnings` array, surface its `message` (and `suggestion`, if present) to the user verbatim — these explain why a query returned zero or unexpected results and typically suggest a schema-design change (e.g. switching a field's `data_type` from `string` to `number`) that will make subsequent filter calls work correctly. Do not silently retry without flagging the warning. TIPS: To discover available field names, call talonic_search first with a related query. Only use fields[] entries where filterable is true — their canonicalName is what to pass as `field` here. Fields with filterable: false have no extracted data yet. fieldMatches[].resolvedFieldId is only valid when filterable is true. Entries with filterable: false have resolvedFieldId: null and cannot be used for filtering. Both `field` (name) and `field_id` (UUID) reach the API as `fieldId`. Either is fine.
talonic_to_markdownA	STATUS: stable. Get the OCR-converted markdown for a document. Accepts an existing document_id, raw file bytes (base64), a local file path, or a URL. When given a raw file, the tool ingests it via extract first and then returns the markdown. USE WHEN: The user wants the full text content of a document for summarisation, translation, or analysis. A previous tool call returned a document_id and you want to inspect its content. The user asks 'what does the document say' or 'summarise this PDF' (you call this then summarise). The user has a raw PDF / scan / image and wants markdown directly without designing a schema first. DO NOT USE WHEN: The user wants specific structured fields (use talonic_extract with a schema). INPUTS (provide exactly EXACTLY ONE; never combine, e.g. do NOT pass both file_data and file_path): document_id: id of an already-ingested document. Cheapest path, one API call. file_data + filename: base64-encoded file bytes plus the original filename (with extension). RECOMMENDED for local-stdio installs (Claude Desktop, Cursor, Cline, Continue, Cowork). WARNING for hosted-MCP via Claude.ai connectors: Claude.ai imposes a hard size limit on tool-call arguments (effectively under ~1KB), so file_data CANNOT carry a real PDF through Claude.ai's pipeline. The bytes get truncated before reaching the MCP server. For files larger than a trivial test, use file_url or document_id instead when running through Claude.ai. Local stdio installs do NOT have this limit. file_path: local path to a document file. Only works if the MCP server has read access to that path; in sandboxed chat clients use file_data instead. file_url: a URL the Talonic API will fetch directly. Use for documents already on the public web. Best path for Claude.ai users dealing with files larger than the parameter cap.
talonic_extractA	STATUS: stable. Production-safe when called with a schema. Schema-less extraction is disabled at the MCP layer. Extract structured, schema-validated data from a document using Talonic. Returns clean JSON matching the schema, with per-field confidence scores and metadata about the document (detected type, language, page count). USE WHEN: The user has a document (PDF, image, scan, DOCX, etc.) and wants specific fields pulled out. You need structured data (vendor name, total amount, dates, parties, terms) rather than free text. The user uploads or references any invoice, contract, certificate, statement, or form. You want validated JSON instead of trying to OCR + parse with raw LLM calls. DO NOT USE WHEN: The user just wants the full text content (use talonic_to_markdown after extracting once). The user wants to find documents matching a query (use talonic_search or talonic_filter). FILE SOURCES (provide exactly EXACTLY ONE; never combine, e.g. do NOT pass both file_data and file_path): file_data + filename: base64-encoded file bytes plus the original filename (with extension). RECOMMENDED for local-stdio installs (Claude Desktop, Cursor, Cline, Continue, Cowork). WARNING for hosted-MCP via Claude.ai connectors: Claude.ai imposes a hard size limit on tool-call arguments (effectively under ~1KB), so file_data CANNOT carry a real PDF through Claude.ai's pipeline. The bytes get truncated before reaching the MCP server. For files larger than a trivial test, use file_url or document_id instead when running through Claude.ai. Local stdio installs do NOT have this limit. file_path: a local path to the document. Only works if the MCP server process can read that path on its own filesystem. Chat clients (Claude Desktop, Claude.ai, Cowork) store user uploads in a sandbox the MCP server cannot access, so file_path is only useful when the agent explicitly knows a path on the same machine as the MCP server. file_url: a URL the Talonic API will fetch directly. Use for documents already on the public web. Best path for Claude.ai users dealing with files larger than the parameter cap. document_id: re-extract a document already in the workspace. Cheapest option when the document is already uploaded via app.talonic.com or a previous extract call. SCHEMA (REQUIRED, provide exactly one of `schema` or `schema_id`): JSON Schema (RECOMMENDED): { type: "object", properties: { vendor_name: { type: "string" } } }. Flat key-type map: { vendor_name: "string", invoice_total: "number" }. Accepted, but if you get a "no fields" error, fall back to JSON Schema. schema_id: id of a saved schema from talonic_list_schemas. Accepts UUID or SCH-XXXXXXXX short id. Calls without `schema` or `schema_id` are rejected with a validation error before they hit the API, to prevent unreliable schema-free extractions reaching production. RESPONSE SHAPE (key fields): data: the structured extracted JSON, shaped by your schema. confidence.overall: 0..1 confidence for the extraction as a whole. confidence.fields: per-field confidence map. Treat fields below ~0.7 as needing human review. document.id, document.filename, document.pages, document.type_detected, document.language_detected. extraction_id, request_id: stable identifiers for support and re-fetch. processing.duration_ms, processing.region: useful for debugging and capacity planning. markdown: present only when `include_markdown: true`. provenance: present only when `include_provenance: true`. Per-field source evidence: { field_name: { source_text, section, page } }. Useful for audit trails and citations. Cost, EUR price, and remaining credit balance are not surfaced in v0.1 and may appear in a later version.
talonic_get_balanceA	STATUS: stable. Read the user's current Talonic credit balance, EUR value, 30-day burn rate, projected runway, tier, and next-tier-reset timestamp. Use this to make budget- aware decisions before kicking off large batches or re-extractions. USE WHEN: The user asks how many credits or how much budget they have left. You are about to run a large or expensive operation (batch extract, re-extract many documents) and want to confirm budget headroom first. The user asks how long their balance will last at the current rate. DO NOT USE WHEN: The user just wants the per-call cost of a single extraction (that is already on the talonic_extract response under `cost`). The user wants to top up credits (route them to the dashboard; auto top-up is guarded by a separate scope). RESPONSE SHAPE: balance_credits: current credit balance. balance_eur: current balance in EUR (rounded to two decimals). burn_rate_30d_credits: total credits consumed in the trailing 30 days. projected_runway_days: days of runway at the current 30-day average burn. `-1` indicates no consumption in the trailing window (cannot compute runway). tier: API tier of the workspace (`free`, `pro`, `enterprise`, etc.). tier_resets_at: ISO 8601 timestamp of the next monthly tier reset.

Prompts

Interactive templates invoked by user choice

Name	Description
No prompts

Resources

Contextual data attached and managed by the client

Name	Description
`talonic-schemas`	All schemas saved in the user's Talonic workspace, with their full JSON Schema definitions.
`talonic-webhooks-reference`	Webhook event types, delivery behavior, signature verification algorithms, and retry policies.

Server Configuration
Capabilities
Tools
Prompts
Resources

Latest Blog Posts

Lightport: Open-Sourcing Glama's AI Gateway
By punkpeye on April 27, 2026.
open source
OpenAI
Tool Definition Quality Score (TDQS)
By punkpeye on April 3, 2026.
mcp
The Hackers Who Tracked My Sleep Cycle
By punkpeye on March 26, 2026.
security

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/talonicdev/talonic-mcp'

If you have feedback or need assistance with the MCP directory API, please join our Discord server