Skip to main content
Glama
UpstageAI

MCP-Upstage-Server

Official
by UpstageAI

extract_information

Extract structured data from documents using custom or auto-generated schemas to process various file formats including PDF, images, and Office documents.

Instructions

Extract structured information from documents using Upstage Universal Information Extraction.

This tool can extract key information from any document type without pre-training. You can either provide a schema defining what information to extract, or let the system automatically generate an appropriate schema based on the document content.

Supported file formats: JPEG, PNG, BMP, PDF, TIFF, HEIC, DOCX, PPTX, XLSX Max file size: 50MB Max pages: 100

SCHEMA FORMAT: When auto_generate_schema is false, provide schema in this exact format: { "type": "json_schema", "json_schema": { "name": "document_schema", "schema": { "type": "object", "properties": { "field_name": { "type": "string|number|array|object", "description": "What to extract" } } } } }

Example schema_json: {"type":"json_schema","json_schema":{"name":"document_schema","schema":{"type":"object","properties":{"company_name":{"type":"string","description":"Company name"},"invoice_number":{"type":"string","description":"Invoice number"},"total_amount":{"type":"number","description":"Total amount"}}}}}

Input Schema

TableJSON Schema
NameRequiredDescriptionDefault
file_pathYes
schema_pathNo
schema_jsonNo
auto_generate_schemaNo

Latest Blog Posts

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/UpstageAI/mcp-upstage-server'

If you have feedback or need assistance with the MCP directory API, please join our Discord server