generate_schema
Analyzes documents to automatically create JSON schemas for structured data extraction, enabling consistent field definitions across similar documents.
Instructions
Generate an extraction schema for a document using Upstage AI's schema generation API.
This tool analyzes a document and automatically generates a JSON schema that defines the structure and fields that can be extracted from similar documents. The generated schema can then be used with the extract_information tool when auto_generate_schema is set to false.
This is useful when you want to:
Create a reusable schema for multiple similar documents
Have more control over the extraction fields
Ensure consistent field naming and structure across extractions
Supported file formats: JPEG, PNG, BMP, PDF, TIFF, HEIC, DOCX, PPTX, XLSX Max file size: 50MB Max pages: 100
The tool returns both a readable schema object and a schema_json string that can be directly copied and used with the extract_information tool.
Input Schema
| Name | Required | Description | Default |
|---|---|---|---|
| file_path | Yes |