Train PyLaia Model
transkribus_pylaia_trainInitiate PyLaia HTR model training for a collection with optimized default parameters that improve model quality over server defaults. Customize training settings or use the provided defaults.
Instructions
Start PyLaia HTR model training for a collection. By default, sends training parameters matching the Transkribus UI defaults (textFeatsCfg with TextFeats preprocessing, use_masked_conv=True, max_epochs=100). Without these defaults, the server uses different preprocessing (trpPreprocPars) which produces significantly worse models. Set noTrainingDefaults=true to send no training parameters (server defaults).
Input Schema
| Name | Required | Description | Default |
|---|---|---|---|
| collId | Yes | Collection ID | |
| modelName | No | Name for the new model | |
| description | No | Description of the model | |
| baseModelId | No | Base model ID for transfer learning | |
| provider | No | Training provider (default: "PyLaia") | PyLaia |
| language | No | Language code (e.g. "rus", "deu", "eng") | |
| trainList | No | Training page list | |
| trainListFile | No | Absolute path to JSON file containing training page list array of {docId, pageId} objects. Example: /tmp/transkribus-training/train_list.json | |
| testList | No | Test page list | |
| testListFile | No | Absolute path to JSON file containing test page list array of {docId, pageId} objects. Example: /tmp/transkribus-training/test_list.json | |
| omitLinesByTag | No | Tags of lines to omit from training | |
| reverseText | No | Whether to reverse text direction | |
| imgType | No | Image type | |
| customAbbrevsTraining | No | Enable custom abbreviations training | |
| customTagTraining | No | Enable custom tag training | |
| trainProperties | No | Enable training properties | |
| textFeatsCfg | No | TextFeats preprocessing config override. Merged with UI defaults (normheight=64, deslope/deslant=true, enh=true). Only specify fields you want to change. | |
| createModelPars | No | Model architecture parameters as key-value pairs (e.g. {"--rnn_units": "512"}). Merged with UI defaults (use_masked_conv=True, cnn_poolsize="2 2 0 2", etc.). Only specify parameters you want to override. | |
| trainCtcPars | No | CTC training parameters as key-value pairs (e.g. {"--max_epochs": "200"}). Merged with UI defaults (max_epochs=100, learning_rate=3.0E-4, batch_size=24, etc.). Only specify parameters you want to override. | |
| max_epochs | No | Maximum training epochs (default: 100). Shortcut for trainCtcPars --max_epochs. | |
| max_nondecreasing_epochs | No | Early stopping patience (default: 20). Shortcut for trainCtcPars --max_nondecreasing_epochs. | |
| learning_rate | No | Learning rate (default: 3.0E-4). Shortcut for trainCtcPars --learning_rate. | |
| batch_size | No | Batch size (default: 24). Shortcut for trainCtcPars --batch_size. | |
| noTrainingDefaults | No | If true, do NOT apply UI-default training parameters. The server will use its own defaults (which differ from the UI and may produce worse models). |