predict_duration
Predict audio duration in seconds for a given text to estimate credit cost before synthesis. Accepts text and optional voice parameters, applying the same limits as speech generation.
Instructions
Predict the expected output audio duration in seconds for a given text WITHOUT producing any audio file. Accepts the same parameters as text_to_speech and applies the same 300-character limit. Use this to estimate credit cost before synthesizing — credit usage is proportional to the predicted duration.
Input Schema
| Name | Required | Description | Default |
|---|---|---|---|
| text | Yes | ||
| voice_id | No | ||
| language | No | ||
| output_format | No | ||
| model | No | ||
| speed | No | ||
| pitch_shift | No | ||
| style | No |
Output Schema
| Name | Required | Description | Default |
|---|---|---|---|
| result | Yes |