async_text_to_speech
Convert text to speech with customizable speaker, audio format, speed, and pitch settings. Save synthesized audio files to a specified directory or default location.
Instructions
Async version of text_to_speech. Convert text to speech with a given speaker and save the output audio file to a given directory. Directory is optional, if not provided, the output file will be saved to $HOME/Desktop. You can choose speaker by providing speaker parameter. If speaker is not provided, the default speaker(xiaoyi_meet) will be used.
⚠️ COST WARNING: This tool makes an API call to Mobvoi TTS service which may incur costs. Only use when explicitly requested by the user.
Args:
text (str): The text to convert to speech.
speaker (str): Determine which speaker's voice to be used to synthesize the audio.
audio_type (str): Determine the format of the synthesized audio. Value can choose form [pcm/mp3/speex-wb-10/wav].
speed (float): Control the speed of the synthesized audio. Values range from 0.5 to 2.0, with 1.0 being the default speed. Lower values create slower, more deliberate speech while higher values produce faster-paced speech. Extreme values can impact the quality of the generated speech. Range is 0.7 to 1.2.
rate(int): Control the sampling rate of the synthesized audio. Value can choose from [8000/16000/24000], with 24000 being the deault rate.
volume(float): Control the volume of the synthesized audio. Values range from 0.1 to 1.0, with 1.0 being the default volume.
pitch(float): Control the pitch of the synthesized audio. Values range from -10 to 10, with 0 being the default pitch. If the parameter is less than 0, the pitch will become lower; otherwise, it will be higher.
streaming(bool): Whether to output in a streaming manner. The default value is false.
output_directory (str): Directory where files should be saved.
Defaults to $HOME/Desktop if not provided.
Returns:
Text content with the path to the output file and name of the speaker used.
Input Schema
Name | Required | Description | Default |
---|---|---|---|
audio_type | No | mp3 | |
output_directory | No | ||
pitch | No | ||
rate | No | ||
speaker | No | xiaoyi_meet | |
speed | No | ||
streaming | No | ||
text | Yes | ||
volume | No |
Input Schema (JSON Schema)
{
"properties": {
"audio_type": {
"default": "mp3",
"title": "Audio Type",
"type": "string"
},
"output_directory": {
"anyOf": [
{
"type": "string"
},
{
"type": "null"
}
],
"default": null,
"title": "Output Directory"
},
"pitch": {
"default": 0,
"title": "Pitch",
"type": "number"
},
"rate": {
"default": 24000,
"title": "Rate",
"type": "integer"
},
"speaker": {
"default": "xiaoyi_meet",
"title": "Speaker",
"type": "string"
},
"speed": {
"default": 1,
"title": "Speed",
"type": "number"
},
"streaming": {
"default": false,
"title": "Streaming",
"type": "boolean"
},
"text": {
"title": "Text",
"type": "string"
},
"volume": {
"default": 1,
"title": "Volume",
"type": "number"
}
},
"required": [
"text"
],
"title": "async_text_to_speechArguments",
"type": "object"
}