vocametrix_synthesize_speech_with_timing
Synthesize speech with per-character timing alignment for lip-sync, subtitles, and karaoke. Supports plain text and SSML input.
Instructions
Synthesize speech via ElevenLabs v2 with per-character timing alignment. Returns audio data and a character-level timing map — useful for lip-sync, subtitles, and karaoke. Supports plain text or SSML markup.
Input Schema
| Name | Required | Description | Default |
|---|---|---|---|
| text | Yes | Text or SSML to synthesize (max 2500 characters) | |
| isSSML | No | Set true if input is SSML markup |