transcribe_base64
Transcribe base64-encoded audio into text using Whisper models, with automatic language detection and optional GPT post-processing for spelling and grammar corrections.
Instructions
Transcribe audio provided as a base64-encoded string.
Input Schema
TableJSON Schema
| Name | Required | Description | Default |
|---|---|---|---|
| audio_base64 | Yes | Base64-encoded audio data. | |
| extension | No | File extension for the temp file (mp3, wav, m4a, ogg, etc.). | mp3 |
| language | No | Language code. Auto-detected if not provided. | |
| model_size | No | Local model size. Ignored when using the OpenAI backend. | |
| post_process | No | If True, passes the transcription through GPT to fix spelling, grammar, and punctuation. Requires the openai package. | |
| post_process_prompt | No | Custom system prompt for post-processing. Use this to provide domain-specific context, proper nouns, or product names. |
Output Schema
TableJSON Schema
| Name | Required | Description | Default |
|---|---|---|---|
No arguments | |||