gemini-transcribe-audio
Convert audio files into text with intelligent transcription, supporting multiple formats and optional language or context hints for enhanced accuracy using multimodal capabilities.
Instructions
Transcribe audio files to text using Gemini's multimodal capabilities (with learned user preferences)
Input Schema
Name | Required | Description | Default |
---|---|---|---|
context | No | Optional context for intelligent enhancement (e.g., "medical", "legal", "technical") | |
file_path | Yes | Path to the audio file to transcribe (supports MP3, WAV, FLAC, AAC, OGG, WEBM) | |
language | No | Optional language hint for better transcription accuracy (e.g., "en", "es", "fr") | |
preserve_spelled_acronyms | No | Keep spelled-out letters (U-R-L) instead of converting to acronyms (URL) |