speech_to_text
Transcribe audio files to text with automatic language detection and speaker diarization. Save transcripts to files or receive text directly.
Instructions
Transcribe speech from an audio file. When save_transcript_to_file=True: Saves output file to directory (default: $HOME/Desktop). When return_transcript_to_client_directly=True, always returns text directly regardless of output mode.
⚠️ COST WARNING: This tool makes an API call to ElevenLabs which may incur costs. Only use when explicitly requested by the user.
Args:
file_path: Path to the audio file to transcribe
language_code: ISO 639-3 language code for transcription. If not provided, the language will be detected automatically.
diarize: Whether to diarize the audio file. If True, which speaker is currently speaking will be annotated in the transcription.
save_transcript_to_file: Whether to save the transcript to a file.
return_transcript_to_client_directly: Whether to return the transcript to the client directly.
output_directory: Directory where files should be saved (only used when saving files).
Defaults to $HOME/Desktop if not provided.
Returns:
TextContent containing the transcription or MCP resource with transcript data.Input Schema
| Name | Required | Description | Default |
|---|---|---|---|
| input_file_path | Yes | ||
| language_code | No | ||
| diarize | No | ||
| save_transcript_to_file | No | ||
| return_transcript_to_client_directly | No | ||
| output_directory | No |
Output Schema
| Name | Required | Description | Default |
|---|---|---|---|
| result | Yes |