transcribe_audio
Transcribe audio or video files locally using mlx-whisper on Apple Silicon. Automatically splits long files into 5-minute chunks, returns timestamped segments, and saves an SRT file for DaVinci Resolve import.
Instructions
Transcribe an audio/video file locally using mlx-whisper (Apple Silicon). Long files are automatically split into 5-minute chunks so it never times out.
Returns ALL segments with timestamps inline (compact format) plus saves an SRT file next to the source for Resolve import.
Parameters:
file_path: Absolute path to audio/video file (mp3, wav, m4a, mp4, mov, etc.)
model: "tiny" (fastest), "base", "small", "medium", "large" (most accurate), "turbo" (best speed/quality, default). Or a full HuggingFace repo path.
language: Language code (e.g. "en", "fr", "de", "ja"). None = auto-detect.
word_timestamps: Include word-level timestamps in output.
initial_prompt: Optional text to guide the model's vocabulary/style.
Input Schema
| Name | Required | Description | Default |
|---|---|---|---|
| file_path | Yes | ||
| model | No | turbo | |
| language | No | ||
| word_timestamps | No | ||
| initial_prompt | No |
Output Schema
| Name | Required | Description | Default |
|---|---|---|---|
| result | Yes |