generate_subtitles
Generate SRT or WebVTT subtitle files from audio or video with automatic language detection and optional English translation.
Instructions
Generate subtitle files for an audio or video file using whisper.cpp. Set language='auto' to detect the spoken language automatically. Set translate_to_english=true to also generate an English translation subtitle file. Supports SRT and WebVTT (VTT) output formats. When both native and translation are requested, two files are saved: one in the original language and one English translation. Load SRT in VLC via Subtitle → Add Subtitle File. VTT works in web players and HTML5 video. Supports all standard formats plus .3gp and .ts.
Input Schema
| Name | Required | Description | Default |
|---|---|---|---|
| file_path | Yes | Absolute Windows path to the file. | |
| language | No | Language code (e.g. ja, es, fr, de) or 'auto' to detect automatically. Defaults to en. | en |
| output_format | No | srt = SubRip subtitle (default, widest compatibility), vtt = WebVTT (web and HTML5 video). | srt |
| translate_to_english | No | Also generate an English translation subtitle file alongside the native language file. Only applies when language is not 'en'. Not available in background mode. | |
| background | No | Run as a detached background job — recommended for files over 10 minutes. Returns a job ID to use with check_progress. translate_to_english is not available in background mode. | |
| threads | No | CPU threads. Defaults to 2 of 2. | |
| temperature | No | Sampling temperature 0.0–1.0. Default 0.0. | |
| prompt | No | Prior context string for domain-specific vocabulary or speaker names. | |
| beam_size | No | Beam search width. Higher = more accurate, slower. Default 5. | |
| best_of | No | Candidate sequences evaluated. Default 5. | |
| diarize | No | Stereo speaker diarization. Requires stereo audio. | |
| vad_model | No | Path to Silero VAD model .bin. Strips silence before transcription. | |
| gpu_device | No | GPU/Vulkan device index for multi-GPU systems. Overrides the WHISPER_GPU_DEVICE env default. Check whisper-cli's startup log for the index that lists your target card. |