transcribe_audio
Convert audio files to text with a local speech-to-text engine. Specify the file path, and optionally set language or model.
Input Schema
| Name | Required | Description | Default |
|---|---|---|---|
| file_path | Yes | Absolute path to the audio file to transcribe | |
| language | No | ISO-639-1 language code, e.g. 'en', 'fr'. Omit for auto-detect. | |
| model | No | Whisper model ID to use. Defaults to Systran/faster-whisper-large-v3 |
Implementation Reference
- src/index.js:25-64 (registration)Registers the 'transcribe_audio' tool on the MCP server with schema (file_path, language, model) and handler that reads an audio file and calls OpenAI-compatible speech-to-text API.
server.tool( "transcribe_audio", { file_path: z.string().describe("Absolute path to the audio file to transcribe"), language: z .string() .optional() .describe("ISO-639-1 language code, e.g. 'en', 'fr'. Omit for auto-detect."), model: z .string() .optional() .describe(`Whisper model ID to use. Defaults to ${STT_MODEL}`), }, async ({ file_path, language, model }) => { if (!fs.existsSync(file_path)) { return { content: [{ type: "text", text: `Error: file not found: ${file_path}` }], isError: true, }; } let transcription; try { transcription = await client.audio.transcriptions.create({ file: fs.createReadStream(file_path), model: model || STT_MODEL, ...(language ? { language } : {}), }); } catch (err) { return { content: [{ type: "text", text: `Error: transcription failed: ${err.message}` }], isError: true, }; } return { content: [{ type: "text", text: transcription.text }], }; } ); - src/index.js:27-36 (schema)Zod schema defining input parameters: required file_path, optional language (ISO-639-1), and optional model (Whisper model ID).
{ file_path: z.string().describe("Absolute path to the audio file to transcribe"), language: z .string() .optional() .describe("ISO-639-1 language code, e.g. 'en', 'fr'. Omit for auto-detect."), model: z .string() .optional() .describe(`Whisper model ID to use. Defaults to ${STT_MODEL}`), - src/index.js:38-63 (handler)Handler function that checks file existence, streams the audio file to the OpenAI-compatible STT API via client.audio.transcriptions.create, and returns the transcribed text.
async ({ file_path, language, model }) => { if (!fs.existsSync(file_path)) { return { content: [{ type: "text", text: `Error: file not found: ${file_path}` }], isError: true, }; } let transcription; try { transcription = await client.audio.transcriptions.create({ file: fs.createReadStream(file_path), model: model || STT_MODEL, ...(language ? { language } : {}), }); } catch (err) { return { content: [{ type: "text", text: `Error: transcription failed: ${err.message}` }], isError: true, }; } return { content: [{ type: "text", text: transcription.text }], }; }