Skip to main content
Glama
hoangdn3
by hoangdn3

mcp_openrouter_analyze_audio

Transcribe audio files from URLs or local paths to extract text content. Supports wav and mp3 formats using OpenRouter models for accurate speech-to-text conversion.

Instructions

Transcribe audio files and provide raw content. Supports wav/mp3 files from CDN URLs or local paths.

Input Schema

TableJSON Schema
NameRequiredDescriptionDefault
audio_urlYesPath or URL to the audio file (supports CDN URLs, local file paths, wav/mp3 formats)
modelNoOpenRouter model to use (e.g., "mistralai/voxtral-small-24b-2507", "openai/gpt-4o-audio-preview")

Implementation Reference

  • Core handler function for the mcp_openrouter_analyze_audio tool. Fetches audio from URL or local path, encodes it to base64, detects format (wav/mp3), constructs OpenAI chat completion request with input_audio, calls OpenRouter models (with fallbacks to backup and free models), and returns JSON-structured analysis result or error.
    export async function handleAnalyzeAudio( request: { params: { arguments: AnalyzeAudioToolRequest } }, openai: OpenAI, defaultModel?: string ): Promise<any> { try { const args = request.params.arguments; // Validate input if (!args.audio_url) { throw new McpError( ErrorCode.InvalidParams, 'audio_url parameter is required' ); } // Fetch audio const buffer = await fetchAudio(args.audio_url); const format = detectAudioFormat(buffer, args.audio_url); const base64 = buffer.toString('base64'); // Build content array const content: Array<{ type: string; text?: string; input_audio?: { data: string; format: string; }; }> = []; // Add fixed transcription instruction content.push({ type: 'text', text: 'Please transcribe and provide me the raw content of this audio.' }); // Add audio content.push({ type: 'input_audio', input_audio: { data: base64, format: format } }); // Select model let model = args.model || defaultModel || DEFAULT_AUDIO_MODEL; console.error(`[Audio Tool] Using AUDIO model: ${model}`); // Try primary model first try { const completion = await openai.chat.completions.create({ model, messages: [{ role: 'user', content }] as any }); const response = completion as any; return { content: [ { type: 'text', text: JSON.stringify({ id: response.id, analysis: completion.choices[0].message.content || '', model: response.model, usage: response.usage }), }, ], }; } catch (primaryError: any) { // Try backup model const backupModel = process.env.OPENROUTER_DEFAULT_MODEL_AUDIO_BACKUP; if (backupModel && backupModel !== model) { try { const completion = await openai.chat.completions.create({ model: backupModel, messages: [{ role: 'user', content }] as any }); const resp = completion as any; return { content: [ { type: 'text', text: JSON.stringify({ id: resp.id, analysis: completion.choices[0].message.content || '', model: resp.model, usage: resp.usage }), }, ], }; } catch (backupError: any) { // Try free audio model const freeModel = await findSuitableFreeAudioModel(openai); if (freeModel && freeModel !== model && freeModel !== backupModel) { const completion = await openai.chat.completions.create({ model: freeModel, messages: [{ role: 'user', content }] as any }); const resp = completion as any; return { content: [ { type: 'text', text: JSON.stringify({ id: resp.id, analysis: completion.choices[0].message.content || '', model: resp.model, usage: resp.usage }), }, ], }; } else { throw backupError; } } } else { // No backup, try free model directly const freeModel = await findSuitableFreeAudioModel(openai); if (freeModel && freeModel !== model) { const completion = await openai.chat.completions.create({ model: freeModel, messages: [{ role: 'user', content }] as any }); const resp = completion as any; return { content: [ { type: 'text', text: JSON.stringify({ id: resp.id, analysis: completion.choices[0].message.content || '', model: resp.model, usage: resp.usage }), }, ], }; } else { throw primaryError; } } } } catch (error) { if (error instanceof McpError) { throw error; } return { content: [ { type: 'text', text: JSON.stringify({ error: error instanceof Error ? error.message : String(error), model: request.params.arguments.model || defaultModel || DEFAULT_AUDIO_MODEL, usage: { prompt_tokens: 0, completion_tokens: 0, total_tokens: 0 } }), }, ], isError: true, }; } }
  • Registers the mcp_openrouter_analyze_audio tool in the MCP server's listTools handler, defining its name, description, and input schema.
    { name: 'mcp_openrouter_analyze_audio', description: 'Transcribe audio files and provide raw content. Supports wav/mp3 files from CDN URLs or local paths.', inputSchema: { type: 'object', properties: { audio_url: { type: 'string', description: 'Path or URL to the audio file (supports CDN URLs, local file paths, wav/mp3 formats)', }, model: { type: 'string', description: 'OpenRouter model to use (e.g., "mistralai/voxtral-small-24b-2507", "openai/gpt-4o-audio-preview")', }, }, required: ['audio_url'], }, },
  • Switch case in CallToolRequestSchema handler that routes calls to 'mcp_openrouter_analyze_audio' to the handleAnalyzeAudio function.
    case 'mcp_openrouter_analyze_audio': return handleAnalyzeAudio({ params: { arguments: request.params.arguments as unknown as AnalyzeAudioToolRequest } }, this.openai, this.defaultAudioModel);
  • TypeScript interface defining the input parameters for the analyze audio tool.
    export interface AnalyzeAudioToolRequest { audio_url: string; model?: string; }
  • Helper function to fetch audio data from remote URL or local file path and return as Buffer.
    async function fetchAudio(audioPath: string): Promise<Buffer> { if (isUrl(audioPath)) { // Fetch from URL const response = await fetch(audioPath); if (!response.ok) { throw new Error(`Failed to fetch audio: ${response.statusText}`); } const arrayBuffer = await response.arrayBuffer(); return Buffer.from(arrayBuffer); } else { // Read from local file const normalizedPath = normalizePath(audioPath); let resolvedPath = normalizedPath; if (!path.isAbsolute(resolvedPath)) { resolvedPath = path.resolve(process.cwd(), resolvedPath); } return await fs.readFile(resolvedPath); } }

Latest Blog Posts

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/hoangdn3/mcp-ocr-fallback'

If you have feedback or need assistance with the MCP directory API, please join our Discord server