Skip to main content
Glama

transcribe_speech

Convert speech audio to text with automatic language detection for WAV, MP3, WEBM, and OGG files. Provides transcription and identified language.

Instructions

Transcribe speech audio into text.

Supports multiple languages with automatic language detection. Returns the transcription text and detected language.

Input Schema

TableJSON Schema
NameRequiredDescriptionDefault
audio_base64YesBase64-encoded audio to transcribe (WAV, MP3, WEBM, OGG)
languageNoOptional language hint, e.g. 'en', 'pt'. Auto-detected if omitted.

Implementation Reference

  • server.py:74-91 (handler)
    The main handler function for the transcribe_speech tool. It accepts base64-encoded audio and an optional language hint, then makes an HTTP POST request to the Brainiall API's /v1/stt/transcribe endpoint to perform speech-to-text transcription.
    @mcp.tool()
    async def transcribe_speech(
        audio_base64: Annotated[str, "Base64-encoded audio to transcribe (WAV, MP3, WEBM, OGG)"],
        language: Annotated[Optional[str], "Optional language hint, e.g. 'en', 'pt'. Auto-detected if omitted."] = None,
    ) -> dict:
        """Transcribe speech audio into text.
    
        Supports multiple languages with automatic language detection.
        Returns the transcription text and detected language.
        """
        payload: dict = {"audio_base64": audio_base64}
        if language:
            payload["language"] = language
    
        async with _client() as client:
            response = await client.post("/v1/stt/transcribe", json=payload)
            response.raise_for_status()
            return response.json()
  • Input schema definition using Python type annotations with Annotated. Defines two parameters: audio_base64 (required string with format hints) and language (optional string for language detection hint).
    async def transcribe_speech(
        audio_base64: Annotated[str, "Base64-encoded audio to transcribe (WAV, MP3, WEBM, OGG)"],
        language: Annotated[Optional[str], "Optional language hint, e.g. 'en', 'pt'. Auto-detected if omitted."] = None,
    ) -> dict:
  • server.py:74-74 (registration)
    The @mcp.tool() decorator registers the transcribe_speech function as an MCP tool with the FastMCP framework.
    @mcp.tool()
  • Helper function that creates an async HTTP client configured with the Brainiall API base URL, authorization headers, and timeout settings. Used by the transcribe_speech handler to make API requests.
    def _client() -> httpx.AsyncClient:
        return httpx.AsyncClient(
            base_url=API_BASE,
            headers=_headers,
            timeout=60.0,
        )

Latest Blog Posts

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/fasuizu-br/brainiall-mcp-server'

If you have feedback or need assistance with the MCP directory API, please join our Discord server