Skip to main content
Glama
elevenlabs

ElevenLabs MCP Server

Official
by elevenlabs

text_to_sound_effects

Convert text descriptions into sound effects with customizable duration, format, and looping options, saving audio files to your desktop.

Instructions

Convert text description of a sound effect to sound effect with a given duration. Saves output file to directory (default: $HOME/Desktop).

Duration must be between 0.5 and 5 seconds.

⚠️ COST WARNING: This tool makes an API call to ElevenLabs which may incur costs. Only use when explicitly requested by the user.

Args:
    text: Text description of the sound effect
    duration_seconds: Duration of the sound effect in seconds
    output_directory: Directory where files should be saved (only used when saving files).
        Defaults to $HOME/Desktop if not provided.
    loop: Whether to loop the sound effect. Defaults to False.
    output_format (str, optional): Output format of the generated audio. Formatted as codec_sample_rate_bitrate. So an mp3 with 22.05kHz sample rate at 32kbs is represented as mp3_22050_32. MP3 with 192kbps bitrate requires you to be subscribed to Creator tier or above. PCM with 44.1kHz sample rate requires you to be subscribed to Pro tier or above. Note that the μ-law format (sometimes written mu-law, often approximated as u-law) is commonly used for Twilio audio inputs.
        Defaults to "mp3_44100_128". Must be one of:
        mp3_22050_32
        mp3_44100_32
        mp3_44100_64
        mp3_44100_96
        mp3_44100_128
        mp3_44100_192
        pcm_8000
        pcm_16000
        pcm_22050
        pcm_24000
        pcm_44100
        ulaw_8000
        alaw_8000
        opus_48000_32
        opus_48000_64
        opus_48000_96
        opus_48000_128
        opus_48000_192

Input Schema

TableJSON Schema
NameRequiredDescriptionDefault
textYes
duration_secondsNo
output_directoryNo
output_formatNomp3_44100_128
loopNo

Implementation Reference

  • The handler function that executes the text_to_sound_effects tool. Converts a text description to sound effects audio using the ElevenLabs client, handles output formatting, validation of duration, and saves or returns the audio based on output mode.
    def text_to_sound_effects(
        text: str,
        duration_seconds: float = 2.0,
        output_directory: str | None = None,
        output_format: str = "mp3_44100_128",
        loop: bool = False,
    ) -> Union[TextContent, EmbeddedResource]:
        if duration_seconds < 0.5 or duration_seconds > 5:
            make_error("Duration must be between 0.5 and 5 seconds")
        output_path = make_output_path(output_directory, base_path)
        output_file_name = make_output_file("sfx", text, "mp3")
    
        audio_data = client.text_to_sound_effects.convert(
            text=text,
            output_format=output_format,
            duration_seconds=duration_seconds,
            loop=loop,
        )
        audio_bytes = b"".join(audio_data)
    
        # Handle different output modes
        return handle_output_mode(audio_bytes, output_path, output_file_name, output_mode)
  • Registers the text_to_sound_effects tool with the FastMCP server, providing the detailed description, parameters (schema), and usage instructions.
    @mcp.tool(
        description=f"""Convert text description of a sound effect to sound effect with a given duration. {get_output_mode_description(output_mode)}.
        
        Duration must be between 0.5 and 5 seconds.
    
        ⚠️ COST WARNING: This tool makes an API call to ElevenLabs which may incur costs. Only use when explicitly requested by the user.
    
        Args:
            text: Text description of the sound effect
            duration_seconds: Duration of the sound effect in seconds
            output_directory: Directory where files should be saved (only used when saving files).
                Defaults to $HOME/Desktop if not provided.
            loop: Whether to loop the sound effect. Defaults to False.
            output_format (str, optional): Output format of the generated audio. Formatted as codec_sample_rate_bitrate. So an mp3 with 22.05kHz sample rate at 32kbs is represented as mp3_22050_32. MP3 with 192kbps bitrate requires you to be subscribed to Creator tier or above. PCM with 44.1kHz sample rate requires you to be subscribed to Pro tier or above. Note that the μ-law format (sometimes written mu-law, often approximated as u-law) is commonly used for Twilio audio inputs.
                Defaults to "mp3_44100_128". Must be one of:
                mp3_22050_32
                mp3_44100_32
                mp3_44100_64
                mp3_44100_96
                mp3_44100_128
                mp3_44100_192
                pcm_8000
                pcm_16000
                pcm_22050
                pcm_24000
                pcm_44100
                ulaw_8000
                alaw_8000
                opus_48000_32
                opus_48000_64
                opus_48000_96
                opus_48000_128
                opus_48000_192
        """
    )

Latest Blog Posts

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/elevenlabs/elevenlabs-mcp'

If you have feedback or need assistance with the MCP directory API, please join our Discord server