Server Configuration
Describes the environment variables required to run the server.
| Name | Required | Description | Default |
|---|---|---|---|
| ELEVENLABS_API_KEY | Yes | Your ElevenLabs API key. Get one at https://elevenlabs.io/app/settings/api-keys | |
| ELEVENLABS_MCP_BASE_PATH | No | The base path the MCP server should look for and output files specified with relative paths. |
Capabilities
Features and capabilities supported by this server
| Capability | Details |
|---|---|
| tools | {
"listChanged": false
} |
| prompts | {
"listChanged": false
} |
| resources | {
"subscribe": false,
"listChanged": false
} |
| experimental | {} |
Tools
Functions exposed to the LLM to take actions
| Name | Description |
|---|---|
| text_to_speech | Converts text to speech (v2/v3/flash models). Returns: audio file path. Use when: single speaker narration. Supports v3 with audio tags. |
| text_to_speech_v3 | Converts text to speech with v3 model and tags. Returns: audio file path. Use when: single speaker needs emotions, pauses, or sound effects. |
| speech_to_text | Transcribes audio to text. Returns: transcript text or file path. Use when: converting speech recordings to text. |
| text_to_sound_effects | Generates sound effects from text. Returns: audio file path. Use when: creating custom sound effects from descriptions. |
| search_voices | Searches available voices. Returns: JSON with voice details. Use when: finding voices by name, gender, or characteristics. |
| get_voice_id_by_name | Resolves voice name to ID. Returns: JSON with voice_id and confidence. Use when: need voice ID from name with fuzzy matching. |
| list_models | Lists available TTS models. Returns: model list with capabilities. Use when: choosing between v2, v3, or other models. |
| get_voice | Gets voice details. Returns: voice metadata and settings. Use when: need detailed information about a specific voice. |
| voice_clone | Creates voice clone from audio. Returns: new voice ID. Use when: creating custom voice from recordings. |
| isolate_audio | Removes background noise from audio. Returns: cleaned audio file path. Use when: extracting voice from noisy recordings. |
| check_subscription | Checks account subscription. Returns: subscription details and usage. Use when: monitoring API usage and limits. |
| create_agent | Creates conversational AI agent. Returns: agent ID and details. Use when: setting up voice-enabled chatbot or assistant. |
| add_knowledge_base_to_agent | Adds knowledge to agent. Returns: knowledge base ID. Use when: giving agent access to documents or information. |
| list_agents | Lists all agents. Returns: agent list with IDs. Use when: viewing available conversational AI agents. |
| get_agent | Gets agent details. Returns: agent configuration. Use when: viewing specific agent settings and capabilities. |
| speech_to_speech | Transforms voice in audio. Returns: audio file with new voice. Use when: changing speaker voice in existing audio. |
| text_to_voice | Creates voice from description. Returns: three voice preview files. Use when: designing custom voice from text prompt. |
| create_voice_from_preview | Saves generated voice to library. Returns: permanent voice ID. Use when: keeping voice from text_to_voice previews. |
| make_outbound_call | Initiates phone call with agent. Returns: call details. Use when: making automated calls via Twilio integration. |
| search_voice_library | Searches global voice library. Returns: shared voices list. Use when: finding voices across entire ElevenLabs platform. |
| list_phone_numbers | Lists account phone numbers. Returns: phone number list. Use when: viewing available numbers for outbound calls. |
| play_audio | Plays audio file locally. Returns: playback confirmation. Use when: previewing generated audio without downloading. |
| get_conversation | Gets conversation with transcript. Returns: conversation details and full transcript. Use when: analyzing completed agent conversations. |
| list_conversations | Lists agent conversations. Returns: conversation list with metadata. Use when: browsing conversation history. |
| get_conversation_transcript | Gets conversation transcript in chunks. Returns: transcript chunk with metadata. Use when: retrieving large conversation transcripts. |
| text_to_dialogue | Converts multi-speaker text to audio. Returns: dialogue audio file paths. Use when: creating conversations with multiple voices. |
| enhance_dialogue | Adds audio tags to dialogue. Returns: enhanced text with v3 tags. Use when: improving dialogue with emotions and effects. |
| fetch_v3_tags | Lists v3 audio tags. Returns: comprehensive tag list. Use when: preparing text for text_to_speech_v3 or text_to_dialogue with emotions and effects. |
| get_v3_audio_tags_guide | Gets v3 tag usage guide. Returns: detailed v3 documentation. Use when: learning how to use v3 audio tags effectively. |
Prompts
Interactive templates invoked by user choice
| Name | Description |
|---|---|
No prompts | |
Resources
Contextual data attached and managed by the client
| Name | Description |
|---|---|
No resources | |