Schema | ElevenLabs MCP Enhanced

Server Configuration

Describes the environment variables required to run the server.

Name	Required	Description	Default
`ELEVENLABS_API_KEY`	Yes	Your ElevenLabs API key. Get one at https://elevenlabs.io/app/settings/api-keys
`ELEVENLABS_MCP_BASE_PATH`	No	The base path the MCP server should look for and output files specified with relative paths.

Capabilities

Features and capabilities supported by this server

Capability	Details
`tools`	{ "listChanged": false }
`prompts`	{ "listChanged": false }
`resources`	{ "subscribe": false, "listChanged": false }
`experimental`	{}

Tools

Functions exposed to the LLM to take actions

Name	Description
text_to_speech	Converts text to speech (v2/v3/flash models). Returns: audio file path. Use when: single speaker narration. Supports v3 with audio tags.
text_to_speech_v3	Converts text to speech with v3 model and tags. Returns: audio file path. Use when: single speaker needs emotions, pauses, or sound effects.
speech_to_text	Transcribes audio to text. Returns: transcript text or file path. Use when: converting speech recordings to text.
text_to_sound_effects	Generates sound effects from text. Returns: audio file path. Use when: creating custom sound effects from descriptions.
search_voices	Searches available voices. Returns: JSON with voice details. Use when: finding voices by name, gender, or characteristics.
get_voice_id_by_name	Resolves voice name to ID. Returns: JSON with voice_id and confidence. Use when: need voice ID from name with fuzzy matching.
list_models	Lists available TTS models. Returns: model list with capabilities. Use when: choosing between v2, v3, or other models.
get_voice	Gets voice details. Returns: voice metadata and settings. Use when: need detailed information about a specific voice.
voice_clone	Creates voice clone from audio. Returns: new voice ID. Use when: creating custom voice from recordings.
isolate_audio	Removes background noise from audio. Returns: cleaned audio file path. Use when: extracting voice from noisy recordings.
check_subscription	Checks account subscription. Returns: subscription details and usage. Use when: monitoring API usage and limits.
create_agent	Creates conversational AI agent. Returns: agent ID and details. Use when: setting up voice-enabled chatbot or assistant.
add_knowledge_base_to_agent	Adds knowledge to agent. Returns: knowledge base ID. Use when: giving agent access to documents or information.
list_agents	Lists all agents. Returns: agent list with IDs. Use when: viewing available conversational AI agents.
get_agent	Gets agent details. Returns: agent configuration. Use when: viewing specific agent settings and capabilities.
speech_to_speech	Transforms voice in audio. Returns: audio file with new voice. Use when: changing speaker voice in existing audio.
text_to_voice	Creates voice from description. Returns: three voice preview files. Use when: designing custom voice from text prompt.
create_voice_from_preview	Saves generated voice to library. Returns: permanent voice ID. Use when: keeping voice from text_to_voice previews.
make_outbound_call	Initiates phone call with agent. Returns: call details. Use when: making automated calls via Twilio integration.
search_voice_library	Searches global voice library. Returns: shared voices list. Use when: finding voices across entire ElevenLabs platform.
list_phone_numbers	Lists account phone numbers. Returns: phone number list. Use when: viewing available numbers for outbound calls.
play_audio	Plays audio file locally. Returns: playback confirmation. Use when: previewing generated audio without downloading.
get_conversation	Gets conversation with transcript. Returns: conversation details and full transcript. Use when: analyzing completed agent conversations.
list_conversations	Lists agent conversations. Returns: conversation list with metadata. Use when: browsing conversation history.
get_conversation_transcript	Gets conversation transcript in chunks. Returns: transcript chunk with metadata. Use when: retrieving large conversation transcripts.
text_to_dialogue	Converts multi-speaker text to audio. Returns: dialogue audio file paths. Use when: creating conversations with multiple voices.
enhance_dialogue	Adds audio tags to dialogue. Returns: enhanced text with v3 tags. Use when: improving dialogue with emotions and effects.
fetch_v3_tags	Lists v3 audio tags. Returns: comprehensive tag list. Use when: preparing text for text_to_speech_v3 or text_to_dialogue with emotions and effects.
get_v3_audio_tags_guide	Gets v3 tag usage guide. Returns: detailed v3 documentation. Use when: learning how to use v3 audio tags effectively.

Prompts

Interactive templates invoked by user choice

Name	Description
No prompts

Resources

Contextual data attached and managed by the client

Name	Description
No resources

ElevenLabs MCP Enhanced

Server Configuration

Capabilities

Tools

Prompts

Resources

Latest Blog Posts

MCP directory API