supertone-mcp
OfficialThe supertone-mcp server provides a full-featured interface to the Supertone TTS API, enabling speech synthesis, voice management, and custom voice cloning from MCP-compatible clients.
Text-to-Speech Synthesis: Convert text into natural-sounding audio across 23+ languages, with support for speed (0.5x–2.0x), pitch shift (-24 to +24 semitones), emotion styles, and MP3/WAV output. Audio can be saved to disk, returned as MCP resources, or both. Long text is auto-chunked beyond the 300-character limit.
Duration & Cost Prediction: Estimate output audio duration and credit cost before synthesizing, using the same parameters as synthesis.
Voice Catalog Search: Browse and filter Supertone's preset voice catalog by language, gender, age, use case, style, model, name, or description.
Voice Details & Previews: Retrieve full metadata for a specific voice and access sample audio URLs, optionally filtered by language, style, and model.
Credit Balance Check: Monitor the remaining API credit balance for your Supertone API key.
Voice Cloning: Create a custom cloned voice from a local WAV or MP3 file (max 3MB), immediately usable for synthesis.
Custom Voice Management: List/filter cloned voices, update their name or description, or permanently delete them.
supertone-mcp
MCP server for Supertone TTS API. Generate high-quality speech, browse the voice catalog, predict synthesis cost, and create cloned voices — directly from Claude Desktop, Cursor, or any MCP-compatible client.
Features
Synthesis
text_to_speech— Convert text (≤300 chars) to audio. Output as files, MCP resources, or both.predict_duration— Estimate audio length (and credit cost) without synthesizing.
Voice discovery (preset)
search_voice— Filter the catalog by language, gender, age, use_case, style, model, name, or description.get_voice— Full detail for one voice.preview_voice— Sample audio URLs for a voice (filterable by language/style/model).get_credit_balance— Check remaining credits.
Custom voice cloning
clone_voice— Create a cloned voice from a local WAV/MP3 (≤3MB).search_custom_voice— List/filter cloned voices.edit_custom_voice— Update name and/or description.delete_custom_voice— Permanently delete (irreversible).
Supports Korean, English, Japanese, and 20+ other languages. Speed (0.5x–2.0x), pitch shift (-24 to +24 semitones), and emotion styles.
Breaking change in v0.2:
list_voiceswas removed and replaced bysearch_voice. To reproduce the old behavior, callsearch_voicewith no arguments.
Installation
# Using uvx (recommended)
uvx supertone-mcp
# Using pip
pip install supertone-mcpConfiguration
Claude Desktop
Add to claude_desktop_config.json:
{
"mcpServers": {
"supertone-tts": {
"command": "uvx",
"args": ["supertone-mcp"],
"env": {
"SUPERTONE_API_KEY": "your-api-key-here"
}
}
}
}Cursor
Add to your Cursor MCP settings (same JSON shape as above).
Environment Variables
Variable | Required | Default | Description |
| Yes | — | Your Supertone API key |
| No | preset voice (Aiden, multilingual) | Default |
| No |
| Directory where audio files are saved |
| No |
| One of |
| No |
| Auto-play generated audio on macOS via |
Output modes (text_to_speech)
Mode | Returns | Use when |
| Plain text with the saved file path + metadata | You want the file on disk |
| MCP | The client renders audio inline (e.g., Claude.ai chat) |
| File on disk and | You want both — preview inline, keep the file |
Usage Examples
Natural language phrasing — the MCP client routes these to the right tool automatically.
Synthesis
"Read this aloud: Hello, how are you today?" "한국어로 '안녕하세요' 천천히 읽어줘"
Estimate before synthesizing
"이 문단 합성하면 몇 초쯤 나와?" → calls
predict_duration
Browse / pick a voice
"Find me a female Korean voice for narration" → calls
search_voice(language="ko", gender="female", use_case="narration")"그 중에 첫 번째 목소리 샘플 들어보자" → calls
preview_voice(voice_id=...)and returns sample URLs
Check credits
"내 크레딧 얼마 남았어?" → calls
get_credit_balance
Clone a voice from a local file
"이 파일로 클론 만들어줘: ~/recordings/sample.wav, 이름은 MyVoice" → calls
clone_voice(name="MyVoice", audio_path="~/recordings/sample.wav")
Manage cloned voices
"내가 만든 커스텀 보이스 목록 보여줘" →
search_custom_voice"MyVoice 이름을 NarratorA로 바꿔" →edit_custom_voice"MyVoice 삭제해" →delete_custom_voice(prompts for confirmation; irreversible)
Tool Parameters
text_to_speech
Parameter | Type | Required | Default | Description |
| string | Yes | — | Text to convert (≤300 chars; longer text is auto-chunked) |
| string | No | env or preset | Voice identifier (browse via |
| string | No |
| Language code ( |
| string | No |
|
|
| string | No |
| TTS model |
| float | No |
| 0.5–2.0 |
| int | No |
| -24 to +24 semitones |
| string | No | — | Emotion style (varies by voice) |
predict_duration
Same parameter schema as text_to_speech (no auto-chunking — hard 300-char limit). Returns "Predicted duration: 2.34s (credit usage is proportional to duration).".
search_voice
All parameters optional. With no filters → full catalog. With any filter → first response line is Filters applied: ....
Parameter | Type | Description |
| string | e.g., |
| string | e.g., |
| string | e.g., |
| string | e.g., |
| string | e.g., |
| string | e.g., |
| string | partial match |
| string | partial match |
get_voice / preview_voice
Tool | Required | Optional |
|
| — |
|
|
|
clone_voice
Parameter | Type | Required | Description |
| string | Yes | Display name (non-empty) |
| string | Yes | Local WAV or MP3 path (≤3MB). Supports |
| string | No | Optional note |
Custom voice CRUD
Tool | Required | Optional |
| — |
|
|
|
|
|
| — (IRREVERSIBLE) |
Development
# Clone and install
git clone https://github.com/supertone-inc/supertone-mcp.git
cd supertone-mcp
uv sync
# Run tests
uv run pytest -q
# Run with coverage
uv run pytest --cov=src --cov-report=term-missingLicense
MIT
Latest Blog Posts
MCP directory API
We provide all the information about MCP servers via our MCP API.
curl -X GET 'https://glama.ai/api/mcp/v1/servers/supertone-inc/supertone-mcp'
If you have feedback or need assistance with the MCP directory API, please join our Discord server