语音 MCP 服务器

使用 Kokoro TTS 模型提供文本转语音功能的模型上下文协议服务器。

配置

可以使用以下环境变量配置服务器：

多变的	描述	默认	有效范围
`MCP_DEFAULT_SPEECH_SPEED`	文本转语音的默认速度倍增器	1.1	0.5 至 2.0

在光标中：

{
  "mcpServers": {
    "speech": {
      "command": "npx",
      "args": [
        "-y",
        "speech-mcp-server"
      ],
      "env": {
        MCP_DEFAULT_SPEECH_SPEED: 1.3
      }
    }
  }
}

特征

🎯 使用 Kokoro TTS 模型实现高质量文本转语音
🗣️ 有多种语音选项
🎛️可定制的语音参数（声音、速度）
🔌 符合 MCP 标准的接口
📦易于安装和设置
🚀 无需 API 密钥

安装

# Using npm
npm install speech-mcp-server

# Using pnpm (recommended)
pnpm add speech-mcp-server

# Using yarn
yarn add speech-mcp-server

用法

运行服务器：

# Using default configuration
npm start

# With custom speech speed
MCP_DEFAULT_SPEECH_SPEED=1.5 npm start

该服务器提供以下 MCP 工具：

text_to_speech ：基本文本到语音的转换
text_to_speech_with_options ：可自定义速度的文本转语音
list_voices ：列出所有可用的声音
get_model_status ：检查 TTS 模型的初始化状态

发展

# Clone the repository
git clone <your-repo-url>
cd speech-mcp-server

# Install dependencies
pnpm install

# Start development server with auto-reload
pnpm dev

# Build the project
pnpm build

# Run linting
pnpm lint

# Format code
pnpm format

# Test with MCP Inspector
pnpm inspector

可用工具

1. 文本转语音

使用默认设置将文本转换为语音。

{
  "type": "request",
  "id": "1",
  "method": "call_tool",
  "params": {
    "name": "text_to_speech",
    "arguments": {
      "text": "Hello world",
      "voice": "af_bella"  // optional
    }
  }
}

2. 文本转语音（带选项）

使用可自定义的参数将文本转换为语音。

{
  "type": "request",
  "id": "1",
  "method": "call_tool",
  "params": {
    "name": "text_to_speech_with_options",
    "arguments": {
      "text": "Hello world",
      "voice": "af_bella",  // optional
      "speed": 1.0,         // optional (0.5 to 2.0)
    }
  }
}

3. 列表声音

列出所有可用于文本转语音的声音。

{
  "type": "request",
  "id": "1",
  "method": "list_voices",
  "params": {}
}

4. 获取模型状态

检查 TTS 模型初始化的当前状态。这在首次启动服务器时特别有用，因为需要下载并初始化模型。

{
  "type": "request",
  "id": "1",
  "method": "call_tool",
  "params": {
    "name": "get_model_status",
    "arguments": {}
  }
}

响应示例：

{
  "content": [{
    "type": "text",
    "text": "Model status: initializing (5s elapsed)"
  }]
}

可能的状态值：

uninitialized ：模型初始化尚未开始
initializing ：模型正在下载并初始化
ready ：模型已准备好使用
error ：初始化过程中发生错误

测试

您可以使用 MCP 检查器或通过发送原始 JSON 消息来测试服务器：

# List available tools
echo '{"type":"request","id":"1","method":"list_tools","params":{}}' | node dist/index.js

# List available voices
echo '{"type":"request","id":"2","method":"list_voices","params":{}}' | node dist/index.js

# Convert text to speech
echo '{"type":"request","id":"3","method":"call_tool","params":{"name":"text_to_speech","arguments":{"text":"Hello world","voice":"af_bella"}}}' | node dist/index.js

与 Claude Desktop 集成

要将此服务器与 Claude Desktop 一起使用，请将以下内容添加到您的 Claude Desktop 配置文件（ ~/Library/Application Support/Claude/claude_desktop_config.json ）：

{
  "servers": {
    "speech": {
      "command": "npx",
      "args": ["@decodershq/speech-mcp-server"]
    }
  }
}

贡献

欢迎贡献代码！欢迎提交 Pull 请求。

执照

MIT 许可证 - 有关详细信息，请参阅LICENSE文件。

故障排除

模型初始化问题

服务器启动时会自动尝试下载并初始化 TTS 模型。如果遇到初始化错误：

服务器将自动重试最多 3 次，并在每次尝试之间进行清理
使用get_model_status工具监控初始化进度和任何错误
如果所有重试后初始化仍失败，请尝试手动删除模型文件：

# Remove model files (MacOS/Linux)
rm -rf ~/.npm/_npx/**/node_modules/@huggingface/transformers/.cache/onnx-community/Kokoro-82M-v1.0-ONNX/onnx/model_quantized.onnx
rm -rf ~/.cache/huggingface/transformers/onnx-community/Kokoro-82M-v1.0-ONNX/onnx/model_quantized.onnx

# Then restart the server
npm start

get_model_status工具现在将在其响应中包含重试信息：

{
  "content": [{
    "type": "text",
    "text": "Model status: initializing (5s elapsed, retry 1/3)"
  }]
}

This server cannot be installed

security - not tested

license - not found

quality - not tested

How are these scores calculated?

remote-capable server

The server can be hosted and run remotely because it primarily relies on remote services or has no dependency on the local environment.

模型上下文协议服务器使用 Kokoro TTS 模型提供文本转语音功能，提供多种语音选项和可定制的语音参数。

Related MCP Servers

Kokoro TTS MCP Server
giannisanni
-
security
F
license
-
quality
Provides text-to-speech capabilities through the Model Context Protocol, allowing applications to easily integrate speech synthesis with customizable voices, adjustable speech speed, and cross-platform audio playback support.
Last updated -
2
Python
TTS-MCP
nakamurau1
-
security
A
license
-
quality
A Model Context Protocol server that integrates high-quality text-to-speech capabilities with Claude Desktop and other MCP-compatible clients, supporting multiple voice options and audio formats.
Last updated -
TypeScript
MIT License
VoIPBin MCP Server
nrjchnd
-
security
A
license
-
quality
A Model Context Protocol server that enables AI models to interact with VoIPBin's VoIP services, supporting features like call management, agent management, campaigns, conferences, and chat functionality.
Last updated -
2
Python
MIT License
Rime MCP
MatthewDailey
A
security
A
license
A
quality
A Model Context Protocol server that enables AI models to generate and play high-quality text-to-speech audio through your device's native audio system using Rime's voice synthesis API.
Last updated -
1
15
4
JavaScript
The Unlicense

View all related MCP servers

Speech MCP Server