Skip to main content
Glama

Speech MCP Server

by hammeiam

语音 MCP 服务器

使用 Kokoro TTS 模型提供文本转语音功能的模型上下文协议服务器。

配置

可以使用以下环境变量配置服务器:

多变的描述默认有效范围
MCP_DEFAULT_SPEECH_SPEED文本转语音的默认速度倍增器1.10.5 至 2.0

在光标中:

{ "mcpServers": { "speech": { "command": "npx", "args": [ "-y", "speech-mcp-server" ], "env": { MCP_DEFAULT_SPEECH_SPEED: 1.3 } } } }

特征

  • 🎯 使用 Kokoro TTS 模型实现高质量文本转语音
  • 🗣️ 有多种语音选项
  • 🎛️可定制的语音参数(声音、速度)
  • 🔌 符合 MCP 标准的接口
  • 📦易于安装和设置
  • 🚀 无需 API 密钥

安装

# Using npm npm install speech-mcp-server # Using pnpm (recommended) pnpm add speech-mcp-server # Using yarn yarn add speech-mcp-server

用法

运行服务器:

# Using default configuration npm start # With custom speech speed MCP_DEFAULT_SPEECH_SPEED=1.5 npm start

该服务器提供以下 MCP 工具:

  • text_to_speech :基本文本到语音的转换
  • text_to_speech_with_options :可自定义速度的文本转语音
  • list_voices :列出所有可用的声音
  • get_model_status :检查 TTS 模型的初始化状态

发展

# Clone the repository git clone <your-repo-url> cd speech-mcp-server # Install dependencies pnpm install # Start development server with auto-reload pnpm dev # Build the project pnpm build # Run linting pnpm lint # Format code pnpm format # Test with MCP Inspector pnpm inspector

可用工具

1. 文本转语音

使用默认设置将文本转换为语音。

{ "type": "request", "id": "1", "method": "call_tool", "params": { "name": "text_to_speech", "arguments": { "text": "Hello world", "voice": "af_bella" // optional } } }

2. 文本转语音(带选项)

使用可自定义的参数将文本转换为语音。

{ "type": "request", "id": "1", "method": "call_tool", "params": { "name": "text_to_speech_with_options", "arguments": { "text": "Hello world", "voice": "af_bella", // optional "speed": 1.0, // optional (0.5 to 2.0) } } }

3. 列表声音

列出所有可用于文本转语音的声音。

{ "type": "request", "id": "1", "method": "list_voices", "params": {} }

4. 获取模型状态

检查 TTS 模型初始化的当前状态。这在首次启动服务器时特别有用,因为需要下载并初始化模型。

{ "type": "request", "id": "1", "method": "call_tool", "params": { "name": "get_model_status", "arguments": {} } }

响应示例:

{ "content": [{ "type": "text", "text": "Model status: initializing (5s elapsed)" }] }

可能的状态值:

  • uninitialized :模型初始化尚未开始
  • initializing :模型正在下载并初始化
  • ready :模型已准备好使用
  • error :初始化过程中发生错误

测试

您可以使用 MCP 检查器或通过发送原始 JSON 消息来测试服务器:

# List available tools echo '{"type":"request","id":"1","method":"list_tools","params":{}}' | node dist/index.js # List available voices echo '{"type":"request","id":"2","method":"list_voices","params":{}}' | node dist/index.js # Convert text to speech echo '{"type":"request","id":"3","method":"call_tool","params":{"name":"text_to_speech","arguments":{"text":"Hello world","voice":"af_bella"}}}' | node dist/index.js

与 Claude Desktop 集成

要将此服务器与 Claude Desktop 一起使用,请将以下内容添加到您的 Claude Desktop 配置文件( ~/Library/Application Support/Claude/claude_desktop_config.json ):

{ "servers": { "speech": { "command": "npx", "args": ["@decodershq/speech-mcp-server"] } } }

贡献

欢迎贡献代码!欢迎提交 Pull 请求。

执照

MIT 许可证 - 有关详细信息,请参阅LICENSE文件。

故障排除

模型初始化问题

服务器启动时会自动尝试下载并初始化 TTS 模型。如果遇到初始化错误:

  1. 服务器将自动重试最多 3 次,并在每次尝试之间进行清理
  2. 使用get_model_status工具监控初始化进度和任何错误
  3. 如果所有重试后初始化仍失败,请尝试手动删除模型文件:
# Remove model files (MacOS/Linux) rm -rf ~/.npm/_npx/**/node_modules/@huggingface/transformers/.cache/onnx-community/Kokoro-82M-v1.0-ONNX/onnx/model_quantized.onnx rm -rf ~/.cache/huggingface/transformers/onnx-community/Kokoro-82M-v1.0-ONNX/onnx/model_quantized.onnx # Then restart the server npm start

get_model_status工具现在将在其响应中包含重试信息:

{ "content": [{ "type": "text", "text": "Model status: initializing (5s elapsed, retry 1/3)" }] }
-
security - not tested
F
license - not found
-
quality - not tested

remote-capable server

The server can be hosted and run remotely because it primarily relies on remote services or has no dependency on the local environment.

模型上下文协议服务器使用 Kokoro TTS 模型提供文本转语音功能,提供多种语音选项和可定制的语音参数。

  1. 配置
    1. 特征
      1. 安装
        1. 用法
          1. 发展
        2. 可用工具
          1. 文本转语音
          2. 文本转语音(带选项)
          3. 列表声音
          4. 获取模型状态
        3. 测试
          1. 与 Claude Desktop 集成
            1. 贡献
              1. 执照
                1. 故障排除
                  1. 模型初始化问题

                Related MCP Servers

                • -
                  security
                  F
                  license
                  -
                  quality
                  Provides text-to-speech capabilities through the Model Context Protocol, allowing applications to easily integrate speech synthesis with customizable voices, adjustable speech speed, and cross-platform audio playback support.
                  Last updated -
                  2
                  Python
                • -
                  security
                  A
                  license
                  -
                  quality
                  A Model Context Protocol server that integrates high-quality text-to-speech capabilities with Claude Desktop and other MCP-compatible clients, supporting multiple voice options and audio formats.
                  Last updated -
                  TypeScript
                  MIT License
                • -
                  security
                  A
                  license
                  -
                  quality
                  A Model Context Protocol server that enables AI models to interact with VoIPBin's VoIP services, supporting features like call management, agent management, campaigns, conferences, and chat functionality.
                  Last updated -
                  2
                  Python
                  MIT License
                  • Linux
                  • Apple
                • A
                  security
                  A
                  license
                  A
                  quality
                  A Model Context Protocol server that enables AI models to generate and play high-quality text-to-speech audio through your device's native audio system using Rime's voice synthesis API.
                  Last updated -
                  1
                  15
                  4
                  JavaScript
                  The Unlicense
                  • Apple
                  • Linux

                View all related MCP servers

                MCP directory API

                We provide all the information about MCP servers via our MCP API.

                curl -X GET 'https://glama.ai/api/mcp/v1/servers/hammeiam/koroko-speech-mcp'

                If you have feedback or need assistance with the MCP directory API, please join our Discord server