Skip to main content
Glama

MS-Lucidia-Voice-Gateway-MCP

MS-Lucidia-Voice-Gateway-MCP

A Model Context Protocol (MCP) server that provides text-to-speech and speech-to-text capabilities using Windows' built-in speech services. This server leverages the native Windows Speech API (SAPI) through PowerShell commands, eliminating the need for external APIs or services.

Features

  • Text-to-Speech (TTS) using Windows SAPI voices
  • Speech-to-Text (STT) using Windows Speech Recognition
  • Simple web interface for testing
  • No external API dependencies
  • Uses native Windows capabilities

Prerequisites

  • Windows 10/11 with Speech Recognition enabled
  • Node.js 16+
  • PowerShell

Installation

  1. Clone the repository:
git clone https://github.com/ExpressionsBot/MS-Lucidia-Voice-Gateway-MCP.git cd MS-Lucidia-Voice-Gateway-MCP
  1. Install dependencies:
npm install
  1. Build the project:
npm run build

Usage

Testing Interface

  1. Start the test server:
npm run test
  1. Open http://localhost:3000 in your browser
  2. Use the web interface to test TTS and STT capabilities

Available Tools

text_to_speech

Converts text to speech using Windows SAPI.

Parameters:

  • text (required): The text to convert to speech
  • voice (optional): The voice to use (e.g., "Microsoft David Desktop")
  • speed (optional): Speech rate from 0.5 to 2.0 (default: 1.0)

Example:

fetch('http://localhost:3000/tts', { method: 'POST', headers: { 'Content-Type': 'application/json' }, body: JSON.stringify({ text: "Hello, this is a test", voice: "Microsoft David Desktop", speed: 1.0 }) });
speech_to_text

Records audio and converts it to text using Windows Speech Recognition.

Parameters:

  • duration (optional): Recording duration in seconds (default: 5, max: 60)

Example:

fetch('http://localhost:3000/stt', { method: 'POST', headers: { 'Content-Type': 'application/json' }, body: JSON.stringify({ duration: 5 }) }).then(response => response.json()) .then(data => console.log(data.text));

Troubleshooting

  1. Make sure Windows Speech Recognition is enabled:
    • Open Windows Settings
    • Go to Time & Language > Speech
    • Enable Speech Recognition
  2. Check available voices:
    • Open PowerShell and run:
    Add-Type -AssemblyName System.Speech (New-Object System.Speech.Synthesis.SpeechSynthesizer).GetInstalledVoices().VoiceInfo.Name
  3. Test speech recognition:
    • Open Speech Recognition in Windows Settings
    • Run through the setup wizard if not already done
    • Test that Windows can recognize your voice

Contributing

  1. Fork the repository
  2. Create your feature branch
  3. Commit your changes
  4. Push to the branch
  5. Create a new Pull Request

License

MIT

-
security - not tested
F
license - not found
-
quality - not tested

local-only server

The server can only run on the client's local machine because it depends on local resources.

外部依存なしで Windows のネイティブ音声サービスを使用して、テキスト読み上げ機能および音声テキスト変換機能を提供するサーバー。

  1. 特徴
    1. 前提条件
      1. インストール
        1. 使用法
          1. テストインターフェース
          2. 利用可能なツール
        2. トラブルシューティング
          1. 貢献
            1. ライセンス

              Related MCP Servers

              • -
                security
                F
                license
                -
                quality
                Provides text-to-speech capabilities through the Model Context Protocol, allowing applications to easily integrate speech synthesis with customizable voices, adjustable speech speed, and cross-platform audio playback support.
                Last updated -
                7
              • -
                security
                A
                license
                -
                quality
                Official Model Context Protocol server that enables interaction with powerful Speech-to-Text and Audio Intelligence APIs, allowing clients like Claude Desktop to transcribe audio, analyze speech, translate content, and more.
                Last updated -
                2
                MIT License
              • -
                security
                F
                license
                -
                quality
                A Model Context Protocol server that provides text-to-speech functionality for AI agents using Microsoft Edge's text-to-speech technology, supporting multiple voices, languages, and voice customization.
                Last updated -
                4
              • A
                security
                A
                license
                A
                quality
                A Model Context Protocol server that integrates with VOICEVOX engine to provide text-to-speech synthesis and speaker information retrieval, allowing users to generate and play voice audio from text.
                Last updated -
                2
                MIT License
                • Apple

              View all related MCP servers

              MCP directory API

              We provide all the information about MCP servers via our MCP API.

              curl -X GET 'https://glama.ai/api/mcp/v1/servers/ExpressionsBot/MS-Lucidia-Voice-Gateway-MCP'

              If you have feedback or need assistance with the MCP directory API, please join our Discord server