Skip to main content
Glama

MS-Lucidia-Voice-Gateway-MCP

MS-Lucidia-Voice-Gateway-MCP

A Model Context Protocol (MCP) server that provides text-to-speech and speech-to-text capabilities using Windows' built-in speech services. This server leverages the native Windows Speech API (SAPI) through PowerShell commands, eliminating the need for external APIs or services.

Features

  • Text-to-Speech (TTS) using Windows SAPI voices
  • Speech-to-Text (STT) using Windows Speech Recognition
  • Simple web interface for testing
  • No external API dependencies
  • Uses native Windows capabilities

Prerequisites

  • Windows 10/11 with Speech Recognition enabled
  • Node.js 16+
  • PowerShell

Installation

  1. Clone the repository:
git clone https://github.com/ExpressionsBot/MS-Lucidia-Voice-Gateway-MCP.git cd MS-Lucidia-Voice-Gateway-MCP
  1. Install dependencies:
npm install
  1. Build the project:
npm run build

Usage

Testing Interface

  1. Start the test server:
npm run test
  1. Open http://localhost:3000 in your browser
  2. Use the web interface to test TTS and STT capabilities

Available Tools

text_to_speech

Converts text to speech using Windows SAPI.

Parameters:

  • text (required): The text to convert to speech
  • voice (optional): The voice to use (e.g., "Microsoft David Desktop")
  • speed (optional): Speech rate from 0.5 to 2.0 (default: 1.0)

Example:

fetch('http://localhost:3000/tts', { method: 'POST', headers: { 'Content-Type': 'application/json' }, body: JSON.stringify({ text: "Hello, this is a test", voice: "Microsoft David Desktop", speed: 1.0 }) });
speech_to_text

Records audio and converts it to text using Windows Speech Recognition.

Parameters:

  • duration (optional): Recording duration in seconds (default: 5, max: 60)

Example:

fetch('http://localhost:3000/stt', { method: 'POST', headers: { 'Content-Type': 'application/json' }, body: JSON.stringify({ duration: 5 }) }).then(response => response.json()) .then(data => console.log(data.text));

Troubleshooting

  1. Make sure Windows Speech Recognition is enabled:
    • Open Windows Settings
    • Go to Time & Language > Speech
    • Enable Speech Recognition
  2. Check available voices:
    • Open PowerShell and run:
    Add-Type -AssemblyName System.Speech (New-Object System.Speech.Synthesis.SpeechSynthesizer).GetInstalledVoices().VoiceInfo.Name
  3. Test speech recognition:
    • Open Speech Recognition in Windows Settings
    • Run through the setup wizard if not already done
    • Test that Windows can recognize your voice

Contributing

  1. Fork the repository
  2. Create your feature branch
  3. Commit your changes
  4. Push to the branch
  5. Create a new Pull Request

License

MIT

-
security - not tested
F
license - not found
-
quality - not tested

local-only server

The server can only run on the client's local machine because it depends on local resources.

A server providing text-to-speech and speech-to-text functionalities using Windows' native speech services without external dependencies.

  1. Features
    1. Prerequisites
      1. Installation
        1. Usage
          1. Testing Interface
          2. Available Tools
        2. Troubleshooting
          1. Contributing
            1. License

              Related MCP Servers

              • -
                security
                F
                license
                -
                quality
                Provides text-to-speech capabilities through the Model Context Protocol, allowing applications to easily integrate speech synthesis with customizable voices, adjustable speech speed, and cross-platform audio playback support.
                Last updated -
                6
                Python
              • -
                security
                A
                license
                -
                quality
                Official Model Context Protocol server that enables interaction with powerful Speech-to-Text and Audio Intelligence APIs, allowing clients like Claude Desktop to transcribe audio, analyze speech, translate content, and more.
                Last updated -
                2
                Python
                MIT License
              • -
                security
                F
                license
                -
                quality
                A Model Context Protocol server that provides text-to-speech functionality for AI agents using Microsoft Edge's text-to-speech technology, supporting multiple voices, languages, and voice customization.
                Last updated -
                4
                Python
              • A
                security
                A
                license
                A
                quality
                A Model Context Protocol server that integrates with VOICEVOX engine to provide text-to-speech synthesis and speaker information retrieval, allowing users to generate and play voice audio from text.
                Last updated -
                2
                TypeScript
                MIT License
                • Apple

              View all related MCP servers

              MCP directory API

              We provide all the information about MCP servers via our MCP API.

              curl -X GET 'https://glama.ai/api/mcp/v1/servers/ExpressionsBot/MS-Lucidia-Voice-Gateway-MCP'

              If you have feedback or need assistance with the MCP directory API, please join our Discord server