MS-Lucidia-Voice-Gateway-MCP

A Model Context Protocol (MCP) server that provides text-to-speech and speech-to-text capabilities using Windows' built-in speech services. This server leverages the native Windows Speech API (SAPI) through PowerShell commands, eliminating the need for external APIs or services.

Features

Text-to-Speech (TTS) using Windows SAPI voices
Speech-to-Text (STT) using Windows Speech Recognition
Simple web interface for testing
No external API dependencies
Uses native Windows capabilities

Prerequisites

Windows 10/11 with Speech Recognition enabled
Node.js 16+
PowerShell

Installation

Clone the repository:

git clone https://github.com/ExpressionsBot/MS-Lucidia-Voice-Gateway-MCP.git
cd MS-Lucidia-Voice-Gateway-MCP

Install dependencies:

npm install

Build the project:

npm run build

Usage

Testing Interface

Start the test server:

npm run test

Open http://localhost:3000 in your browser
Use the web interface to test TTS and STT capabilities

Available Tools

text_to_speech

Converts text to speech using Windows SAPI.

Parameters:

text (required): The text to convert to speech
voice (optional): The voice to use (e.g., "Microsoft David Desktop")
speed (optional): Speech rate from 0.5 to 2.0 (default: 1.0)

Example:

fetch('http://localhost:3000/tts', {
  method: 'POST',
  headers: { 'Content-Type': 'application/json' },
  body: JSON.stringify({
    text: "Hello, this is a test",
    voice: "Microsoft David Desktop",
    speed: 1.0
  })
});

speech_to_text

Records audio and converts it to text using Windows Speech Recognition.

Parameters:

duration (optional): Recording duration in seconds (default: 5, max: 60)

Example:

fetch('http://localhost:3000/stt', {
  method: 'POST',
  headers: { 'Content-Type': 'application/json' },
  body: JSON.stringify({
    duration: 5
  })
}).then(response => response.json())
  .then(data => console.log(data.text));

Troubleshooting

Make sure Windows Speech Recognition is enabled:
- Open Windows Settings
- Go to Time & Language > Speech
- Enable Speech Recognition
Check available voices:
- Open PowerShell and run:
Add-Type -AssemblyName System.Speech (New-Object System.Speech.Synthesis.SpeechSynthesizer).GetInstalledVoices().VoiceInfo.Name
Test speech recognition:
- Open Speech Recognition in Windows Settings
- Run through the setup wizard if not already done
- Test that Windows can recognize your voice

Contributing

Fork the repository
Create your feature branch
Commit your changes
Push to the branch
Create a new Pull Request

License

MIT

This server cannot be installed

security - not tested

license - not found

quality - not tested

How are these scores calculated?

local-only server

The server can only run on the client's local machine because it depends on local resources.

A server providing text-to-speech and speech-to-text functionalities using Windows' native speech services without external dependencies.

Related Resources

Reddit Discussion about this server

Related MCP Servers

Kokoro TTS MCP Server
giannisanni
-
security
F
license
-
quality
Provides text-to-speech capabilities through the Model Context Protocol, allowing applications to easily integrate speech synthesis with customizable voices, adjustable speech speed, and cross-platform audio playback support.
Last updated -
6
Python
Gladia MCPofficial
gladiaio
-
security
A
license
-
quality
Official Model Context Protocol server that enables interaction with powerful Speech-to-Text and Audio Intelligence APIs, allowing clients like Claude Desktop to transcribe audio, analyze speech, translate content, and more.
Last updated -
2
Python
MIT License
Edge-TTS MCP Server
yuiseki
-
security
F
license
-
quality
A Model Context Protocol server that provides text-to-speech functionality for AI agents using Microsoft Edge's text-to-speech technology, supporting multiple voices, languages, and voice customization.
Last updated -
4
Python
VOICEVOX MCP Server
Yuki10Kobayashi
A
security
A
license
A
quality
A Model Context Protocol server that integrates with VOICEVOX engine to provide text-to-speech synthesis and speaker information retrieval, allowing users to generate and play voice audio from text.
Last updated -
2
TypeScript
MIT License

View all related MCP servers

MS-Lucidia-Voice-Gateway-MCP