Search for:
Why this server?
Integrates with ElevenLabs text-to-speech API, providing functionality to convert text into audio.
Why this server?
Provides text-to-speech capabilities using the Kokoro TTS model, offering multiple voice options and customizable speech parameters.
Why this server?
A Python server providing access to Whissle API endpoints for speech-to-text, diarization, translation, and text summarization.
Why this server?
Provides text-to-speech capabilities using the Kokoro TTS model, offering multiple voice options and customizable speech parameters.
Why this server?
Provides voice recognition and text extraction capabilities with support for both stdio and MCP modes, processing audio files or base64 encoded data and returning structured results with language, emotion, and speaker information.
Why this server?
A Model Context Protocol server that enables LLMs to extract and use content from unstructured documents across a wide variety of file formats. It indirectly helps with audio processing if audio is embedded.