A server that allows Claude to control audio playback on your computer, supporting MP3, WAV, and OGG files with features like play, list, and stop commands.
A voice-to-text transcription service that converts audio files to transcripts using SiliconFlow, supporting both multipart/form-data and base64 formats.
A portable, Dockerized Python tool that implements Model Context Protocol for audio transcription using Whisper models, featuring both CLI and web UI interfaces for converting audio files to JSON transcriptions.
Enables Claude and other AI assistants to interact with your computer's audio system, allowing for recording from microphones and playing audio through speakers.
Enables voice-based interactions with Claude by converting text to speech using Kokoro TTS and transcribing user responses using NVIDIA NeMo ASR, creating interactive voice dialogues.