Integrates ElevenLabs Text-to-Speech capabilities with Cursor through the Model Context Protocol, allowing users to convert text to speech with selectable voices within the Cursor editor.
Enables downloading videos from platforms like YouTube and converting them to text using OpenAI Whisper and ffmpeg. It supports multiple output formats including TXT, JSON, SRT, and VTT for transcriptions.
Enables agents to convert text to speech using OpenAI's TTS models with voice selection, delivery instructions, and queue-based audio playback. Supports both blocking and non-blocking modes for flexible audio generation and playback control.