An advanced MCP server that provides stateful voice-controlled AGI capabilities with local STT, TTS, and intent detection. It enables users to execute tools, manage memory, and conduct research through natural multi-turn dialogue with low-latency performance tracking.
Enables users to convert text into high-quality audio by accessing the OpenAI Text-to-Speech API. It supports customizable model selection and voice options for synthesized speech generation via the MCP protocol.
Enables conversion of YouTube videos to MP3 format through the Youtube To Mp315 API. Supports checking conversion status, retrieving video titles, and asynchronous video-to-audio conversion with customizable quality and time range settings.