Speech Processing MCP connectors
AI phone secretary: place calls, read transcripts, list calls, agents, and stats.
Create, inspect, and manage Wubble music, speech, voice, and sound-effect requests through MCP.
One key, 100+ models — chat with any LLM and generate video, images, speech. Free trial at 370.ai.
AI voice agents: assistants, calls, campaigns, leads, knowledge bases, WhatsApp, SMS & SIP trunks.
Human-input bridge for AI agents with voice-first answer links, MCP tools, and HTTP APIs.
OCR, transcription, file extraction, and image generation for AI agents via MCP.
AI voice agents on SMB websites — fully autonomous build in 2–3 min. 23 MCP tools. EU, GDPR.
YouTube video search with transcript extraction as first-class output.
Pronunciation scoring, speech-to-text, and text-to-speech for language learning
Manage Speko voice-AI agents, sessions, calls, phone numbers, knowledge bases, evals, and docs.
Official MCP server for OmniDimension. Drive voice agents, dispatch calls, and run bulk campaigns.
Give AI agents real phone numbers, messages, and voice calls via MCP.
Search recordings, summarize meetings, create clips, and automate workflows from your AI assistant.
Noogat is a voice-first note-taking app for iOS and web with an MCP server for AI coding agents. Capture ideas hands-free via Siri, retrieve them inside Claude Code, Claude Desktop, or Cursor. Features: semantic search, AI auto-tagging, search by time, related notes. Pro subscription required for MCP access.
Voice notes that organize themselves. Capture by Siri, AI auto-tags, semantic search retrieves.
Podcast intelligence for agents: transcripts, clips, speaker diarization, mention tracking.
An MCP server that fetches video transcripts/subtitles, with pagination for large responses. Supports YouTube, Twitter/X, Instagram, TikTok, Twitch, Vimeo, Facebook, Bilibili, VK, Dailymotion, Reddit. Whisper fallback — transcribes audio when subtitles are unavailable.
Voice AI assistant builder for websites. Create, deploy, and analyze AI voice bots that understand natural speech, navigate pages, fill forms, and respond in 50+ languages. Includes knowledge base training, visitor intelligence, and conversation analytics.