Skip to main content
Glama

Voice Call MCP Server

by popcornspace
README.md6.64 kB
# Voice Call MCP Server A Model Context Protocol (MCP) server that enables Claude and other AI assistants to initiate and manage voice calls using Twilio and OpenAI (GPT-4o Realtime model). Use this as a base to kick-start your AI-powered voice calling explorations, save time and develop additional functionality on top of it. ![Demo](./assets/demo.gif) ## Sequence Diagram ```mermaid sequenceDiagram participant AI as AI Assistant (e.g., Claude) participant MCP as MCP Server participant Twilio as Twilio participant Phone as Destination Phone participant OpenAI as OpenAI AI->>MCP: 1) Initiate outbound call request <br>(POST /calls) MCP->>Twilio: 2) Place outbound call via Twilio API Twilio->>Phone: 3) Ring the destination phone Twilio->>MCP: 4) Call status updates & audio callbacks (webhooks) MCP->>OpenAI: 5) Forward real-time audio to OpenaAI's realtime model OpenAI->>MCP: 6) Return voice stream MCP->>Twilio: 7) Send voice stream Twilio->>Phone: 8) Forward voice stream Note over Phone: Two-way conversation continues <br>until the call ends ``` ## Features - Make outbound phone calls via Twilio 📞 - Process call audio in real-time with GPT-4o Realtime model 🎙️ - Real-time language switching during calls 🌐 - Pre-built prompts for common calling scenarios (like restaurant reservations) 🍽️ - Automatic public URL tunneling with ngrok 🔄 - Secure handling of credentials 🔒 ## Why MCP? The Model Context Protocol (MCP) bridges the gap between AI assistants and real-world actions. By implementing MCP, this server allows AI models like Claude to: 1. Initiate actual phone calls on behalf of users 2. Process and respond to real-time audio conversations 3. Execute complex tasks requiring voice communication This open-source implementation provides transparency and customizability, allowing developers to extend functionality while maintaining control over their data and privacy. ## Requirements - Node.js >= 22 - If you need to update Node.js, we recommend using `nvm` (Node Version Manager): ```bash nvm install 22 nvm use 22 ``` - Twilio account with API credentials - OpenAI API key - Ngrok Authtoken ## Installation ### Manual Installation 1. Clone the repository ```bash git clone https://github.com/lukaskai/voice-call-mcp-server.git cd voice-call-mcp-server ``` 2. Install dependencies and build ```bash npm install npm run build ``` ## Configuration The server requires several environment variables: - `TWILIO_ACCOUNT_SID`: Your Twilio account SID - `TWILIO_AUTH_TOKEN`: Your Twilio auth token - `TWILIO_NUMBER`: Your Twilio number - `OPENAI_API_KEY`: Your OpenAI API key - `NGROK_AUTHTOKEN`: Your ngrok authtoken - `RECORD_CALLS`: Set to "true" to record calls (optional) ### Claude Desktop Configuration To use this server with Claude Desktop, add the following to your configuration file: **macOS**: `~/Library/Application Support/Claude/claude_desktop_config.json` **Windows**: `%APPDATA%\Claude\claude_desktop_config.json` ```json { "mcpServers": { "voice-call": { "command": "node", "args": ["/path/to/your/mcp-new/dist/start-all.cjs"], "env": { "TWILIO_ACCOUNT_SID": "your_account_sid", "TWILIO_AUTH_TOKEN": "your_auth_token", "TWILIO_NUMBER": "your_e.164_format_number", "OPENAI_API_KEY": "your_openai_api_key", "NGROK_AUTHTOKEN": "your_ngrok_authtoken" } } } } ``` After that, restart Claude Desktop to reload the configuration. If connected, you should see Voice Call under the 🔨 menu. ## Example Interactions with Claude Here are some natural ways to interact with the server through Claude: 1. Simple call: ``` Can you call +1-123-456-7890 and let them know I'll be 15 minutes late for our meeting? ``` 2. Restaurant reservation: ``` Please call Delicious Restaurant at +1-123-456-7890 and make a reservation for 4 people tonight at 7:30 PM. Please speak in German. ``` 3. Appointment scheduling: ``` Please call Expert Dental NYC (+1-123-456-7899) and reschedule my Monday appointment to next Friday between 4–6pm. ``` ## Important Notes 1. **Phone Number Format**: All phone numbers must be in E.164 format (e.g., +11234567890) 2. **Rate Limits**: Be aware of your Twilio and OpenAI account's rate limits and pricing 3. **Voice Conversations**: The AI will handle natural conversations in real-time 4. **Call Duration**: Be mindful of call durations as they affect OpenAI API and Twilio costs 5. **Public Exposure**: Be aware that the ngrok tunnel exposes your server publicly for Twilio to reach it (though with a random URL and protected by a random secret) ## Troubleshooting Common error messages and solutions: 1. "Phone number must be in E.164 format" - Make sure the phone number starts with "+" and the country code 2. "Invalid credentials" - Double-check your TWILIO_ACCOUNT_SID and TWILIO_AUTH_TOKEN. You can copy them from the [Twilio Console](https://console.twilio.com) 3. "OpenAI API error" - Verify your OPENAI_API_KEY is correct and has sufficient credits 4. "Ngrok tunnel failed to start" - Ensure your NGROK_AUTHTOKEN is valid and not expired 5. "OpenAI Realtime does not detect the end of voice input, or is lagging." - Sometimes, there might be voice encoding issues between Twilio and the receiver's network operator. Try using a different receiver. ## Contributing Contributions are welcome! Here are some areas we're looking to improve: - Implement support for multiple AI models beyond the current implementation - Add database integration to store conversation history locally and make it accessible for AI context - Improve latency and response times to enhance call experiences - Enhance error handling and recovery mechanisms - Add more pre-built conversation templates for common scenarios - Implement improved call monitoring and analytics If you'd like to contribute, please open an issue to discuss your ideas before submitting a pull request. ## License This project is licensed under the MIT License - see the LICENSE file for details. ## Security Please do not include any sensitive information (like phone numbers or API credentials) in GitHub issues or pull requests. This server handles sensitive communications; deploy it responsibly and ensure all credentials are kept secure. ## Time For a New Mission? We’re hiring engineers to build at the frontier of voice AI — and bake it into a next-gen telco. Curious? Head to [careers.popcorn.space](https://careers.popcorn.space/apply) 🍿 !

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/popcornspace/voice-call-mcp-server'

If you have feedback or need assistance with the MCP directory API, please join our Discord server