en es ja ko zh

Resemble AI Voice Generation MCP Server

# Resemble AI Voice Generation MCP Server

## Overview
This project creates an MCP server that integrates with Resemble AI's voice generation capabilities. The server enables Claude Desktop to generate and manage voice content through natural language interactions.

## Requirements
- Python 3.10 or higher
- MCP SDK v1.2.0 or higher
- Resemble AI API key

## Implementation Plan

### 1. Server Setup
- ✅ Bootstrap Python MCP server
- ✅ Install required dependencies
- ✅ Configure environment for API keys

### 2. Tool Implementation
- ✅ `list_voices`: Retrieve available voice models from Resemble AI
- ✅ `generate_tts`: Generate voice audio from text input

### 3. Security & Error Handling
- ✅ Secure storage and usage of API credentials
- ✅ Comprehensive error handling with detailed logging
- ✅ Proper response validation with Pydantic

### 4. Testing
- ✅ Created test script (test_server.py)
- ✅ Updated API endpoints to current Resemble AI API (app.resemble.ai)
- ✅ Added alternative SDK-based implementation
- ✅ Created HTTP server for direct API access
- ✅ Test each tool with valid inputs
- ✅ Verify output format is correct
- ✅ Test integration with Claude Desktop

### 5. Documentation
- ✅ Code documentation with docstrings
- ✅ Usage examples in README.md
- ✅ Setup instructions in README.md
- ✅ Added troubleshooting section to README.md
- ✅ Added Cursor AI integration instructions

## API Updates (March 2025)
- Updated endpoint from `https://api.resemble.ai/v2` to `https://app.resemble.ai/api/v2`
- Updated clip creation flow to require project UUID
- TTS parameter name changed from "text" to "body"
- Audio URL parameter changed from "audio_url" to "audio_src"
- Added alternative implementation using the official Resemble SDK

## Implementation Options

This project now offers three implementation approaches:

1. **Direct API Implementation** (resemble_ai_server.py)
   - Uses direct HTTP requests to the Resemble AI API
   - Minimal dependencies (requests, pydantic)
   - Good for learning how the API works

2. **SDK-based Implementation** (resemble_ai_sdk_server.py)
   - Uses the official Resemble SDK
   - More idiomatic and aligned with Resemble's recommendations
   - Less code maintenance as SDK updates automatically handle API changes

3. **HTTP Server Implementation** (resemble_http_server.py)
   - Exposes a RESTful API for tool access
   - Doesn't require MCP framework, works with any HTTP client
   - Best option for Cursor AI integration
   - Uses FastAPI for robust server implementation

## Integration Testing with Claude Desktop

To test the integration with Claude Desktop:

1. Create a `.env` file with your Resemble AI API key:
   ```
   RESEMBLE_API_KEY=your_api_key_here
   ```

2. Test the server implementation directly first:
   ```bash
   # Test direct implementation
   python test_server.py --implementation direct
   
   # Test SDK implementation
   python test_server.py --implementation sdk
   ```

3. Start the MCP server (choose one implementation):
   ```bash
   # Direct API implementation
   python resemble_ai_server.py
   
   # SDK implementation
   python resemble_ai_sdk_server.py
   
   # HTTP server implementation
   python resemble_http_server.py
   ```

4. Configure Claude Desktop to use the MCP server by adding the configuration from `claude_desktop_config.json` to Claude's settings.

5. Test with sample prompts:
   - "List all available voice models from Resemble AI"
   - "Generate audio of the text 'Hello world' using a female English voice"

## Integration with Cursor AI

For Cursor AI integration, the HTTP server provides the most flexible approach:

1. Start the HTTP server:
   ```bash
   python resemble_http_server.py
   ```

2. The server exposes an endpoint at `http://localhost:8000/tools` that accepts POST requests with the following format:
   ```json
   {
     "tool": "list_voices",
     "params": {}
   }
   ```
   or
   ```json
   {
     "tool": "generate_tts",
     "params": {
       "text": "Hello world",
       "voice_id": "voice_uuid",
       "return_type": "file",
       "output_filename": "my_audio"
     }
   }
   ```

3. A demonstration script for Cursor AI integration is provided in `cursor_ai_example.py`.

4. Example prompts for Cursor AI:
   - "Can you help me list all available voice models from the Resemble AI server running at http://localhost:8000/tools?"
   - "Generate audio of the text 'Hello, Cursor AI here' using the Resemble AI server at http://localhost:8000/tools. Save it to a file called 'cursor_speech.mp3'."

## Next Steps

1. ✅ Test both implementations using the test script
2. ✅ Select the preferred implementation based on testing results
3. ✅ Test the integration with Claude Desktop
4. ✅ Add Cursor AI integration
5. ⬜ Create a demonstration video
6. ⬜ Publish the code to GitHub with documentation

## Notes for Improvement

- Consider adding more tools such as:
  - `get_voice_details`: Get detailed information about a specific voice
  - `clip_management`: Save, retrieve, or delete generated clips
  - `custom_voice_support`: Support for any custom voices created by the user
- Add rate limiting to respect Resemble AI's API constraints
- Implement caching to avoid redundant API calls
- Add authentication to the HTTP server for production use

## Connecting MCP Server to Cursor

The new MCP SDK-based implementation (`resemble_mcp_server.py`) provides a standardized way to connect to Cursor:

1. Make sure you have Python 3.10 or higher installed
2. Set up the environment using one of the provided scripts:
   ```bash
   # For conda users
   ./setup_environment.sh

   # For venv users
   ./setup_venv.sh
   ```

3. Activate the environment:
   ```bash
   # For conda
   conda activate resemble_mcp

   # For venv
   source venv/bin/activate
   ```

4. Run the MCP server:
   ```bash
   python resemble_mcp_server.py --port 8083
   ```

5. In Cursor:
   - Go to Settings
   - Under "AI" section, locate the "Add an MCP Server" option
   - Enter the following SSE URL: `http://localhost:8083/sse`
   - Save the configuration

6. You can now use the Resemble AI tools in Cursor with commands like:
   - "List all available voice models from Resemble AI"
   - "Generate audio for the text 'Hello, this is a test'"

## Troubleshooting MCP Server Connection

If you encounter issues connecting the MCP server to Cursor:

1. Verify the server is running by checking for the following output:
   ```
   INFO: Uvicorn running on http://0.0.0.0:8083 (Press CTRL+C to quit)
   ```

2. Test the SSE endpoint using the debug script:
   ```bash
   python sse_debug.py http://localhost:8083/sse
   ```
   This should show events being received from the server.

3. Check that your RESEMBLE_API_KEY is properly set in the .env file

4. Ensure you're using the correct SSE URL in Cursor: `http://localhost:8083/sse`

5. If you see connection errors, check for any firewall or antivirus software that might be blocking the connection.

# Development Notes

## MCP Server Implementation

### Key Components

1. **Server Transport**:
   - SSE (Server-Sent Events): Network-based communication (default)
   - StdIO: Direct process communication (added in latest version)

2. **API Integration**:
   - Direct API calls to Resemble AI
   - Official Resemble SDK wrapper

3. **Configuration Methods**:
   - Environment variables
   - Command-line arguments
   - Config file

### MCP Server Versions

1. **MCP SDK Version**: Uses the official MCP SDK with SSE transport
2. **HTTP Version**: Simple HTTP implementation with SSE fallback
3. **StdIO Version**: Uses direct stdio communication for Claude/Cursor integration

### Error Handling

- Added automatic fallback to HTTP implementation if MCP SDK import fails
- Added comprehensive logging to file for StdIO implementation
- Graceful handling of API errors from Resemble

## Troubleshooting

### MCP SDK Import Issues

The current error is:
```
Server.__init__() got an unexpected keyword argument 'transport'
```

This indicates the installed MCP SDK version doesn't support the transport parameter in the constructor. We've addressed this with:

1. Updated code to handle MCP SDK import errors
2. Added a fallback to the HTTP implementation
3. Created a new StdIO implementation that doesn't require the MCP SDK

### StdIO Implementation Benefits

- Doesn't require a separate server process
- Easier integration with Claude Desktop and Cursor
- Doesn't require network ports or connectivity
- Better security as it's a direct process communication
- Automatically managed by the LLM client

### Server Requirements

- Python 3.10+ (MCP SDK requirement)
- Resemble AI API key
- uvicorn, fastapi, requests (for HTTP/SSE implementations)
- dotenv for configuration
- pydantic for data validation

## Future Improvements

1. Add unit tests for each implementation
2. Add more granular error handling
3. Improve voice selection with metadata
4. Add audio effects support
5. Add streaming audio support
6. Add voice cloning capability 

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/obaid/resemble-mcp'

If you have feedback or need assistance with the MCP directory API, please join our Discord server