Skip to main content
Glama

speech_to_speech

Convert audio files to different voices while preserving speech content. Transform MP3 or WAV files using voice IDs with adjustable similarity and optional background noise removal.

Instructions

[AllVoiceLab Tool] Convert audio to another voice while preserving speech content.

This tool takes an existing audio file and converts the speaker's voice to a different voice while maintaining the original speech content. Args: audio_file_path: Path to the source audio file. Only MP3 and WAV formats are supported. Maximum file size: 50MB. voice_id: Voice ID to use for the conversion. Required. Must be a valid voice ID from the available voices (use get_voices tool to retrieve). similarity: Voice similarity factor, range [0, 1], where 0 is least similar and 1 is most similar to the original voice characteristics. Default value is 1. remove_background_noise: Whether to remove background noise from the source audio before conversion. Default is False. output_dir: Output directory for the generated audio file. Default is user's desktop. Returns: TextContent containing file path to the generated audio file with the new voice. Limitations: - Only MP3 and WAV formats are supported - Maximum file size: 50MB - File must exist and be accessible

Input Schema

TableJSON Schema
NameRequiredDescriptionDefault
audio_file_pathYes
voice_idYes
similarityNo
remove_background_noiseNo
output_dirNo

Implementation Reference

  • The main handler function for the speech_to_speech tool. It validates inputs, calls the AllVoiceLab client to perform the voice conversion, and returns the output file path or error.
    def speech_to_speech( audio_file_path: str, voice_id: str, similarity: float = 1, remove_background_noise: bool = False, output_dir: str = None ) -> TextContent: """ Convert audio to another voice while preserving speech content Args: audio_file_path: Path to the source audio file. Only MP3 and WAV formats are supported. Maximum file size: 50MB. voice_id: Voice ID to use for the conversion. Required. Must be a valid voice ID from the available voices (use get_voices tool to retrieve). similarity: Voice similarity factor, range [0, 1], where 0 is least similar and 1 is most similar to the original voice characteristics. Default value is 1. remove_background_noise: Whether to remove background noise from the source audio before conversion. Default is False. output_dir: Output directory for the generated audio file. Default is user's desktop. Returns: TextContent: Contains the file path to the generated audio file with the new voice. """ all_voice_lab = get_client() output_dir = all_voice_lab.get_output_path(output_dir) logging.info(f"Tool called: speech_to_speech, voice_id: {voice_id}, similarity: {similarity}") logging.info(f"Audio file path: {audio_file_path}, remove background noise: {remove_background_noise}") logging.info(f"Output directory: {output_dir}") # Validate audio file is_valid, error_message = validate_audio_file(audio_file_path) if not is_valid: return create_error_response(error_message) # Validate voice_id parameter if not voice_id: logging.warning("voice_id parameter is empty") return create_error_response("voice_id parameter cannot be empty") # Validate voice_id format (basic check) if not isinstance(voice_id, str) or len(voice_id.strip()) == 0: logging.warning(f"Invalid voice_id format: {voice_id}") return create_error_response("Invalid voice_id format") # Validate similarity range if similarity < 0 or similarity > 1: logging.warning(f"Similarity parameter {similarity} is out of range [0, 1]") return create_error_response("similarity parameter must be between 0 and 1") # Validate and create output directory is_valid, error_message = validate_output_directory(output_dir) if not is_valid: return create_error_response(error_message) try: logging.info("Starting speech conversion processing") file_path = all_voice_lab.speech_to_speech(audio_file_path, voice_id, output_dir, similarity, remove_background_noise) logging.info(f"Speech conversion successful, file saved at: {file_path}") return create_success_response(f"Audio conversion completed, file saved at: {file_path}\n") except FileNotFoundError as e: logging.error(f"Audio file does not exist: {audio_file_path}, error: {str(e)}") return create_error_response(f"Audio file does not exist: {audio_file_path}") except Exception as e: logging.error(f"Speech conversion failed: {str(e)}") return create_error_response("Conversion failed, tool temporarily unavailable")
  • The MCP tool registration for speech_to_speech, including name, description with input schema, and binding to the handler function.
    mcp.tool( name="speech_to_speech", description="""[AllVoiceLab Tool] Convert audio to another voice while preserving speech content. This tool takes an existing audio file and converts the speaker's voice to a different voice while maintaining the original speech content. Args: audio_file_path: Path to the source audio file. Only MP3 and WAV formats are supported. Maximum file size: 50MB. voice_id: Voice ID to use for the conversion. Required. Must be a valid voice ID from the available voices (use get_voices tool to retrieve). similarity: Voice similarity factor, range [0, 1], where 0 is least similar and 1 is most similar to the original voice characteristics. Default value is 1. remove_background_noise: Whether to remove background noise from the source audio before conversion. Default is False. output_dir: Output directory for the generated audio file. Default is user's desktop. Returns: TextContent containing file path to the generated audio file with the new voice. Limitations: - Only MP3 and WAV formats are supported - Maximum file size: 50MB - File must exist and be accessible """ )(speech_to_speech)

Latest Blog Posts

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/allvoicelab/AllVoiceLab-MCP'

If you have feedback or need assistance with the MCP directory API, please join our Discord server