Skip to main content
Glama
allvoicelab

All Voice Lab MCP Server

Official
by allvoicelab

clone_voice

Create custom voice profiles from audio samples for text-to-speech and speech-to-speech applications. Analyze MP3 or WAV files to generate voice replicas that mimic original audio characteristics.

Instructions

[AllVoiceLab Tool] Create a custom voice profile by cloning from an audio sample.

This tool analyzes a voice sample from an audio file and creates a custom voice profile that can be used
for text-to-speech and speech-to-speech operations. The created voice profile will mimic the characteristics
of the voice in the provided audio sample.

Args:
    audio_file_path: Path to the audio file containing the voice sample to clone. Only MP3 and WAV formats are supported. Maximum file size: 10MB.
    name: Name to assign to the cloned voice profile. Required.
    description: Optional description for the cloned voice profile.
    
Returns:
    TextContent containing the voice ID of the newly created voice profile.
    
Limitations:
    - Only MP3 and WAV formats are supported
    - Maximum file size: 10MB (smaller than other audio tools)
    - File must exist and be accessible
    - Requires permission to use voice cloning feature
    - Audio sample should contain clear speech with minimal background noise for best results

Input Schema

TableJSON Schema
NameRequiredDescriptionDefault
audio_file_pathYes
nameYes
descriptionNo

Implementation Reference

  • The handler function that executes the clone_voice tool. Validates input audio file and name, calls all_voice_lab.add_voice to clone the voice, handles various exceptions, and returns TextContent with success or error message including the new voice ID.
    def clone_voice(
        audio_file_path: str,
        name: str,
        description: str = None
    ) -> TextContent:
        """
        Create a custom voice profile by cloning from an audio sample
        
        Args:
            audio_file_path: Path to the audio file containing the voice sample to clone. Only MP3 and WAV formats are supported. Maximum file size: 10MB.
            name: Name to assign to the cloned voice profile. Required.
            description: Optional description for the cloned voice profile.
            
        Returns:
            TextContent: Contains the voice ID of the newly created voice profile.
        """
        all_voice_lab = get_client()
        logging.info(f"Tool called: clone_voice")
        logging.info(f"Audio file path: {audio_file_path}")
        logging.info(f"Voice name: {name}")
        if description:
            logging.info(f"Voice description: {description}")
    
        # Validate audio file, using 10MB size limit
        is_valid, error_message = validate_audio_file(audio_file_path, max_size_mb=10)
        if not is_valid:
            return create_error_response(error_message)
    
        # Validate name parameter
        if not name:
            logging.warning("Name parameter is empty")
            return create_error_response("name parameter cannot be empty")
    
        try:
            logging.info("Starting voice cloning process")
            voice_id = all_voice_lab.add_voice(name, audio_file_path, description)
            logging.info(f"Voice cloning successful, voice ID: {voice_id}")
            return TextContent(
                type="text",
                text=f"Voice cloning completed. Your new voice ID is: {voice_id}\n"
            )
        except VoiceCloneNoPermissionError as e:
            logging.error(f"Voice cloning failed: {str(e)}")
            return TextContent(
                type="text",
                text=f"Voice cloning failed, you don't have permission to clone voice. Please contact AllVoiceLab com."
            )
        except FileNotFoundError as e:
            logging.error(f"Audio file does not exist: {audio_file_path}, error: {str(e)}")
            return TextContent(
                type="text",
                text=f"Audio file does not exist: {audio_file_path}"
            )
        except Exception as e:
            logging.error(f"Voice cloning failed: {str(e)}")
            return TextContent(
                type="text",
                text=f"Voice cloning failed, tool temporarily unavailable"
            )
  • MCP tool registration for clone_voice, including name, detailed description serving as input/output schema with args, returns, and limitations.
    mcp.tool(
        name="clone_voice",
        description="""[AllVoiceLab Tool] Create a custom voice profile by cloning from an audio sample.
        
        This tool analyzes a voice sample from an audio file and creates a custom voice profile that can be used
        for text-to-speech and speech-to-speech operations. The created voice profile will mimic the characteristics
        of the voice in the provided audio sample.
        
        Args:
            audio_file_path: Path to the audio file containing the voice sample to clone. Only MP3 and WAV formats are supported. Maximum file size: 10MB.
            name: Name to assign to the cloned voice profile. Required.
            description: Optional description for the cloned voice profile.
            
        Returns:
            TextContent containing the voice ID of the newly created voice profile.
            
        Limitations:
            - Only MP3 and WAV formats are supported
            - Maximum file size: 10MB (smaller than other audio tools)
            - File must exist and be accessible
            - Requires permission to use voice cloning feature
            - Audio sample should contain clear speech with minimal background noise for best results
        """
    )(clone_voice)
  • Type hints and docstring in the handler function define the input schema (parameters with types and descriptions) and output type (TextContent). Also includes validation logic within the function.
    def clone_voice(
        audio_file_path: str,
        name: str,
        description: str = None
    ) -> TextContent:
        """
        Create a custom voice profile by cloning from an audio sample
        
        Args:
            audio_file_path: Path to the audio file containing the voice sample to clone. Only MP3 and WAV formats are supported. Maximum file size: 10MB.
            name: Name to assign to the cloned voice profile. Required.
            description: Optional description for the cloned voice profile.
            
        Returns:
            TextContent: Contains the voice ID of the newly created voice profile.
        """
        all_voice_lab = get_client()
        logging.info(f"Tool called: clone_voice")
        logging.info(f"Audio file path: {audio_file_path}")
        logging.info(f"Voice name: {name}")
        if description:
            logging.info(f"Voice description: {description}")
    
        # Validate audio file, using 10MB size limit
        is_valid, error_message = validate_audio_file(audio_file_path, max_size_mb=10)
        if not is_valid:
            return create_error_response(error_message)
    
        # Validate name parameter
        if not name:
            logging.warning("Name parameter is empty")
            return create_error_response("name parameter cannot be empty")
    
        try:
            logging.info("Starting voice cloning process")
            voice_id = all_voice_lab.add_voice(name, audio_file_path, description)
            logging.info(f"Voice cloning successful, voice ID: {voice_id}")
            return TextContent(
                type="text",
                text=f"Voice cloning completed. Your new voice ID is: {voice_id}\n"
            )
        except VoiceCloneNoPermissionError as e:
            logging.error(f"Voice cloning failed: {str(e)}")
            return TextContent(
                type="text",
                text=f"Voice cloning failed, you don't have permission to clone voice. Please contact AllVoiceLab com."
            )
        except FileNotFoundError as e:
            logging.error(f"Audio file does not exist: {audio_file_path}, error: {str(e)}")
            return TextContent(
                type="text",
                text=f"Audio file does not exist: {audio_file_path}"
            )
        except Exception as e:
            logging.error(f"Voice cloning failed: {str(e)}")
            return TextContent(
                type="text",
                text=f"Voice cloning failed, tool temporarily unavailable"
            )
Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

With no annotations provided, the description carries the full burden of behavioral disclosure. It does this well by describing what the tool does (creates voice profiles), what it returns (voice ID), and important behavioral constraints: format limitations (MP3/WAV only), size limits (10MB), accessibility requirements, permission needs, and quality recommendations for audio samples. This covers most key behavioral aspects for a creation tool.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness4/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is well-structured with clear sections (Args, Returns, Limitations) and front-loaded with the core purpose. Each sentence adds value, though the 'Limitations' section could be slightly more concise. Overall, it's appropriately sized for a tool with multiple parameters and behavioral constraints.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the tool's complexity (voice cloning with 3 parameters, no annotations, no output schema), the description provides substantial context: purpose, parameters, return value, and limitations. It covers the essential aspects for a creation tool, though it could benefit from more detail on error conditions or what happens with invalid audio samples. The absence of an output schema is mitigated by describing the return value.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters5/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

The schema description coverage is 0%, so the description must fully compensate. It does this excellently by providing detailed semantics for all 3 parameters: audio_file_path (path to audio file, supported formats, size limit), name (required name for profile), and description (optional description). It adds crucial information not in the schema, like format restrictions and file requirements.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the tool's purpose: 'Create a custom voice profile by cloning from an audio sample.' It specifies the action (create/clone), resource (voice profile), and source (audio sample). It distinguishes itself from siblings like text_to_speech or get_voices by focusing on voice creation rather than using existing voices or other audio operations.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description provides clear context for when to use this tool: for creating voice profiles from audio samples for text-to-speech or speech-to-speech operations. It mentions limitations like supported formats and file size, which help determine appropriateness. However, it doesn't explicitly state when NOT to use it or name specific alternatives among the sibling tools.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

Install Server

Other Tools

Latest Blog Posts

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/allvoicelab/AllVoiceLab-MCP'

If you have feedback or need assistance with the MCP directory API, please join our Discord server