MiniMax MCP

Server Configuration

Describes the environment variables required to run the server.

Name	Required	Description	Default
`MINIMAX_API_KEY`	Yes	Your MiniMax API key. Get it from MiniMax Global (https://www.minimax.io/platform/user-center/basic-information/interface-key) or MiniMax Mainland (https://platform.minimaxi.com/user-center/basic-information/interface-key)
`MINIMAX_API_HOST`	Yes	API host URL. Use https://api.minimax.io for Global or https://api.minimaxi.com for Mainland China
`MINIMAX_MCP_BASE_PATH`	Yes	Local output directory path, such as /User/xxx/Desktop
`MINIMAX_API_RESOURCE_MODE`	No	Optional resource mode. Options: [url\|local]. URL is default. Determines whether audio/image/video are downloaded locally or provided in URL format	url

Capabilities

Server capabilities have not been inspected yet.

Tools

Functions exposed to the LLM to take actions

Name	Description
text_to_audioA	Convert text to audio with a given voice and save the output audio file to a given directory. Directory is optional, if not provided, the output file will be saved to $HOME/Desktop. Voice id is optional, if not provided, the default voice will be used. COST WARNING: This tool makes an API call to Minimax which may incur costs. Only use when explicitly requested by the user. Args: text (str): The text to convert to speech. voice_id (str, optional): The id of the voice to use. For example, "male-qn-qingse"/"audiobook_female_1"/"cute_boy"/"Charming_Lady"... model (string, optional): The model to use. speed (float, optional): Speed of the generated audio. Controls the speed of the generated speech. Values range from 0.5 to 2.0, with 1.0 being the default speed. vol (float, optional): Volume of the generated audio. Controls the volume of the generated speech. Values range from 0 to 10, with 1 being the default volume. pitch (int, optional): Pitch of the generated audio. Controls the speed of the generated speech. Values range from -12 to 12, with 0 being the default speed. emotion (str, optional): Emotion of the generated audio. Controls the emotion of the generated speech. Values range ["happy", "sad", "angry", "fearful", "disgusted", "surprised", "neutral"], with "happy" being the default emotion. sample_rate (int, optional): Sample rate of the generated audio. Controls the sample rate of the generated speech. Values range [8000,16000,22050,24000,32000,44100] with 32000 being the default sample rate. bitrate (int, optional): Bitrate of the generated audio. Controls the bitrate of the generated speech. Values range [32000,64000,128000,256000] with 128000 being the default bitrate. channel (int, optional): Channel of the generated audio. Controls the channel of the generated speech. Values range [1, 2] with 1 being the default channel. format (str, optional): Format of the generated audio. Controls the format of the generated speech. Values range ["pcm", "mp3","flac"] with "mp3" being the default format. language_boost (str, optional): Language boost of the generated audio. Controls the language boost of the generated speech. Values range ['Chinese', 'Chinese,Yue', 'English', 'Arabic', 'Russian', 'Spanish', 'French', 'Portuguese', 'German', 'Turkish', 'Dutch', 'Ukrainian', 'Vietnamese', 'Indonesian', 'Japanese', 'Italian', 'Korean', 'Thai', 'Polish', 'Romanian', 'Greek', 'Czech', 'Finnish', 'Hindi', 'auto'] with "auto" being the default language boost. output_directory (str): The directory to save the audio to. Returns: Text content with the path to the output file and name of the voice used.
list_voicesB	List all voices available. `Args: voice_type (str, optional): The type of voices to list. Values range ["all", "system", "voice_cloning"], with "all" being the default. Returns: Text content with the list of voices.`
voice_cloneA	Clone a voice using provided audio files. The new voice will be charged upon first use. COST WARNING: This tool makes an API call to Minimax which may incur costs. Only use when explicitly requested by the user. Args: voice_id (str): The id of the voice to use. file (str): The path to the audio file to clone or a URL to the audio file. text (str, optional): The text to use for the demo audio. is_url (bool, optional): Whether the file is a URL. Defaults to False. output_directory (str): The directory to save the demo audio to. Returns: Text content with the voice id of the cloned voice.
play_audioC	Play an audio file. Supports WAV and MP3 formats. Not supports video. `Args: input_file_path (str): The path to the audio file to play. is_url (bool, optional): Whether the audio file is a URL. Returns: Text content with the path to the audio file.`
generate_videoA	Generate a video from a prompt. COST WARNING: This tool makes an API call to Minimax which may incur costs. Only use when explicitly requested by the user. Args: model (str, optional): The model to use. Values range ["T2V-01", "T2V-01-Director", "I2V-01", "I2V-01-Director", "I2V-01-live", "MiniMax-Hailuo-02"]. "Director" supports inserting instructions for camera movement control. "I2V" for image to video. "T2V" for text to video. "MiniMax-Hailuo-02" is the latest model with best effect, ultra-clear quality and precise response. prompt (str): The prompt to generate the video from. When use Director model, the prompt supports 15 Camera Movement Instructions (Enumerated Values) -Truck: [Truck left], [Truck right] -Pan: [Pan left], [Pan right] -Push: [Push in], [Pull out] -Pedestal: [Pedestal up], [Pedestal down] -Tilt: [Tilt up], [Tilt down] -Zoom: [Zoom in], [Zoom out] -Shake: [Shake] -Follow: [Tracking shot] -Static: [Static shot] first_frame_image (str): The first frame image. The model must be "I2V" Series. duration (int, optional): The duration of the video. The model must be "MiniMax-Hailuo-02". Values can be 6 and 10. resolution (str, optional): The resolution of the video. The model must be "MiniMax-Hailuo-02". Values range ["768P", "1080P"] output_directory (str): The directory to save the video to. async_mode (bool, optional): Whether to use async mode. Defaults to False. If True, the video generation task will be submitted asynchronously and the response will return a task_id. Should use `query_video_generation` tool to check the status of the task and get the result. Returns: Text content with the path to the output video file.
query_video_generationB	Query the status of a video generation task. Args: task_id (str): The task ID to query. Should be the task_id returned by `generate_video` tool if `async_mode` is True. output_directory (str): The directory to save the video to. Returns: Text content with the status of the task.
text_to_imageA	Generate a image from a prompt. COST WARNING: This tool makes an API call to Minimax which may incur costs. Only use when explicitly requested by the user. Args: model (str, optional): The model to use. Values range ["image-01"], with "image-01" being the default. prompt (str): The prompt to generate the image from. aspect_ratio (str, optional): The aspect ratio of the image. Values range ["1:1", "16:9","4:3", "3:2", "2:3", "3:4", "9:16", "21:9"], with "1:1" being the default. n (int, optional): The number of images to generate. Values range [1, 9], with 1 being the default. prompt_optimizer (bool, optional): Whether to optimize the prompt. Values range [True, False], with True being the default. output_directory (str): The directory to save the image to. Returns: Text content with the path to the output image file.
music_generationA	Create a music generation task using AI models. Generate music from prompt and lyrics. COST WARNING: This tool makes an API call to Minimax which may incur costs. Only use when explicitly requested by the user. Args: prompt (str): Music creation inspiration describing style, mood, scene, etc. Example: "Pop music, sad, suitable for rainy nights". Character range: [10, 300] lyrics (str): Song lyrics for music generation. Use newline (\n) to separate each line of lyrics. Supports lyric structure tags [Intro][Verse][Chorus][Bridge][Outro] to enhance musicality. Character range: [10, 600] (each Chinese character, punctuation, and letter counts as 1 character) stream (bool, optional): Whether to enable streaming mode. Defaults to False sample_rate (int, optional): Sample rate of generated music. Values: [16000, 24000, 32000, 44100] bitrate (int, optional): Bitrate of generated music. Values: [32000, 64000, 128000, 256000] format (str, optional): Format of generated music. Values: ["mp3", "wav", "pcm"]. Defaults to "mp3" output_directory (str, optional): Directory to save the generated music file Note: Currently supports generating music up to 1 minute in length. Returns: Text content with the path to the generated music file or generation status.
voice_designA	Generate a voice based on description prompts. `COST WARNING: This tool makes an API call to Minimax which may incur costs. Only use when explicitly requested by the user. Args: prompt (str): The prompt to generate the voice from. preview_text (str): The text to preview the voice. voice_id (str, optional): The id of the voice to use. For example, "male-qn-qingse"/"audiobook_female_1"/"cute_boy"/"Charming_Lady"... output_directory (str, optional): The directory to save the voice to. Returns: Text content with the path to the output voice file.`

Prompts

Interactive templates invoked by user choice

Name	Description
No prompts

Resources

Contextual data attached and managed by the client

Name	Description
No resources

Latest Blog Posts

Tool Definition Quality Score (TDQS)
By punkpeye on April 3, 2026.
mcp
The Hackers Who Tracked My Sleep Cycle
By punkpeye on March 26, 2026.
security
Open Source Has a Bot Problem
By punkpeye on March 19, 2026.
open source

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/swesmith-repos/MiniMax-AI__MiniMax-MCP.aa97ac39'

If you have feedback or need assistance with the MCP directory API, please join our Discord server