MiniMax-MCP

MiniMax-MCP

4.3

MiniMax-MCP is hosted online, so all tools can be tested directly either in theInspector tabor in theOnline Client.

If you are the rightful owner of MiniMax-MCP and would like to certify it and/or have it hosted online, please leave a comment on the right or send an email to henry@mcphub.com.

Official MiniMax Model Context Protocol (MCP) server for interaction with Text to Speech and video/image generation APIs.

text_to_audio

Convert text to audio with a given voice and save the output audio file to a given directory. Directory is optional, if not provided, the output file will be saved to $HOME/Desktop. Voice id is optional, if not provided, the default voice will be used. COST WARNING: This tool makes an API call to Minimax which may incur costs. Only use when explicitly requested by the user. Args: text (str): The text to convert to speech. voice_id (str, optional): The id of the voice to use. For example, "male-qn-qingse"/"audiobook_female_1"/"cute_boy"/"Charming_Lady"... model (string, optional): The model to use. speed (float, optional): Speed of the generated audio. Controls the speed of the generated speech. Values range from 0.5 to 2.0, with 1.0 being the default speed. vol (float, optional): Volume of the generated audio. Controls the volume of the generated speech. Values range from 0 to 10, with 1 being the default volume. pitch (int, optional): Pitch of the generated audio. Controls the speed of the generated speech. Values range from -12 to 12, with 0 being the default speed. emotion (str, optional): Emotion of the generated audio. Controls the emotion of the generated speech. Values range ["happy", "sad", "angry", "fearful", "disgusted", "surprised", "neutral"], with "happy" being the default emotion. sample_rate (int, optional): Sample rate of the generated audio. Controls the sample rate of the generated speech. Values range [8000,16000,22050,24000,32000,44100] with 32000 being the default sample rate. bitrate (int, optional): Bitrate of the generated audio. Controls the bitrate of the generated speech. Values range [32000,64000,128000,256000] with 128000 being the default bitrate. channel (int, optional): Channel of the generated audio. Controls the channel of the generated speech. Values range [1, 2] with 1 being the default channel. format (str, optional): Format of the generated audio. Controls the format of the generated speech. Values range ["pcm", "mp3","flac"] with "mp3" being the default format. language_boost (str, optional): Language boost of the generated audio. Controls the language boost of the generated speech. Values range ['Chinese', 'Chinese,Yue', 'English', 'Arabic', 'Russian', 'Spanish', 'French', 'Portuguese', 'German', 'Turkish', 'Dutch', 'Ukrainian', 'Vietnamese', 'Indonesian', 'Japanese', 'Italian', 'Korean', 'Thai', 'Polish', 'Romanian', 'Greek', 'Czech', 'Finnish', 'Hindi', 'auto'] with "auto" being the default language boost. output_directory (str): The directory to save the audio to. Returns: Text content with the path to the output file and name of the voice used.

Try it

                              Result:

                              list_voices

                              List all voices available. Args: voice_type (str, optional): The type of voices to list. Values range ["all", "system", "voice_cloning"], with "all" being the default. Returns: Text content with the list of voices.

                              Try it

                                  Result:

                                  voice_clone

                                  Clone a voice using provided audio files. The new voice will be charged upon first use. COST WARNING: This tool makes an API call to Minimax which may incur costs. Only use when explicitly requested by the user. Args: voice_id (str): The id of the voice to use. file (str): The path to the audio file to clone or a URL to the audio file. text (str, optional): The text to use for the demo audio. is_url (bool, optional): Whether the file is a URL. Defaults to False. output_directory (str): The directory to save the demo audio to. Returns: Text content with the voice id of the cloned voice.

                                  Try it

                                              Result:

                                              play_audio

                                              Play an audio file. Supports WAV and MP3 formats. Not supports video. Args: input_file_path (str): The path to the audio file to play. is_url (bool, optional): Whether the audio file is a URL. Returns: Text content with the path to the audio file.

                                              Try it

                                                    Result:

                                                    generate_video

                                                    Generate a video from a prompt. COST WARNING: This tool makes an API call to Minimax which may incur costs. Only use when explicitly requested by the user. Args: model (str, optional): The model to use. Values range ["T2V-01", "T2V-01-Director", "I2V-01", "I2V-01-Director", "I2V-01-live", "MiniMax-Hailuo-02"]. "Director" supports inserting instructions for camera movement control. "I2V" for image to video. "T2V" for text to video. "MiniMax-Hailuo-02" is the latest model with best effect, ultra-clear quality and precise response. prompt (str): The prompt to generate the video from. When use Director model, the prompt supports 15 Camera Movement Instructions (Enumerated Values) -Truck: [Truck left], [Truck right] -Pan: [Pan left], [Pan right] -Push: [Push in], [Pull out] -Pedestal: [Pedestal up], [Pedestal down] -Tilt: [Tilt up], [Tilt down] -Zoom: [Zoom in], [Zoom out] -Shake: [Shake] -Follow: [Tracking shot] -Static: [Static shot] first_frame_image (str): The first frame image. The model must be "I2V" Series. duration (int, optional): The duration of the video. The model must be "MiniMax-Hailuo-02". Values can be 6 and 10. resolution (str, optional): The resolution of the video. The model must be "MiniMax-Hailuo-02". Values range ["768P", "1080P"] output_directory (str): The directory to save the video to. async_mode (bool, optional): Whether to use async mode. Defaults to False. If True, the video generation task will be submitted asynchronously and the response will return a task_id. Should use `query_video_generation` tool to check the status of the task and get the result. Returns: Text content with the path to the output video file.

                                                    Try it

                                                                    Result:

                                                                    query_video_generation

                                                                    Query the status of a video generation task. Args: task_id (str): The task ID to query. Should be the task_id returned by `generate_video` tool if `async_mode` is True. output_directory (str): The directory to save the video to. Returns: Text content with the status of the task.

                                                                    Try it

                                                                          Result:

                                                                          text_to_image

                                                                          Generate a image from a prompt. COST WARNING: This tool makes an API call to Minimax which may incur costs. Only use when explicitly requested by the user. Args: model (str, optional): The model to use. Values range ["image-01"], with "image-01" being the default. prompt (str): The prompt to generate the image from. aspect_ratio (str, optional): The aspect ratio of the image. Values range ["1:1", "16:9","4:3", "3:2", "2:3", "3:4", "9:16", "21:9"], with "1:1" being the default. n (int, optional): The number of images to generate. Values range [1, 9], with 1 being the default. prompt_optimizer (bool, optional): Whether to optimize the prompt. Values range [True, False], with True being the default. output_directory (str): The directory to save the image to. Returns: Text content with the path to the output image file.

                                                                          Try it

                                                                                        Result:

                                                                                        music_generation

                                                                                        Create a music generation task using AI models. Generate music from prompt and lyrics. COST WARNING: This tool makes an API call to Minimax which may incur costs. Only use when explicitly requested by the user. Args: prompt (str): Music creation inspiration describing style, mood, scene, etc. Example: "Pop music, sad, suitable for rainy nights". Character range: [10, 300] lyrics (str): Song lyrics for music generation. Use newline (\n) to separate each line of lyrics. Supports lyric structure tags [Intro][Verse][Chorus][Bridge][Outro] to enhance musicality. Character range: [10, 600] (each Chinese character, punctuation, and letter counts as 1 character) stream (bool, optional): Whether to enable streaming mode. Defaults to False sample_rate (int, optional): Sample rate of generated music. Values: [16000, 24000, 32000, 44100] bitrate (int, optional): Bitrate of generated music. Values: [32000, 64000, 128000, 256000] format (str, optional): Format of generated music. Values: ["mp3", "wav", "pcm"]. Defaults to "mp3" output_directory (str, optional): Directory to save the generated music file Note: Currently supports generating music up to 1 minute in length. Returns: Text content with the path to the generated music file or generation status.

                                                                                        Try it

                                                                                                        Result:

                                                                                                        voice_design

                                                                                                        Generate a voice based on description prompts. COST WARNING: This tool makes an API call to Minimax which may incur costs. Only use when explicitly requested by the user. Args: prompt (str): The prompt to generate the voice from. preview_text (str): The text to preview the voice. voice_id (str, optional): The id of the voice to use. For example, "male-qn-qingse"/"audiobook_female_1"/"cute_boy"/"Charming_Lady"... output_directory (str, optional): The directory to save the voice to. Returns: Text content with the path to the output voice file.

                                                                                                        Try it

                                                                                                                  Result: