Skip to main content
Glama
baidu-xiling

Baidu Digital Human MCP Server

Official
by baidu-xiling

generateDhVideo

Create customized digital human videos using text or audio input, selecting avatar, voice, resolution, background, and camera angles. Add subtitles and automatic animations for personalized output.

Instructions

#工具说明:根据所选数字人像ID及发音人ID,生成数字人视频。

样例1:

用户输入:用数字人像ID为xxx,发音人ID为yyy的音色,视频的内容是“大家好,我是数字人播报的内容”,使用横屏全身的机位,视频背景用“https://digital-human-material.bj.bcebos.com/-%5BLjava.lang.String%3B%4046f6cc1e.png”,开启自动添加动作,开启字幕,生成一个1080P的数字人视频。 思考过程: 1.用户想要用人像ID生成一个数字人视频,对声音,背景,字幕,分辨率等有要求,不是一个简单的数字人视频,需要使用“generateDhVideo”工具。 2.工具需要FigureId,driveType,text,person,inputAudioUrl,width,hight,cameraID,enable,backgroundimageUrl,autoAnimoji这些参数。 3.FigureId是需要使用的人像ID,所以值为xxx。给的播报内容是文本,所以driveType是文本驱动,text为“大家好,我是数字人播报的内容”。发音人已经提供了ID,所以person的值是yyy,开启自动动作,所以autoAnimoji的值为true,开启字幕,所以enabled的值为true,分辨率为1080P,拆分为width的值为1920,hight的值为1080,backgroundimageUrl的值是“https://digital-human-material.bj.bcebos.com/-%5BLjava.lang.String%3B%4046f6cc1e.png”

Input Schema

TableJSON Schema
NameRequiredDescriptionDefault
autoAnimojiNo自动添加数字人动作
backgroundImageUrlNo背景图片
backgroundTransparentNo背景是否透明
callbackUrlNo回调地址
cameraIdNo数字人相机机位,0:横屏半身, 1:竖屏半身, 2: 横屏全身, 3: 竖屏全身
driveTypeNo驱动类型, TEXT:文本驱动, VOICE: 音频驱动TEXT
figureIdNo人像ID
inputAudioUrlNo驱动音频URL
resolutionHeightNo分辨率:高
resolutionWidthNo分辨率:宽
subtitleEnableNo是否启用字幕
textNo播报内容
voiceIdNo音色ID

Implementation Reference

  • Registration of the generateDhVideo tool using @mcp.tool decorator, including the tool name and detailed usage description with examples.
    @mcp.tool( name="generateDhVideo", description=( """ #工具说明:根据所选数字人像ID及发音人ID,生成数字人视频。 # 样例1: 用户输入:用数字人像ID为xxx,发音人ID为yyy的音色,视频的内容是“大家好,我是数字人播报的内容”,使用横屏全身的机位,视频背景用\ “https://digital-human-material.bj.bcebos.com/-%5BLjava.lang.String%3B%4046f6cc1e.png”,\ 开启自动添加动作,开启字幕,生成一个1080P的数字人视频。 思考过程: 1.用户想要用人像ID生成一个数字人视频,对声音,背景,字幕,分辨率等有要求,不是一个简单的数字人视频,需要使用“generateDhVideo”工具。 2.工具需要FigureId,driveType,text,person,inputAudioUrl,width,hight,cameraID,enable,backgroundimageUrl,\ autoAnimoji这些参数。 3.FigureId是需要使用的人像ID,所以值为xxx。给的播报内容是文本,所以driveType是文本驱动,text为“大家好,我是数字人播报的内容”。\ 发音人已经提供了ID,所以person的值是yyy,开启自动动作,所以autoAnimoji的值为true,开启字幕,所以enabled的值为true,分辨率为1080P,\ 拆分为width的值为1920,hight的值为1080,backgroundimageUrl的值是\ “https://digital-human-material.bj.bcebos.com/-%5BLjava.lang.String%3B%4046f6cc1e.png” """) )
  • The core handler function that implements the generateDhVideo tool logic. It validates inputs via Annotated Fields, constructs a VideoGenerateRequest using imported types, calls the DHApiClient's generate_avatar_video method, and returns MCPVideoGenerateResponse.
    async def generateDhVideo( figureId: Annotated[str, Field(description="人像ID", default=None)], voiceId: Annotated[str, Field(description="音色ID", default=None)], text: Annotated[str, Field(description="播报内容", default=None)], inputAudioUrl: Annotated[str, Field(description="驱动音频URL", default=None)], resolutionWidth: Annotated[int, Field(description="分辨率:宽", default=768)], resolutionHeight: Annotated[int, Field(description="分辨率:高", default=1280)], backgroundTransparent: Annotated[bool, Field(description="背景是否透明", default=False)], cameraId: Annotated[int, Field(description="数字人相机机位,0:横屏半身, 1:竖屏半身, 2: 横屏全身, 3: 竖屏全身", default=3)], backgroundImageUrl: Annotated[str, Field(description="背景图片", default=None)], callbackUrl: Annotated[str, Field(description="回调地址", default=None)], driveType: Annotated[Literal["TEXT", "VOICE"], Field(description="驱动类型, TEXT:文本驱动, VOICE: 音频驱动", default="TEXT")], subtitleEnable: Annotated[bool, Field(description="是否启用字幕", default=False)], autoAnimoji: Annotated[bool, Field(description="自动添加数字人动作", default=False)] ) -> MCPVideoGenerateResponse: """ Generate a new digital human video using the DH API. Args: figureId: 人像ID driveType: 驱动类型, TEXT:文本驱动, VOICE: 音频驱动 text: 文本内容,播报内容 voiceId: 音色id, inputAudioUrl: 驱动音频URL resolutionWidth: 分辨率宽 resolutionHeight: 分辨率高 backgroundTransparent: 背景透明 cameraId: 0:横屏半身, 1:竖屏半身, 2: 横屏全身, 3: 竖屏全身 subtitleEnable: 字幕 backgroundImageUrl: 背景图片 autoAnimoji: 自动添加数字人动作 callbackUrl: 回调地址 Returns: taskId: 任务ID """ try: request = VideoGenerateRequest( figureId=figureId, driveType=driveType, text=text, ttsParams=TtsParams(person=str(voiceId), speed="5", volume="5", pitch="5"), inputAudioUrl=inputAudioUrl, videoParams=VideoParams(width=resolutionWidth, height=resolutionHeight, transparent=backgroundTransparent), dhParams=DHParams(cameraId=cameraId), subtitleParams=SubtitleParams(subtitlePolicy="SRT", enabled=True) if subtitleEnable else None, backgroundImageUrl=backgroundImageUrl, callbackUrl=callbackUrl, autoAnimoji=autoAnimoji, ) client = await getDhClient() ret = await client.generate_avatar_video(request) return ret except Exception as e: return MCPVideoGenerateResponse(error=str(e))
  • Input schema defined by Pydantic Annotated parameters in the handler function signature, including descriptions and defaults for tool parameters.
    figureId: Annotated[str, Field(description="人像ID", default=None)], voiceId: Annotated[str, Field(description="音色ID", default=None)], text: Annotated[str, Field(description="播报内容", default=None)], inputAudioUrl: Annotated[str, Field(description="驱动音频URL", default=None)], resolutionWidth: Annotated[int, Field(description="分辨率:宽", default=768)], resolutionHeight: Annotated[int, Field(description="分辨率:高", default=1280)], backgroundTransparent: Annotated[bool, Field(description="背景是否透明", default=False)], cameraId: Annotated[int, Field(description="数字人相机机位,0:横屏半身, 1:竖屏半身, 2: 横屏全身, 3: 竖屏全身", default=3)], backgroundImageUrl: Annotated[str, Field(description="背景图片", default=None)], callbackUrl: Annotated[str, Field(description="回调地址", default=None)], driveType: Annotated[Literal["TEXT", "VOICE"], Field(description="驱动类型, TEXT:文本驱动, VOICE: 音频驱动", default="TEXT")], subtitleEnable: Annotated[bool, Field(description="是否启用字幕", default=False)], autoAnimoji: Annotated[bool, Field(description="自动添加数字人动作", default=False)]

Latest Blog Posts

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/baidu-xiling/mcp'

If you have feedback or need assistance with the MCP directory API, please join our Discord server