Skip to main content
Glama

add_audio_segment

Insert audio segments into video tracks with precise timing control, allowing users to specify exact start/end points, adjust playback speed, and manage volume for professional video editing workflows.

Instructions

添加音频片段到指定轨道,须注意 target_start_end和 source_start_end 的使用规则

Args: track_id: 轨道ID,通过create_track获得 material: 音频文件路径,支持本地文件路径或URL target_start_end: 片段在轨道上的目标时间范围,格式如 "0s-4.2s",表示在轨道上从0s开始,持续4.2s,target_start_end参数描述的是轨道上的时间范围,同一轨道中不可有重复时间段,即0s-4.2s和4s-5s,第一段素材最后0.2s与第二段素材重叠了,只能是0s-4.2s和4.ss-5s source_start_end: 从源音频文件中截取的时间范围,格式如 "1s-4.2s",表示从源音频的1s开始截取,到4.2s结束(可选),source_start_end参数描述的是素材本身取的时长,默认取全部时长,一般情况下不设置,除非用户说明,若素材时长为5s,用户需要取其中1s-5s的内容,才配置 speed: 播放速度,默认为1.0。此项与source_timerange同时指定时,将覆盖target_timerange中的时长(可选) volume: 音量,默认为1.0 change_pitch: 是否跟随变速改变音调,默认为False

Input Schema

TableJSON Schema
NameRequiredDescriptionDefault
track_idYes
materialYes
target_start_endYes
source_start_endNo
speedNo
volumeNo
change_pitchNo

Output Schema

TableJSON Schema
NameRequiredDescriptionDefault
dataNo
messageYes
successYes

Implementation Reference

  • MCP tool handler for 'add_audio_segment': validates inputs, resolves draft/track via index, calls service layer, updates index mapping.
    def add_audio_segment(
            track_id: str,
            material: str,
            target_start_end: str,
            source_start_end: Optional[str] = None,
            speed: Optional[float] = None,
            volume: float = 1.0,
            change_pitch: bool = False
    ) -> ToolResponse:
        """
        添加音频片段到指定轨道,须注意 target_start_end和 source_start_end 的使用规则
    
        Args:
            track_id: 轨道ID,通过create_track获得
            material: 音频文件路径,支持本地文件路径或URL
            target_start_end: 片段在轨道上的目标时间范围,格式如 "0s-4.2s",表示在轨道上从0s开始,持续4.2s,target_start_end参数描述的是轨道上的时间范围,同一轨道中不可有重复时间段,即0s-4.2s和4s-5s,第一段素材最后0.2s与第二段素材重叠了,只能是0s-4.2s和4.ss-5s
            source_start_end: 从源音频文件中截取的时间范围,格式如 "1s-4.2s",表示从源音频的1s开始截取,到4.2s结束(可选),source_start_end参数描述的是素材本身取的时长,默认取全部时长,一般情况下不设置,除非用户说明,若素材时长为5s,用户需要取其中1s-5s的内容,才配置
            speed: 播放速度,默认为1.0。此项与source_timerange同时指定时,将覆盖target_timerange中的时长(可选)
            volume: 音量,默认为1.0
            change_pitch: 是否跟随变速改变音调,默认为False
        """
        try:
            target_timerange = parse_start_end_format(target_start_end)
        except ValueError as e:
            return ToolResponse(
                success=False,
                message=f"target_start_end格式错误: {str(e)}"
            )
    
        source_timerange = None
        if source_start_end is not None:
            try:
                source_timerange = parse_start_end_format(source_start_end)
            except ValueError as e:
                return ToolResponse(
                    success=False,
                    message=f"source_start_end格式错误: {str(e)}"
                )
        # 参数验证
        if speed is not None and speed <= 0:
            return ToolResponse(
                success=False,
                message=f"播放速度必须大于0,当前值: {speed}"
            )
    
        if volume < 0:
            return ToolResponse(
                success=False,
                message=f"音量不能为负数,当前值: {volume}"
            )
    
        # 通过track_id获取draft_id和track_name
        draft_id = index_manager.get_draft_id_by_track_id(track_id)
        track_name = index_manager.get_track_name_by_track_id(track_id)
    
        if not draft_id:
            return ToolResponse(
                success=False,
                message=f"未找到轨道ID对应的草稿: {track_id}"
            )
    
        if not track_name:
            return ToolResponse(
                success=False,
                message=f"未找到轨道ID对应的轨道名: {track_id}"
            )
    
        # 调用服务层处理业务逻辑
        result = add_audio_segment_service(
            draft_id=draft_id,
            material=material,
            target_timerange=target_timerange,
            source_timerange=source_timerange,
            speed=speed,
            volume=volume,
            change_pitch=change_pitch,
            track_name=track_name
        )
    
        # 如果音频片段添加成功,添加索引记录
        if result.success and result.data and "audio_segment_id" in result.data:
            audio_segment_id = result.data["audio_segment_id"]
            index_manager.add_audio_segment_mapping(audio_segment_id, track_id)
    
        return result
  • Top-level registration of audio_tools (containing add_audio_segment) via audio_tools(mcp) in the main server entrypoint.
    from jianyingdraft.tool.audio_tool import audio_tools
    from jianyingdraft.tool.utility_tool import utility_tools
    
    
    def main():
        # 注册所有工具
        draft_tools(mcp)
        track_tools(mcp)
        video_tools(mcp)
        text_tools(mcp)
        audio_tools(mcp)
  • Service function add_audio_segment_service: wraps AudioSegment.add_audio_segment, handles exceptions, returns ToolResponse.
    def add_audio_segment_service(
        draft_id: str,
        material: str,
        target_timerange: str,
        source_timerange: Optional[str] = None,
        speed: Optional[float] = None,
        volume: float = 1.0,
        change_pitch: bool = False,
        track_name: Optional[str] = None
    ) -> ToolResponse:
        """
        音频片段添加服务 - 创建音频片段
        
        Args:
            draft_id: 草稿ID
            material: 音频文件路径,包括本地路径或者URL
            target_timerange: 片段在轨道上的目标时间范围,格式如 "0s-4.2s"
            source_timerange: 从源音频文件中截取的时间范围,格式如 "1s-5.2s"(可选)
            speed: 播放速度,默认为1.0(可选)
            volume: 音量,默认1.0
            change_pitch: 是否跟随变速改变音调,默认False
            track_name: 指定的轨道名称(可选)
        
        Returns:
            ToolResponse: 包含操作结果的响应对象
        """
        try:
            # 创建AudioSegment实例
            audio_segment = AudioSegment(draft_id, track_name=track_name)
            
            # 调用音频片段添加方法
            result_data = audio_segment.add_audio_segment(
                material=material,
                target_timerange=target_timerange,
                source_timerange=source_timerange,
                speed=speed,
                volume=volume,
                change_pitch=change_pitch,
                track_name=track_name
            )
            
            # 构建返回数据
            response_data = {
                "audio_segment_id": audio_segment.audio_segment_id,
                "draft_id": draft_id,
                "material": material,
                "target_timerange": target_timerange,
                "volume": volume,
                "change_pitch": change_pitch,
                "add_audio_segment": result_data
            }
            
            # 添加可选参数到返回数据
            if source_timerange:
                response_data["source_timerange"] = source_timerange
            if speed is not None:
                response_data["speed"] = speed
            if track_name:
                response_data["track_name"] = track_name
            
            return ToolResponse(
                success=True,
                message=f"音频片段添加成功: {material}",
                data=response_data
            )
            
        except ValueError as e:
            # 处理参数错误
            return ToolResponse(
                success=False,
                message=f"参数错误: {str(e)}"
            )
            
        except NameError as e:
            # 处理轨道不存在错误
            return ToolResponse(
                success=False,
                message=f"轨道错误: {str(e)}"
            )
            
        except Exception as e:
            # 处理其他未预期的错误
            return ToolResponse(
                success=False,
                message=f"音频片段添加失败: {str(e)}"
            )
  • Core implementation in AudioSegment class: parses params, downloads/validates material, checks overlaps, builds operation data, persists to JSON.
    def add_audio_segment(self, material: str, target_timerange: str,
                          source_timerange: Optional[str] = None, speed: Optional[float] = None,
                          volume: float = 1.0, change_pitch: bool = False,
                          track_name: Optional[str] = None) -> Dict[str, Any]:
        """
        创建音频片段配置
    
        Args:
            material: 音频文件路径,包括本地路径或者url
            target_timerange: 片段在轨道上的目标时间范围,格式如 "0s-4.2s",表示在轨道上从0s开始
            source_timerange: 从源视频文件中截取的时间范围,格式如 "1s-4.2s",表示从源视频的1s开始截取,持续4.2s,默认从开头根据`speed`截取与`target_timerange`等长的一部分
            speed: (`float`, optional): 播放速度, 默认为1.0,此项与`source_timerange`同时指定时, 将覆盖`target_timerange`中的时长
            volume: 音量,默认1.0
            change_pitch: 是否跟随变速改变音调,默认False,一般不修改
            track_name: 指定的轨道名称(可选),如果不指定则使用实例的track_name
        """
        # 生成音频片段ID
        audio_segment_id = str(uuid.uuid4())
    
        # 解析target_timerange
        if not target_timerange or "-" not in target_timerange:
            raise ValueError(f"Invalid target_timerange format: {target_timerange}")
    
        start_str, duration_str = target_timerange.split("-", 1)
        target_timerange_data = {
            "start": start_str.strip(),
            "duration": duration_str.strip()
        }
    
        # 解析source_timerange(如果有)
        source_timerange_data = None
        if source_timerange is not None:
            # 解析source_timerange
            if "-" not in source_timerange:
                raise ValueError(f"Invalid source_timerange format: {source_timerange}")
            src_start_str, src_duration_str = source_timerange.split("-", 1)
            source_timerange_data = {
                "start": src_start_str.strip(),
                "duration": src_duration_str.strip()
            }
    
        # 确定使用的轨道名称(参数优先,然后是实例属性)
        final_track_name = track_name or self.track_name
    
        # 下载并验证素材,获取本地化路径
        local_material_path = download_and_validate_material(
            self.draft_id,
            material,
            "audio",
            target_timerange_data
        )
    
        # 构建add_audio_segment参数(使用本地化后的路径)
        add_audio_segment_params = {
            "material": local_material_path,  # 使用本地化后的相对路径
            "target_timerange": target_timerange_data
        }
    
        # 只添加用户明确传入的可选参数
        if source_timerange_data is not None:
            add_audio_segment_params["source_timerange"] = source_timerange_data
    
        if speed is not None:
            add_audio_segment_params["speed"] = speed
        if volume != 1.0:  # 只有非默认值才保存
            add_audio_segment_params["volume"] = volume
        if change_pitch:  # 只有True才保存
            add_audio_segment_params["change_pitch"] = change_pitch
    
        # 验证轨道
        if final_track_name:
            self._validate_track_for_audio(final_track_name)
    
        # 验证片段重叠
        if final_track_name:
            validate_overlap(self.draft_id, "audio", final_track_name, target_timerange_data)
    
        # 构建完整的片段数据
        segment_data = {
            "audio_segment_id": audio_segment_id,
            "operation": "add_audio_segment",
            "add_audio_segment": add_audio_segment_params,
        }
        return_data = {
            "draft_id": self.draft_id,
            "track_name": final_track_name,
            "audio_segment_id": audio_segment_id,
            "operation": "add_audio_segment",
            "add_audio_segment": add_audio_segment_params,
        }
        # 只在指定了轨道名称时才添加track_name字段
        if final_track_name:
            segment_data["track_name"] = final_track_name
    
        # 保存参数
        self.add_json_to_file(segment_data)
        self.audio_segment_id = audio_segment_id
        return return_data
  • Function signature with type annotations and comprehensive docstring defining the input schema for the tool.
            track_id: str,
            material: str,
            target_start_end: str,
            source_start_end: Optional[str] = None,
            speed: Optional[float] = None,
            volume: float = 1.0,
            change_pitch: bool = False
    ) -> ToolResponse:
        """
        添加音频片段到指定轨道,须注意 target_start_end和 source_start_end 的使用规则
    
        Args:
            track_id: 轨道ID,通过create_track获得
            material: 音频文件路径,支持本地文件路径或URL
            target_start_end: 片段在轨道上的目标时间范围,格式如 "0s-4.2s",表示在轨道上从0s开始,持续4.2s,target_start_end参数描述的是轨道上的时间范围,同一轨道中不可有重复时间段,即0s-4.2s和4s-5s,第一段素材最后0.2s与第二段素材重叠了,只能是0s-4.2s和4.ss-5s
            source_start_end: 从源音频文件中截取的时间范围,格式如 "1s-4.2s",表示从源音频的1s开始截取,到4.2s结束(可选),source_start_end参数描述的是素材本身取的时长,默认取全部时长,一般情况下不设置,除非用户说明,若素材时长为5s,用户需要取其中1s-5s的内容,才配置
            speed: 播放速度,默认为1.0。此项与source_timerange同时指定时,将覆盖target_timerange中的时长(可选)
            volume: 音量,默认为1.0
            change_pitch: 是否跟随变速改变音调,默认为False
        """
Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

With no annotations provided, the description carries full burden and does well by explaining important behavioral constraints: the non-overlap rule for target_start_end segments ('同一轨道中不可有重复时间段'), how speed interacts with timing ('此项与source_timerange同时指定时,将覆盖target_timerange中的时长'), and default behaviors. It doesn't mention error conditions, performance characteristics, or mutation consequences beyond the overlap constraint.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness4/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is efficiently structured with a purpose statement followed by detailed parameter explanations. Every sentence adds value, though the Chinese formatting with Args: header is slightly less conventional than English MCP patterns. The information is well-organized and front-loaded with the most critical constraint about target_start_end usage.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given a 7-parameter mutation tool with no annotations but with output schema present, the description provides strong coverage of input semantics and behavioral constraints. The existence of an output schema means return values don't need explanation. The description adequately covers the tool's complexity, though could benefit from more explicit error case descriptions or prerequisites.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters5/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

With 0% schema description coverage, the description fully compensates by providing detailed semantic explanations for all 7 parameters: track_id origin, material formats, target_start_end format and constraints, source_start_end optionality and use cases, speed interaction rules, volume default, and change_pitch default. Each parameter gets clear operational meaning beyond the schema's type information.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose4/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the action ('添加音频片段' - add audio segment) and target resource ('到指定轨道' - to specified track), providing a specific verb+resource combination. However, it doesn't explicitly differentiate from sibling tools like 'add_video_segment' or 'add_text_segment' beyond the 'audio' qualifier in the name.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines3/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description provides implied usage guidance through parameter explanations (e.g., '一般情况下不设置' for source_start_end, meaning 'generally not set'), but lacks explicit when-to-use vs. alternatives. It mentions track_id comes from 'create_track' but doesn't clarify when to use this vs. other audio manipulation tools like 'add_audio_effect' or 'add_audio_fade'.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

Install Server

Other Tools

Latest Blog Posts

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/hey-jian-wei/jianying-mcp'

If you have feedback or need assistance with the MCP directory API, please join our Discord server