Skip to main content
Glama
yangbuyiya

Short Video MCP Server

by yangbuyiya

share_text_parse_tool_wrapper

Extract video content from TikTok share links and convert speech to text using specified API parameters.

Instructions

      提取视频内容,需要传递apikey,否则无法使用视频内容提取功能!
      参数:
      - text: 抖音分享文本,包含分享链接
      - api_base_url: API基础URL,默认使用siliconflow.cn
      - model: 语音识别模型,默认使用FunAudioLLM/SenseVoiceSmall
      

Input Schema

TableJSON Schema
NameRequiredDescriptionDefault
textYes
api_base_urlNo
modelNo

Output Schema

TableJSON Schema
NameRequiredDescriptionDefault

No arguments

Implementation Reference

  • Registration of the MCP tool 'share_text_parse_tool_wrapper' using @mcp.tool decorator. The description string defines the tool schema including parameters and usage.
    @mcp.tool(
        description="""
              提取视频内容,需要传递apikey,否则无法使用视频内容提取功能!
              参数:
              - text: 抖音分享文本,包含分享链接
              - api_base_url: API基础URL,默认使用siliconflow.cn
              - model: 语音识别模型,默认使用FunAudioLLM/SenseVoiceSmall
              """
    )
  • The handler function that implements the logic for 'share_text_parse_tool_wrapper'. It delegates to the core 'share_text_parse_tool' utility function in utils/tools.py.
    async def share_text_parse_tool_wrapper(
        text: str,
        api_base_url: Optional[str] = None,
        model: Optional[str] = None,
        ctx: Context = None,
    ) -> Dict[str, Any]:
        """
        解析抖音分享链接,提取无水印视频地址
        下载无水印视频
        提取音频
        转换音频为文本
        清理临时文件
    
        参数:
        - text: 抖音分享文本,包含分享链接
        - api_base_url: API基础URL,默认使用SiliconFlow
        - model: 语音识别模型,默认使用SenseVoiceSmall
        """
        return await share_text_parse_tool(text, api_base_url, model, ctx)
  • Core helper function 'share_text_parse_tool' that contains the main implementation logic for parsing share text, downloading video, extracting audio, transcribing to text, and cleanup. Called by the wrapper handler.
    async def share_text_parse_tool(
        text: str,
        api_base_url: Optional[str] = None,
        model: Optional[str] = None,
        ctx: Context = None,
    ) -> Dict[str, Any]:
        """
        解析抖音分享链接,提取无水印视频地址
        下载无水印视频
        提取音频
        转换音频为文本
        清理临时文件
    
        参数:
        - text: 抖音分享文本,包含分享链接
        - api_base_url: API基础URL,默认使用SiliconFlow
        - model: 语音识别模型,默认使用SenseVoiceSmall
        """
        if not text or not isinstance(text, str):
            return create_error_response("文本参数不能为空")
        
        try:
            # 获取配置参数
            api_key, api_base_url, model = get_api_configuration(ctx, api_base_url, model)
            
            if not api_key:
                error_msg = "没有传递apikey,请通过参数传入apikey或请求头传入apikey,或者设置环境变量API_KEY,否则无法使用视频内容提取功能!"
                logger.error(error_msg)
                return create_error_response(error_msg)
    
            processor = VideoProcessor(api_key, api_base_url, model)
    
            # 解析视频链接
            video_share_url = extract_url_from_text(text)
            if not video_share_url:
                error_msg = f"无法从文本中提取视频链接: {text}"
                logger.error(error_msg)
                return create_error_response(error_msg)
    
            video_obj = await parse_video_share_url(video_share_url)
            video_info = create_success_response("解析成功", video_obj.__dict__)
    
            # 处理视频下载和文本提取
            try:
                download_video_info = {
                    "url": video_info["data"]["video_url"],
                    "title": video_info["data"]["title"],
                    "video_id": str(int(time.time())),
                }
                
                logger.info(f"开始下载视频: {download_video_info['title']}")
                video_path = await processor.download_video(download_video_info)
                ctx.info(f"视频下载地址: {video_path}")
                
                logger.info("开始提取音频")
                audio_path = processor.extract_audio(video_path)
                ctx.info(f"音频提取地址: {audio_path}")
                
                logger.info("开始转换音频为文本")
                text_content = processor.extract_text_from_audio(audio_path)
                ctx.info(f"文本提取内容: {text_content}")
                
                logger.info("清理临时文件")
                processor.cleanup_files(video_path, audio_path)
                ctx.info(f"临时文件清理: {video_path}, {audio_path}")
                
                return {
                    "code": DEFAULT_RESPONSE_CODES['SUCCESS'],
                    "msg": "解析成功",
                    "data": video_info["data"],
                    "text_content": text_content,
                }
            except Exception as processing_err:
                logger.error(f"视频处理失败: {processing_err}")
                # 确保清理可能存在的临时文件
                try:
                    if 'video_path' in locals() and 'audio_path' in locals():
                        processor.cleanup_files(video_path, audio_path)
                except Exception:
                    pass
                raise processing_err
        except ValueError as err:
            logger.error(f"参数错误: {err}")
            if ctx:
                ctx.error(f"参数错误: {str(err)}")
            return create_error_response(str(err))
        except Exception as err:
            logger.error(f"未知错误: {err}")
            if ctx:
                ctx.error(f"解析失败: {str(err)}")
            return create_error_response(f"解析失败: {str(err)}") 
Behavior2/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

No annotations are provided, so the description carries the full burden. It mentions the apikey requirement, which is a useful behavioral trait (authentication need). However, it lacks other critical details: whether this is a read-only or mutating operation, potential rate limits, error handling, or what the output contains (though an output schema exists). For a tool with no annotations, this is insufficient.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness4/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is appropriately sized with three sentences: purpose statement, apikey requirement, and parameter list. It's front-loaded with the main function. However, the parameter explanations are brief and could be more integrated, and there's minor redundancy in stating defaults that are also in the schema.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness3/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given no annotations, 3 parameters with 0% schema coverage, and an output schema (which alleviates need to describe return values), the description is partially complete. It covers the apikey requirement and parameter basics, but misses behavioral context like operation type, error cases, or sibling differentiation. For a tool with authentication needs and sibling tools, this is a moderate gap.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema description coverage is 0%, so the description must compensate. It lists all three parameters (text, api_base_url, model) with brief explanations: text is '抖音分享文本,包含分享链接' (Douyin share text containing share link), api_base_url has a default, and model specifies a default speech recognition model. This adds meaning beyond the bare schema, but explanations are minimal (e.g., no format details for text or valid values for model). With 0% coverage, baseline would be lower, but the description provides some compensation, warranting a 3.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose3/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description states '提取视频内容' (extract video content), which provides a basic purpose, but it's vague about what exactly is extracted (e.g., metadata, transcript, audio). It mentions '抖音分享文本' (Douyin share text) as the input, but doesn't clearly differentiate from siblings like share_url_parse_tool_wrapper or video_id_parse_tool_wrapper in terms of specific functionality or resource focus.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines2/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description includes a prerequisite ('需要传递apikey,否则无法使用视频内容提取功能' - requires passing apikey, otherwise cannot use video content extraction function), which is helpful. However, it provides no guidance on when to use this tool versus the sibling tools (share_url_parse_tool_wrapper, video_id_parse_tool_wrapper), leaving the agent to guess based on names alone.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

Install Server

Other Tools

Latest Blog Posts

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/yangbuyiya/yby6-crawling-short-video-mcp'

If you have feedback or need assistance with the MCP directory API, please join our Discord server