parse_generic_link
Extract video, image, and text content from any social media share link by parsing platform-independent URLs to retrieve titles, captions, and direct media URLs.
Instructions
解析任意短视频/图文链接,直接启用 generic 兜底逻辑。
参数:
- share_link: 任意平台的分享链接或包含链接的文本(抖音/小红书亦可传入)
返回:
- 包含资源链接和信息的JSON字符串
- 输出字段与其它工具一致:platform/title/caption/url
- 调用完成后,请将结果整理为以下纯文本格式并反馈给用户(禁止使用Markdown):
标题(如无则留空):
文案:
视频/图片链接:
- 请完整保留标题与文案的全部内容,不要省略或截断
- 若未能解析,将返回错误说明(可能原因:页面无直链、需要登录等)
Input Schema
TableJSON Schema
| Name | Required | Description | Default |
|---|---|---|---|
| share_link | Yes |
Implementation Reference
- douyin_mcp_server/server.py:224-250 (handler)The main handler function for the 'parse_generic_link' tool, registered via @mcp.tool(). It invokes the generic media extractor and returns formatted JSON or error.@mcp.tool() def parse_generic_link(share_link: str) -> str: """解析任意短视频/图文链接,直接启用 generic 兜底逻辑。 参数: - share_link: 任意平台的分享链接或包含链接的文本(抖音/小红书亦可传入) 返回: - 包含资源链接和信息的JSON字符串 - 输出字段与其它工具一致:platform/title/caption/url - 调用完成后,请将结果整理为以下纯文本格式并反馈给用户(禁止使用Markdown): 标题(如无则留空): 文案: 视频/图片链接: - 请完整保留标题与文案的全部内容,不要省略或截断 - 若未能解析,将返回错误说明(可能原因:页面无直链、需要登录等) """ try: result = extract_generic_media(share_link) result.setdefault("fallback_reason", "generic_tool_invocation") return json.dumps(result, ensure_ascii=False, indent=2) except Exception as e: return json.dumps({ "status": "error", "error": f"通用解析失败: {e}" }, ensure_ascii=False, indent=2)
- Core helper function that performs the actual generic media extraction by fetching the page, parsing HTML with regex patterns for video URLs (og:video, etc.), and extracting title/caption.def extract_generic_media(share_text: str) -> Dict[str, str]: """尝试从任意链接中提取无水印视频信息。 返回: dict: 包含 platform/type/url/title/caption 的基础信息。 失败时抛出 ValueError,便于调用方根据需要返回原始错误。 """ share_url = _extract_first_url(share_text) logger.debug("[GenericExtractor] 开始解析链接: %s", share_url) response = requests.get(share_url, headers=GENERIC_HEADERS, timeout=10, allow_redirects=True) response.raise_for_status() final_url = response.url html_text = response.text logger.debug("[GenericExtractor] 最终地址: %s", final_url) media_url = _find_media_url(html_text) if not media_url: raise ValueError("未从页面中发现可用的视频直链") title = ( _extract_meta(html_text, "og:title") or _extract_meta(html_text, "twitter:title") or final_url ) caption = _extract_meta(html_text, "og:description") or _extract_meta(html_text, "description") return { "status": "success", "type": "video", "platform": "generic", "title": title.strip() if title else None, "caption": caption.strip() if caption else None, "url": media_url, "source_url": final_url, }
- douyin_mcp_server/server.py:224-224 (registration)The @mcp.tool() decorator registers the parse_generic_link function as an MCP tool.@mcp.tool()
- douyin_mcp_server/server.py:225-225 (schema)Function signature defines the input schema (share_link: str) and output (str JSON).def parse_generic_link(share_link: str) -> str: