parse_douyin_video_info
Extract video details from Douyin share links using the MCP server. Input a share link to receive structured video information in JSON format for analysis or integration.
Instructions
解析抖音分享链接,获取视频基本信息
参数:
- share_link: 抖音分享链接或包含链接的文本
返回:
- 视频信息(JSON格式字符串)
Input Schema
TableJSON Schema
| Name | Required | Description | Default |
|---|---|---|---|
| share_link | Yes |
Implementation Reference
- douyin_mcp_server/server.py:271-298 (handler)The primary handler function for the 'parse_douyin_video_info' tool, decorated with @mcp.tool() for registration. It instantiates DouyinProcessor and calls parse_share_url to extract video information from the share link, returning JSON.@mcp.tool() def parse_douyin_video_info(share_link: str) -> str: """ 解析抖音分享链接,获取视频基本信息 参数: - share_link: 抖音分享链接或包含链接的文本 返回: - 视频信息(JSON格式字符串) """ try: processor = DouyinProcessor("") # 不需要API密钥来解析链接 video_info = processor.parse_share_url(share_link) return json.dumps({ "video_id": video_info["video_id"], "title": video_info["title"], "download_url": video_info["url"], "status": "success" }, ensure_ascii=False, indent=2) except Exception as e: return json.dumps({ "status": "error", "error": str(e) }, ensure_ascii=False, indent=2)
- douyin_mcp_server/server.py:59-110 (helper)Key helper method in the DouyinProcessor class that performs the core logic of parsing Douyin share URLs to extract no-watermark video URL, title, and video ID by scraping the page and parsing embedded JSON data.def parse_share_url(self, share_text: str) -> dict: """从分享文本中提取无水印视频链接""" # 提取分享链接 urls = re.findall(r'http[s]?://(?:[a-zA-Z]|[0-9]|[$-_@.&+]|[!*\(\),]|(?:%[0-9a-fA-F][0-9a-fA-F]))+', share_text) if not urls: raise ValueError("未找到有效的分享链接") share_url = urls[0] share_response = requests.get(share_url, headers=HEADERS) video_id = share_response.url.split("?")[0].strip("/").split("/")[-1] share_url = f'https://www.iesdouyin.com/share/video/{video_id}' # 获取视频页面内容 response = requests.get(share_url, headers=HEADERS) response.raise_for_status() pattern = re.compile( pattern=r"window\._ROUTER_DATA\s*=\s*(.*?)</script>", flags=re.DOTALL, ) find_res = pattern.search(response.text) if not find_res or not find_res.group(1): raise ValueError("从HTML中解析视频信息失败") # 解析JSON数据 json_data = json.loads(find_res.group(1).strip()) VIDEO_ID_PAGE_KEY = "video_(id)/page" NOTE_ID_PAGE_KEY = "note_(id)/page" if VIDEO_ID_PAGE_KEY in json_data["loaderData"]: original_video_info = json_data["loaderData"][VIDEO_ID_PAGE_KEY]["videoInfoRes"] elif NOTE_ID_PAGE_KEY in json_data["loaderData"]: original_video_info = json_data["loaderData"][NOTE_ID_PAGE_KEY]["videoInfoRes"] else: raise Exception("无法从JSON中解析视频或图集信息") data = original_video_info["item_list"][0] # 获取视频信息 video_url = data["video"]["play_addr"]["url_list"][0].replace("playwm", "play") desc = data.get("desc", "").strip() or f"douyin_{video_id}" # 替换文件名中的非法字符 desc = re.sub(r'[\\/:*?"<>|]', '_', desc) return { "url": video_url, "title": desc, "video_id": video_id }