find_similar_news
Identifies news articles similar to a given headline using similarity scoring, helping users discover related coverage and track news topics across sources.
Instructions
查找与指定新闻标题相似的其他新闻
Args: reference_title: 新闻标题(完整或部分) threshold: 相似度阈值,0-1之间,默认0.6 注意:阈值越高匹配越严格,返回结果越少 limit: 返回条数限制,默认50,最大100 注意:实际返回数量取决于相似度匹配结果,可能少于请求值 include_url: 是否包含URL链接,默认False(节省token)
Returns: JSON格式的相似新闻列表,包含相似度分数
重要:数据展示策略
本工具返回完整的相似新闻列表
默认展示方式:展示全部返回的新闻(包括相似度分数)
仅在用户明确要求"总结"或"挑重点"时才进行筛选
Input Schema
| Name | Required | Description | Default |
|---|---|---|---|
| reference_title | Yes | ||
| threshold | No | ||
| limit | No | ||
| include_url | No |
Output Schema
| Name | Required | Description | Default |
|---|---|---|---|
| result | Yes |
Implementation Reference
- mcp_server/tools/analytics.py:910-1029 (handler)The actual handler implementation for finding similar news items based on title similarity.
def find_similar_news( self, reference_title: str, threshold: float = 0.6, limit: int = 50, include_url: bool = False ) -> Dict: """ 相似新闻查找 - 基于标题相似度查找相关新闻 Args: reference_title: 参考标题 threshold: 相似度阈值(0-1之间) limit: 返回条数限制,默认50 include_url: 是否包含URL链接,默认False(节省token) Returns: 相似新闻列表 Examples: 用户询问示例: - "找出和'特斯拉降价'相似的新闻" - "查找关于iPhone发布的类似报道" - "看看有没有和这条新闻相似的报道" 代码调用示例: >>> tools = AnalyticsTools() >>> result = tools.find_similar_news( ... reference_title="特斯拉宣布降价", ... threshold=0.6, ... limit=10 ... ) >>> print(result['similar_news']) """ try: # 参数验证 reference_title = validate_keyword(reference_title) if not 0 <= threshold <= 1: raise InvalidParameterError( "threshold 必须在 0 到 1 之间", suggestion="推荐值:0.5-0.8" ) limit = validate_limit(limit, default=50) # 读取数据 all_titles, id_to_name, _ = self.data_service.parser.read_all_titles_for_date() # 计算相似度 similar_items = [] for platform_id, titles in all_titles.items(): platform_name = id_to_name.get(platform_id, platform_id) for title, info in titles.items(): if title == reference_title: continue # 计算相似度 similarity = self._calculate_similarity(reference_title, title) if similarity >= threshold: news_item = { "title": title, "platform": platform_id, "platform_name": platform_name, "similarity": round(similarity, 3), "rank": info["ranks"][0] if info["ranks"] else 0 } # 条件性添加 URL 字段 if include_url: news_item["url"] = info.get("url", "") similar_items.append(news_item) # 按相似度排序 similar_items.sort(key=lambda x: x["similarity"], reverse=True) # 限制数量 result_items = similar_items[:limit] if not result_items: raise DataNotFoundError( f"未找到相似度超过 {threshold} 的新闻", suggestion="请降低相似度阈值或尝试其他标题" ) result = { "success": True, "summary": { "total_found": len(similar_items), "returned_count": len(result_items), "requested_limit": limit, "threshold": threshold, "reference_title": reference_title }, "similar_news": result_items } if len(similar_items) < limit: result["note"] = f"相似度阈值 {threshold} 下仅找到 {len(similar_items)} 条相似新闻" return result except MCPError as e: return { "success": False, "error": e.to_dict() } except Exception as e: return { "success": False, "error": { "code": "INTERNAL_ERROR", "message": str(e) } }