Skip to main content
Glama
funinii

TrendRadar

by funinii

find_similar_news

Identifies news articles similar to a given headline using similarity scoring, helping users discover related coverage and track news topics across sources.

Instructions

查找与指定新闻标题相似的其他新闻

Args: reference_title: 新闻标题(完整或部分) threshold: 相似度阈值,0-1之间,默认0.6 注意:阈值越高匹配越严格,返回结果越少 limit: 返回条数限制,默认50,最大100 注意:实际返回数量取决于相似度匹配结果,可能少于请求值 include_url: 是否包含URL链接,默认False(节省token)

Returns: JSON格式的相似新闻列表,包含相似度分数

重要:数据展示策略

  • 本工具返回完整的相似新闻列表

  • 默认展示方式:展示全部返回的新闻(包括相似度分数)

  • 仅在用户明确要求"总结"或"挑重点"时才进行筛选

Input Schema

TableJSON Schema
NameRequiredDescriptionDefault
reference_titleYes
thresholdNo
limitNo
include_urlNo

Output Schema

TableJSON Schema
NameRequiredDescriptionDefault
resultYes

Implementation Reference

  • The actual handler implementation for finding similar news items based on title similarity.
    def find_similar_news(
        self,
        reference_title: str,
        threshold: float = 0.6,
        limit: int = 50,
        include_url: bool = False
    ) -> Dict:
        """
        相似新闻查找 - 基于标题相似度查找相关新闻
    
        Args:
            reference_title: 参考标题
            threshold: 相似度阈值(0-1之间)
            limit: 返回条数限制,默认50
            include_url: 是否包含URL链接,默认False(节省token)
    
        Returns:
            相似新闻列表
    
        Examples:
            用户询问示例:
            - "找出和'特斯拉降价'相似的新闻"
            - "查找关于iPhone发布的类似报道"
            - "看看有没有和这条新闻相似的报道"
    
            代码调用示例:
            >>> tools = AnalyticsTools()
            >>> result = tools.find_similar_news(
            ...     reference_title="特斯拉宣布降价",
            ...     threshold=0.6,
            ...     limit=10
            ... )
            >>> print(result['similar_news'])
        """
        try:
            # 参数验证
            reference_title = validate_keyword(reference_title)
    
            if not 0 <= threshold <= 1:
                raise InvalidParameterError(
                    "threshold 必须在 0 到 1 之间",
                    suggestion="推荐值:0.5-0.8"
                )
    
            limit = validate_limit(limit, default=50)
    
            # 读取数据
            all_titles, id_to_name, _ = self.data_service.parser.read_all_titles_for_date()
    
            # 计算相似度
            similar_items = []
    
            for platform_id, titles in all_titles.items():
                platform_name = id_to_name.get(platform_id, platform_id)
    
                for title, info in titles.items():
                    if title == reference_title:
                        continue
    
                    # 计算相似度
                    similarity = self._calculate_similarity(reference_title, title)
    
                    if similarity >= threshold:
                        news_item = {
                            "title": title,
                            "platform": platform_id,
                            "platform_name": platform_name,
                            "similarity": round(similarity, 3),
                            "rank": info["ranks"][0] if info["ranks"] else 0
                        }
    
                        # 条件性添加 URL 字段
                        if include_url:
                            news_item["url"] = info.get("url", "")
    
                        similar_items.append(news_item)
    
            # 按相似度排序
            similar_items.sort(key=lambda x: x["similarity"], reverse=True)
    
            # 限制数量
            result_items = similar_items[:limit]
    
            if not result_items:
                raise DataNotFoundError(
                    f"未找到相似度超过 {threshold} 的新闻",
                    suggestion="请降低相似度阈值或尝试其他标题"
                )
    
            result = {
                "success": True,
                "summary": {
                    "total_found": len(similar_items),
                    "returned_count": len(result_items),
                    "requested_limit": limit,
                    "threshold": threshold,
                    "reference_title": reference_title
                },
                "similar_news": result_items
            }
    
            if len(similar_items) < limit:
                result["note"] = f"相似度阈值 {threshold} 下仅找到 {len(similar_items)} 条相似新闻"
    
            return result
    
        except MCPError as e:
            return {
                "success": False,
                "error": e.to_dict()
            }
        except Exception as e:
            return {
                "success": False,
                "error": {
                    "code": "INTERNAL_ERROR",
                    "message": str(e)
                }
            }
Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

With no annotations provided, the description carries the full burden. It discloses key behavioral traits: the tool returns a JSON list with similarity scores, explains the threshold parameter's effect on result strictness, notes that actual returned count may be less than the limit, mentions token-saving with include_url default, and details output display strategies (full list by default, filtered only on user request). This covers operational behavior well beyond basic function.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness4/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is appropriately sized and well-structured with clear sections (purpose, Args, Returns, important notes). Every sentence adds value: the opening states purpose, parameter explanations are necessary given 0% schema coverage, return format is specified, and the display strategy section provides crucial usage context. Minor verbosity in the display strategy could be tightened, but overall efficient.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness5/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the tool's moderate complexity (4 parameters, similarity matching logic), no annotations, and the presence of an output schema (implied by 'Returns: JSON格式的相似新闻列表'), the description is highly complete. It covers purpose, all parameter semantics, behavioral traits, return format, and display strategies. The output schema handles return values, so the description appropriately focuses on usage context.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters5/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

The schema description coverage is 0%, so the description must fully compensate. It provides detailed semantic explanations for all 4 parameters: reference_title as the news title (full or partial), threshold as similarity score range 0-1 with default and effect explanation, limit with default, maximum, and actual result caveat, and include_url with default and token-saving rationale. This adds substantial meaning beyond the bare schema.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the tool's purpose: '查找与指定新闻标题相似的其他新闻' (find news similar to a specified news title). It specifies the verb '查找' (find) and resource '其他新闻' (other news), and distinguishes from siblings like 'search_news' or 'search_related_news_history' by focusing on similarity matching rather than general search or historical analysis.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description provides clear context for when to use this tool: to find similar news based on title similarity. It doesn't explicitly state when not to use it or name alternatives, but the sibling tools list includes 'search_news' and 'search_related_news_history', which are implied alternatives for different search needs. The '重要:数据展示策略' section adds usage guidance for output handling.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

Install Server

Other Tools

Latest Blog Posts

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/funinii/TrendRadar'

If you have feedback or need assistance with the MCP directory API, please join our Discord server