Skip to main content
Glama

export_claim_matrix_grouped_v1

Export grouped claim matrices from academic literature to organize research findings by categories for systematic analysis.

Instructions

导出按分组聚合的结论矩阵。

Input Schema

TableJSON Schema
NameRequiredDescriptionDefault
comm_idNo
pack_idNo

Output Schema

TableJSON Schema
NameRequiredDescriptionDefault

No arguments

Implementation Reference

  • The @mcp.tool()-decorated handler function that executes the core logic: constructs a SQL query to fetch grouped claims filtered by comm_id or pack_id, aggregates top claims per group, and returns structured output.
    @mcp.tool()
    def export_claim_matrix_grouped_v1(
        comm_id: int | None = None,
        pack_id: int | None = None,
    ) -> dict[str, Any]:
        """导出按分组聚合的结论矩阵。"""
        try:
            where_clauses = []
            params = []
            
            if comm_id:
                where_clauses.append("""
                    EXISTS (
                        SELECT 1 FROM claim_group_members cgm
                        JOIN claims c ON c.claim_id = cgm.claim_id
                        JOIN mentions m ON m.doc_id = c.doc_id
                        JOIN community_members cm ON cm.entity_id = m.entity_id
                        WHERE cm.comm_id = %s AND cgm.group_id = g.group_id
                    )
                """)
                params.append(comm_id)
            elif pack_id:
                where_clauses.append("""
                    EXISTS (
                        SELECT 1 FROM claim_group_members cgm
                        JOIN claims c ON c.claim_id = cgm.claim_id
                        JOIN evidence_pack_items epi ON epi.chunk_id = c.chunk_id
                        WHERE epi.pack_id = %s AND cgm.group_id = g.group_id
                    )
                """)
                params.append(pack_id)
            
            where_sql = " WHERE " + " AND ".join(where_clauses) if where_clauses else ""
            
            sql = f"""
                SELECT 
                    g.group_id, g.group_key, g.sign, g.id_family, g.setting,
                    e.canonical_name as topic_name,
                    (SELECT COUNT(*) FROM claim_group_members cgm WHERE cgm.group_id = g.group_id) as member_count,
                    (
                        SELECT json_agg(json_build_object(
                            'claim_id', c.claim_id,
                            'doc_id', c.doc_id,
                            'claim_text', c.claim_text,
                            'confidence', c.confidence
                        ))
                        FROM claim_group_members cgm
                        JOIN claims c ON c.claim_id = cgm.claim_id
                        WHERE cgm.group_id = g.group_id
                        LIMIT 5
                    ) as top_claims
                FROM claim_groups g
                LEFT JOIN entities e ON e.entity_id = g.topic_entity_id
                {where_sql}
                ORDER BY member_count DESC
            """
            rows = query_all(sql, tuple(params))
            
            return ExportClaimMatrixGroupedOut(groups=rows).model_dump()
            
        except Exception as e:
            return ExportClaimMatrixGroupedOut(
                error=MCPErrorModel(code="SYSTEM_ERROR", message=str(e))
            ).model_dump()
  • Pydantic models defining input (comm_id or pack_id optional) and output (list of group dicts with details like group_id, topic_name, member_count, top_claims; optional error) schemas for the tool.
    class ExportClaimMatrixGroupedIn(BaseModel):
        """export_claim_matrix_grouped_v1 输入"""
        comm_id: Optional[int] = None
        pack_id: Optional[int] = None
    
    
    class ExportClaimMatrixGroupedOut(BaseModel):
        """export_claim_matrix_grouped_v1 输出"""
        groups: list[dict[str, Any]] = Field(default_factory=list)
        error: Optional[MCPErrorModel] = None
  • Top-level registration call in the main MCP server that invokes the module's register function, thereby registering export_claim_matrix_grouped_v1 via its @mcp.tool() decorator.
    register_graph_claim_grouping_tools(mcp)
  • Module-level registration function that defines and registers both build_claim_groups_v1 and export_claim_matrix_grouped_v1 tools using @mcp.tool() decorators.
    def register_graph_claim_grouping_tools(mcp: FastMCP) -> None:
        """注册结论分组工具"""
    
        @mcp.tool()
        def build_claim_groups_v1(
            scope: str = "all",
            max_claims_per_doc: int = 100,  # v1.1: raised from 20 to improve coverage
            dry_run: bool = False,
        ) -> dict[str, Any]:
            """对结论进行分组/聚类。
            
            Args:
                scope: 处理范围,"all", "comm_id:...", "doc_ids:id1,id2"
                max_claims_per_doc: 每个文档最多处理多少条结论
                dry_run: 是否仅预览
            """
            try:
                # 1. 确定范围
                params = []
                where_clauses = []
                
                if scope.startswith("comm_id:"):
                    comm_id = int(scope.split(":", 1)[1])
                    where_clauses.append("""
                        EXISTS (
                            SELECT 1 FROM community_members cm
                            JOIN mentions m ON m.entity_id = cm.entity_id
                            WHERE cm.comm_id = %s AND m.doc_id = c.doc_id
                        )
                    """)
                    params.append(comm_id)
                elif scope.startswith("doc_ids:"):
                    doc_ids = scope.split(":", 1)[1].split(",")
                    where_clauses.append("c.doc_id = ANY(%s)")
                    params.append(doc_ids)
    
                where_sql = " WHERE " + " AND ".join(where_clauses) if where_clauses else ""
                
                # 2. 获取结论及关联信息
                sql = f"""
                    SELECT 
                        c.claim_id, c.doc_id, c.claim_text, c.sign, c.conditions,
                        (SELECT e.canonical_key FROM relations r 
                         JOIN entities e ON e.entity_id = r.obj_entity_id
                         WHERE r.subj_entity_id = (SELECT entity_id FROM entities WHERE type='Paper' AND canonical_key = c.doc_id)
                         AND r.predicate = 'PAPER_HAS_TOPIC' LIMIT 1) as topic_key,
                        (SELECT e.canonical_key FROM relations r 
                         JOIN entities e ON e.entity_id = r.obj_entity_id
                         WHERE r.subj_entity_id = (SELECT entity_id FROM entities WHERE type='Paper' AND canonical_key = c.doc_id)
                         AND r.predicate = 'PAPER_IDENTIFIES_WITH' LIMIT 1) as id_family
                    FROM claims c
                    {where_sql}
                    ORDER BY c.doc_id, c.confidence DESC
                """
                claims = query_all(sql, tuple(params))
                
                if not claims:
                    return BuildClaimGroupsOut(new_groups=0, total_members=0).model_dump()
    
                # 3. 分组逻辑
                groups: dict[str, list[int]] = defaultdict(list)
                group_details: dict[str, dict] = {}
                
                doc_counts: dict[str, int] = defaultdict(int)
                
                for c in claims:
                    if doc_counts[c["doc_id"]] >= max_claims_per_doc:
                        continue
                    doc_counts[c["doc_id"]] += 1
                    
                    # 构造 group_key
                    topic_key = c["topic_key"] or "unknown_topic"
                    sign = c["sign"] or "null"
                    id_family = c["id_family"] or "general"
                    
                    # 尝试从 conditions 或 claim_text 提取 outcome/treatment
                    conditions = c["conditions"] or {}
                    outcome_family = conditions.get("outcome_family") or extract_family(c["claim_text"], "outcome_gen")
                    treatment_family = conditions.get("treatment_family") or extract_family(c["claim_text"], "treatment_gen")
                    setting = conditions.get("setting") or "general"
                    
                    group_key = f"{topic_key}|{outcome_family}|{treatment_family}|{sign}|{id_family}|{setting}"
                    groups[group_key].append(c["claim_id"])
                    
                    if group_key not in group_details:
                        # 查找 topic_entity_id
                        topic_ent = query_one("SELECT entity_id FROM entities WHERE canonical_key = %s", (topic_key,))
                        group_details[group_key] = {
                            "topic_entity_id": topic_ent["entity_id"] if topic_ent else None,
                            "sign": sign,
                            "setting": setting,
                            "id_family": id_family
                        }
    
                if dry_run:
                    return BuildClaimGroupsOut(
                        new_groups=len(groups),
                        total_members=sum(len(v) for v in groups.values())
                    ).model_dump()
    
                # 4. 写入数据库
                total_members = 0
                with get_db() as conn:
                    for key, claim_ids in groups.items():
                        details = group_details[key]
                        try:
                            with conn.cursor() as cur:
                                with conn.transaction():
                                    cur.execute("""
                                        INSERT INTO claim_groups (group_key, topic_entity_id, sign, setting, id_family)
                                        VALUES (%s, %s, %s, %s, %s)
                                        ON CONFLICT (group_key) DO UPDATE SET updated_at = now()
                                        RETURNING group_id
                                    """, (key, details["topic_entity_id"], details["sign"], details["setting"], details["id_family"]))
                                    group_id = cur.fetchone()["group_id"]
                                    
                                    for cid in claim_ids:
                                        cur.execute("""
                                            INSERT INTO claim_group_members (group_id, claim_id)
                                            VALUES (%s, %s)
                                            ON CONFLICT (group_id, claim_id) DO NOTHING
                                        """, (group_id, cid))
                                        total_members += cur.rowcount
                        except Exception as e:
                            print(f"Error processing group {key}: {e}")
    
                    # 计算总数 (在 commit 前查询)
                    with conn.cursor() as cur:
                        cur.execute("SELECT COUNT(*) as count FROM claim_groups")
                        new_groups_total = cur.fetchone()["count"]
    
                return BuildClaimGroupsOut(
                    new_groups=new_groups_total,
                    total_members=total_members
                ).model_dump()
    
            except Exception as e:
                return BuildClaimGroupsOut(
                    new_groups=0,
                    total_members=0,
                    error=MCPErrorModel(code="SYSTEM_ERROR", message=str(e))
                ).model_dump()
    
        @mcp.tool()
        def export_claim_matrix_grouped_v1(
            comm_id: int | None = None,
            pack_id: int | None = None,
        ) -> dict[str, Any]:
            """导出按分组聚合的结论矩阵。"""
            try:
                where_clauses = []
                params = []
                
                if comm_id:
                    where_clauses.append("""
                        EXISTS (
                            SELECT 1 FROM claim_group_members cgm
                            JOIN claims c ON c.claim_id = cgm.claim_id
                            JOIN mentions m ON m.doc_id = c.doc_id
                            JOIN community_members cm ON cm.entity_id = m.entity_id
                            WHERE cm.comm_id = %s AND cgm.group_id = g.group_id
                        )
                    """)
                    params.append(comm_id)
                elif pack_id:
                    where_clauses.append("""
                        EXISTS (
                            SELECT 1 FROM claim_group_members cgm
                            JOIN claims c ON c.claim_id = cgm.claim_id
                            JOIN evidence_pack_items epi ON epi.chunk_id = c.chunk_id
                            WHERE epi.pack_id = %s AND cgm.group_id = g.group_id
                        )
                    """)
                    params.append(pack_id)
                
                where_sql = " WHERE " + " AND ".join(where_clauses) if where_clauses else ""
                
                sql = f"""
                    SELECT 
                        g.group_id, g.group_key, g.sign, g.id_family, g.setting,
                        e.canonical_name as topic_name,
                        (SELECT COUNT(*) FROM claim_group_members cgm WHERE cgm.group_id = g.group_id) as member_count,
                        (
                            SELECT json_agg(json_build_object(
                                'claim_id', c.claim_id,
                                'doc_id', c.doc_id,
                                'claim_text', c.claim_text,
                                'confidence', c.confidence
                            ))
                            FROM claim_group_members cgm
                            JOIN claims c ON c.claim_id = cgm.claim_id
                            WHERE cgm.group_id = g.group_id
                            LIMIT 5
                        ) as top_claims
                    FROM claim_groups g
                    LEFT JOIN entities e ON e.entity_id = g.topic_entity_id
                    {where_sql}
                    ORDER BY member_count DESC
                """
                rows = query_all(sql, tuple(params))
                
                return ExportClaimMatrixGroupedOut(groups=rows).model_dump()
                
            except Exception as e:
                return ExportClaimMatrixGroupedOut(
                    error=MCPErrorModel(code="SYSTEM_ERROR", message=str(e))
                ).model_dump()
Behavior1/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

With no annotations provided, the description carries full burden for behavioral disclosure but fails to do so. It doesn't indicate if this is a read-only export, whether it triggers computation, what format the output takes (though an output schema exists), or any side effects like file generation. For a tool with 'export' in its name and no annotations, this lack of transparency is critical.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness4/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is a single, efficient sentence in Chinese that directly states the tool's function without fluff. It's appropriately sized for a basic export tool, though it could be more front-loaded with key details. The brevity is a strength, but it borders on under-specification rather than optimal conciseness.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness2/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the tool's complexity (implied by 'grouped aggregated'), 2 parameters with 0% schema coverage, no annotations, but an output schema, the description is incomplete. It doesn't clarify the grouping mechanism, aggregation logic, or how parameters influence the output. The presence of an output schema mitigates some gaps, but the description lacks essential context for effective use.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters2/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema description coverage is 0%, so the description must compensate but adds no parameter information. It doesn't explain what 'comm_id' or 'pack_id' mean, their roles in grouping/aggregation, or how they affect the export. With 2 parameters entirely undocumented in both schema and description, the description fails to provide necessary semantic context, scoring below the baseline.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose2/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description '导出按分组聚合的结论矩阵' (Export grouped aggregated claim matrix) restates the tool name 'export_claim_matrix_grouped_v1' with minimal elaboration. It specifies the action (export) and resource (claim matrix) but lacks detail on what 'grouped aggregated' means operationally. Compared to sibling 'export_claim_matrix_grouped_v1_2', it doesn't differentiate itself, making it a tautological restatement of the name rather than a clear purpose statement.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines1/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

No guidance is provided on when to use this tool versus alternatives like 'export_claim_matrix_grouped_v1_2' or other export tools (e.g., 'export_evidence_matrix_v1'). The description offers no context, prerequisites, or exclusions, leaving the agent with no usage direction. This is a significant gap given the presence of similar sibling tools.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

Install Server

Other Tools

Latest Blog Posts

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/h-lu/paperlib-mcp'

If you have feedback or need assistance with the MCP directory API, please join our Discord server