Skip to main content
Glama

build_community_evidence_pack

Collects evidence from academic literature by sampling chunks from top community mentions to build structured evidence packs for research analysis.

Instructions

为社区构建证据包

从社区 top entities 的 mentions 中采样 chunks,写入证据包。

Args: comm_id: 社区 ID max_chunks: 最大 chunk 数量,默认 100 per_doc_limit: 每篇文档最多 chunk 数,默认 4

Returns: 证据包信息,包含 pack_id、文档数和 chunk 数

Input Schema

TableJSON Schema
NameRequiredDescriptionDefault
comm_idYes
max_chunksNo
per_doc_limitNo

Implementation Reference

  • The core execution logic for the 'build_community_evidence_pack' tool. Validates community, retrieves top-weighted entity members, samples high-confidence mentions/chunks with limits, creates an evidence pack in the database, and returns pack metadata.
    @mcp.tool()
    def build_community_evidence_pack(
        comm_id: int,
        max_chunks: int = 100,
        per_doc_limit: int = 4,
    ) -> dict[str, Any]:
        """为社区构建证据包
        
        从社区 top entities 的 mentions 中采样 chunks,写入证据包。
        
        Args:
            comm_id: 社区 ID
            max_chunks: 最大 chunk 数量,默认 100
            per_doc_limit: 每篇文档最多 chunk 数,默认 4
            
        Returns:
            证据包信息,包含 pack_id、文档数和 chunk 数
        """
        try:
            # 验证社区存在
            community = query_one(
                "SELECT comm_id, level FROM communities WHERE comm_id = %s",
                (comm_id,)
            )
            
            if not community:
                return BuildCommunityEvidencePackOut(
                    pack_id=0,
                    docs=0,
                    chunks=0,
                    error=MCPErrorModel(code="NOT_FOUND", message=f"Community {comm_id} not found"),
                ).model_dump()
            
            # 获取社区成员(按权重排序)
            members = query_all(
                """
                SELECT entity_id, weight
                FROM community_members
                WHERE comm_id = %s
                ORDER BY weight DESC
                """,
                (comm_id,)
            )
            
            if not members:
                return BuildCommunityEvidencePackOut(
                    pack_id=0,
                    docs=0,
                    chunks=0,
                    error=MCPErrorModel(code="NOT_FOUND", message="No members in community"),
                ).model_dump()
            
            entity_ids = [m["entity_id"] for m in members]
            
            # 获取这些实体的 mentions -> chunks
            mentions = query_all(
                """
                SELECT m.doc_id, m.chunk_id, MAX(m.confidence) AS conf
                FROM mentions m
                WHERE m.entity_id = ANY(%s)
                GROUP BY m.doc_id, m.chunk_id
                ORDER BY conf DESC
                LIMIT 5000
                """,
                (entity_ids,)
            )
            
            if not mentions:
                return BuildCommunityEvidencePackOut(
                    pack_id=0,
                    docs=0,
                    chunks=0,
                    error=MCPErrorModel(code="NOT_FOUND", message="No mentions found for community entities"),
                ).model_dump()
            
            # 应用 per_doc_limit
            doc_counts: dict[str, int] = defaultdict(int)
            selected_chunks: list[tuple[str, int]] = []
            
            for m in mentions:
                if doc_counts[m["doc_id"]] < per_doc_limit:
                    selected_chunks.append((m["doc_id"], m["chunk_id"]))
                    doc_counts[m["doc_id"]] += 1
                    
                    if len(selected_chunks) >= max_chunks:
                        break
            
            if not selected_chunks:
                return BuildCommunityEvidencePackOut(
                    pack_id=0,
                    docs=0,
                    chunks=0,
                    error=MCPErrorModel(code="NOT_FOUND", message="No chunks selected"),
                ).model_dump()
            
            # 创建证据包
            with get_db() as conn:
                with conn.cursor() as cur:
                    # 创建包
                    cur.execute(
                        """
                        INSERT INTO evidence_packs(query, params_json)
                        VALUES (%s, %s::jsonb)
                        RETURNING pack_id
                        """,
                        (
                            f"Community {comm_id} evidence",
                            json.dumps({
                                "comm_id": comm_id,
                                "max_chunks": max_chunks,
                                "per_doc_limit": per_doc_limit,
                            })
                        )
                    )
                    result = cur.fetchone()
                    pack_id = result["pack_id"]
                    
                    # 写入条目
                    for rank, (doc_id, chunk_id) in enumerate(selected_chunks):
                        cur.execute(
                            """
                            INSERT INTO evidence_pack_items(pack_id, doc_id, chunk_id, rank)
                            VALUES (%s, %s, %s, %s)
                            ON CONFLICT DO NOTHING
                            """,
                            (pack_id, doc_id, chunk_id, rank)
                        )
            
            return BuildCommunityEvidencePackOut(
                pack_id=pack_id,
                docs=len(doc_counts),
                chunks=len(selected_chunks),
            ).model_dump()
            
        except Exception as e:
            return BuildCommunityEvidencePackOut(
                pack_id=0,
                docs=0,
                chunks=0,
                error=MCPErrorModel(code="DB_CONN_ERROR", message=str(e)),
            ).model_dump()
  • Pydantic models defining the input parameters (comm_id, max_chunks, per_doc_limit) and output structure (pack_id, docs, chunks, optional error) for the tool.
    class BuildCommunityEvidencePackIn(BaseModel):
        """build_community_evidence_pack 输入"""
        comm_id: int
        max_chunks: int = 100
        per_doc_limit: int = 4
    
    
    class BuildCommunityEvidencePackOut(BaseModel):
        """build_community_evidence_pack 输出"""
        pack_id: int
        docs: int
        chunks: int
        error: Optional[MCPErrorModel] = None
  • Invocation of register_graph_community_tools on the MCP instance, which defines and registers the build_community_evidence_pack tool using @mcp.tool() decorator.
    register_graph_community_tools(mcp)

Latest Blog Posts

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/h-lu/paperlib-mcp'

If you have feedback or need assistance with the MCP directory API, please join our Discord server