build_community_evidence_pack
Collects evidence from academic literature by sampling chunks from top community mentions to build structured evidence packs for research analysis.
Instructions
为社区构建证据包
从社区 top entities 的 mentions 中采样 chunks,写入证据包。
Args: comm_id: 社区 ID max_chunks: 最大 chunk 数量,默认 100 per_doc_limit: 每篇文档最多 chunk 数,默认 4
Returns: 证据包信息,包含 pack_id、文档数和 chunk 数
Input Schema
TableJSON Schema
| Name | Required | Description | Default |
|---|---|---|---|
| comm_id | Yes | ||
| max_chunks | No | ||
| per_doc_limit | No |
Implementation Reference
- The core execution logic for the 'build_community_evidence_pack' tool. Validates community, retrieves top-weighted entity members, samples high-confidence mentions/chunks with limits, creates an evidence pack in the database, and returns pack metadata.@mcp.tool() def build_community_evidence_pack( comm_id: int, max_chunks: int = 100, per_doc_limit: int = 4, ) -> dict[str, Any]: """为社区构建证据包 从社区 top entities 的 mentions 中采样 chunks,写入证据包。 Args: comm_id: 社区 ID max_chunks: 最大 chunk 数量,默认 100 per_doc_limit: 每篇文档最多 chunk 数,默认 4 Returns: 证据包信息,包含 pack_id、文档数和 chunk 数 """ try: # 验证社区存在 community = query_one( "SELECT comm_id, level FROM communities WHERE comm_id = %s", (comm_id,) ) if not community: return BuildCommunityEvidencePackOut( pack_id=0, docs=0, chunks=0, error=MCPErrorModel(code="NOT_FOUND", message=f"Community {comm_id} not found"), ).model_dump() # 获取社区成员(按权重排序) members = query_all( """ SELECT entity_id, weight FROM community_members WHERE comm_id = %s ORDER BY weight DESC """, (comm_id,) ) if not members: return BuildCommunityEvidencePackOut( pack_id=0, docs=0, chunks=0, error=MCPErrorModel(code="NOT_FOUND", message="No members in community"), ).model_dump() entity_ids = [m["entity_id"] for m in members] # 获取这些实体的 mentions -> chunks mentions = query_all( """ SELECT m.doc_id, m.chunk_id, MAX(m.confidence) AS conf FROM mentions m WHERE m.entity_id = ANY(%s) GROUP BY m.doc_id, m.chunk_id ORDER BY conf DESC LIMIT 5000 """, (entity_ids,) ) if not mentions: return BuildCommunityEvidencePackOut( pack_id=0, docs=0, chunks=0, error=MCPErrorModel(code="NOT_FOUND", message="No mentions found for community entities"), ).model_dump() # 应用 per_doc_limit doc_counts: dict[str, int] = defaultdict(int) selected_chunks: list[tuple[str, int]] = [] for m in mentions: if doc_counts[m["doc_id"]] < per_doc_limit: selected_chunks.append((m["doc_id"], m["chunk_id"])) doc_counts[m["doc_id"]] += 1 if len(selected_chunks) >= max_chunks: break if not selected_chunks: return BuildCommunityEvidencePackOut( pack_id=0, docs=0, chunks=0, error=MCPErrorModel(code="NOT_FOUND", message="No chunks selected"), ).model_dump() # 创建证据包 with get_db() as conn: with conn.cursor() as cur: # 创建包 cur.execute( """ INSERT INTO evidence_packs(query, params_json) VALUES (%s, %s::jsonb) RETURNING pack_id """, ( f"Community {comm_id} evidence", json.dumps({ "comm_id": comm_id, "max_chunks": max_chunks, "per_doc_limit": per_doc_limit, }) ) ) result = cur.fetchone() pack_id = result["pack_id"] # 写入条目 for rank, (doc_id, chunk_id) in enumerate(selected_chunks): cur.execute( """ INSERT INTO evidence_pack_items(pack_id, doc_id, chunk_id, rank) VALUES (%s, %s, %s, %s) ON CONFLICT DO NOTHING """, (pack_id, doc_id, chunk_id, rank) ) return BuildCommunityEvidencePackOut( pack_id=pack_id, docs=len(doc_counts), chunks=len(selected_chunks), ).model_dump() except Exception as e: return BuildCommunityEvidencePackOut( pack_id=0, docs=0, chunks=0, error=MCPErrorModel(code="DB_CONN_ERROR", message=str(e)), ).model_dump()
- Pydantic models defining the input parameters (comm_id, max_chunks, per_doc_limit) and output structure (pack_id, docs, chunks, optional error) for the tool.class BuildCommunityEvidencePackIn(BaseModel): """build_community_evidence_pack 输入""" comm_id: int max_chunks: int = 100 per_doc_limit: int = 4 class BuildCommunityEvidencePackOut(BaseModel): """build_community_evidence_pack 输出""" pack_id: int docs: int chunks: int error: Optional[MCPErrorModel] = None
- src/paperlib_mcp/server.py:42-42 (registration)Invocation of register_graph_community_tools on the MCP instance, which defines and registers the build_community_evidence_pack tool using @mcp.tool() decorator.register_graph_community_tools(mcp)