compute_topic_df_cache
Calculates document frequency cache for Topic entities to optimize search performance in academic literature management systems.
Instructions
计算 Topic 实体的文档频率缓存
Input Schema
TableJSON Schema
| Name | Required | Description | Default |
|---|---|---|---|
No arguments | |||
Implementation Reference
- The handler function for the compute_topic_df_cache tool. It is decorated with @mcp.tool(), which registers it with the MCP server. The function computes the document frequency (DF) cache for Topic entities by inserting or updating the entity_stats table based on relations where papers have topics.@mcp.tool() def compute_topic_df_cache() -> dict[str, Any]: """计算 Topic 实体的文档频率缓存""" try: with get_db() as conn: with conn.cursor() as cur: # 计算每个 Topic 出现在多少篇 Paper 中 cur.execute(""" INSERT INTO entity_stats (entity_id, doc_frequency) SELECT x.entity_id, COUNT(DISTINCT p.entity_id) FROM relations r JOIN entities p ON p.entity_id = r.subj_entity_id AND p.type = 'Paper' JOIN entities x ON x.entity_id = r.obj_entity_id AND x.type = 'Topic' WHERE r.predicate = 'PAPER_HAS_TOPIC' GROUP BY x.entity_id ON CONFLICT (entity_id) DO UPDATE SET doc_frequency = EXCLUDED.doc_frequency, updated_at = now() """) cur.execute("SELECT COUNT(*) as n FROM entity_stats WHERE doc_frequency > 0") count = cur.fetchone()["n"] return {"topics_cached": count} except Exception as e: return {"error": str(e)}