Skip to main content
Glama
turambar928

MCP-based Knowledge Graph Construction System

by turambar928

build_knowledge_graph

Automatically processes text data to assess quality, enrich information, and generate structured knowledge graphs with interactive visualizations.

Instructions

全自动构建知识图谱:自动评估数据质量、补全知识、构建图谱并生成可视化

Input Schema

TableJSON Schema
NameRequiredDescriptionDefault
textYes要处理的文本数据
output_fileNo可视化输出文件名(可选)knowledge_graph.html

Implementation Reference

  • The 'build_knowledge_graph_tool' function implementation in kg_server_enhanced.py, which serves as the handler for the 'build_knowledge_graph' MCP tool. It extracts entities and triples from the input text and saves a visualization.
    async def build_knowledge_graph_tool(arguments: dict[str, Any]) -> list[TextContent]:
        """
        构建知识图谱(不进行质量评估、知识补全或其他内容增强)
        """
        try:
            text = arguments.get("text", "")
            output_file = arguments.get("output_file", "knowledge_graph.html")
    
            if not text.strip():
                return [TextContent(
                    type="text",
                    text=json.dumps({
                        "success": False,
                        "error": "输入文本不能为空"
                    }, ensure_ascii=False, indent=2)
                )]
    
            start_time = time.time()
    
            # 直接构建知识图谱
            kg_result = await kg_builder.build_graph(text, use_llm=True)
    
            # 检查是否成功提取到实体和三元组
            if not kg_result["entities"] and not kg_result["triples"]:
                return [TextContent(
                    type="text",
                    text=json.dumps({
                        "success": False,
                        "error": "无法从输入文本中提取到有效的实体或关系",
                        "suggestion": "请尝试输入包含明确实体和关系的文本"
                    }, ensure_ascii=False, indent=2)
                )]
    
            # 生成可视化
            visualization_file = kg_visualizer.save_simple_visualization(
                kg_result["triples"],
                kg_result["entities"],
                kg_result["relations"],
                output_file
            )
    
            abs_path = os.path.abspath(visualization_file)
            visualization_url = f"file:///{abs_path.replace(os.sep, '/')}"
            http_url = f"http://localhost:8000/{visualization_file}"
            server_info = f"可手动启动HTTP服务器访问:在项目目录运行 'python -m http.server 8000',然后访问 {http_url}"
    
            processing_time = time.time() - start_time
    
            # 构建结果
            result = {
                "success": True,
                "input_text": text,
                "processing_time": round(processing_time, 3),
                "knowledge_graph": {
                    "entities_count": len(kg_result["entities"]),
                    "relations_count": len(kg_result["relations"]),
                    "triples_count": len(kg_result["triples"]),
                    "entities": kg_result["entities"],
                    "relations": kg_result["relations"]
                },
                "visualization": {
                    "file_path": visualization_file,
                    "file_url": visualization_url,
                    "http_url": http_url,
                    "server_info": server_info
                }
            }
    
            return [TextContent(
                type="text",
                text=json.dumps(result, ensure_ascii=False, indent=2)
            )]
  • Registration logic for 'build_knowledge_graph' within the server's tool handler dispatch logic in kg_server_enhanced.py.
    tools = [
        Tool(
            name="build_knowledge_graph",
            description="构建知识图谱:直接从文本提取实体与关系并生成可视化(不进行内容增强)",
            inputSchema={
Behavior2/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

With no annotations provided, the description carries the full burden of behavioral disclosure. It mentions the tool's automated processes but fails to disclose critical traits like required permissions, rate limits, whether it's read-only or destructive, or what happens on failure. For a complex tool with no annotation coverage, this is a significant gap in transparency.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is highly concise and front-loaded, using a single sentence that efficiently outlines the tool's multi-step process. Every phrase ('自动评估数据质量、补全知识、构建图谱并生成可视化') earns its place by specifying key actions without redundancy or waste.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness2/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the tool's complexity (automated knowledge graph construction) and lack of annotations or output schema, the description is incomplete. It doesn't cover behavioral aspects, error handling, or output details, leaving gaps that could hinder an AI agent's effective use. The description should provide more context to compensate for missing structured data.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema description coverage is 100%, so the schema already documents both parameters ('text' and 'output_file'). The description adds no additional meaning beyond the schema—it doesn't explain parameter interactions, formats, or constraints. Baseline 3 is appropriate when the schema handles parameter documentation adequately.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose4/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the tool's purpose: '全自动构建知识图谱' (fully automatic knowledge graph construction) with specific verbs like '评估数据质量' (assess data quality), '补全知识' (complete knowledge), '构建图谱' (build graph), and '生成可视化' (generate visualization). It distinguishes the tool's comprehensive automated workflow. However, without sibling tools, we cannot assess differentiation from alternatives, preventing a perfect score.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines2/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description provides no guidance on when to use this tool versus alternatives, prerequisites, or exclusions. It simply lists what the tool does without context for application. This lack of usage instructions limits its utility for an AI agent in decision-making.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

Install Server

Other Tools

Latest Blog Posts

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/turambar928/MCP_based_KG_construction'

If you have feedback or need assistance with the MCP directory API, please join our Discord server