Utilizes OpenAI-compatible APIs to perform intelligent semantic understanding, relationship inference, and knowledge enhancement for automated knowledge graph construction.
Generates interactive HTML-based network visualizations of knowledge graphs, featuring clickable nodes, entity clustering, and confidence-coded visualizations.
Click on "Install Server".
Wait a few minutes for the server to deploy. Once ready, it will show a "Started" state.
In the chat, type
@followed by the MCP server name and your instructions, e.g., "@MCP-based Knowledge Graph Construction SystemCreate a knowledge graph and visualization from this text about the history of AI."
That's it! The server will respond to your query, and you can continue using it as needed.
Here is a step-by-step guide with screenshots.
MCP-based Knowledge Graph Construction System
A fully automated knowledge graph construction system built on the Model Context Protocol (MCP), implementing a sophisticated 3-stage data processing pipeline for intelligent knowledge extraction and graph generation.
Overview
This project implements an advanced knowledge graph construction system that automatically processes raw text data through three intelligent stages:
Data Quality Assessment - Evaluates completeness, consistency, and relevance
Knowledge Completion - Enhances low-quality data using LLM and external knowledge bases
Knowledge Graph Construction - Builds structured knowledge graphs with confidence scoring
The system is built on the MCP (Model Context Protocol) architecture, providing a clean client-server interface for seamless integration and scalability.
Key Features
Fully Automated Processing
Zero Manual Intervention: Automatically detects data quality and processing needs
Intelligent Pipeline: Adapts processing strategy based on input data characteristics
Real-time Processing: Immediate knowledge graph generation from raw text
3-Stage Processing Pipeline
Stage 1: Data Quality Assessment
Completeness Analysis: Evaluates entity and relationship coverage
Consistency Checking: Detects semantic conflicts and contradictions
Relevance Scoring: Assesses information relevance and meaningfulness
Quality Threshold: Automatically determines if data needs enhancement
Stage 2: Knowledge Completion (for low-quality data)
Entity Enhancement: Completes missing entity information
Relationship Inference: Adds missing relationships between entities
Conflict Resolution: Corrects semantic inconsistencies
Format Normalization: Standardizes data format and structure
Implicit Knowledge Inference: Extracts hidden knowledge patterns
Stage 3: Knowledge Graph Construction
Rule-based Extraction: Fast, deterministic triple generation
LLM-enhanced Processing: Advanced semantic understanding and relationship inference
Confidence Scoring: Assigns reliability scores to extracted knowledge
Interactive Visualization: Generates beautiful HTML visualizations
MCP Architecture
Client-Server Design: Clean separation of concerns
Standardized Protocol: Built on MCP for interoperability
Tool-based Interface: Modular, extensible tool system
Async Processing: High-performance asynchronous operations
Requirements
Python: 3.11 or higher
UV Package Manager: For dependency management
OpenAI-compatible API: For LLM integration (DeepSeek, OpenAI, etc.)
Quick Start
1. Clone and Setup
git clone https://github.com/turambar928/MCP_based_KG_construction.git
cd MCP_based_KG_construction
# Install dependencies
uv sync2. Environment Configuration
Create a .env file with your API configuration:
OPENAI_API_KEY=your_api_key_here
OPENAI_BASE_URL=https://api.siliconflow.cn/v1 # or your preferred endpoint
OPENAI_MODEL=Qwen/QwQ-32B # or your preferred modelSupported API Providers:
OpenAI:
https://api.openai.com/v1DeepSeek:
https://api.deepseek.comSiliconFlow:
https://api.siliconflow.cn/v1Any OpenAI-compatible endpoint
3. Start the MCP Server
uv run kg_server.pyThe server will start and listen for MCP client connections.
4. Running Tests
There are three ways to test the system:
a. Using MCP Inspector
npx -y @modelcontextprotocol/inspector uv run kg_server.pyAfter running this command, click the link that appears after "MCP Inspector is up and running at" to open the MCP Inspector in your browser. Once opened:
Click "Connect"
Select "Tools" from the top menu
Choose "build_knowledge_graph" from the list tools
Enter your text in the left panel to generate the knowledge graph

b. Using Client Code
uv run kg_client.pyAfter the connection is successful, enter your text to view the results.

c. Using Mainstream MCP Tools (Cursor, Cherry Studio, etc.)
Example: Running in Cherry Studio
In settings, select MCP servers, click "Add Server" (import from JSON). Here's the configuration JSON (make sure to modify the local path):
{
"mcpServers": {
"kg_server": {
"command": "uv",
"args": [
"--directory",
"D:/mcp_getting_started",
"run",
"kg_server.py"
],
"env": {},
"disabled": false,
"autoApprove": []
}
}
}After enabling this MCP server, you can use it in Cherry Studio.

🛠️ Usage Guide
Interactive Client Commands
Once the client is running, you can use these commands:
# Build knowledge graph from text
build <your_text_here>
# Example usage
build 北京大学是中国著名的高等教育机构,位于北京市海淀区
# Run demonstration examples
demo
# Exit the client
quitProgrammatic Usage
from kg_client import KnowledgeGraphClient
async def main():
client = KnowledgeGraphClient()
await client.connect_to_server()
# Build knowledge graph
result = await client.build_knowledge_graph(
"苹果公司的CEO是蒂姆·库克",
output_file="my_graph.html"
)
print(f"Generated graph: {result}")
await client.cleanup()Example Outputs
High-Quality Input
Input: "北京大学是中国著名的高等教育机构,位于北京市海淀区。"
Processing: Direct Stage 3 (high quality detected)
Output:
- Entities: [北京大学, 中国, 高等教育机构, 北京市, 海淀区]
- Triples: [(北京大学, 是, 高等教育机构), (北京大学, 位于, 海淀区), ...]
- Visualization: Interactive HTML graphLow-Quality Input (Incomplete)
Input: "李华去巴黎"
Processing:
- Stage 1: Detects incomplete information
- Stage 2: Enhances with "巴黎位于法国", "李华是人"
- Stage 3: Builds enhanced knowledge graph
Output: Enriched knowledge graph with inferred relationshipsLow-Quality Input (Conflicting)
Input: "巴黎市是德国城市。"
Processing:
- Stage 1: Detects semantic conflict
- Stage 2: Corrects to "巴黎是法国城市"
- Stage 3: Builds corrected knowledge graph
Output: Corrected and enhanced knowledge graphMCP Tools API
The system exposes the following MCP tools for integration:
build_knowledge_graph
Description: Complete pipeline for knowledge graph construction with automatic quality assessment and enhancement.
Parameters:
text(string): Input text to processoutput_file(string, optional): HTML visualization output filename (default: "knowledge_graph.html")
Returns: JSON object containing:
success(boolean): Processing success statusentities(array): Extracted entitiestriples(array): Generated knowledge triplesconfidence_scores(array): Confidence scores for each triplevisualization_file(string): Path to generated HTML visualizationprocessing_stages(object): Details of each processing stage
Example:
{
"success": true,
"entities": ["北京大学", "中国", "高等教育机构"],
"triples": [
{
"subject": "北京大学",
"predicate": "是",
"object": "高等教育机构",
"confidence": 0.95
}
],
"visualization_file": "knowledge_graph.html"
}Project Structure
├── kg_server.py # Main MCP server implementation
├── kg_client.py # Interactive client for testing
├── kg_utils.py # Core knowledge graph construction utilities
├── kg_visualizer.py # HTML visualization generator
├── data_quality.py # Stage 1: Data quality assessment
├── knowledge_completion.py # Stage 2: Knowledge completion and enhancement
├── pyproject.toml # Project dependencies and configuration
├── .env # Environment variables (API keys)
└── README.md # This fileCore Components
kg_server.py: MCP server that orchestrates the 3-stage pipelinekg_client.py: Command-line client for interactive testing and batch processingkg_utils.py: Knowledge graph construction engine with rule-based and LLM-enhanced extractionkg_visualizer.py: Generates interactive HTML visualizations using Plotlydata_quality.py: Implements quality assessment algorithms for completeness, consistency, and relevanceknowledge_completion.py: Handles knowledge enhancement and conflict resolution
Advanced Features
Quality Assessment Metrics
Completeness Score: Based on entity coverage and relationship density
Consistency Score: Detects semantic conflicts and contradictions
Relevance Score: Evaluates information meaningfulness
Composite Quality Score: Weighted combination of all metrics
Knowledge Enhancement Strategies
Entity Completion: Adds missing entity attributes and types
Relationship Inference: Discovers implicit relationships
Conflict Resolution: Corrects factual inconsistencies
Format Normalization: Standardizes entity and relationship representations
Visualization Features
Interactive Network Graph: Clickable nodes and edges
Entity Clustering: Groups related entities by type
Confidence Visualization: Color-coded confidence levels
Export Options: HTML, PNG, SVG formats
Technical Details
Processing Pipeline
Input Validation: Checks text format and encoding
Quality Assessment: Multi-dimensional quality scoring
Conditional Enhancement: Applies enhancement only when needed
Graph Construction: Rule-based + LLM hybrid approach
Confidence Calculation: Bayesian confidence scoring
Visualization Generation: Interactive HTML output
Performance Characteristics
Processing Speed: ~1-3 seconds per text input
Memory Usage: ~50-100MB for typical workloads
Scalability: Async architecture supports concurrent processing
Accuracy: 85-95% entity extraction, 80-90% relationship accuracy
Development
Running Tests
Refer to the "Running Tests" section above for three different testing methods:
MCP Inspector (recommended for visual testing)
Client code (for programmatic testing)
Mainstream MCP tools (for integration testing)
# Quick test with demonstration examples
uv run kg_client.py
# Then type: demo
# Test with custom input
uv run kg_client.py "Your test text here"Adding New Features
Custom Quality Metrics: Extend
data_quality.pyNew Enhancement Strategies: Modify
knowledge_completion.pyAdditional Visualization: Enhance
kg_visualizer.pyNew MCP Tools: Add tools to
kg_server.py
Configuration Options
Environment variables in .env:
# Required
OPENAI_API_KEY=your_api_key
OPENAI_BASE_URL=your_api_endpoint
OPENAI_MODEL=your_model_name
# Optional
QUALITY_THRESHOLD=0.5 # Quality threshold for enhancement
MAX_ENTITIES=50 # Maximum entities per graph
VISUALIZATION_WIDTH=1200 # HTML visualization width
VISUALIZATION_HEIGHT=800 # HTML visualization heightContributing
Fork the repository
Create a feature branch:
git checkout -b feature-nameMake your changes and test thoroughly
Submit a pull request with detailed description
Troubleshooting
Common Issues
Port Occupation Error
# Find process using the port netstat -ano | findstr :6277 # Kill the process taskkill /PID <process_id> /FAPI Balance Insufficient
Check API configuration in
.envfileEnsure API account has sufficient balance
Dependency Installation Issues
uv sync --reinstall
License
This project is licensed under the MIT License - see the LICENSE file for details.
Acknowledgments
Built on the Model Context Protocol (MCP)
Visualization powered by Plotly
Graph algorithms using NetworkX
LLM integration via OpenAI API
Support
For questions, issues, or contributions:
📧 Email: tzf9282003@163.com
🐛 Issues: GitHub Issues
📖 Documentation: See
KNOWLEDGE_GRAPH_README.mdfor detailed technical documentation