Stores and manages historical performance data and metrics in a local database for trend analysis and reporting
Provides comprehensive monitoring and benchmarking capabilities for OpenShift/Kubernetes clusters, including API server metrics, pod performance, and networking component analysis
Orchestrates AI agents for automated data collection and report generation workflows in cluster performance monitoring
Powers AI-driven performance analysis, generating intelligent insights, recommendations, and natural language summaries of cluster performance data
Enables querying and analysis of Prometheus metrics for cluster performance monitoring, including API latency, resource usage, and component-specific metrics
OpenShift OVN-Kubernetes Benchmark MCP Server
A comprehensive benchmarking and performance monitoring solution for OpenShift clusters using OVN-Kubernetes networking, built with FastMCP and AI-powered analysis.
Architecture Overview
Architecture Topology
Features
🔧 Core Capabilities
- Automated Authentication: Discovers and authenticates with OpenShift/Kubernetes clusters
- Multi-Source Monitoring: Collects metrics from Prometheus, Kubernetes API, and cluster resources
- AI-Powered Analysis: Uses LangGraph and OpenAI for intelligent insights and recommendations
- Comprehensive Reporting: Generates Excel and PDF reports with visualizations
- Historical Tracking: Stores performance data in DuckDB for trend analysis
📊 Monitored Components
- Kubernetes API Server: Request latency, throughput, and error rates
- Multus CNI: Resource usage and pod networking performance
- OVN-Kubernetes Pods: Control plane and node performance
- OVN Containers: Database sizes, memory usage, and sync performance
- OVS Components: CPU and memory usage of OVS processes
- General Cluster Info: NetworkPolicies, AdminNetworkPolicies, EgressFirewalls
🤖 AI Features
- Automated performance trend analysis
- Intelligent alert correlation
- Proactive recommendations
- Risk assessment and health scoring
- Natural language insights
Quick Start
Prerequisites
- Python 3.9+
- OpenShift/Kubernetes cluster access
- KUBECONFIG file
- OpenAI API key (for AI features)
Installation
- Clone and Setup
- Test Configuration
Usage
Start MCP Server
Collect Performance Data
Generate Reports
Full Workflow
Project Structure
Configuration
Environment Variables
Variable | Description | Default |
---|---|---|
KUBECONFIG | Path to kubeconfig file | ~/.kube/config |
OPENAI_API_KEY | OpenAI API key for AI features | Required for reports |
MCP_SERVER_URL | MCP server URL | http://localhost:8000 |
COLLECTION_DURATION | Metrics collection duration | 5m |
REPORT_PERIOD_DAYS | Report period in days | 7 |
DATABASE_PATH | DuckDB database path | storage/ovnk_benchmark.db |
REPORT_OUTPUT_DIR | Report output directory | exports |
Metrics Configuration
The config/metrics.yml
file defines all Prometheus queries organized by category:
- General Information: Pod and namespace status
- API Server: Request latency and error rates
- Multus: CNI resource usage
- OVN Control Plane/Node: CPU and memory metrics
- OVN Containers: Database and controller metrics
- OVS Containers: OVS daemon metrics
- OVN Sync: Synchronization duration metrics
API Reference
MCP Tools
The server exposes the following MCP tools:
get_openshift_general_info
Get general cluster information including NetworkPolicy, AdminNetworkPolicy, and EgressFirewall counts.
Parameters:
namespace
(optional): Specific namespace to query
Response:
query_kube_api_metrics
Query Kubernetes API server performance metrics.
Parameters:
duration
(optional): Query duration (default: "5m")start_time
(optional): Start time in ISO formatend_time
(optional): End time in ISO format
query_multus_metrics
Query Multus CNI performance metrics.
query_ovnk_pods_metrics
Query OVN-Kubernetes pod performance metrics.
query_ovnk_containers_metrics
Query OVN-Kubernetes container metrics.
query_ovnk_sync_metrics
Query OVN-Kubernetes synchronization metrics.
store_performance_data
Store performance data in DuckDB.
get_performance_history
Retrieve historical performance data.
AI Agents
Performance Data Collection Agent
Uses LangGraph to orchestrate data collection:
- Initialize: Setup collection parameters
- Collect General Info: Gather cluster information
- Collect Metrics: Query each component category
- Store Data: Save to DuckDB storage
- Finalize: Generate collection summary
Report Generation Agent
Uses LangGraph with AI analysis:
- Fetch Historical Data: Retrieve performance history
- Analyze Performance: Calculate trends and statistics
- Generate Insights: Use AI for recommendations
- Create Reports: Generate Excel and PDF reports
- Finalize: Output summary and files
Storage Schema
DuckDB Tables
metrics
: Individual metric data pointsmetric_summaries
: Category performance summariesperformance_snapshots
: Complete performance snapshotsbenchmark_runs
: Benchmark execution recordsalerts_history
: Historical alert data
Report Types
Excel Reports
- Executive Summary: Key performance indicators
- Historical Trends: Time-series performance data
- Category Analysis: Component-specific metrics
- Recommendations: AI-generated insights
- Raw Data: Complete dataset
PDF Reports
- Executive Summary: High-level performance overview
- Key Metrics: Performance indicator tables
- Category Analysis: Component performance breakdown
- Recommendations: Prioritized action items
Troubleshooting
Common Issues
Authentication Problems
Prometheus Discovery
MCP Server Issues
Debug Mode
Enable debug logging:
Contributing
- Fork the repository
- Create a feature branch
- Add tests for new functionality
- Submit a pull request
Development Setup
License
MIT License - see LICENSE file for details.
Support
For issues and questions:
- Check the troubleshooting section
- Review logs in the
logs/
directory - Open an issue with detailed logs and configuration
Roadmap
- Kubernetes native deployment (Helm charts)
- Grafana dashboard integration
- Custom alert rule definitions
- Multi-cluster support
- Real-time streaming metrics
- Advanced ML-based anomaly detection
- Integration with CI/CD pipelines
Note: This tool is designed for OpenShift clusters with OVN-Kubernetes networking. Some features may not be available on other Kubernetes distributions.
This server cannot be installed
remote-capable server
The server can be hosted and run remotely because it primarily relies on remote services or has no dependency on the local environment.
Enables comprehensive benchmarking and performance monitoring of OpenShift clusters using OVN-Kubernetes networking through automated data collection, AI-powered analysis, and report generation. Provides intelligent insights into cluster performance, bottleneck detection, and optimization recommendations.
Related MCP Servers
- -securityAlicense-qualityA powerful and flexible Kubernetes MCP server implementation with support for OpenShift.Last updated -542GoApache 2.0
- -securityAlicense-quality* Index Management Tools: * List all indices in OpenSearch cluster * Get index mapping * Get index settings * Cluster Management Tools: * Get cluster health status * Get cluster statistics * Document Tools: * Search documentsLast updated -1MIT License
- -securityAlicense-qualityA Model Context Protocol server that enables monitoring and analysis of Precision Time Protocol systems in OpenShift clusters through configuration parsing, log monitoring, and natural language queries.Last updated -1MIT License
- -securityAlicense-qualityEnables interaction with the Observe platform through OPAL query execution, worksheet data export, dataset management, and monitor operations. Provides AI-powered troubleshooting assistance through vector search across documentation and specialized runbooks.Last updated -1GPL 3.0