Provides tools for analyzing etcd logs to assist SREs in monitoring core Kubernetes infrastructure health and troubleshooting cluster state issues.
Offers comprehensive cluster management capabilities, including resource querying, namespace investigation, pod management, and real-time health monitoring using various analysis strategies.
Enables the simulation and risk assessment of configuration changes for NGINX ingress controllers, helping to evaluate impact on traffic before deployment.
Integrates with Prometheus to enable advanced metrics monitoring, historical performance baselining, and predictive forecasting for resource bottlenecks.
Provides deep intelligence for Tekton CI/CD pipelines, including monitoring of runs, automated root cause analysis for failures, and performance baselining.
LUMINO MCP Server
An open source MCP (Model Context Protocol) server empowering SREs with intelligent observability, predictive analytics, and AI-driven automation across Kubernetes, OpenShift, and Tekton environments.
Table of Contents
Overview
LUMINO MCP Server transforms how Site Reliability Engineers (SREs) and DevOps teams interact with Kubernetes clusters. By exposing 37 specialized tools through the Model Context Protocol, it enables AI assistants to:
Monitor cluster health, resources, and pipeline status in real-time
Analyze logs, events, and anomalies using statistical and ML techniques
Troubleshoot failed pipelines with automated root cause analysis
Predict resource bottlenecks and potential issues before they occur
Simulate configuration changes to assess impact before deployment
Features
Kubernetes & OpenShift Operations
Namespace and pod management
Resource querying with flexible output formats
Label-based resource search across clusters
OpenShift operator and MachineConfigPool status
etcd log analysis
Tekton Pipeline Intelligence
Pipeline and task run monitoring across namespaces
Detailed log retrieval with optional cleaning
Failed pipeline root cause analysis
Cross-cluster pipeline tracing
CI/CD performance baselining
Advanced Log Analysis
Smart log summarization with configurable detail levels
Streaming analysis for large log volumes
Hybrid analysis combining multiple strategies
Semantic search using NLP techniques
Anomaly detection with severity classification
Predictive & Proactive Monitoring
Statistical anomaly detection using z-score analysis
Predictive log analysis for early warning
Resource bottleneck forecasting
Certificate health monitoring with expiry alerts
TLS certificate issue investigation
Event Intelligence
Smart event retrieval with multiple strategies
Progressive event analysis (overview to deep-dive)
Advanced analytics with ML pattern detection
Log-event correlation
Simulation & What-If Analysis
Monte Carlo simulation for configuration changes
Impact analysis before deployment
Risk assessment with configurable tolerance
Affected component identification
Quick Start
Get started with LUMINO in under 2 minutes:
For Claude Code CLI Users (Easiest)
Simply ask Claude Code to provision the Lumino MCP server for you by pasting this prompt:
For Other MCP Clients
Choose your preferred installation method:
MCPM (Recommended):
mcpm install @spre-sre/lumino-mcp-serverManual Setup: See detailed MCP Client Integration instructions
Verify Installation
Once installed, test with a simple query:
Prerequisites
Required
Python 3.10 or higher - Core runtime
MCP Client - One of:
For Kubernetes Features
Kubernetes/OpenShift Access - Valid kubeconfig with read permissions
RBAC Permissions - Ability to list pods, namespaces, and other resources
Optional (Recommended)
uv - Faster dependency management than pip
MCPM - Easiest installation experience
Prometheus - For advanced metrics and forecasting features
Installation
Using uv (recommended)
Using pip
Usage
Local Mode (stdio transport)
By default, the server runs in local mode using stdio transport, suitable for direct integration with MCP clients:
Kubernetes Mode (HTTP streaming transport)
When running inside Kubernetes, set the namespace environment variable to enable HTTP streaming:
The server automatically detects the environment and switches transport modes.
Usage Examples
π Intelligent Root Cause Analysis
Investigate and diagnose complex failures with automated analysis:
π― Predictive Intelligence & Forecasting
Anticipate problems before they impact your systems:
π§ͺ Simulation & What-If Analysis
Test changes safely before applying them to production:
πΊοΈ Topology & Dependency Mapping
Understand system architecture and component relationships:
π¬ Advanced Investigation & Forensics
Deep-dive into complex issues with multi-faceted analysis:
π CI/CD Pipeline Intelligence
Optimize and troubleshoot your continuous delivery pipelines:
π¨ Progressive Event Analysis
Multi-level event investigation from overview to deep-dive:
π Real-Time Monitoring & Alerts
Stay informed about cluster health and pipeline status:
π Security & Compliance
Ensure cluster security and certificate management:
π Advanced Analytics & ML Insights
Leverage machine learning for pattern detection:
Configuration
Kubernetes Authentication
The server automatically detects Kubernetes configuration:
In-cluster config - When running inside a Kubernetes pod
Local kubeconfig - When running locally (uses
~/.kube/config)
Environment Variables
Variable | Description | Default | When to Use |
| Namespace for K8s mode | - | When running server inside a Kubernetes pod |
| Alternative namespace variable | - | Alternative to |
| Prometheus server URL for metrics | Auto-detected | Custom Prometheus endpoint or non-standard port |
| Path to kubeconfig file |
| Multiple clusters or custom kubeconfig location |
| Logging verbosity (DEBUG, INFO, WARNING, ERROR) |
| Debugging issues or reducing log noise |
| MCP framework log level |
| Troubleshooting MCP protocol issues |
| Disable Python output buffering | - | Recommended for MCP clients to see real-time logs |
Available Tools
Kubernetes Core (4 tools)
Tool | Description |
| List all namespaces in the cluster |
| List pods with status and placement info |
| Get any Kubernetes resource with flexible output |
| Search resources across namespaces by labels |
Tekton Pipelines (6 tools)
Tool | Description |
| List PipelineRuns with status and timing |
| List TaskRuns, optionally filtered by pipeline |
| Retrieve pipeline logs with optional cleaning |
| Recent pipelines across all namespaces |
| Find pipelines by pattern matching |
| Cluster-wide pipeline status summary |
Log Analysis (6 tools)
Tool | Description |
| Extract error patterns from log text |
| Intelligent log summarization |
| Streaming analysis for large logs |
| Combined analysis strategies |
| Anomaly detection with severity levels |
| NLP-based semantic log search |
Event Analysis (3 tools)
Tool | Description |
| Smart event retrieval with strategies |
| Multi-level event analysis |
| ML-powered event pattern detection |
Failure Analysis & RCA (2 tools)
Tool | Description |
| Root cause analysis for failed pipelines |
| Automated incident reports |
Resource Monitoring (4 tools)
Tool | Description |
| Detect resource issues in namespace |
| Statistical anomaly detection |
| Execute PromQL queries |
| Predict resource exhaustion |
Namespace Investigation (2 tools)
Tool | Description |
| Focused namespace health check |
| Dynamic investigation based on query |
Certificate & Security (2 tools)
Tool | Description |
| Find TLS-related problems |
| Certificate expiry monitoring |
OpenShift Specific (3 tools)
Tool | Description |
| MachineConfigPool status and updates |
| Cluster operator health |
| etcd log retrieval and analysis |
CI/CD Performance (2 tools)
Tool | Description |
| Pipeline performance baselines |
| Trace pipelines by commit, PR, or image |
Topology & Prediction (2 tools)
Tool | Description |
| Real-time system topology mapping |
| Predict issues from log patterns |
Simulation (1 tool)
Tool | Description |
| Simulate configuration changes |
Architecture
How It Works
LUMINO acts as a bridge between AI assistants and your Kubernetes infrastructure through the Model Context Protocol:
Workflow
User Query β AI assistant receives natural language request
MCP Translation β Assistant converts query to appropriate tool calls
LUMINO Processing β Server executes Kubernetes/Prometheus operations
Data Analysis β ML/statistical algorithms process raw data
AI Synthesis β Assistant formats results into human-readable insights
Key Features
Stateless Design - No data persistence, queries cluster in real-time
Automatic Transport Detection - Switches between stdio (local) and HTTP (K8s) modes
Token Budget Management - Adaptive strategies to handle large log volumes
Intelligent Caching - Smart caching for frequently accessed data
Security First - Uses existing kubeconfig RBAC permissions, no separate auth
MCP Client Integration
Method 1: Using MCPM (Recommended for Claude Code CLI / Gemini CLI)
The easiest way to install LUMINO MCP Server for Claude Code CLI or Gemini CLI is using MCPM - an MCP server package manager.
Install MCPM
Requirements: Go 1.23+, Git, Python 3.10+, uv (or pip)
Install LUMINO MCP Server
Short syntax explained:
@owner/repo- Installs from GitHub (default:https://github.com/owner/repo.git)gl:@owner/repo- Installs from GitLab (https://gitlab.com/owner/repo.git)Full URL - Works with any Git repository
This will:
Clone the repository to
~/.mcp/servers/lumino-mcp-server/Auto-detect Python project and install dependencies using
uv(or pip)Register with Claude Code CLI or Gemini CLI configuration automatically
Manage LUMINO
Method 2: Manual Configuration
If you prefer manual setup or need to configure Claude Desktop / Cursor, follow these client-specific guides:
Claude Desktop
Find your config file location:
macOS:
~/Library/Application Support/Claude/claude_desktop_config.jsonWindows:
%APPDATA%\Claude\claude_desktop_config.jsonLinux:
~/.config/Claude/claude_desktop_config.json
Add LUMINO configuration:
Restart Claude Desktop
Verify: Look for the hammer icon (π¨) in Claude Desktop to see available tools
Claude Code CLI
Option A: Using MCPM (see Method 1 above)
Option B: Automatic Provisioning via Claude Code (Recommended and easiest way)
Copy and paste the provisioning prompt from the Quick Start section above into Claude Code. Claude will clone the repository, install dependencies, and configure the MCP server for your project.
Option C: Manual Configuration
Clone and install:
Create in your project root (for project-local config) or update
~/.claude.json(for global config):
Important: Replace /absolute/path/to/lumino-mcp-server with the actual absolute path where you cloned the repository (e.g., /Users/username/projects/lumino-mcp-server).
Verify installation:
Gemini CLI
Option A: Using MCPM (Recommended - see Method 1 above)
Option B: Manual Configuration
Find your config file location:
macOS/Linux:
~/.config/gemini/mcp_servers.jsonWindows:
%APPDATA%\gemini\mcp_servers.json
Add LUMINO configuration:
Verify installation:
Cursor IDE
Open Cursor Settings:
Press
Cmd+,(macOS) orCtrl+,(Windows/Linux)Search for "MCP" or "Model Context Protocol"
Add MCP Server Configuration:
In Cursor's MCP settings, add:
Alternative - Using Cursor's settings.json:
Open Command Palette (
Cmd+Shift+PorCtrl+Shift+P)Type "Preferences: Open User Settings (JSON)"
Add the MCP configuration:
Restart Cursor IDE
Verify: Open Cursor's AI chat and check if LUMINO tools are available
Configuration Notes
Replace with the actual path where you cloned the repository:
Environment Variables (optional):
Add these to the env section if needed:
Using Alternative Python Package Managers
With pip instead of uv
Note: Ensure you've activated the virtual environment first:
With poetry
Testing Your Configuration
After configuring any client, test the connection:
Check if tools are loaded:
Claude Desktop: Look for π¨ hammer icon
Claude Code CLI:
claude mcp listGemini CLI:
gemini mcp listCursor: Check AI chat for available tools
Test a simple query:
Check server logs (if issues):
Expected output:
Advanced Configuration
Multiple Clusters
Configure multiple LUMINO instances for different clusters:
Custom Log Level
Supported Transports
The server automatically detects the appropriate transport:
stdio - For local desktop integrations (Claude Desktop, Claude Code CLI, Gemini CLI, Cursor)
streamable-http - For Kubernetes deployments (when
KUBERNETES_NAMESPACEis set)
Performance Considerations
Optimizing for Large Clusters
LUMINO is designed to handle clusters of any size efficiently:
Cluster Size | Recommendation | Tool Strategy |
Small (< 50 pods) | Use default settings | All tools work optimally |
Medium (50-500 pods) | Use namespace filtering | Leverage adaptive tools with auto-sampling |
Large (500+ pods) | Specify time windows and namespaces | Use conservative and streaming tools |
Very Large (1000+ pods) | Combine filters and pagination | Progressive analysis with targeted queries |
Token Budget Management
LUMINO automatically manages AI context limits:
Adaptive Sampling - Smart tools auto-sample data when volumes are high
Progressive Loading - Stream analysis processes data in chunks
Token Budgets - Configurable limits prevent context overflow
Hybrid Strategies - Automatically selects best analysis approach
Query Optimization Tips
Use Namespace Filtering
Specify Time Windows
Leverage Smart Tools
Use Progressive Analysis
Performance Metrics
Operation | Typical Response Time | Scalability |
List namespaces | < 1s | O(1) |
Get pod logs (1 pod) | 1-3s | O(log size) |
Analyze pipeline run | 2-5s | O(task count) |
Cluster-wide search | 5-15s | O(namespace count) |
ML anomaly detection | 3-10s | O(data points) |
Topology mapping | 5-20s | O(resource count) |
Caching Strategy
LUMINO uses intelligent caching for frequently accessed data:
15-minute cache - For web-fetched content
Session cache - For hybrid log analysis
No persistence - All data queries cluster in real-time
Concurrent Requests
The server handles multiple concurrent requests efficiently:
Thread-safe operations - Safe parallel tool execution
Connection pooling - Reuses Kubernetes API connections
Async HTTP - Non-blocking Prometheus queries
Resource Usage
Server Resource Requirements
Deployment | CPU | Memory | Disk |
Local (stdio) | 100-500m | 256-512Mi | Minimal |
Kubernetes | 200m-1 | 512Mi-1Gi | Minimal |
High-load | 1-2 | 1-2Gi | Minimal |
Note: LUMINO is stateless and requires minimal resources. Most processing happens in the AI assistant.
Troubleshooting
Common Issues
No Kubernetes cluster found
Ensure you have a valid kubeconfig at ~/.kube/config or are running inside a cluster.
Permission denied for resources
Check your RBAC permissions. The server needs read access to the resources you want to query.
Tool timeout For large clusters, some tools may timeout. Use filtering options (namespace, labels) to reduce scope.
Dependencies
mcp[cli]>=1.10.1- Model Context Protocol SDKkubernetes>=32.0.1- Kubernetes Python clientpandas>=2.0.0- Data analysisscikit-learn>=1.6.1- ML algorithmsprometheus-client>=0.22.0- Prometheus integrationaiohttp>=3.12.2- Async HTTP client
Contributing
Contributions are welcome! Please read our Contributing Guide before submitting pull requests.
Security
For security vulnerabilities, please see our Security Policy.
License
This project is licensed under the Apache License 2.0 - see the LICENSE file for details.
Acknowledgments
Built with FastMCP framework
Inspired by the needs of SRE teams managing complex Kubernetes environments