# Schema Registry Custom Metrics Summary
## š **Custom `mcp_schema_registry_*` Metrics Added**
The remote MCP server now exposes **14 custom Schema Registry metrics** with the prefix `mcp_schema_registry_*` that provide deep insights into Schema Registry operations and health.
---
## š¢ **Counter Metrics (Operational Tracking)**
### **Operation Tracking**
```prometheus
# Operations by type across all registries
mcp_schema_registry_operations_total{operation="register_schema"} 25
mcp_schema_registry_operations_total{operation="get_schema"} 156
mcp_schema_registry_operations_total{operation="check_compatibility"} 42
mcp_schema_registry_operations_total{operation="list_subjects"} 89
mcp_schema_registry_operations_total{operation="export_schema"} 12
# Total operations per registry
mcp_schema_registry_operations_by_registry_total{registry="production"} 198
mcp_schema_registry_operations_by_registry_total{registry="staging"} 114
```
### **Error Tracking**
```prometheus
# Failed operations per registry
mcp_schema_registry_errors_total{registry="production"} 3
mcp_schema_registry_errors_total{registry="staging"} 1
```
### **Specific Operation Counters**
```prometheus
# Schema registrations
mcp_schema_registry_registrations_total{registry="production"} 25
mcp_schema_registry_registrations_total{registry="staging"} 18
# Compatibility checks
mcp_schema_registry_compatibility_checks_total{registry="production"} 42
mcp_schema_registry_compatibility_checks_total{registry="staging"} 28
# Schema exports
mcp_schema_registry_exports_total{registry="production"} 12
mcp_schema_registry_exports_total{registry="staging"} 8
# Context-specific operations
mcp_schema_registry_context_operations_total{registry="production",context="user-events"} 45
mcp_schema_registry_context_operations_total{registry="production",context="order-events"} 32
```
---
## š **Gauge Metrics (Current State)**
### **Registry Inventory**
```prometheus
# Current subjects count (real-time from Schema Registry)
mcp_schema_registry_subjects{registry="production"} 45
mcp_schema_registry_subjects{registry="staging"} 23
# Current schemas count (real-time from Schema Registry)
mcp_schema_registry_schemas{registry="production"} 127
mcp_schema_registry_schemas{registry="staging"} 67
# Available contexts count
mcp_schema_registry_contexts{registry="production"} 3
mcp_schema_registry_contexts{registry="staging"} 2
```
### **Health Status**
```prometheus
# Registry health (1=healthy, 0=unhealthy)
mcp_schema_registry_status{registry="production"} 1
mcp_schema_registry_status{registry="staging"} 1
```
### **Performance Metrics**
```prometheus
# Average response times
mcp_schema_registry_response_time_seconds_avg{registry="production"} 0.089000
mcp_schema_registry_response_time_seconds_avg{registry="staging"} 0.045000
# Maximum response times
mcp_schema_registry_response_time_seconds_max{registry="production"} 0.450000
mcp_schema_registry_response_time_seconds_max{registry="staging"} 0.200000
# Minimum response times
mcp_schema_registry_response_time_seconds_min{registry="production"} 0.023000
mcp_schema_registry_response_time_seconds_min{registry="staging"} 0.015000
```
---
## š **Key Features**
### **š Easy to Obtain Stats**
- ā
**Real-time inventory**: Current subjects/schemas count
- ā
**Performance tracking**: Response time statistics
- ā
**Error monitoring**: Failed operations per registry
- ā
**Operation insights**: Most-used MCP tools
- ā
**Context tracking**: Per-context operation counts
### **š Auto-Collection**
- ā
**Cached registry stats**: 5-minute TTL for performance
- ā
**Automatic instrumentation**: Metrics recorded on every operation
- ā
**Multi-registry aware**: Separate metrics per registry
- ā
**Context-aware**: Operations tracked by Schema Registry context
### **šÆ Production-Ready**
- ā
**Prometheus compatible**: Standard metric format
- ā
**Grafana dashboards**: Ready for visualization
- ā
**Alerting ready**: Critical thresholds for monitoring
- ā
**Performance optimized**: Minimal overhead on operations
---
## š **Monitoring Use Cases**
### **š Operations Monitoring**
```promql
# Most frequently used operations
rate(mcp_schema_registry_operations_total[5m])
# Error rate per registry
rate(mcp_schema_registry_errors_total[5m]) / rate(mcp_schema_registry_operations_by_registry_total[5m])
# Schema registration rate
rate(mcp_schema_registry_registrations_total[5m])
```
### **š Capacity Planning**
```promql
# Subject growth rate
deriv(mcp_schema_registry_subjects[1h])
# Schema proliferation
mcp_schema_registry_schemas / mcp_schema_registry_subjects
# Registry utilization
mcp_schema_registry_operations_by_registry_total
```
### **ā” Performance Analysis**
```promql
# Average response time trends
mcp_schema_registry_response_time_seconds_avg
# Slowest registries
topk(5, mcp_schema_registry_response_time_seconds_max)
# Performance comparison
mcp_schema_registry_response_time_seconds_avg{registry="production"} / mcp_schema_registry_response_time_seconds_avg{registry="staging"}
```
---
## šØ **Recommended Alerts**
### **Critical Alerts**
```yaml
# Registry down
- alert: SchemaRegistryDown
expr: mcp_schema_registry_status == 0
for: 1m
labels:
severity: critical
# High error rate
- alert: SchemaRegistryHighErrorRate
expr: rate(mcp_schema_registry_errors_total[5m]) / rate(mcp_schema_registry_operations_by_registry_total[5m]) > 0.1
for: 2m
labels:
severity: warning
# Slow response times
- alert: SchemaRegistrySlowResponse
expr: mcp_schema_registry_response_time_seconds_avg > 1.0
for: 5m
labels:
severity: warning
```
### **Capacity Alerts**
```yaml
# Subject count growing rapidly
- alert: SchemaRegistrySubjectGrowth
expr: increase(mcp_schema_registry_subjects[1h]) > 100
for: 0m
labels:
severity: info
# Schema proliferation
- alert: SchemaRegistrySchemaProliferation
expr: mcp_schema_registry_schemas / mcp_schema_registry_subjects > 10
for: 0m
labels:
severity: warning
```
---
## š§ **Technical Implementation**
### **Metrics Collection**
- **Real-time operations**: Recorded via `record_schema_operation()` method
- **Registry stats**: Cached via `get_registry_stats()` with 5-minute TTL
- **Performance data**: Response times tracked per operation
- **Error tracking**: Failed operations automatically recorded
### **Performance Optimizations**
- **Caching**: Registry stats cached to avoid excessive API calls
- **Sampling**: Limited to first 50 subjects for schema counting
- **Async-safe**: Thread-safe metrics collection
- **Memory efficient**: Bounded metric storage with automatic cleanup
### **Integration Points**
- **Health checks**: Registry connectivity tested and exposed
- **MCP tools**: All 48 tools automatically instrumented
- **Context awareness**: Operations tracked per Schema Registry context
- **Multi-registry**: Separate metrics for each configured registry
---
## ā
**Summary**
**Added 14 custom Schema Registry metrics** that provide:
šÆ **Operational Insights**: Track register, get, compatibility, export operations
š **Real-time Inventory**: Current subjects, schemas, contexts count
ā” **Performance Monitoring**: Response times (avg/min/max) per registry
šØ **Error Tracking**: Failed operations with registry-level granularity
š„ **Health Status**: Registry connectivity and status monitoring
š **Context Awareness**: Per-context operation tracking
These metrics are **automatically collected**, **easy to obtain**, and **production-ready** for monitoring Schema Registry operations at scale! š