Skip to main content
Glama
sandraschi

Observability MCP Server

Observability MCP Server

FastMCP 2.14.1-powered observability server for monitoring MCP ecosystems

FastMCP OpenTelemetry Prometheus GitHub

A comprehensive observability server built on FastMCP 2.14.1 that leverages OpenTelemetry integration, persistent storage, and advanced monitoring capabilities to provide production-grade observability for MCP server ecosystems.


πŸš€ Features

FastMCP 2.14.1 Integration

  • βœ… OpenTelemetry Integration - Distributed tracing and metrics collection

  • βœ… Enhanced Storage Backend - Persistent metrics and historical data

  • βœ… Production-Ready - Built for high-performance monitoring

Comprehensive Monitoring

  • πŸ” Real-time Health Checks - Monitor MCP server availability and response times

  • πŸ“Š Performance Metrics - CPU, memory, disk, and network monitoring

  • πŸ”— Distributed Tracing - Track interactions across MCP server ecosystems

  • 🚨 Intelligent Alerting - Anomaly detection and automated alerts

  • πŸ“ˆ Performance Reports - Automated analysis and optimization recommendations

Advanced Analytics

  • πŸ”¬ Usage Pattern Analysis - Understand how MCP servers are being used

  • πŸ“‰ Trend Detection - Identify performance trends and bottlenecks

  • 🎯 Optimization Insights - Data-driven recommendations for improvement

  • πŸ“€ Multi-Format Export - Prometheus, OpenTelemetry, and JSON export


πŸ› οΈ Installation

Prerequisites

  • Python 3.11+

  • FastMCP 2.14.1+ (automatically installed)

Install from Source

git clone https://github.com/sandraschi/observability-mcp cd observability-mcp pip install -e .

Docker Installation

docker build -t observability-mcp . docker run -p 9090:9090 observability-mcp

πŸš€ Quick Start

1. Start the Server

# Using the CLI observability-mcp run # Or directly with Python python -m observability_mcp.server

2. Verify Installation

# Check server health observability-mcp health # View available metrics observability-mcp metrics

3. Configure Claude Desktop

Add to your claude_desktop_config.json:

{ "mcpServers": { "observability": { "command": "observability-mcp", "args": ["run"] } } }

πŸ“Š Available Tools

πŸ” Health Monitoring

  • monitor_server_health - Real-time health checks with OpenTelemetry metrics

  • monitor_system_resources - Comprehensive system resource monitoring

πŸ“ˆ Performance Analysis

  • collect_performance_metrics - CPU, memory, disk, and network metrics

  • generate_performance_reports - Automated performance analysis and recommendations

  • analyze_mcp_interactions - Usage pattern analysis and optimization insights

🚨 Alerting & Anomaly Detection

  • alert_on_anomalies - Intelligent anomaly detection and alerting

  • trace_mcp_calls - Distributed tracing for MCP server interactions

πŸ“€ Data Export

  • export_metrics - Export metrics in Prometheus, OpenTelemetry, or JSON formats


πŸ”§ Configuration

Environment Variables

# Prometheus metrics server port PROMETHEUS_PORT=9090 # OpenTelemetry service name OTEL_SERVICE_NAME=observability-mcp # OTLP exporter endpoint (optional) OTEL_EXPORTER_OTLP_ENDPOINT=http://localhost:4317 # Metrics retention period (days) METRICS_RETENTION_DAYS=30

Alert Configuration

The server comes with pre-configured alerts for common issues:

  • CPU Usage > 90% (Warning)

  • Memory Usage > 1GB (Error)

  • Error Rate > 5% (Error)

Alerts are stored persistently and can be customized through the MCP tools.


πŸ“ˆ Monitoring Dashboard

Prometheus Metrics

Access metrics at: http://localhost:9090/metrics

Available metrics:

# Health checks mcp_health_checks_total{status="healthy|degraded|unhealthy", service="..."} 1 # Performance metrics mcp_performance_metrics_collected{service="..."} 1 # System resources mcp_cpu_usage_percent{} 45.2 mcp_memory_usage_mb{} 1024.5 # Traces and alerts mcp_traces_created{service="...", operation="..."} 1 mcp_alerts_triggered{type="active|anomaly"} 1

Integration with Grafana

  1. Add Prometheus as a data source in Grafana

  2. Import the provided dashboard JSON

  3. Visualize your MCP ecosystem's health and performance


πŸ—οΈ Architecture

FastMCP 2.14.1 Features Leveraged

OpenTelemetry Integration

  • Distributed Tracing: Track requests across multiple MCP servers

  • Metrics Collection: Structured performance data collection

  • Context Propagation: Maintain context across service boundaries

Enhanced Persistent Storage

  • Historical Data: Store metrics and traces for trend analysis

  • Cross-Session Persistence: Data survives server restarts

  • Efficient Storage: Optimized for time-series data

Production Architecture

β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”‚ MCP Servers │───▢│ Observability │───▢│ Prometheus β”‚ β”‚ (Monitored) β”‚ β”‚ MCP Server β”‚ β”‚ Metrics β”‚ β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ β”‚ β”‚ β–Ό β–Ό β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”‚ Persistent β”‚ β”‚ Grafana β”‚ β”‚ Storage β”‚ β”‚ Dashboard β”‚ β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜

πŸ“š Usage Examples

Health Monitoring

# Check MCP server health result = await monitor_server_health( service_url="http://localhost:8000/health", timeout_seconds=5.0 ) print(f"Status: {result['health_check']['status']}")

Performance Analysis

# Collect system metrics metrics = await collect_performance_metrics(service_name="my-mcp-server") print(f"CPU: {metrics['metrics']['cpu_percent']}%") print(f"Memory: {metrics['metrics']['memory_mb']} MB")

Distributed Tracing

# Record a trace trace = await trace_mcp_calls( operation_name="process_document", service_name="ocr-mcp", duration_ms=150.5, attributes={"file_size": "2.3MB", "format": "PDF"} )

Generate Reports

# Create performance report report = await generate_performance_reports( service_name="web-mcp", days=7 ) print("Performance Summary:", report['summary']) print("Recommendations:", report['recommendations'])

πŸ”§ Development

Running Tests

# Install development dependencies pip install -e ".[dev]" # Run tests pytest # Run with coverage pytest --cov=observability_mcp --cov-report=html

Code Quality

# Format code black src/ # Lint code ruff check src/ # Type checking mypy src/

Docker Development

# Build development image docker build -t observability-mcp:dev -f Dockerfile.dev . # Run with hot reload docker run -p 9090:9090 -v $(pwd):/app observability-mcp:dev

πŸ“Š Performance Benchmarks

FastMCP 2.14.1 Benefits

  • OpenTelemetry Overhead: <1ms per trace

  • Storage Performance: 1000+ metrics/second

  • Memory Usage: 50MB baseline + 10MB per monitored service

  • Concurrent Monitoring: 100+ services simultaneously

  • CPU: 2+ cores for metrics processing

  • RAM: 2GB minimum, 4GB recommended

  • Storage: 10GB for metrics history (30 days retention)


🚨 Troubleshooting

Common Issues

Server Won't Start

# Check Python version python --version # Should be 3.11+ # Check FastMCP installation pip show fastmcp # Should be 2.14.1+ # Check dependencies pip check

Metrics Not Appearing

# Check Prometheus endpoint curl http://localhost:9090/metrics # Verify OpenTelemetry configuration observability-mcp metrics

High Memory Usage

  • Reduce METRICS_RETENTION_DAYS

  • Implement metric aggregation

  • Monitor with monitor_system_resources

Storage Issues

  • Check available disk space

  • Clean old metrics: rm -rf ~/.observability-mcp/metrics/*

  • Restart server to recreate storage


🀝 Contributing

Development Setup

  1. Fork the repository

  2. Create a feature branch

  3. Make your changes

  4. Add tests for new functionality

  5. Submit a pull request

Code Standards

  • FastMCP 2.14.1+: Use latest features and patterns

  • OpenTelemetry: Follow OTEL best practices

  • Async First: All operations should be async

  • Type Hints: Full type coverage required

  • Documentation: Comprehensive docstrings

Testing Strategy

  • Unit Tests: Core functionality

  • Integration Tests: MCP server interactions

  • Performance Tests: Benchmarking and load testing

  • Chaos Tests: Failure scenario testing


πŸ“„ License

MIT License - see LICENSE file for details.


πŸ™ Acknowledgments

  • FastMCP Team - For the amazing 2.14.1 framework with OpenTelemetry integration

  • OpenTelemetry Community - For the observability standards and tools

  • Prometheus Team - For the metrics collection and alerting system



Built with ❀️ using FastMCP 2.14.1 and OpenTelemetry

-
security - not tested
F
license - not found
-
quality - not tested

Latest Blog Posts

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/sandraschi/observability-mcp'

If you have feedback or need assistance with the MCP directory API, please join our Discord server