Foundry MCP

observability.md•6.83 KiB

# Observability Guide This guide covers foundry-mcp's observability stack, including OpenTelemetry tracing and Prometheus metrics. ## Design Philosophy foundry-mcp implements **graceful degradation** for all observability features: - When optional dependencies are not installed, all observability features become no-ops - Zero performance overhead when observability is disabled - No import errors or runtime failures from missing dependencies - Features can be enabled incrementally as needed This means you can: 1. Run foundry-mcp without any observability dependencies 2. Add Prometheus metrics without OpenTelemetry 3. Add OpenTelemetry tracing without Prometheus 4. Enable full observability with both ## Quick Start ### Installing Dependencies ```bash # Prometheus metrics only pip install foundry-mcp[metrics] # OpenTelemetry tracing only pip install foundry-mcp[tracing] # Full observability stack pip install foundry-mcp[observability] ``` ### Enabling Prometheus Metrics Add to your `foundry-mcp.toml`: ```toml [observability] enabled = true prometheus_enabled = true prometheus_port = 9090 # Expose /metrics on this port ``` Verify metrics are available: ```bash curl http://localhost:9090/metrics ``` ### Enabling OpenTelemetry Tracing Add to your `foundry-mcp.toml`: ```toml [observability] enabled = true otel_enabled = true otel_endpoint = "localhost:4317" # OTLP gRPC endpoint otel_service_name = "foundry-mcp" otel_sample_rate = 1.0 # Sample all traces (reduce in production) ``` ### Full Stack Configuration ```toml [observability] enabled = true # OpenTelemetry otel_enabled = true otel_endpoint = "otel-collector:4317" otel_service_name = "foundry-mcp" otel_sample_rate = 0.1 # Sample 10% in production # Prometheus prometheus_enabled = true prometheus_port = 9090 prometheus_host = "0.0.0.0" prometheus_namespace = "foundry_mcp" ``` ## Configuration Reference ### TOML Configuration | Key | Type | Default | Description | |-----|------|---------|-------------| | `observability.enabled` | bool | `false` | Master switch for all observability | | `observability.otel_enabled` | bool | `false` | Enable OpenTelemetry tracing | | `observability.otel_endpoint` | string | `localhost:4317` | OTLP gRPC endpoint | | `observability.otel_service_name` | string | `foundry-mcp` | Service name in traces | | `observability.otel_sample_rate` | float | `1.0` | Trace sampling rate (0.0-1.0) | | `observability.prometheus_enabled` | bool | `false` | Enable Prometheus metrics | | `observability.prometheus_port` | int | `0` | HTTP port for /metrics (0=disabled) | | `observability.prometheus_host` | string | `0.0.0.0` | HTTP server bind address | | `observability.prometheus_namespace` | string | `foundry_mcp` | Metric name prefix | ### Environment Variables All TOML settings can be overridden via environment variables: | Environment Variable | TOML Equivalent | |---------------------|-----------------| | `FOUNDRY_MCP_OBSERVABILITY_ENABLED` | `observability.enabled` | | `FOUNDRY_MCP_OTEL_ENABLED` | `observability.otel_enabled` | | `FOUNDRY_MCP_OTEL_ENDPOINT` | `observability.otel_endpoint` | | `FOUNDRY_MCP_OTEL_SERVICE_NAME` | `observability.otel_service_name` | | `FOUNDRY_MCP_OTEL_SAMPLE_RATE` | `observability.otel_sample_rate` | | `FOUNDRY_MCP_PROMETHEUS_ENABLED` | `observability.prometheus_enabled` | | `FOUNDRY_MCP_PROMETHEUS_PORT` | `observability.prometheus_port` | | `FOUNDRY_MCP_PROMETHEUS_HOST` | `observability.prometheus_host` | | `FOUNDRY_MCP_PROMETHEUS_NAMESPACE` | `observability.prometheus_namespace` | ## Metrics Catalog foundry-mcp emits the following Prometheus metrics: ### Tool Metrics | Metric | Type | Labels | Description | |--------|------|--------|-------------| | `foundry_mcp_tool_invocations_total` | Counter | tool, status | Total tool invocations | | `foundry_mcp_tool_duration_seconds` | Histogram | tool | Tool execution duration | | `foundry_mcp_tool_errors_total` | Counter | tool, error_type | Total tool errors | | `foundry_mcp_active_operations` | Gauge | operation_type | Currently active operations | ### Resource Metrics | Metric | Type | Labels | Description | |--------|------|--------|-------------| | `foundry_mcp_resource_access_total` | Counter | resource_type, action | Total resource accesses | For the full catalog with examples, see `src/foundry_mcp/core/metrics_registry.py`. ## Deep Research Audit Artifacts Deep research workflows emit a per-run JSONL audit artifact when enabled. Each line is a structured event covering phase starts/completions, provider results, LLM prompts/responses, and summary stats. **Location (default)**: - `specs/.research/deep_research/<research_id>.audit.jsonl` **Config toggle**: - `research.deep_research_audit_artifacts = true` **Example event**: ```json {"timestamp":"2026-01-01T12:00:00Z","event_id":"...","event_type":"analysis_result","level":"info","research_id":"deepres-abc123","phase":"analysis","iteration":1,"data":{"provider_id":"gemini","tokens_used":1234,"parse_success":true,"finding_count":4}} ``` ## Troubleshooting ### Observability Not Working 1. **Check dependencies are installed:** ```bash pip show prometheus-client opentelemetry-sdk ``` 2. **Verify configuration:** ```python from foundry_mcp.core.observability import get_observability_status print(get_observability_status()) ``` 3. **Check logs for initialization errors:** Look for "Failed to initialize OpenTelemetry" or "Failed to initialize Prometheus" messages. ### Metrics Not Appearing 1. **Verify Prometheus port is exposed:** ```bash curl http://localhost:9090/metrics ``` 2. **Check `prometheus_port` is non-zero** in configuration. 3. **Verify `observability.enabled = true`** - this is the master switch. ### Traces Not Appearing in Jaeger/Zipkin 1. **Verify OTLP endpoint is reachable:** ```bash nc -zv localhost 4317 ``` 2. **Check `otel_sample_rate`** - set to `1.0` for debugging. 3. **Ensure traces are flushed** - the server calls `shutdown()` on exit to flush pending spans. ### High Memory Usage If observability is causing memory issues: 1. Reduce `otel_sample_rate` to sample fewer traces 2. Ensure the OTLP endpoint is healthy (spans queue when endpoint is down) 3. Set `prometheus_port = 0` if not using Prometheus HTTP server ## Programmatic Access For advanced use cases, access the observability manager directly: ```python from foundry_mcp.core.observability import get_observability_manager manager = get_observability_manager() # Check what's enabled print(f"Tracing: {manager.is_tracing_enabled()}") print(f"Metrics: {manager.is_metrics_enabled()}") # Get a tracer for custom spans tracer = manager.get_tracer("my-module") with tracer.start_as_current_span("my-operation"): # ... do work pass # Get Prometheus exporter for custom metrics exporter = manager.get_prometheus_exporter() exporter.record_tool_invocation("custom-tool", success=True, duration_ms=42.5) ```

Loading blob content...

Latest Blog Posts

Redis vs ioredis vs valkey-glide
By punkpeye on January 26, 2026.
benchmark
Redis
valkey
Quickstart: Publish an MCP Server to the MCP Registry
By punkpeye on January 24, 2026.
mcp
official reference mirror
Official MCP Registry Server.json Requirements
By punkpeye on January 24, 2026.
mcp
official reference mirror

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/tylerburleigh/foundry-mcp'

If you have feedback or need assistance with the MCP directory API, please join our Discord server

observability.md•6.83 KiB