Skip to main content
Glama

Observability & Governance: Using OTEL, Guardrails & Metrics with MCP Workflows

Written by on .

AWS
Strands Agents SDK
observability
otel
guardrails
mcp
Agentic Ai

  1. Implementing Observability & Governance in MCP and Strands
    1. a. Adding OpenTelemetry to MCP & Strands Agents
      1. b. Defining Guardrails on MCP Tools
        1. c. Capturing Custom Metrics
        2. What Happens Behind the Scenes
          1. Visualizing & Analyzing Observability Data
            1. Conclusion
              1. References

                In previous articles, we explored how to build, integrate, and deploy Strands Agents SDK with the Model Context Protocol (MCP) for dynamic tool usage and reasoning123. While these agents are functional and scalable, deploying them in production environments demands more than just reliability—it requires visibility and control.

                Without observability, it is difficult to trace how an agent made a decision, which tools were invoked, or why certain failures occurred. Similarly, without governance mechanisms like guardrails, agents might make unsafe or inefficient calls, leading to unpredictable outcomes.

                In this article, we’ll demonstrate:

                • How to add OpenTelemetry (OTEL) for tracing and metrics.
                • How to implement governance guardrails within MCP tools.
                • How these capabilities work together to monitor, audit, and optimize agentic workflows.

                Implementing Observability & Governance in MCP and Strands

                a. Adding OpenTelemetry to MCP & Strands Agents

                To monitor both MCP tool execution and agent decisions, we can integrate OTEL tracing45.

                Installation:

                pip install opentelemetry-api opentelemetry-sdk opentelemetry-instrumentation pip install opentelemetry-exporter-otlp

                OTEL Setup in MCP Server:

                from mcp.server.fastmcp import FastMCP from opentelemetry import trace from opentelemetry.sdk.trace import TracerProvider from opentelemetry.sdk.trace.export import BatchSpanProcessor, OTLPSpanExporter trace.set_tracer_provider(TracerProvider()) tracer = trace.get_tracer(__name__) otlp_exporter = OTLPSpanExporter(endpoint="localhost:4317", insecure=True) trace.get_tracer_provider().add_span_processor(BatchSpanProcessor(otlp_exporter)) mcp = FastMCP("otel-enabled-mcp", stateless_http=True, port=8002) @mcp.tool() def get_greeting(name: str) -> str: with tracer.start_as_current_span("get_greeting_tool"): return f"Hello, {name}!" if __name__ == "__main__": mcp.run(transport="streamable-http")

                Agent Side Instrumentation: Similarly, you can wrap the agent reasoning process with spans to trace which tools the LLM decides to invoke6.

                b. Defining Guardrails on MCP Tools

                Guardrails add runtime checks and constraints to ensure tools are used appropriately.

                from mcp.server.fastmcp import FastMCP mcp = FastMCP("guardrails-mcp", stateless_http=True, port=8002) @mcp.tool() def limited_sum(a: int, b: int) -> int: if a < 0 or b < 0: raise ValueError("Inputs must be non-negative.") if a + b > 100: raise ValueError("Sum cannot exceed 100.") return a + b if __name__ == "__main__": mcp.run(transport="streamable-http")

                This ensures any misuse of the tool by the agent is caught early, keeping operations within safe parameters.

                c. Capturing Custom Metrics

                You can also capture quantitative metrics, such as invocation counts or latency.

                from opentelemetry.metrics import get_meter_provider, set_meter_provider from opentelemetry.sdk.metrics import MeterProvider from opentelemetry.sdk.metrics.export import PeriodicExportingMetricReader from opentelemetry.exporter.otlp.proto.grpc.metric_exporter import OTLPMetricExporter set_meter_provider(MeterProvider()) meter = get_meter_provider().get_meter("mcp-metrics") exporter = OTLPMetricExporter(endpoint="localhost:4317", insecure=True) reader = PeriodicExportingMetricReader(exporter) meter_provider = get_meter_provider() meter_provider.start_pipeline(reader) tool_invocation_counter = meter.create_counter( "tool_invocations", description="Counts tool invocations" ) @mcp.tool() def monitored_greeting(name: str) -> str: tool_invocation_counter.add(1) return f"Hello, {name}!"

                What Happens Behind the Scenes

                Once integrated:

                • The Strands Agent, when reasoning over a user prompt, emits OTEL spans capturing which tools it intends to invoke45.
                • Upon calling an MCP tool, the MCP server logs a trace span for the tool execution.
                • If a tool includes guardrails, violations are raised and logged as part of the trace error details.
                • Simultaneously, metrics like tool invocation count, success/failure rates, and execution durations are collected.
                • All telemetry is exported to an observability backend like Grafana, AWS X-Ray, or Langfuse for visualization and audit7.

                Image

                This structured flow ensures you can:

                • Debug why an agent took a specific action.
                • Monitor operational health in real-time.
                • Validate that governance rules are effectively enforced.

                Visualizing & Analyzing Observability Data

                Once telemetry is wired, you can visualize traces and metrics through various platforms:

                • Grafana OTEL dashboards: Visualize tool latency, agent reasoning steps, and invocation frequency.
                • AWS X-Ray: Trace the end-to-end request from the agent through MCP tools.
                • Langfuse: Integrates with Strands SDK for agent-specific logs and evaluations8. Image

                Example Grafana panel queries:

                • Number of tool calls per hour
                • Average latency per tool
                • Error rates per MCP endpoint

                Conclusion

                Bringing observability and governance into your MCP + Strands setup isn’t just about monitoring—it’s about making these systems understandable and accountable. With tracing, metrics, and guardrails in place, teams can confidently manage agents in real-world environments. Looking ahead, these practices will be essential as agent workflows become more complex and need stronger assurances around safety, reliability, and auditability.

                References

                Footnotes

                1. Understanding AWS Strands Agents, an Open Source AI Agents SDK

                2. Hands‑On: Implementing a Basic Strands Agent with MCP

                3. Serverless Scaling: Deploying Strands + MCP on AWS

                4. Observability - Strands Agents SDK 2

                5. Traces - Strands Agents SDK 2

                6. Amazon Strands Agents SDK: A Technical Deep Dive into Agent Frameworks - LinkedIn

                7. Example - Tracing and Evaluation for the OpenAI-Agents SDK (Langfuse)

                8. Integrate Langfuse with the Strands Agents SDK

                Written by Om-Shree-0709 (@Om-Shree-0709)