Skip to main content
Glama
Use-Tusk
by Use-Tusk

aggregate_spans

Analyze performance metrics from application spans to calculate latency percentiles, error rates, request counts, and compare environments for debugging and optimization.

Instructions

Calculate aggregated metrics and statistics across spans.

Use this tool to:

  • Get latency percentiles for endpoints (p50, p95, p99)

  • Calculate error rates by endpoint

  • Get request counts over time

  • Compare performance across environments

Examples:

  • Endpoint latency: groupBy = ["name"], metrics = ["count", "avgDuration", "p95Duration"]

  • Error rates: groupBy = ["name"], metrics = ["count", "errorCount", "errorRate"]

  • Hourly trends: timeBucket = "hour", metrics = ["count", "errorRate"]

Input Schema

TableJSON Schema
NameRequiredDescriptionDefault
observableServiceIdNoService ID to query (required if multiple services available)
whereNoFilter conditions
groupByNoFields to group by
metricsYesMetrics to calculate
timeBucketNoTime bucket for time-series data
orderByNoOrder by metric
limitNoMax results

Implementation Reference

  • The handler function that executes the aggregate_spans tool logic: validates input with Zod schema, calls the API client to aggregate spans, formats results into a human-readable text table.
    export async function handleAggregateSpans(
      client: TuskDriftApiClient,
      args: Record<string, unknown>
    ): Promise<{ content: Array<{ type: "text"; text: string }> }> {
      const input = aggregateSpansInputSchema.parse(args) as AggregateSpansInput;
      const result = await client.aggregateSpans(input);
    
      const header = `Aggregation Results (${result.results.length} rows):\n`;
    
      const rows = result.results
        .map((row, i) => {
          const groupStr = Object.entries(row.groupValues)
            .map(([k, v]) => `${k}=${v}`)
            .join(", ");
    
          const metrics: string[] = [];
          if (row.count !== undefined) metrics.push(`count: ${row.count}`);
          if (row.errorCount !== undefined) metrics.push(`errors: ${row.errorCount}`);
          if (row.errorRate !== undefined) metrics.push(`error rate: ${(row.errorRate * 100).toFixed(2)}%`);
          if (row.avgDuration !== undefined) metrics.push(`avg: ${row.avgDuration.toFixed(2)}ms`);
          if (row.minDuration !== undefined) metrics.push(`min: ${row.minDuration.toFixed(2)}ms`);
          if (row.maxDuration !== undefined) metrics.push(`max: ${row.maxDuration.toFixed(2)}ms`);
          if (row.p50Duration !== undefined) metrics.push(`p50: ${row.p50Duration.toFixed(2)}ms`);
          if (row.p95Duration !== undefined) metrics.push(`p95: ${row.p95Duration.toFixed(2)}ms`);
          if (row.p99Duration !== undefined) metrics.push(`p99: ${row.p99Duration.toFixed(2)}ms`);
    
          const timeBucketStr = row.timeBucket ? ` [${row.timeBucket}]` : "";
    
          return `${i + 1}. ${groupStr || "(all)"}${timeBucketStr}\n   ${metrics.join(" | ")}`;
        })
        .join("\n\n");
    
      return {
        content: [
          {
            type: "text",
            text: header + rows,
          },
        ],
      };
    }
  • Zod schema defining the input structure and validation for the aggregate_spans tool, including filters, grouping, metrics, and limits.
    export const aggregateSpansInputSchema = z.object({
      observableServiceId: z.string().optional().describe("Service ID to query (required if multiple services available)"),
      where: spanWhereClauseSchema.optional().describe("Filter conditions"),
      groupBy: z
        .array(z.enum(["name", "packageName", "instrumentationName", "environment", "statusCode"]))
        .optional()
        .describe("Fields to group by"),
      metrics: z
        .array(
          z.enum([
            "count",
            "errorCount",
            "errorRate",
            "avgDuration",
            "minDuration",
            "maxDuration",
            "p50Duration",
            "p95Duration",
            "p99Duration",
          ])
        )
        .min(1)
        .describe("Metrics to calculate"),
      timeBucket: z.enum(["hour", "day", "week"]).optional().describe("Time bucket for time-series data"),
      orderBy: z
        .object({
          metric: z.string(),
          direction: z.enum(["ASC", "DESC"]),
        })
        .optional()
        .describe("Order by metric"),
      limit: z.number().min(1).max(100).default(20).describe("Max results"),
    });
  • MCP Tool object definition for 'aggregate_spans', including name, detailed description, and JSON input schema compatible with MCP.
    export const aggregateSpansTool: Tool = {
      name: "aggregate_spans",
      description: `Calculate aggregated metrics and statistics across spans.
    
    Use this tool to:
    - Get latency percentiles for endpoints (p50, p95, p99)
    - Calculate error rates by endpoint
    - Get request counts over time
    - Compare performance across environments
    
    Examples:
    - Endpoint latency: groupBy = ["name"], metrics = ["count", "avgDuration", "p95Duration"]
    - Error rates: groupBy = ["name"], metrics = ["count", "errorCount", "errorRate"]
    - Hourly trends: timeBucket = "hour", metrics = ["count", "errorRate"]`,
      inputSchema: {
        type: "object",
        properties: {
          observableServiceId: {
            type: "string",
            description: "Service ID to query. Required if multiple services are available.",
          },
          where: {
            type: "object",
            description: "Filter conditions (same as query_spans)",
          },
          groupBy: {
            type: "array",
            description: "Fields to group by",
            items: {
              type: "string",
              enum: ["name", "packageName", "instrumentationName", "environment", "statusCode"],
            },
          },
          metrics: {
            type: "array",
            description: "Metrics to calculate",
            items: {
              type: "string",
              enum: [
                "count",
                "errorCount",
                "errorRate",
                "avgDuration",
                "minDuration",
                "maxDuration",
                "p50Duration",
                "p95Duration",
                "p99Duration",
              ],
            },
          },
          timeBucket: {
            type: "string",
            description: "Time bucket for time-series data",
            enum: ["hour", "day", "week"],
          },
          orderBy: {
            type: "object",
            description: "Order results by a metric",
            properties: {
              metric: { type: "string" },
              direction: { type: "string", enum: ["ASC", "DESC"] },
            },
          },
          limit: {
            type: "number",
            description: "Maximum results to return",
            default: 20,
          },
        },
        required: ["metrics"],
      },
    };
  • Central registration mapping the tool name 'aggregate_spans' to its handler function handleAggregateSpans.
    export const toolHandlers: Record<string, ToolHandler> = {
      query_spans: handleQuerySpans,
      get_schema: handleGetSchema,
      list_distinct_values: handleListDistinctValues,
      aggregate_spans: handleAggregateSpans,
      get_trace: handleGetTrace,
      get_spans_by_ids: handleGetSpansByIds,
    };
  • Export of all tools array including the aggregateSpansTool for MCP server registration.
    export const tools: Tool[] = [
      querySpansTool,
      getSchemaTool,
      listDistinctValuesTool,
      aggregateSpansTool,
      getTraceTool,
      getSpansByIdsTool,
    ];
Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

With no annotations provided, the description carries full burden for behavioral disclosure. It effectively communicates the tool's analytical nature through examples showing different aggregation scenarios. However, it doesn't mention performance characteristics, rate limits, or authentication requirements that would be helpful for a data aggregation tool.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is well-structured with clear sections: purpose statement, usage scenarios, and concrete examples. Every sentence adds value, and the information is front-loaded with the most important guidance appearing first. The examples are specific and illustrative without being verbose.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

For a complex aggregation tool with 7 parameters, rich schema (100% coverage), and no output schema, the description provides good context through usage scenarios and examples. However, without annotations or output schema, it could benefit from mentioning expected return format or result structure to fully prepare the agent for interpreting results.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema description coverage is 100%, so the schema already documents all parameters thoroughly. The description adds value through examples showing practical combinations of parameters (e.g., 'groupBy = ["name"], metrics = ["count", "avgDuration", "p95Duration"]'), but doesn't provide additional semantic context beyond what the schema already specifies.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the tool's purpose with specific verbs ('calculate aggregated metrics and statistics') and resources ('across spans'). It distinguishes itself from sibling tools like 'get_spans_by_ids' or 'query_spans' by focusing on aggregation rather than retrieval of individual spans or traces.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines5/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description provides explicit usage guidelines with a 'Use this tool to:' section listing four specific scenarios (latency percentiles, error rates, request counts, performance comparison). It differentiates from alternatives by focusing on aggregated metrics rather than raw span data retrieval.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

Install Server

Other Tools

Latest Blog Posts

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/Use-Tusk/drift-mcp'

If you have feedback or need assistance with the MCP directory API, please join our Discord server