Skip to main content
Glama

Log Trace

log_trace

Persist agent execution traces with input, output, spans, tool calls, cost, and latency for later evaluation and analysis.

Instructions

Persist a single agent execution trace (input, output, spans, tool calls, cost, latency, token usage).

Sibling tools — evaluate_output runs heuristic scoring on the trace; evaluate_with_llm_judge runs semantic LLM-based scoring; verify_citations checks citation grounding; get_traces queries stored traces; delete_trace removes a single trace; list_rules / deploy_rule / delete_rule manage custom evaluation rules. log_trace is the WRITE path that records executions; everything else reads, scores, or manages around it.

Behavior. Writes one row to Iris storage (SQLite by default; Postgres in Cloud tier). When IRIS_OTEL_ENDPOINT is set, ALSO fires a best-effort async export to the configured OTLP/HTTP collector (Jaeger, Tempo, Datadog OTLP, OTEL Collector). The OTel export is fire-and-forget — its success does not affect the tool response; failures are logged but the trace is still stored locally. No authentication in stdio mode; HTTP mode requires Bearer token. Rate-limited to 20 req/min on HTTP MCP, unlimited on stdio. Not idempotent: each call mints a fresh trace_id, so resubmitting the same payload creates a duplicate trace.

Output shape. Returns a JSON string: { "trace_id": "<32-hex>", "status": "stored" }. The trace_id is the key you pass to evaluate_output or get_traces afterwards.

Use when you want to record an agent execution for later evaluation, analysis, or audit. Call it AFTER the agent has produced output; call evaluate_output afterwards to score it; call get_traces to query historical traces. Store rich context: spans (span tree), tool_calls (which tools were invoked with latency/errors), token_usage, cost_usd, metadata (arbitrary key-value). All optional except agent_name.

Don't use when you only need a transient log (use console logging). Don't use to update an existing trace — there is no update path in v0.4 (traces are immutable once stored).

Parameters. agent_name is required; everything else is optional. token_usage and cost_usd are summary fields — if you ALSO pass spans with per-tool-call costs, the summary fields are treated as authoritative (no auto-aggregation). spans without an explicit start_time fall back to the trace timestamp; spans with an end_time get a duration_ms derived. metadata is opaque key-value (queryable in the dashboard, not via get_traces filters). tool_calls record per-tool latency + errors; missing latency_ms means "not reported," not "zero." Defaults: span.kind="INTERNAL", span.status_code="UNSET", timestamp=now() if omitted.

Error modes. Throws on missing agent_name. Throws on malformed span or tool_call objects (Zod rejects). Returns 500 on storage failure (disk full, DB locked). Never blocks on the agent — returns within ~50ms for typical payloads.

Input Schema

TableJSON Schema
NameRequiredDescriptionDefault
agent_nameYesAgent name — used for filtering in get_traces (e.g., "customer-support-bot")
frameworkNoAgent framework identifier (e.g., langchain, autogen, custom)
inputNoAgent input text — the user prompt or upstream input that produced this output
outputNoAgent output text — what the agent produced (pass to evaluate_output for scoring)
tool_callsNoTool calls made during execution (per-call latency, errors, input/output)
latency_msNoTotal execution time in milliseconds (end-to-end agent latency)
token_usageNoToken usage breakdown (prompt/completion/total — used for cost analysis)
cost_usdNoTotal cost in USD — overrides per-span aggregation when provided (treated as authoritative)
metadataNoOpaque key-value tags (e.g. {requestId, userId, env}) — queryable in dashboard, not via get_traces filters
spansNoDetailed execution spans (hierarchical span tree with timings, attributes, events)
timestampNoTrace timestamp (ISO 8601); defaults to now() when omitted

Implementation Reference

  • Tool handler that generates a trace ID, constructs a Trace object from validated inputs, inserts it into storage via IStorageAdapter.insertTrace, fires a best-effort async OTel export, and returns { trace_id, status: 'stored' } as JSON text.
    export function registerLogTraceTool(server: McpServer, storage: IStorageAdapter): void {
      server.registerTool(
        'log_trace',
        {
          title: 'Log Trace',
          description: [
            'Persist a single agent execution trace (input, output, spans, tool calls, cost, latency, token usage).',
            '',
            'Sibling tools — evaluate_output runs heuristic scoring on the trace; evaluate_with_llm_judge runs semantic LLM-based scoring; verify_citations checks citation grounding; get_traces queries stored traces; delete_trace removes a single trace; list_rules / deploy_rule / delete_rule manage custom evaluation rules. log_trace is the WRITE path that records executions; everything else reads, scores, or manages around it.',
            '',
            'Behavior. Writes one row to Iris storage (SQLite by default; Postgres in Cloud tier). When IRIS_OTEL_ENDPOINT is set, ALSO fires a best-effort async export to the configured OTLP/HTTP collector (Jaeger, Tempo, Datadog OTLP, OTEL Collector). The OTel export is fire-and-forget — its success does not affect the tool response; failures are logged but the trace is still stored locally. No authentication in stdio mode; HTTP mode requires Bearer token. Rate-limited to 20 req/min on HTTP MCP, unlimited on stdio. Not idempotent: each call mints a fresh trace_id, so resubmitting the same payload creates a duplicate trace.',
            '',
            'Output shape. Returns a JSON string: `{ "trace_id": "<32-hex>", "status": "stored" }`. The trace_id is the key you pass to evaluate_output or get_traces afterwards.',
            '',
            'Use when you want to record an agent execution for later evaluation, analysis, or audit. Call it AFTER the agent has produced output; call evaluate_output afterwards to score it; call get_traces to query historical traces. Store rich context: spans (span tree), tool_calls (which tools were invoked with latency/errors), token_usage, cost_usd, metadata (arbitrary key-value). All optional except agent_name.',
            '',
            'Don\'t use when you only need a transient log (use console logging). Don\'t use to update an existing trace — there is no update path in v0.4 (traces are immutable once stored).',
            '',
            'Parameters. agent_name is required; everything else is optional. token_usage and cost_usd are summary fields — if you ALSO pass spans with per-tool-call costs, the summary fields are treated as authoritative (no auto-aggregation). spans without an explicit start_time fall back to the trace timestamp; spans with an end_time get a duration_ms derived. metadata is opaque key-value (queryable in the dashboard, not via get_traces filters). tool_calls record per-tool latency + errors; missing latency_ms means "not reported," not "zero." Defaults: span.kind="INTERNAL", span.status_code="UNSET", timestamp=now() if omitted.',
            '',
            'Error modes. Throws on missing agent_name. Throws on malformed span or tool_call objects (Zod rejects). Returns 500 on storage failure (disk full, DB locked). Never blocks on the agent — returns within ~50ms for typical payloads.',
          ].join('\n'),
          inputSchema,
          annotations: {
            readOnlyHint: false,     // Writes a row to storage
            destructiveHint: false,  // Creates new data; doesn't overwrite or delete
            idempotentHint: false,   // Each call mints a fresh trace_id; duplicate payloads produce distinct traces
            openWorldHint: false,    // Local storage first. When IRIS_OTEL_ENDPOINT is set a best-effort async OTel export runs but is non-blocking (tool succeeds even if export fails).
          },
        },
        async (args) => {
          const traceId = generateTraceId();
          const timestamp = args.timestamp ?? new Date().toISOString();
    
          const trace = {
            trace_id: traceId,
            agent_name: args.agent_name,
            framework: args.framework,
            input: args.input,
            output: args.output,
            tool_calls: args.tool_calls,
            latency_ms: args.latency_ms,
            token_usage: args.token_usage,
            cost_usd: args.cost_usd,
            metadata: args.metadata as Record<string, unknown> | undefined,
            timestamp,
            spans: args.spans?.map((s) => ({
              ...s,
              span_id: s.span_id ?? generateSpanId(),
              trace_id: traceId,
            })),
          };
    
          await storage.insertTrace(LOCAL_TENANT, trace);
    
          // Best-effort async OTel export (fire-and-forget). No-op when
          // IRIS_OTEL_ENDPOINT isn't configured. Errors are logged via the
          // server logger but never affect the tool response — if the OTel
          // collector is down we still want to store traces locally.
          bestEffortExport(trace, (err) => {
            // eslint-disable-next-line no-console
            console.warn(`[iris.otel] ${err.message}`);
          });
    
          return {
            content: [
              {
                type: 'text' as const,
                text: JSON.stringify({ trace_id: traceId, status: 'stored' }),
              },
            ],
          };
        },
      );
    }
  • Zod input schema defining the 12 parameters accepted by log_trace. Only agent_name is required; all others optional. Includes nested schemas for ToolCallSchema, SpanSchema, and TokenUsageSchema.
    const inputSchema = {
      agent_name: z.string().describe('Agent name — used for filtering in get_traces (e.g., "customer-support-bot")'),
      framework: z.string().optional().describe('Agent framework identifier (e.g., langchain, autogen, custom)'),
      input: z.string().optional().describe('Agent input text — the user prompt or upstream input that produced this output'),
      output: z.string().optional().describe('Agent output text — what the agent produced (pass to evaluate_output for scoring)'),
      tool_calls: z.array(ToolCallSchema).optional().describe('Tool calls made during execution (per-call latency, errors, input/output)'),
      latency_ms: z.number().optional().describe('Total execution time in milliseconds (end-to-end agent latency)'),
      token_usage: TokenUsageSchema.optional().describe('Token usage breakdown (prompt/completion/total — used for cost analysis)'),
      cost_usd: z.number().optional().describe('Total cost in USD — overrides per-span aggregation when provided (treated as authoritative)'),
      metadata: z.record(z.unknown()).optional().describe('Opaque key-value tags (e.g. {requestId, userId, env}) — queryable in dashboard, not via get_traces filters'),
      spans: z.array(SpanSchema).optional().describe('Detailed execution spans (hierarchical span tree with timings, attributes, events)'),
      timestamp: z.string().optional().describe('Trace timestamp (ISO 8601); defaults to now() when omitted'),
    };
  • Central registration point that calls registerLogTraceTool(server, storage) among all other tools.
    export function registerAllTools(
      server: McpServer,
      storage: IStorageAdapter,
      evalEngine: EvalEngine,
      customRuleStore: CustomRuleStore,
    ): void {
      registerLogTraceTool(server, storage);
      registerEvaluateOutputTool(server, storage, evalEngine);
      registerGetTracesTool(server, storage);
      registerListRulesTool(server, customRuleStore);
      registerDeployRuleTool(server, customRuleStore);
      registerDeleteRuleTool(server, customRuleStore);
      registerDeleteTraceTool(server, storage);
      registerEvaluateWithLLMJudgeTool(server, storage);
      registerVerifyCitationsTool(server, storage);
    }
  • Helper functions generateTraceId (16 random bytes → 32-char hex) and generateSpanId (8 random bytes → 16-char hex) used in the log_trace handler.
    export function generateTraceId(): string {
      return randomBytes(16).toString('hex');
    }
    
    export function generateSpanId(): string {
      return randomBytes(8).toString('hex');
    }
    
    export function generateEvalId(): string {
      return randomUUID();
    }
  • Fire-and-forget helper that lazily initializes an OTel exporter from env vars and exports the trace asynchronously; errors are swallowed (logged via callback) and never affect the tool response.
    export function bestEffortExport(
      trace: Trace,
      onError?: (err: Error) => void,
    ): void {
      const exporter = getLazyExporter();
      if (!exporter) return;
    
      exporter
        .exportTraces([trace])
        .then((result) => {
          if (!result.ok) {
            const msg = `OTel export failed: status=${result.status} ${result.error ?? ''}`.trim();
            onError?.(new Error(msg));
          }
        })
        .catch((err: Error) => onError?.(err));
    }
Behavior5/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Discloses storage backend, optional OTel export, rate limits, idempotency (not idempotent), typical latency, and authentication modes. All behavioral traits are beyond annotations, which only indicate write and non-destructive nature.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness4/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is lengthy but well-structured with clear sections (purpose, siblings, behavior, output, usage, parameters, errors). Every sentence adds value, though some reorganization could reduce length slightly.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness5/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Covers all essential aspects: behavior, output shape, error modes, inter-tool relationships. No output schema exists, but description fully explains return values. Complex parameter interactions are documented thoroughly.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters5/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Adds significant meaning: explains defaults for spans (kind, status, timestamp), authoritative behavior for token_usage/cost_usd vs spans, interpretation of missing latency_ms, and metadata queryability limits. Schema coverage is 100% but description enriches with context.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states 'Persist a single agent execution trace' and explicitly contrasts with siblings: 'log_trace is the WRITE path that records executions; everything else reads, scores, or manages around it.' The purpose is specific and distinct.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines5/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

Provides explicit usage context: 'Use when you want to record an agent execution for later evaluation' and complementary tools to call afterward. Also states clear exclusions: 'Don't use when you only need a transient log (use console logging). Don't use to update an existing trace.'

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

Install Server

Other Tools

Latest Blog Posts

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/iris-eval/mcp-server'

If you have feedback or need assistance with the MCP directory API, please join our Discord server