Log Trace
log_tracePersist agent execution traces with input, output, spans, tool calls, cost, and latency for later evaluation and analysis.
Instructions
Persist a single agent execution trace (input, output, spans, tool calls, cost, latency, token usage).
Sibling tools — evaluate_output runs heuristic scoring on the trace; evaluate_with_llm_judge runs semantic LLM-based scoring; verify_citations checks citation grounding; get_traces queries stored traces; delete_trace removes a single trace; list_rules / deploy_rule / delete_rule manage custom evaluation rules. log_trace is the WRITE path that records executions; everything else reads, scores, or manages around it.
Behavior. Writes one row to Iris storage (SQLite by default; Postgres in Cloud tier). When IRIS_OTEL_ENDPOINT is set, ALSO fires a best-effort async export to the configured OTLP/HTTP collector (Jaeger, Tempo, Datadog OTLP, OTEL Collector). The OTel export is fire-and-forget — its success does not affect the tool response; failures are logged but the trace is still stored locally. No authentication in stdio mode; HTTP mode requires Bearer token. Rate-limited to 20 req/min on HTTP MCP, unlimited on stdio. Not idempotent: each call mints a fresh trace_id, so resubmitting the same payload creates a duplicate trace.
Output shape. Returns a JSON string: { "trace_id": "<32-hex>", "status": "stored" }. The trace_id is the key you pass to evaluate_output or get_traces afterwards.
Use when you want to record an agent execution for later evaluation, analysis, or audit. Call it AFTER the agent has produced output; call evaluate_output afterwards to score it; call get_traces to query historical traces. Store rich context: spans (span tree), tool_calls (which tools were invoked with latency/errors), token_usage, cost_usd, metadata (arbitrary key-value). All optional except agent_name.
Don't use when you only need a transient log (use console logging). Don't use to update an existing trace — there is no update path in v0.4 (traces are immutable once stored).
Parameters. agent_name is required; everything else is optional. token_usage and cost_usd are summary fields — if you ALSO pass spans with per-tool-call costs, the summary fields are treated as authoritative (no auto-aggregation). spans without an explicit start_time fall back to the trace timestamp; spans with an end_time get a duration_ms derived. metadata is opaque key-value (queryable in the dashboard, not via get_traces filters). tool_calls record per-tool latency + errors; missing latency_ms means "not reported," not "zero." Defaults: span.kind="INTERNAL", span.status_code="UNSET", timestamp=now() if omitted.
Error modes. Throws on missing agent_name. Throws on malformed span or tool_call objects (Zod rejects). Returns 500 on storage failure (disk full, DB locked). Never blocks on the agent — returns within ~50ms for typical payloads.
Input Schema
| Name | Required | Description | Default |
|---|---|---|---|
| agent_name | Yes | Agent name — used for filtering in get_traces (e.g., "customer-support-bot") | |
| framework | No | Agent framework identifier (e.g., langchain, autogen, custom) | |
| input | No | Agent input text — the user prompt or upstream input that produced this output | |
| output | No | Agent output text — what the agent produced (pass to evaluate_output for scoring) | |
| tool_calls | No | Tool calls made during execution (per-call latency, errors, input/output) | |
| latency_ms | No | Total execution time in milliseconds (end-to-end agent latency) | |
| token_usage | No | Token usage breakdown (prompt/completion/total — used for cost analysis) | |
| cost_usd | No | Total cost in USD — overrides per-span aggregation when provided (treated as authoritative) | |
| metadata | No | Opaque key-value tags (e.g. {requestId, userId, env}) — queryable in dashboard, not via get_traces filters | |
| spans | No | Detailed execution spans (hierarchical span tree with timings, attributes, events) | |
| timestamp | No | Trace timestamp (ISO 8601); defaults to now() when omitted |
Implementation Reference
- src/tools/log-trace.ts:53-127 (handler)Tool handler that generates a trace ID, constructs a Trace object from validated inputs, inserts it into storage via IStorageAdapter.insertTrace, fires a best-effort async OTel export, and returns { trace_id, status: 'stored' } as JSON text.
export function registerLogTraceTool(server: McpServer, storage: IStorageAdapter): void { server.registerTool( 'log_trace', { title: 'Log Trace', description: [ 'Persist a single agent execution trace (input, output, spans, tool calls, cost, latency, token usage).', '', 'Sibling tools — evaluate_output runs heuristic scoring on the trace; evaluate_with_llm_judge runs semantic LLM-based scoring; verify_citations checks citation grounding; get_traces queries stored traces; delete_trace removes a single trace; list_rules / deploy_rule / delete_rule manage custom evaluation rules. log_trace is the WRITE path that records executions; everything else reads, scores, or manages around it.', '', 'Behavior. Writes one row to Iris storage (SQLite by default; Postgres in Cloud tier). When IRIS_OTEL_ENDPOINT is set, ALSO fires a best-effort async export to the configured OTLP/HTTP collector (Jaeger, Tempo, Datadog OTLP, OTEL Collector). The OTel export is fire-and-forget — its success does not affect the tool response; failures are logged but the trace is still stored locally. No authentication in stdio mode; HTTP mode requires Bearer token. Rate-limited to 20 req/min on HTTP MCP, unlimited on stdio. Not idempotent: each call mints a fresh trace_id, so resubmitting the same payload creates a duplicate trace.', '', 'Output shape. Returns a JSON string: `{ "trace_id": "<32-hex>", "status": "stored" }`. The trace_id is the key you pass to evaluate_output or get_traces afterwards.', '', 'Use when you want to record an agent execution for later evaluation, analysis, or audit. Call it AFTER the agent has produced output; call evaluate_output afterwards to score it; call get_traces to query historical traces. Store rich context: spans (span tree), tool_calls (which tools were invoked with latency/errors), token_usage, cost_usd, metadata (arbitrary key-value). All optional except agent_name.', '', 'Don\'t use when you only need a transient log (use console logging). Don\'t use to update an existing trace — there is no update path in v0.4 (traces are immutable once stored).', '', 'Parameters. agent_name is required; everything else is optional. token_usage and cost_usd are summary fields — if you ALSO pass spans with per-tool-call costs, the summary fields are treated as authoritative (no auto-aggregation). spans without an explicit start_time fall back to the trace timestamp; spans with an end_time get a duration_ms derived. metadata is opaque key-value (queryable in the dashboard, not via get_traces filters). tool_calls record per-tool latency + errors; missing latency_ms means "not reported," not "zero." Defaults: span.kind="INTERNAL", span.status_code="UNSET", timestamp=now() if omitted.', '', 'Error modes. Throws on missing agent_name. Throws on malformed span or tool_call objects (Zod rejects). Returns 500 on storage failure (disk full, DB locked). Never blocks on the agent — returns within ~50ms for typical payloads.', ].join('\n'), inputSchema, annotations: { readOnlyHint: false, // Writes a row to storage destructiveHint: false, // Creates new data; doesn't overwrite or delete idempotentHint: false, // Each call mints a fresh trace_id; duplicate payloads produce distinct traces openWorldHint: false, // Local storage first. When IRIS_OTEL_ENDPOINT is set a best-effort async OTel export runs but is non-blocking (tool succeeds even if export fails). }, }, async (args) => { const traceId = generateTraceId(); const timestamp = args.timestamp ?? new Date().toISOString(); const trace = { trace_id: traceId, agent_name: args.agent_name, framework: args.framework, input: args.input, output: args.output, tool_calls: args.tool_calls, latency_ms: args.latency_ms, token_usage: args.token_usage, cost_usd: args.cost_usd, metadata: args.metadata as Record<string, unknown> | undefined, timestamp, spans: args.spans?.map((s) => ({ ...s, span_id: s.span_id ?? generateSpanId(), trace_id: traceId, })), }; await storage.insertTrace(LOCAL_TENANT, trace); // Best-effort async OTel export (fire-and-forget). No-op when // IRIS_OTEL_ENDPOINT isn't configured. Errors are logged via the // server logger but never affect the tool response — if the OTel // collector is down we still want to store traces locally. bestEffortExport(trace, (err) => { // eslint-disable-next-line no-console console.warn(`[iris.otel] ${err.message}`); }); return { content: [ { type: 'text' as const, text: JSON.stringify({ trace_id: traceId, status: 'stored' }), }, ], }; }, ); } - src/tools/log-trace.ts:39-51 (schema)Zod input schema defining the 12 parameters accepted by log_trace. Only agent_name is required; all others optional. Includes nested schemas for ToolCallSchema, SpanSchema, and TokenUsageSchema.
const inputSchema = { agent_name: z.string().describe('Agent name — used for filtering in get_traces (e.g., "customer-support-bot")'), framework: z.string().optional().describe('Agent framework identifier (e.g., langchain, autogen, custom)'), input: z.string().optional().describe('Agent input text — the user prompt or upstream input that produced this output'), output: z.string().optional().describe('Agent output text — what the agent produced (pass to evaluate_output for scoring)'), tool_calls: z.array(ToolCallSchema).optional().describe('Tool calls made during execution (per-call latency, errors, input/output)'), latency_ms: z.number().optional().describe('Total execution time in milliseconds (end-to-end agent latency)'), token_usage: TokenUsageSchema.optional().describe('Token usage breakdown (prompt/completion/total — used for cost analysis)'), cost_usd: z.number().optional().describe('Total cost in USD — overrides per-span aggregation when provided (treated as authoritative)'), metadata: z.record(z.unknown()).optional().describe('Opaque key-value tags (e.g. {requestId, userId, env}) — queryable in dashboard, not via get_traces filters'), spans: z.array(SpanSchema).optional().describe('Detailed execution spans (hierarchical span tree with timings, attributes, events)'), timestamp: z.string().optional().describe('Trace timestamp (ISO 8601); defaults to now() when omitted'), }; - src/tools/index.ts:15-30 (registration)Central registration point that calls registerLogTraceTool(server, storage) among all other tools.
export function registerAllTools( server: McpServer, storage: IStorageAdapter, evalEngine: EvalEngine, customRuleStore: CustomRuleStore, ): void { registerLogTraceTool(server, storage); registerEvaluateOutputTool(server, storage, evalEngine); registerGetTracesTool(server, storage); registerListRulesTool(server, customRuleStore); registerDeployRuleTool(server, customRuleStore); registerDeleteRuleTool(server, customRuleStore); registerDeleteTraceTool(server, storage); registerEvaluateWithLLMJudgeTool(server, storage); registerVerifyCitationsTool(server, storage); } - src/utils/ids.ts:3-13 (helper)Helper functions generateTraceId (16 random bytes → 32-char hex) and generateSpanId (8 random bytes → 16-char hex) used in the log_trace handler.
export function generateTraceId(): string { return randomBytes(16).toString('hex'); } export function generateSpanId(): string { return randomBytes(8).toString('hex'); } export function generateEvalId(): string { return randomUUID(); } - src/otel/lazy.ts:24-40 (helper)Fire-and-forget helper that lazily initializes an OTel exporter from env vars and exports the trace asynchronously; errors are swallowed (logged via callback) and never affect the tool response.
export function bestEffortExport( trace: Trace, onError?: (err: Error) => void, ): void { const exporter = getLazyExporter(); if (!exporter) return; exporter .exportTraces([trace]) .then((result) => { if (!result.ok) { const msg = `OTel export failed: status=${result.status} ${result.error ?? ''}`.trim(); onError?.(new Error(msg)); } }) .catch((err: Error) => onError?.(err)); }