Skip to main content
Glama

Log Trace

log_trace

Persist agent execution traces (input, output, spans, tool calls, cost, latency, token usage) for later evaluation and analysis. Call after agent output to enable scoring and auditing.

Instructions

Persist a single agent execution trace (input, output, spans, tool calls, cost, latency, token usage).

Sibling tools — evaluate_output runs heuristic scoring on the trace; evaluate_with_llm_judge runs semantic LLM-based scoring; verify_citations checks citation grounding; get_traces queries stored traces; delete_trace removes a single trace; list_rules / deploy_rule / delete_rule manage custom evaluation rules. log_trace is the WRITE path that records executions; everything else reads, scores, or manages around it.

Behavior. Writes one row to Iris storage (SQLite by default; Postgres in Cloud tier). When IRIS_OTEL_ENDPOINT is set, ALSO fires a best-effort async export to the configured OTLP/HTTP collector (Jaeger, Tempo, Datadog OTLP, OTEL Collector). The OTel export is fire-and-forget — its success does not affect the tool response; failures are logged but the trace is still stored locally. No authentication in stdio mode; HTTP mode requires Bearer token. Rate-limited to 20 req/min on HTTP MCP, unlimited on stdio. Not idempotent: each call mints a fresh trace_id, so resubmitting the same payload creates a duplicate trace.

Output shape. Returns a JSON string: { "trace_id": "<32-hex>", "status": "stored" }. The trace_id is the key you pass to evaluate_output or get_traces afterwards.

Use when you want to record an agent execution for later evaluation, analysis, or audit. Call it AFTER the agent has produced output; call evaluate_output afterwards to score it; call get_traces to query historical traces. Store rich context: spans (span tree), tool_calls (which tools were invoked with latency/errors), token_usage, cost_usd, metadata (arbitrary key-value). All optional except agent_name.

Don't use when you only need a transient log (use console logging). Don't use to update an existing trace — there is no update path in v0.4 (traces are immutable once stored).

Parameters. agent_name is required; everything else is optional. token_usage and cost_usd are summary fields — if you ALSO pass spans with per-tool-call costs, the summary fields are treated as authoritative (no auto-aggregation). spans without an explicit start_time fall back to the trace timestamp; spans with an end_time get a duration_ms derived. metadata is opaque key-value (queryable in the dashboard, not via get_traces filters). tool_calls record per-tool latency + errors; missing latency_ms means "not reported," not "zero." Defaults: span.kind="INTERNAL", span.status_code="UNSET", timestamp=now() if omitted.

Error modes. Throws on missing agent_name. Throws on malformed span or tool_call objects (Zod rejects). Returns 500 on storage failure (disk full, DB locked). Never blocks on the agent — returns within ~50ms for typical payloads.

Input Schema

TableJSON Schema
NameRequiredDescriptionDefault
agent_nameYesAgent name — used for filtering in get_traces (e.g., "customer-support-bot")
frameworkNoAgent framework identifier (e.g., langchain, autogen, custom)
inputNoAgent input text — the user prompt or upstream input that produced this output
outputNoAgent output text — what the agent produced (pass to evaluate_output for scoring)
tool_callsNoTool calls made during execution (per-call latency, errors, input/output)
latency_msNoTotal execution time in milliseconds (end-to-end agent latency)
token_usageNoToken usage breakdown (prompt/completion/total — used for cost analysis)
cost_usdNoTotal cost in USD — overrides per-span aggregation when provided (treated as authoritative)
metadataNoOpaque key-value tags (e.g. {requestId, userId, env}) — queryable in dashboard, not via get_traces filters
spansNoDetailed execution spans (hierarchical span tree with timings, attributes, events)
timestampNoTrace timestamp (ISO 8601); defaults to now() when omitted
Behavior5/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations only indicate non-readOnly, non-destructive, non-idempotent. The description adds rich behavioral context: storage details (SQLite/Postgres), OTel export behavior, authentication, rate limits, idempotency (not idempotent, creates duplicate), and error modes. No contradictions with annotations.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness4/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is lengthy but well-structured, starting with purpose, then sibling context, behavior, output, usage, parameter details, and errors. Every sentence adds value, though it could be more concise. Given tool complexity, the length is acceptable.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness5/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Despite no output schema, the description provides output shape and error modes. It covers all important aspects: storage, OTel export, defaults, rate limits, authentication, and sequencing. The description is complete for an 11-parameter tool with nested objects.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters5/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema has 100% coverage, but description adds crucial semantics beyond: e.g., token_usage is authoritative over spans, defaults for span kind/status, metadata not filterable in get_traces, tool_calls missing latency_ms means 'not reported'. This significantly enhances parameter understanding.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the tool's purpose: 'Persist a single agent execution trace' and distinguishes it from siblings by noting it is the WRITE path while others read, score, or manage. This provides a specific verb+resource and differentiates from related tools.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines5/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description explicitly states when to use ('record an agent execution for later evaluation') and when not to use ('transient log, update existing trace'). It also provides ordering guidance (call after agent output, then evaluate_output). This offers clear context and alternatives.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

Install Server

Other Tools

Latest Blog Posts

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/iris-eval/mcp-server'

If you have feedback or need assistance with the MCP directory API, please join our Discord server