Skip to main content
Glama

Log Trace

log_trace

Record an agent execution trace with spans, tool calls, token usage, and cost for later evaluation and analysis.

Instructions

Persist a single agent execution trace (input, output, spans, tool calls, cost, latency, token usage).

Sibling tools — evaluate_output runs heuristic scoring on the trace; evaluate_with_llm_judge runs semantic LLM-based scoring; verify_citations checks citation grounding; get_traces queries stored traces; delete_trace removes a single trace; list_rules / deploy_rule / delete_rule manage custom evaluation rules. log_trace is the WRITE path that records executions; everything else reads, scores, or manages around it.

Behavior. Writes one row to Iris storage (SQLite by default; Postgres in Cloud tier). When IRIS_OTEL_ENDPOINT is set, ALSO fires a best-effort async export to the configured OTLP/HTTP collector (Jaeger, Tempo, Datadog OTLP, OTEL Collector). The OTel export is fire-and-forget — its success does not affect the tool response; failures are logged but the trace is still stored locally. No authentication in stdio mode; HTTP mode requires Bearer token. Rate-limited to 20 req/min on HTTP MCP, unlimited on stdio. Not idempotent: each call mints a fresh trace_id, so resubmitting the same payload creates a duplicate trace.

Output shape. Returns a JSON string: { "trace_id": "<32-hex>", "status": "stored" }. The trace_id is the key you pass to evaluate_output or get_traces afterwards.

Use when you want to record an agent execution for later evaluation, analysis, or audit. Call it AFTER the agent has produced output; call evaluate_output afterwards to score it; call get_traces to query historical traces. Store rich context: spans (span tree), tool_calls (which tools were invoked with latency/errors), token_usage, cost_usd, metadata (arbitrary key-value). All optional except agent_name.

Don't use when you only need a transient log (use console logging). Don't use to update an existing trace — there is no update path in v0.4 (traces are immutable once stored).

Parameters. agent_name is required; everything else is optional. token_usage and cost_usd are summary fields — if you ALSO pass spans with per-tool-call costs, the summary fields are treated as authoritative (no auto-aggregation). spans without an explicit start_time fall back to the trace timestamp; spans with an end_time get a duration_ms derived. metadata is opaque key-value (queryable in the dashboard, not via get_traces filters). tool_calls record per-tool latency + errors; missing latency_ms means "not reported," not "zero." Defaults: span.kind="INTERNAL", span.status_code="UNSET", timestamp=now() if omitted.

Error modes. Throws on missing agent_name. Throws on malformed span or tool_call objects (Zod rejects). Returns 500 on storage failure (disk full, DB locked). Never blocks on the agent — returns within ~50ms for typical payloads.

Input Schema

TableJSON Schema
NameRequiredDescriptionDefault
agent_nameYesAgent name — used for filtering in get_traces (e.g., "customer-support-bot")
frameworkNoAgent framework identifier (e.g., langchain, autogen, custom)
inputNoAgent input text — the user prompt or upstream input that produced this output
outputNoAgent output text — what the agent produced (pass to evaluate_output for scoring)
tool_callsNoTool calls made during execution (per-call latency, errors, input/output)
latency_msNoTotal execution time in milliseconds (end-to-end agent latency)
token_usageNoToken usage breakdown (prompt/completion/total — used for cost analysis)
cost_usdNoTotal cost in USD — overrides per-span aggregation when provided (treated as authoritative)
metadataNoOpaque key-value tags (e.g. {requestId, userId, env}) — queryable in dashboard, not via get_traces filters
spansNoDetailed execution spans (hierarchical span tree with timings, attributes, events)
timestampNoTrace timestamp (ISO 8601); defaults to now() when omitted
Behavior5/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Adds numerous details beyond annotations: writes to Iris storage (SQLite/Postgres), optional OTel export (fire-and-forget), not idempotent (each call creates new trace_id), authentication modes (stdio vs HTTP), rate limits (20 req/min HTTP, unlimited stdio), and immutability. Annotations only provide hints; the description fully discloses behavior.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Well-structured: purpose first, sibling context, behavior details, output shape, usage guidance, parameter details, error modes. Front-loaded with key info. Every sentence adds value despite length; no redundancy or fluff.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness5/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given 11 parameters (1 required), nested objects (spans, tool_calls, token_usage), no output schema, the description covers storage, OTel export, idempotency, auth, rate limits, output shape, error modes, and parameter nuances. No gaps remain.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters5/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema coverage is 100% (all 11 parameters described in schema). The description adds critical semantics: defaults for spans (kind, status_code, timestamp), cost_usd as authoritative over per-span aggregation, metadata not queryable via get_traces, tool_calls missing latency_ms means not reported. Error modes are also detailed.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description explicitly states 'Persist a single agent execution trace (input, output, spans, tool calls, cost, latency, token usage).' It distinguishes from sibling tools by positioning log_trace as the WRITE path, while siblings like evaluate_output and get_traces are reads or scores. This provides strong differentiation.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines5/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

Clear guidance: 'Use when you want to record an agent execution for later evaluation.' It instructs to call AFTER agent output, then use evaluate_output and get_traces. It also tells when NOT to use ('transient log' or 'update existing trace') and explains immutability. Alternative tools are explicitly named.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

Install Server

Other Tools

Latest Blog Posts

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/iris-eval/mcp-server'

If you have feedback or need assistance with the MCP directory API, please join our Discord server