Skip to main content
Glama

confidence_history

Shows the trend of confidence-score and execution-depth across recent runs to reveal if verification confidence is improving, decaying, or oscillating.

Instructions

Returns the confidence-score and execution-depth trend across the last N Veris runs persisted in local state (.veris/state.db). Confidence math uses a 14-day half-life decay over real execution results — so this surface shows whether the project's verification confidence is improving, decaying, or oscillating over time. Useful for dashboards, weekly health checks, or detecting regressions in test coverage discipline. Defaults to last 20 runs; pass limit to widen.

Input Schema

TableJSON Schema
NameRequiredDescriptionDefault
limitNoMaximum number of runs to return. Defaults to 20.

Implementation Reference

  • Handler function for the confidence_history tool. Delegates to VerisState.confidenceTrend(), defaulting to 30 results if no limit is provided.
    private handleConfidenceHistory(args: any) {
        return this.text(this.state.confidenceTrend(args.limit || 30));
    }
  • Tool registration entry with description and input schema. Accepts optional 'limit' parameter (number).
    { name: "confidence_history",
      description: "Returns the confidence-score and execution-depth trend across the last N Veris runs persisted in local state (.veris/state.db). Confidence math uses a 14-day half-life decay over real execution results — so this surface shows whether the project's verification confidence is improving, decaying, or oscillating over time. Useful for dashboards, weekly health checks, or detecting regressions in test coverage discipline. Defaults to last 20 runs; pass `limit` to widen.",
      inputSchema: { type: "object", properties: { limit: { type: "number", description: "Maximum number of runs to return. Defaults to 20." } }, required: [] } },
  • Switch-case routing that dispatches 'confidence_history' tool requests to the handleConfidenceHistory method.
    case "confidence_history": return this.handleConfidenceHistory(args);
  • Core SQL query backing the confidence_history tool. Fetches the most recent N run records (ordered by timestamp descending), returning runId, timestamp, overallConfidence, and executionDepth.
    public confidenceTrend(limit = 30): ConfidenceTrendRow[] {
        if (!this.db) return [];
        return this.db.prepare(`
            SELECT run_id as runId, ts, overall_confidence as overallConfidence, execution_depth as executionDepth
            FROM runs ORDER BY ts DESC LIMIT ?
        `).all(limit) as ConfidenceTrendRow[];
    }
  • Type definition for each row returned by confidence_history: runId, timestamp, overall confidence score, and execution depth.
    export interface ConfidenceTrendRow {
        runId: string;
        ts: string;
        overallConfidence: number;
        executionDepth: number;
    }
Behavior5/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

With no annotations, the description fully explains the read-only operation, the confidence math with 14-day half-life, and reliance on local state. No side effects are implied, and the behavior is transparent.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is concise, with three short sentences that front-load the main action and add important context efficiently. No redundant information.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the simple nature of the tool and no output schema, the description sufficiently covers purpose, data source, math, and usage contexts. It could be slightly more explicit about output format but is otherwise complete.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

The schema already describes the 'limit' parameter fully. The description adds minimal value (wording 'widen') but does not provide new semantic depth, so baseline 3 is appropriate.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states that the tool returns confidence-score and execution-depth trends from Veris runs, specifying the data source (local state) and distinguishing it from sibling tools like analyze_repository or detect_drift.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description provides use cases (dashboards, health checks, regression detection) and default behavior (last 20 runs) but does not explicitly state when not to use it or mention alternative tools.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

Install Server

Other Tools

Latest Blog Posts

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/vighriday/Veris'

If you have feedback or need assistance with the MCP directory API, please join our Discord server