fleet_evaluate
Evaluate prediction stability across multiple model nodes. Detects warnings, faults, and ghost confirmations in score streams, providing per-node episodes and fleet summary for monitoring and alerting.
Instructions
Evaluate a fleet of model nodes for prediction stability. Each node provides a score stream. Returns per-node episodes and aggregate fleet stats. Accepts floats (0.0–1.0) or Q0.16 integers (0–65535) — auto-converts per node. Response: { nodes: [{ node_id, episodes: [{ ci_out, ci_ema_out, al_out, warn, fault, ghost_confirmed, ... }] }], fleet_summary, credits_used, credits_remaining }. Chain per-node episodes → visualize, alert_check, or compare_windows. For persistent multi-round fleet monitoring, use fleet_session_create instead.
Input Schema
| Name | Required | Description | Default |
|---|---|---|---|
| nodes | Yes | Array of node score arrays — each inner array is one node's scores (floats or Q0.16). Max 16 nodes, 10,000 scores per node. | |
| n | No | Episode length (default: 3) |
Implementation Reference
- src/index.ts:484-504 (registration)The tool 'fleet_evaluate' is registered via server.tool() with its schema (zod validation for nodes and n) and handler function.
server.tool( "fleet_evaluate", "Evaluate a fleet of model nodes for prediction stability. Each node provides a score stream. Returns per-node episodes and aggregate fleet stats. Accepts floats (0.0–1.0) or Q0.16 integers (0–65535) — auto-converts per node. Response: { nodes: [{ node_id, episodes: [{ ci_out, ci_ema_out, al_out, warn, fault, ghost_confirmed, ... }] }], fleet_summary, credits_used, credits_remaining }. Chain per-node episodes → visualize, alert_check, or compare_windows. For persistent multi-round fleet monitoring, use fleet_session_create instead.", { nodes: z.array(z.array(z.number().min(0).max(65535)).min(1).max(10000)).min(1).max(16).describe("Array of node score arrays — each inner array is one node's scores (floats or Q0.16). Max 16 nodes, 10,000 scores per node."), n: z.number().int().min(2).max(8).optional().describe("Episode length (default: 3)"), }, async ({ nodes, n }) => { const guard = requireApiKey(); if (guard) return guard; const q16Nodes = nodes.map((nodeScores) => toQ16(nodeScores)); const body: Record<string, unknown> = { nodes: q16Nodes }; if (n !== undefined) body.config = { n }; const result = await apiFetch("/api/fleet-evaluate", { method: "POST", headers: apiKeyHeaders(), body, }); return formatResult(result); } ); - src/index.ts:491-503 (handler)The handler function for fleet_evaluate: checks API key auth, converts scores to Q0.16 via toQ16, POSTs to /api/fleet-evaluate, and returns the formatted API result.
async ({ nodes, n }) => { const guard = requireApiKey(); if (guard) return guard; const q16Nodes = nodes.map((nodeScores) => toQ16(nodeScores)); const body: Record<string, unknown> = { nodes: q16Nodes }; if (n !== undefined) body.config = { n }; const result = await apiFetch("/api/fleet-evaluate", { method: "POST", headers: apiKeyHeaders(), body, }); return formatResult(result); } - src/index.ts:487-490 (schema)Input schema: nodes (array of number arrays, min 1 node, max 16 nodes, each 1-10000 scores), optional n (episode length, 2-8).
{ nodes: z.array(z.array(z.number().min(0).max(65535)).min(1).max(10000)).min(1).max(16).describe("Array of node score arrays — each inner array is one node's scores (floats or Q0.16). Max 16 nodes, 10,000 scores per node."), n: z.number().int().min(2).max(8).optional().describe("Episode length (default: 3)"), }, - src/index.ts:296-304 (helper)toQ16 helper: auto-converts float scores (0-1) to Q0.16 integers (0-65535), used by fleet_evaluate to normalize node scores before sending to API.
/** Auto-convert: if all values are 0–1 floats with decimals, scale to Q0.16. Otherwise clamp to 0–65535. */ function toQ16(scores: number[]): number[] { const hasDecimals = scores.some((s) => s % 1 !== 0); const allInUnit = scores.every((s) => s >= 0 && s <= 1); const isFloat = hasDecimals && allInUnit; return isFloat ? scores.map((s) => Math.round(Math.max(0, Math.min(1, s)) * Q16)) : scores.map((s) => Math.round(Math.max(0, Math.min(Q16, s)))); }