Skip to main content
Glama
Ownership verified

Server Details

AI agent infrastructure: dedup, cost prediction, validation, governance, failure intelligence.

Status
Healthy
Last Tested
Transport
Streamable HTTP
URL

Glama MCP Gateway

Connect through Glama MCP Gateway for full control over tool access and complete visibility into every call.

MCP client
Glama
MCP server

Full call logging

Every tool call is logged with complete inputs and outputs, so you can debug issues and audit what your agents are doing.

Tool access control

Enable or disable individual tools per connector, so you decide what your agents can and cannot do.

Managed credentials

Glama handles OAuth flows, token storage, and automatic rotation, so credentials never expire on your clients.

Usage analytics

See which tools your agents call, how often, and when, so you can understand usage patterns and catch anomalies.

100% free. Your data is private.
Tool DescriptionsA

Average 4.3/5 across 15 of 15 tools scored.

Server CoherenceA
Disambiguation5/5

Each tool has a clearly distinct purpose, grouped by functionality (cost, dedup, guardrails, pitfalls, quality, rigor). Descriptions clearly differentiate within groups, so an agent can reliably select the correct tool.

Naming Consistency5/5

All tool names follow a consistent pattern of lowercase with underscores, using category prefix (burnrate_, dedupq_, guardrail_, etc.) followed by a descriptive verb or noun. No naming mix-ups.

Tool Count5/5

15 tools cover multiple distinct capabilities (cost management, deduplication, guardrails, pitfalls, quality, workflows) without being excessive. Each tool has a clear role, and the count is appropriate for the server's scope.

Completeness4/5

The tool surface covers core workflows well, but there are minor gaps: no tool to delete/update guardrail policies, no explicit qualitygate rule management, and no workflow cancellation tool. Also, 'qualitygate_trends' is referenced but not provided as a tool. Overall, agents can work around these gaps.

Available Tools

15 tools
burnrate_budgetA
Read-onlyIdempotent
Inspect

Get today's tracked LLM spend, per-model breakdown, projection, and budget alerts. Free — no credits charged.

ParametersJSON Schema
NameRequiredDescriptionDefault
daily_limitNoOptional. Daily budget in USD (e.g., 10.0 for a $10/day cap). Enables budget alerts and remaining-balance calculation.

Output Schema

ParametersJSON Schema
NameRequiredDescription
dateNo
spendNo
alertsNo
projectionNo
request_idNo
credits_usedNo
credits_remainingNo
fallback_behaviorNo
Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations already indicate read-only, idempotent, non-destructive. Description adds that the tool is free and returns specific outputs (spend, breakdown, projection, alerts), providing useful behavioral context beyond annotations.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Two sentences, no fluff. Front-loaded with core functionality, followed by a free-to-use note. Every word is necessary.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

With a rich annotation set and an output schema (assumed), the description adequately covers inputs and outputs. Could mention limitations or pagination but not needed for this simple read tool.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema coverage is 100% and describes the daily_limit parameter well. The description adds context (budget alerts) but does not elaborate on parameter specifics, so value added is marginal.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the tool retrieves today's tracked LLM spend, per-model breakdown, projection, and budget alerts. It distinguishes from siblings like burnrate_estimate (future estimates) and burnrate_track (tracking over time) by focusing on current data and budget features.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines3/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

No explicit guidance on when to use this tool vs. siblings. The purpose is clear but the description does not state alternatives or exclusions, leaving inference to the agent.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

burnrate_estimateA
Idempotent
Inspect

Before executing a multi-step agent plan, estimate the total LLM cost. Returns per-step breakdown and optimization suggestions. If the estimate exceeds your budget, pipe the same plan into burnrate_optimize. Costs 1 credit.

ParametersJSON Schema
NameRequiredDescriptionDefault
planYesArray of plan steps with provider, model, and token estimates.

Output Schema

ParametersJSON Schema
NameRequiredDescription
estimateNo
request_idNo
credits_usedNo
credits_remainingNo
fallback_behaviorNo
optimization_suggestionsNo
Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

The description reveals that calling the tool 'Costs 1 credit,' which is behavioral information not present in the annotations. It also states the return value: 'per-step breakdown and optimization suggestions.' The annotations already mark it as idempotent and non-destructive, and the description aligns with that. No contradiction. A minor gap: it doesn't specify if the tool has side effects beyond credit deduction, but overall transparency is good.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is three sentences long, each serving a distinct purpose: definition of use, action upon exceeding budget, and cost mention. It is front-loaded with the primary action and contains no redundant or extraneous information. Efficient and to the point.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

The tool has one required parameter and an output schema (not shown but indicated in context). The description explains the return values and cost, which is sufficient for an agent to use it correctly. It does not cover edge cases like invalid input or rate limits, but given the presence of an output schema, those details are likely handled elsewhere. Slightly incomplete, but adequate.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

The input schema has 100% description coverage for all parameters. The description does not add new semantic meaning beyond what the schema already provides. Baseline is 3, and the description's mention of returns ('per-step breakdown and optimization suggestions') weakly connects to the input but does not deepen parameter understanding. Hence, score remains at 3.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the tool's purpose: 'estimate the total LLM cost' for a 'multi-step agent plan.' It distinguishes itself from the sibling tool 'burnrate_optimize' by explicitly offering optimization as an alternative when the estimate exceeds budget. The verb 'estimate' and resource 'total LLM cost' are specific and unambiguous.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description explicitly tells when to use the tool: 'Before executing a multi-step agent plan.' It also provides guidance for the next step: 'If the estimate exceeds your budget, pipe the same plan into burnrate_optimize.' However, it does not explicitly state when not to use this tool or mention any prerequisites, which prevents a perfect score.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

burnrate_optimizeA
Idempotent
Inspect

Get a cheaper equivalent plan by substituting models with lower-cost alternatives. Call after burnrate_estimate if the estimated cost exceeds your budget. Returns the optimized plan with substituted models, new per-step costs, total savings, and whether the target_budget is met. Optionally set target_budget to constrain the optimization. Costs 1 credit.

ParametersJSON Schema
NameRequiredDescriptionDefault
planYesArray of plan steps. Same schema as burnrate_estimate: each step needs step, provider, model, estimated_input_tokens, estimated_output_tokens.
target_budgetNoOptional. Target total cost in USD.

Output Schema

ParametersJSON Schema
NameRequiredDescription
stepsNo
originalNo
optimizedNo
request_idNo
suggestionsNo
credits_usedNo
credits_remainingNo
fallback_behaviorNo
Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations indicate idempotentHint=true, readOnlyHint=false, destructiveHint=false. The description adds that the tool costs 1 credit and returns specific outputs. It does not contradict annotations and provides useful behavioral context beyond them.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is three sentences with no fluff. The first sentence states the main purpose, the second provides usage timing, and the third details returns and options. Well-structured and efficient.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness5/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

With a clear output schema implied (returning optimized plan, costs, savings, budget status), the description covers all needed information: input plan, optional budget, and what is returned. It also mentions credit cost. No gaps for a tool of this complexity.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema coverage is 100%, so the schema already explains both parameters. The description adds context that the plan schema matches burnrate_estimate and that target_budget is optional to constrain optimization. This adds minimal value beyond the schema, so a baseline score of 3 is appropriate.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the tool's purpose: 'Get a cheaper equivalent plan by substituting models with lower-cost alternatives.' It specifies the verb (substitute), resource (plan), and distinguishes from siblings like burnrate_estimate and burnrate_budget. It also details what the tool returns.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description explicitly advises when to use the tool: 'Call after burnrate_estimate if the estimated cost exceeds your budget.' It mentions optional target_budget parameter, but doesn't explicitly state when not to use. Nonetheless, the guidance is clear and contextual.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

burnrate_trackA
Idempotent
Inspect

Log the actual cost of an LLM call after execution. Call this after every LLM request to build calibration data that improves burnrate_estimate accuracy over time. Free — no credits charged. Returns the recorded cost entry with computed margin versus the prior estimate when one exists for this model and token range.

ParametersJSON Schema
NameRequiredDescriptionDefault
modelYesModel identifier as returned by the provider. Examples: claude-sonnet-4-6, gpt-4o, gemini-2.0-flash, mistral-large-latest. Unknown models are accepted but cost may show as $0.
task_idNoOptional task ID for cross-referencing spend with DedupQ deduplication results. Use the same task_id passed to dedupq_check to link cost tracking with deduplication.
providerYesLLM provider identifier. Supported: anthropic, openai, google, mistral, cohere, deepseek, together, fireworks, groq. Must match the provider of the model used.
input_tokensYesActual prompt tokens used. Must be >= 0.
output_tokensYesActual completion tokens used. Must be >= 0.
cache_read_tokensNoOptional. Cache-read tokens.

Output Schema

ParametersJSON Schema
NameRequiredDescription
modelNo
trackedNo
providerNo
record_idNo
request_idNo
credits_usedNo
input_tokensNo
output_tokensNo
pricing_foundNo
actual_cost_usdNo
credits_remainingNo
fallback_behaviorNo
actual_cost_usd_formattedNo
Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations already indicate a non-destructive, idempotent write operation. The description adds valuable behavioral context: it's free, reports computed margin versus prior estimates, and explains that unknown models may show $0 cost. No contradictions with annotations.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Three concise sentences, front-loaded with purpose. Every sentence provides essential information: purpose, usage guidance, cost, and return value. No extraneous words.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness5/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the tool's simple logging purpose, the description is complete. It states what the tool does, when to use it, what it returns (including margin computation), and notes the cost behavior. Output schema exists, so return format details are covered elsewhere.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema coverage is 100%, so the schema already documents all parameters. The description adds slight value by mentioning that unknown models may result in $0 cost and that task_id cross-references with dedupq_check, but overall it doesn't significantly enhance the schema descriptions.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the tool logs actual LLM call costs after execution, with a specific verb (Log) and resource (actual cost of an LLM call). It distinguishes itself from sibling burnrate tools by explaining it builds calibration data to improve burnrate_estimate accuracy.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description explicitly says 'Call this after every LLM request' and notes it's free with no credits charged. It provides clear context for when to use the tool, though it doesn't explicitly state when not to use it or name alternatives for exclusion.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

dedupq_checkA
Idempotent
Inspect

Before executing any LLM task, check if an identical or semantically similar task has already been completed. Returns cached result on hit, saving one LLM call. On a miss, execute your task and call dedupq_complete to cache the result for future hits. Costs 1 credit.

ParametersJSON Schema
NameRequiredDescriptionDefault
contentYesThe task content to check for duplicates. This is hashed and embedded for matching.
task_idNoOptional caller task ID for tracing and cross-referencing with BurnRate.
hash_onlyNoIf true, skip vector similarity search and use exact hash matching only. Default: false.
similarity_thresholdNoCosine similarity threshold for semantic matching, 0.0 to 1.0. Default: 0.80.

Output Schema

ParametersJSON Schema
NameRequiredDescription
matchNo
statusNohit | miss | in_progress
cache_hitNo
request_idNo
content_hashNo
credits_usedNo
cache_age_secondsNo
credits_remainingNo
fallback_behaviorNo
Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations already indicate idempotency and non-destructiveness. The description adds behavioral details: costs 1 credit, returns cached result on hit, and requires dedupq_complete on miss. No contradictions.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is extremely concise: two sentences and a credit cost note. All information is essential and front-loaded.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the presence of annotations, an output schema, and 100% parameter coverage, the description provides sufficient context. It handles the workflow, credit cost, and usage pattern. A brief note on what the output looks like could be added, but it's not critical.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema coverage is 100% with clear parameter descriptions. The tool description adds minimal additional meaning beyond the schema, such as 'hashed and embedded for matching' for content, but this is largely redundant.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the tool's purpose: to check for duplicate tasks before execution and return cached results, saving an LLM call. It distinguishes itself from the sibling dedupq_complete by specifying the workflow on miss.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines5/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description provides explicit usage instructions: 'Before executing any LLM task, check... On a miss, execute your task and call dedupq_complete to cache the result.' This tells the agent exactly when and how to use the tool and what to do next.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

dedupq_completeA
Idempotent
Inspect

After executing a task, store the result so future identical or similar tasks return a cache hit via dedupq_check. Costs 2 credits.

ParametersJSON Schema
NameRequiredDescriptionDefault
resultYesThe task result to cache. Can be any JSON value.
contentYesOriginal task content. Used to compute hash and embedding for future matching.
task_idNoOptional task ID. Used as the database row ID if provided.
hash_onlyNoIf true, skip embedding generation. Default: false.

Output Schema

ParametersJSON Schema
NameRequiredDescription
storedNo
task_idNo
request_idNo
content_hashNo
credits_usedNo
has_embeddingNo
credits_remainingNo
fallback_behaviorNo
Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

The description adds important behavioral details beyond annotations: the cost of 2 credits and the caching mechanism. Annotations already mark it as idempotent and non-destructive, so no contradictions. The description enhances transparency with cost and usage flow.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is a single, well-structured sentence that conveys purpose, timing, effect, and cost without extraneous words. It is front-loaded with the key action and context.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness5/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

The description covers the essential context: when to use (after task execution), what it does (store result for caching), cost, and relationship to dedupq_check. Given the presence of a detailed schema and output schema, no additional information is necessary.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

All four parameters are described in the schema (100% coverage), so the description does not need to add much. The schema already explains each parameter's role. The description adds no extra parameter-specific semantics beyond the schema.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the purpose: storing task results for caching via dedupq_check. It includes specific action ('store the result') and distinguishes from sibling dedupq_check (retrieval). The verb and resource are unambiguous.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description specifies when to use the tool ('After executing a task') and implies the complementary tool dedupq_check for checking. It does not explicitly mention when not to use it or list alternatives, but the context is clear enough.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

guardrail_checkA
Idempotent
Inspect

Evaluate a proposed agent action against your governance policies. Returns allow or deny with the matched policy reason. Requires at least one active policy created via guardrail_create_policy. Deterministic rule evaluation — no LLM. Costs 1 credit.

ParametersJSON Schema
NameRequiredDescriptionDefault
agent_idYesAgent identifier.
proposed_actionYesAction to evaluate. Must contain a 'type' field. Example: {"type": "http_request", "url": "https://external.example.com"} or {"type": "file_write", "path": "/etc/config"}.

Output Schema

ParametersJSON Schema
NameRequiredDescription
reasonNo
audit_idNo
decisionNo
policy_idNo
request_idNo
credits_usedNo
credits_remainingNo
fallback_behaviorNo
Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations already indicate idempotentHint=true, so deterministic behavior is implied. The description adds value by explicitly stating 'Deterministic rule evaluation — no LLM' and 'Costs 1 credit', which are beyond annotations. No contradiction detected.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is extremely concise: two sentences plus a note. Each sentence conveys unique, important information without redundancy. Front-loaded with the core action.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness5/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the tool has an output schema and high schema coverage, the description covers purpose, prerequisites, evaluation method, and cost. It is complete and leaves no critical gaps for an agent to use the tool correctly.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema description coverage is 100%, so the schema already documents all parameters thoroughly. The description adds no extra parameter-level information beyond what the schema provides, meeting the baseline for high coverage.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description explicitly states the tool evaluates a proposed agent action against governance policies and returns allow/deny with policy reason. It clearly distinguishes from sibling guardrail_create_policy by being the evaluation action.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description specifies a prerequisite: requires at least one active policy created via guardrail_create_policy. It also clarifies that evaluation is deterministic and costs 1 credit. Does not explicitly state when not to use or list alternatives, but context provides enough guidance.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

guardrail_create_policyAInspect

Create a persistent governance policy that guardrail_check evaluates on every subsequent call. Define rules using and/or/not operators over action types, resource patterns, and budget thresholds. Call this before using guardrail_check — checks require at least one active policy. Policies persist until explicitly deleted. Duplicate policy names return an error. Returns the created policy with its ID and active status.

ParametersJSON Schema
NameRequiredDescriptionDefault
nameYesUnique policy name per org. Examples: 'no-delete-in-prod', 'budget-cap-50', 'pii-block'.
rulesYesArray of rule objects evaluated against the proposed_action in guardrail_check. Leaf operators: eq, starts_with, contains, gt, lt (compare field to value). Compound operators: and, or, not (nest sub-rules in a rules array). Example: [{operator:'eq', field:'type', value:'file_write'}] blocks all file writes. Nested example: [{operator:'and', rules:[{operator:'eq',field:'type',value:'api_call'},{operator:'contains',field:'url',value:'prod'}]}] blocks prod API calls.
priorityNoOptional. Evaluation order. Default: 0.
descriptionNoOptional human-readable summary of what this policy enforces. Returned in guardrail_check responses and guardrail_list_policies output for auditability.
action_typesNoOptional. Restrict this policy to only evaluate when proposed_action.type matches one of these values. Examples: ['file_write', 'api_call', 'db_delete']. Omit to apply the policy to all action types regardless of type field.

Output Schema

ParametersJSON Schema
NameRequiredDescription
policyNo
request_idNo
credits_usedNo
credits_remainingNo
fallback_behaviorNo
Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Discloses that policies persist, returns created policy with ID and status, and errors on duplicates. This adds meaningful behavioral context beyond annotations (which only indicate non-read-only and non-destructive). No contradiction with annotations.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Three sentences: purpose, prerequisite usage, and key behaviors. Front-loaded and efficient with no superfluous content.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Covers creation, persistence, error handling, and prerequisite relationship. With output schema present, return values need no further explanation. Could optionally mention rule structure complexity, but schema compensates.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema coverage is 100% with detailed descriptions for all 5 parameters, including examples for rules. The description doesn't add new parameter info beyond referencing the schema, so baseline 3 is appropriate.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the tool creates a persistent governance policy evaluated by guardrail_check, with specific verb and resource. It distinguishes from siblings like guardrail_check (which evaluates) and other tools (dedup, burnrate, etc.) by focusing on policy creation.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines5/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

Explicitly instructs to call this before using guardrail_check, as checks require at least one active policy. Also notes policies persist until deleted and duplicate names cause errors, providing clear context for proper use.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

pitfalldb_queryA
Idempotent
Inspect

Check for known failure patterns before executing a task type. Returns pitfalls with severity, fix suggestions, and confidence scores. After your agent runs, submit failures via pitfalldb_report so others benefit. Costs 2 credits.

ParametersJSON Schema
NameRequiredDescriptionDefault
filtersNoOptional filters.
task_typeYesTask category: code_generation, web_search, data_analysis, etc.
task_descriptionNoOptional. Natural-language task description for semantic search.

Output Schema

ParametersJSON Schema
NameRequiredDescription
pitfallsNo
request_idNo
credits_usedNo
total_matchingNo
credits_remainingNo
fallback_behaviorNo
Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Discloses cost of 2 credits and that it returns specific fields (severity, fix suggestions, confidence scores). Adds value beyond annotations (idempotent, openWorld, non-destructive) without contradicting them.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Three efficient sentences, front-loaded with purpose, output, and usage guidance. No wasted words.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness5/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Complete for a query tool with output schema: describes purpose, usage timing, return fields, cost, and sibling relationship. No gaps.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema coverage is 100%, so baseline is 3. Description adds minimal extra beyond schema; mentions task_type implicitly and filters briefly. Task_description is already described in schema.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

Description clearly states the verb 'Check' and the resource 'known failure patterns'. It distinguishes from the sibling tool 'pitfalldb_report' by instructing to submit failures there after running.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

Explicitly says to use before executing a task type and to report failures via pitfalldb_report afterward, providing clear context. Lacks explicit 'when not to use' but the contrast with the sibling implies proper usage.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

pitfalldb_reportA
Idempotent
Inspect

Report an agent failure. PII-scrubbed before storage. Linked to existing pitfalls if similar. Free — no credits charged.

ParametersJSON Schema
NameRequiredDescriptionDefault
failureYesFailure details.
task_typeYesTask category.
task_descriptionYesDescription of the failed task.

Output Schema

ParametersJSON Schema
NameRequiredDescription
messageNo
verifiedNo
report_idNo
request_idNo
credits_usedNo
credits_remainingNo
fallback_behaviorNo
linked_pitfall_idNo
Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

The description adds behavioral context beyond annotations, including PII scrubbing, linking to existing pitfalls, and free usage. No contradiction with annotations.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is extremely concise: three short sentences that front-load the core action and each add unique value. No wasted words.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the presence of an output schema and full schema coverage, the description covers the essential human-understandable aspects. Could possibly mention input structure, but schema already does that.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema coverage is 100%, so parameters are fully described in the schema. The description adds no additional parameter-level detail, earning the baseline score of 3.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the verb 'Report' and resource 'agent failure', and adds key details like PII scrubbing, linking, and free nature. It distinguishes from siblings like pitfalldb_query which is for querying.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description gives clear context (reporting vs querying) and notes it's free and PII-scrubbed, but does not explicitly provide when-not-to-use or alternatives. However, sibling tools make differentiation straightforward.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

qualitygate_validateA
Idempotent
Inspect

After your agent generates output, validate it against your rules before shipping. Runs deterministic checks (regex, JSON schema, syntax) plus optional LLM-powered tone and factual analysis. Returns a structured verdict (pass, warn, or fail) with a 0-100 score and per-check issue details. Use qualitygate_trends to spot recurring failure patterns over time. Variable cost: 1 credit per deterministic check, 8 credits per LLM check.

ParametersJSON Schema
NameRequiredDescriptionDefault
outputYesThe agent output text to validate.
schemaNoJSON Schema to validate output against.
languageNoCode language for syntax check: json, python, javascript, typescript.
overrideNoForce pass. Requires override_reason.
directivesNoDirective objects. Types: must_include, must_not_include, must_match, must_not_match, must_contain, must_not_contain, min_length, max_length.
check_typesNoChecks to run. Auto-inferred if omitted.
override_reasonNoRequired when override is true.

Output Schema

ParametersJSON Schema
NameRequiredDescription
issuesNo
summaryNo
verdictNo
checks_runNo
request_idNo
credits_usedNo
credits_remainingNo
fallback_behaviorNo
Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

The description reveals behavioral traits beyond annotations: it explains the cost model (1 credit per deterministic check, 8 per LLM check), the return structure (verdict with pass/warn/fail, 0-100 score, per-check details), and the two types of checks. There is no contradiction with annotations, which indicate readOnlyHint=false, destructiveHint=false, openWorldHint=true, idempotentHint=true.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is concise and well-structured: it opens with the primary use case, outlines capabilities, mentions return format, suggests an alternative, and notes cost. Every sentence adds value with no redundancy.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness5/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the tool's complexity (7 parameters, nested objects, output schema exists), the description covers all essential aspects: what it does, when to use, return format, cost, and alternative. The output schema is not detailed in the description but is available separately, so this is sufficient.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

The input schema has 100% description coverage, so the schema already documents parameter meanings. The description adds context about check types (regex, JSON schema, syntax) and cost but does not significantly enrich parameter semantics beyond what the schema provides. Baseline 3 is appropriate.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the tool's purpose: validate agent-generated output against user-defined rules using deterministic and optional LLM checks. It distinguishes itself from sibling tools like qualitygate_trends by specifying a different use case.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description provides explicit guidance on when to use the tool: 'After your agent generates output, validate it against your rules before shipping.' It also recommends an alternative tool (qualitygate_trends) for spotting failure patterns. It lacks explicit when-not-to-use scenarios but gives sufficient context.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

rigor_executeAInspect

Execute a structured workflow end-to-end. Call rigor_plan first (free) to preview the step sequence and cost estimate before committing credits. Classifies the task, selects the optimal tool sequence, and executes each step with the right LLM model. Returns a complete deliverable — solution designs, competitive analyses, governance documents, and more. Supports SSE streaming for real-time progress, webhook callback, or polling.

ParametersJSON Schema
NameRequiredDescriptionDefault
contextNoAdditional context for the workflow.
deliveryNoDelivery method. Default: polling (MCP clients typically can't handle SSE).
task_typeNoOptional hint to bypass automatic classification. Values: solution_design, requirements_analysis, code_implementation, code_review, bug_fix, root_cause_analysis, incident_response, deployment_execution, competitive_scan, financial_analysis, research_task, documentation, governance_change, compliance_audit, data_security_assessment, performance_optimization, user_story_definition, implementation_prompt_generation.
preferencesNoOptional workflow preferences.
task_descriptionYesNatural language description of the task. Be specific — include what you want produced, constraints, and context. Example: 'Design a caching layer for our API gateway with Redis integration.'

Output Schema

ParametersJSON Schema
NameRequiredDescription
okNo
statusNo
poll_urlNo
task_typeNo
value_classNo
workflow_idNo
delivery_modeNo
available_modesNo
estimated_creditsNo
Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations already indicate mutation (readOnlyHint=false) and side effects (openWorldHint=true). Description adds transparency about streaming, callback, and polling support, and mentions it returns a complete deliverable, beyond what annotations provide.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Four concise sentences front-load the main purpose, include key usage guidance (rigor_plan prerequisite), and list deliverables without redundant information.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness5/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the tool's complexity (5 params, 1 required, nested objects) and existence of an output schema, the description covers prerequisites, delivery options, and return value nature completely.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema description coverage is 100% with clear parameter descriptions. The description adds an example for task_description but does not significantly augment schema semantics, so baseline 3 is appropriate.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the tool executes a structured workflow end-to-end, specifies deliverables like solution designs and competitive analyses, and differentiates from siblings like rigor_plan (preview) and rigor_status (polling).

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines5/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

Explicitly advises calling rigor_plan first for preview and cost estimate, and describes delivery methods (SSE, webhook, polling) with defaults, providing clear when-to-use guidance.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

rigor_planA
Read-onlyIdempotent
Inspect

Before executing a complex task, get a structured workflow plan with per-step cost estimates. Classifies your task, selects the optimal framework sequence, and returns the full plan without executing anything. Free — no credits charged.

ParametersJSON Schema
NameRequiredDescriptionDefault
task_typeNoOptional hint to bypass automatic classification. Values: solution_design, requirements_analysis, code_implementation, code_review, bug_fix, root_cause_analysis, incident_response, deployment_execution, competitive_scan, financial_analysis, research_task, documentation, governance_change, compliance_audit, data_security_assessment, performance_optimization, user_story_definition, implementation_prompt_generation.
preferencesNoOptional workflow preferences.
task_descriptionYesNatural language description of the task. Be specific — include what you want produced, constraints, and context. Example: 'Design a caching layer for our API gateway with Redis integration.'

Output Schema

ParametersJSON Schema
NameRequiredDescription
okNo
planNo
generated_titleNo
Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations already indicate readOnlyHint, idempotentHint, destructiveHint false. Description adds that it classifies task and selects framework sequence, no contradictions.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Two concise sentences, no fluff. First sentence delivers core purpose, second adds key details.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given full schema coverage and output schema existence, description is adequate. Could hint at preferences complexity but not required.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema covers 100% of parameters with descriptions. Description adds minimal extra meaning (e.g., not repeating param details). Baseline 3 appropriate.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the tool produces a structured workflow plan with cost estimates, and distinguishes it from execution tools like rigor_execute.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

Explicitly says 'Before executing a complex task', indicating when to use. Does not explicitly mention alternatives, but context makes it clear.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

rigor_statusA
Read-onlyIdempotent
Inspect

Check the status of a running or completed Rigor workflow. Returns progress, step results, and the full deliverable when complete. Use after rigor_execute with polling delivery to retrieve results.

ParametersJSON Schema
NameRequiredDescriptionDefault
workflow_idYesThe workflow ID returned by rigor_execute (format: wr_xxx).

Output Schema

ParametersJSON Schema
NameRequiredDescription
okNo
workflowNo
Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations already indicate readOnlyHint, idempotentHint, and non-destructive. Description adds valuable context about returns (progress, step results, deliverable) and the polling pattern, complementing annotations without contradiction.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Two sentences: first states purpose and outputs, second gives usage guidance. No redundant words, front-loaded with key information.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given existence of output schema (not shown but present), description adequately covers return concepts. With low parameter count and high schema coverage, content is sufficient for the tool's complexity.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema description coverage is 100%, with parameter workflow_id already described as 'The workflow ID returned by rigor_execute (format: wr_xxx)'. Description does not add further parameter details.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

Description clearly states 'Check the status of a running or completed Rigor workflow', with specific verb and resource. It also lists return values (progress, step results, deliverable) and distinguishes from sibling tools like rigor_execute.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

Explicitly says 'Use after rigor_execute with polling delivery to retrieve results', providing clear context. Does not explicitly mention when not to use, but implies it should only be used after execution.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

rigor_workflowsA
Read-onlyIdempotent
Inspect

List all Rigor workflows for your organization with filtering and pagination. Returns status, progress, capacity usage, and available actions per workflow. Use to monitor workflow state, understand concurrent limit usage, and identify stuck or completed workflows.

ParametersJSON Schema
NameRequiredDescriptionDefault
limitNoPage size (default 20, max 100)
cursorNoPagination cursor (created_at timestamp from previous page)
statusNoFilter by status (comma-separated). Valid values: executing, step_executing, completed, failed, halted, pending_approval, cancelled. E.g. "halted,failed,pending_approval"
task_typeNoFilter by classified task type
counts_toward_limitNoFilter to workflows counting toward the concurrent limit

Output Schema

ParametersJSON Schema
NameRequiredDescription
okNo
workflowsNo
paginationNo
credits_remainingNo
concurrent_summaryNo
Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations already declare readOnlyHint=true, idempotentHint=true, destructiveHint=false. Description adds value by specifying returned data (status, progress, capacity usage, available actions) and actionable insights, complementing annotations.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Two sentences with no redundancy. First sentence defines core purpose, second adds context and use cases. Efficient and well-structured.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness5/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

With output schema present and high schema coverage, description is sufficient. Covers filtering, pagination, returned fields, and practical use cases. No gaps.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema coverage is 100%, so descriptions adequately document parameters. The tool description does not add extra parameter meaning beyond the schema, meeting baseline expectations.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

Description clearly states 'List all Rigor workflows' with specific resource and verb. It mentions filtering, pagination, and what is returned (status, progress, etc.). Differentiates from sibling tools like rigor_execute or rigor_status.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

Explicitly states use cases: 'monitor workflow state, understand concurrent limit usage, identify stuck or completed workflows.' Provides clear context but does not explicitly exclude scenarios or compare to siblings.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

Discussions

No comments yet. Be the first to start the discussion!

Try in Browser

Your Connectors

Sign in to create a connector for this server.

Resources