Skip to main content
Glama

AXIS Toolbox — Agentic Commerce Codebase Intelligence

Server Details

Generate AGENTS.md, AP2 compliance docs, checkout rules, debug playbook & MCP configs from any repo.

Status
Healthy
Last Tested
Transport
Streamable HTTP
URL
Repository
lastmanupinc-hub/AXIS-iliad
GitHub Stars
0
Server Listing
AXIS iliad

Glama MCP Gateway

Connect through Glama MCP Gateway for full control over tool access and complete visibility into every call.

MCP client
Glama
MCP server

Full call logging

Every tool call is logged with complete inputs and outputs, so you can debug issues and audit what your agents are doing.

Tool access control

Enable or disable individual tools per connector, so you decide what your agents can and cannot do.

Managed credentials

Glama handles OAuth flows, token storage, and automatic rotation, so credentials never expire on your clients.

Usage analytics

See which tools your agents call, how often, and when, so you can understand usage patterns and catch anomalies.

100% free. Your data is private.
Tool DescriptionsA

Average 4.6/5 across 29 of 29 tools scored. Lowest: 3.9/5.

Server CoherenceB
Disambiguation3/5

Tools have distinct purposes but some overlap exists, especially among research/analysis and commerce tools. Descriptions are detailed and help differentiate, but the sheer number of tools creates potential confusion.

Naming Consistency4/5

Most tools follow a consistent verb_noun and snake_case pattern (e.g., analyze_files, get_artifact). The iliad_ prefix for infrastructure tools adds coherence. Minor deviations are absent.

Tool Count2/5

29 tools is high for a single server, covering many domains like code analysis, deployment, commerce, and infrastructure. It feels overloaded and could be split into multiple specialized servers.

Completeness3/5

Core operations like create/read are covered for most domains, but update/delete operations are missing for snapshots and other resources. The commerce tools have a good set but gaps in other areas.

Available Tools

29 tools
analyze_filesA
Idempotent
Inspect

Analyze source files directly and generate the full 140-artifact AXIS bundle without using GitHub. Returns snapshot_id plus artifact listing; use this for local, generated, or unsaved code. Requires Authorization: Bearer . Use analyze_repo for GitHub URLs or improve_my_agent_with_axis for recommendation-first agent hardening.

ParametersJSON Schema
NameRequiredDescriptionDefault
filesYesSource files to analyze
goalsYesAnalysis goals
frameworksYesDetected or known frameworks
project_nameYesName of the project
project_typeYesProject type (web_application, api_service, cli_tool, library, monorepo)

Output Schema

ParametersJSON Schema
NameRequiredDescription
statusYes
artifactsYes
project_idYes
snapshot_idYes
artifact_countYes
programs_executedNo
Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Adds context beyond annotations: requires Authorization header, returns snapshot_id plus artifact listing, and generates a bundle. No contradiction with annotations.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Three sentences, front-loaded with main purpose, then usage guidelines and alternatives. No wasted words.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Covers purpose, usage context, authorization, return type. With 5 required params and output schema, description is sufficiently complete for a well-documented tool.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema coverage is 100%, so description does not need to add parameter details. Description provides overall context but does not enhance parameter understanding beyond schema.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

Clearly states 'analyze source files directly and generate the full 140-artifact AXIS bundle without using GitHub', with specific verb and resource, and distinguishes from sibling tools analyze_repo and improve_my_agent_with_axis.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines5/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

Explicitly says 'use this for local, generated, or unsaved code' and provides alternatives: 'Use analyze_repo for GitHub URLs or improve_my_agent_with_axis for recommendation-first agent hardening.'

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

analyze_repoA
Idempotent
Inspect

Analyze a GitHub repository and generate 140 structured AXIS artifacts across 20 programs. Returns snapshot_id plus an artifacts listing; use get_artifact to read files and get_snapshot to re-enumerate outputs without re-running analysis. Requires Authorization: Bearer . Use this when the source of truth is a GitHub repo URL. Pricing: $0.50 standard, $0.15 lite budget mode, $25 engineer per repo. Engineer mode (X-Agent-Mode: engineer — Living Architecture) adds a verified LLM specificity pass: a living-architecture.md whose every architectural claim is grounded in the repo's extracted facts or dropped. This is the paid path for full repo analysis and can return authentication, quota, payment-required, invalid-URL, or GitHub-fetch errors. private repos require a stored GitHub token. Use analyze_files instead for inline file payloads or list_programs/search_and_discover_tools when you are still selecting a workflow.

ParametersJSON Schema
NameRequiredDescriptionDefault
github_urlYesGitHub repository URL (https://github.com/owner/repo)

Output Schema

ParametersJSON Schema
NameRequiredDescription
statusYes
artifactsYes
project_idYes
snapshot_idYes
artifact_countYes
programs_executedNo
Behavior5/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations already indicate idempotentHint=true and destructiveHint=false. The description adds critical behavioral context: authentication requirements, pricing, engineer mode, error types, and private repo token needs. No contradiction with annotations; description significantly enriches transparency.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness4/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is lengthy but well-structured, front-loading the primary purpose. Every sentence adds necessary detail (pricing, modes, errors, alternatives), making it informative without being overly verbose. Minor redundancy but overall efficient.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness5/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the tool's complexity (paid, multiple modes, error conditions, many sibling tools), the description is comprehensive. It covers output behavior (snapshot_id and artifacts listing), how to use related tools, auth, pricing, errors, and prerequisites. The presence of an output schema means return values need not be detailed here.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema coverage is 100% with one parameter (github_url) described as 'GitHub repository URL (https://github.com/owner/repo)'. The description mostly restates this, adding minimal context about URL format. Baseline 3 is appropriate as schema already handles the parameter documentation.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the tool analyzes a GitHub repository and generates 140 structured AXIS artifacts across 20 programs, using specific verbs and resources. It distinguishes itself from siblings by explicitly mentioning alternatives like get_artifact, get_snapshot, and analyze_files.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines5/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

Explicit guidance on when to use this tool ('Use this when the source of truth is a GitHub repo URL') and when to use alternatives ('Use analyze_files instead for inline file payloads or list_programs/search_and_discover_tools when you are still selecting a workflow'). Also recommends subsequent steps with get_artifact and get_snapshot.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

closerAInspect

Take a 70-80% complete project directory and generate complete professional packaging + marketplace certification artifacts so it is ready to ship and sell.

ParametersJSON Schema
NameRequiredDescriptionDefault
taglineNoOptional branding tagline
snapshot_idNoExisting AXIS snapshot_id to package into a distributable product
product_nameNoOptional branding override for product name
project_rootNoOptional local project root path hint (metadata only in remote MCP mode)
target_marketplacesNoOptional marketplaces list (e.g. npm, unreal, vscode, dockerhub, github-marketplace)

Output Schema

ParametersJSON Schema
NameRequiredDescription
programYes
artifactsYes
project_idYes
snapshot_idYes
artifact_countYes
Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations indicate non-destructive and non-read-only behavior. The description adds context about the input state (70-80% complete) and output (packaging artifacts), but does not fully detail side effects like file creation or modification. Still more informative than relying solely on annotations.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is a single sentence that packs the core purpose and outcome without unnecessary words.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the tool's complexity (multi-marketplace packaging), the description mentions key aspects like input state and output artifacts. The presence of output schema accounts for return values. Lacks explicit prerequisites or examples, but is largely complete.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema coverage is 100%, so baseline is 3. The description adds no extra meaning beyond what the schema already provides for the five optional parameters.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the tool takes a 70-80% complete project directory and generates packaging and certification artifacts, distinguishing it from sibling tools which focus on analysis and discovery.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines3/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description implies usage when a project is nearly complete and needs to be shipped, but lacks explicit guidance on when not to use it or alternatives. No direct siblings exist, so it is adequate but minimal.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

deployAInspect

Generate a zero-pipeline-minutes deploy bundle: stack-aware Dockerfile, .dockerignore, dev compose, render.yaml (Render existing-image), wrangler.pages.toml + wrangler.containers.toml + worker.ts (Cloudflare), bash/PowerShell push scripts, and a qualification report. The project builds locally in VSCode, pushes images to GHCR or via wrangler, and Render/Cloudflare just pulls — no GitHub Actions minutes, no Render build pipeline minutes, no CF build minutes.

ParametersJSON Schema
NameRequiredDescriptionDefault
snapshot_idYesExisting AXIS snapshot_id to package into deploy artifacts

Output Schema

ParametersJSON Schema
NameRequiredDescription
programYes
artifactsYes
project_idYes
snapshot_idYes
artifact_countYes
Behavior3/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations provide readOnlyHint=false, idempotentHint=false, destructiveHint=false. The description adds that it 'generates' files but does not discuss side effects, idempotency, or what happens to existing files. It does not contradict annotations, but adds limited behavioral context beyond annotations.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness4/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is two sentences: the first sentence lists the generated artifacts (somewhat long but efficient), and the second explains the value proposition. It is front-loaded with 'Generate a zero-pipeline-minutes deploy bundle' and every sentence adds value. Slightly dense but not verbose.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given an output schema exists (not shown but indicated), the description need not explain return values. It sufficiently describes what is generated and the overall approach. However, it lacks details on prerequisites beyond snapshot_id and does not mention the output schema, leaving minor gaps.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

There is one required parameter (snapshot_id) with schema description 'Existing AXIS snapshot_id to package into deploy artifacts'. The tool description mentions 'snapshot_id' in context but adds no semantic detail beyond the schema. Since schema coverage is 100%, a baseline of 3 is appropriate.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the tool generates a 'zero-pipeline-minutes deploy bundle' and lists specific artifacts (Dockerfile, .dockerignore, dev compose, render.yaml, etc.). The verb 'generate' and the resource 'deploy bundle' are specific, and the tool is distinct from all siblings, which are unrelated.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description explains the context: it builds locally in VSCode and avoids CI minutes, implying use when avoiding pipeline costs is desired. However, it does not explicitly state when not to use this tool or mention alternatives, which prevents a higher score.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

discover_agentic_purchasing_needsA
Read-onlyIdempotent
Inspect

Discover the best AXIS workflow for a purchasing or compliance task. Free, no auth, and logs lightweight task metadata for intent analytics. Example: task_description='prepare for autonomous Visa checkout'. Use this when you need commerce-specific triage and next-step guidance. Use search_and_discover_tools instead for non-commerce keyword routing across all programs.

ParametersJSON Schema
NameRequiredDescriptionDefault
focus_areasNoOptional: specific areas to focus on
task_descriptionNoWhat the agent is trying to accomplish
current_readinessNoOptional: current Purchasing Readiness Score (0-100) if known

Output Schema

ParametersJSON Schema
NameRequiredDescription
readinessYes
task_descriptionYes
matched_capabilitiesYes
recommended_next_stepYes
Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations already provide readOnlyHint, idempotentHint, and destructiveHint. The description adds that it is free, requires no auth, and logs lightweight task metadata for intent analytics, which are valuable behavioral disclosures beyond annotations.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is concise with four sentences covering purpose, cost/auth, logging, example usage, and sibling differentiation. Every sentence adds value, and key information is front-loaded.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness5/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the tool has a simple input schema with all optional parameters, an output schema (so return values need not be described), and safety annotations, the description sufficiently covers purpose, usage, behavioral context, and differentiation from siblings, making it complete.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema description coverage is 100%, so baseline is 3. The description does not add extra meaning to parameters beyond what is in the schema, which is acceptable given the schema's completeness.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the tool discovers the best AXIS workflow for purchasing or compliance tasks, and distinguishes it from search_and_discover_tools by specifying commerce-specific triage. This provides a specific verb and resource with sibling differentiation.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines5/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

Explicitly says 'Use this when you need commerce-specific triage and next-step guidance' and 'Use search_and_discover_tools instead for non-commerce keyword routing across all programs,' giving clear when-to-use and alternative guidance.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

discover_commerce_toolsA
Read-onlyIdempotent
Inspect

Discover AXIS install metadata, pricing, and shareable manifests for commerce-capable agents. Free, no auth, and no mutation beyond read access. Example: call before wiring AXIS into Claude Desktop, Cursor, or VS Code. Use this when you need onboarding and ecosystem setup details. Use search_and_discover_tools instead for keyword routing or discover_agentic_purchasing_needs for purchasing-task triage.

ParametersJSON Schema
NameRequiredDescriptionDefault

No parameters

Output Schema

ParametersJSON Schema
NameRequiredDescription
toolsYes
installYes
axis_iliadYes
free_toolsYes
shareable_manifestYes
Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

The description adds valuable behavioral context beyond annotations: it explicitly states 'Free, no auth, and no mutation beyond read access', which clarifies cost, authentication, and safety aspects. While annotations already indicate readOnlyHint=true and destructiveHint=false, the description reinforces this with plain language and adds practical constraints.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is efficiently structured: first sentence states purpose, second adds behavioral traits, third gives concrete example, fourth provides usage guidelines, and fifth distinguishes from alternatives. Every sentence earns its place with no wasted words.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness5/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the tool has no parameters, comprehensive annotations, and an output schema exists, the description provides complete context. It covers purpose, behavioral traits, usage scenarios, and sibling differentiation, making it fully adequate for agent understanding.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters4/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

With 0 parameters and 100% schema description coverage, the baseline would be 4. The description doesn't need to explain parameters, but it provides context about what the tool discovers (metadata, pricing, manifests) which helps understand the output semantics.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the tool's purpose with specific verbs ('discover AXIS install metadata, pricing, and shareable manifests') and resources ('commerce-capable agents'). It explicitly distinguishes from siblings by naming alternatives ('search_and_discover_tools' and 'discover_agentic_purchasing_needs') for different use cases.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines5/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description provides explicit guidance on when to use this tool ('call before wiring AXIS into Claude Desktop, Cursor, or VS Code', 'when you need onboarding and ecosystem setup details') and when to use alternatives ('Use search_and_discover_tools instead for keyword routing or discover_agentic_purchasing_needs for purchasing-task triage').

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

get_artifactA
Read-onlyIdempotent
Inspect

Read one generated artifact by snapshot_id and path. Requires access to the snapshot and may return snapshot-not-found, invalid-path, or artifact-not-found errors. Example: snapshot_id=abc-123, path=AGENTS.md. Use this when you need the full text of one artifact. Use get_snapshot instead when you first need the artifact list.

ParametersJSON Schema
NameRequiredDescriptionDefault
pathYesArtifact file path as returned in the artifacts list
snapshot_idYesSnapshot ID

Output Schema

ParametersJSON Schema
NameRequiredDescription
contentYesUTF-8 artifact content
Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

The annotations already declare readOnlyHint=true, idempotentHint=true, and destructiveHint=false. The description adds valuable behavioral context by listing possible errors (snapshot-not-found, invalid-path, artifact-not-found) and stating access requirements ('Requires access to the snapshot'). No contradictions with annotations.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is three sentences with no wasted words. It front-loads the main purpose, then covers errors, an example, and usage guidance in a logical order.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness5/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

The tool has an output schema (not shown but noted in context), so return value explanation is not needed. The description covers purpose, usage, errors, and an example, making it complete for a read-only tool with clear annotations.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters4/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Both parameters have descriptions in the input schema (100% coverage). The description adds an example ('snapshot_id=abc-123, path=AGENTS.md'), which clarifies the format and combination of parameters beyond the schema.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description explicitly states 'Read one generated artifact by snapshot_id and path,' providing a specific verb and resource. It distinguishes from the sibling tool 'get_snapshot' by noting that this tool retrieves a single artifact while the sibling gets the artifact list.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines5/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description clearly specifies when to use this tool: 'Use this when you need the full text of one artifact. Use get_snapshot instead when you first need the artifact list.' This provides explicit guidance and an alternative.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

get_referral_codeA
Idempotent
Inspect

Get or create the caller's AXIS referral token. Requires Authorization: Bearer , has no usage charge, and may persist a new referral code if one does not exist yet. Example: call before sharing AXIS with another agent or workspace. Use this when you need the shareable token itself. Use get_referral_credits instead when you need balances, milestones, and discount status.

ParametersJSON Schema
NameRequiredDescriptionDefault

No parameters

Output Schema

ParametersJSON Schema
NameRequiredDescription
costYes
next_milestoneYes
referral_tokenYes
current_earningsYes
share_instructionYes
Behavior5/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations already provide idempotentHint=true and destructiveHint=false. Description adds context: 'has no usage charge' and 'may persist a new referral code if one does not exist yet', explaining the possible side effect. No contradictions.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Five concise sentences covering purpose, requirements, example, and usage guidance. Every sentence provides value; no redundancy.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness5/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given low complexity (0 params, simple get/create operation) and presence of output schema, the description fully covers purpose, behavior, and usage context.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters4/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

The tool has 0 parameters with 100% schema coverage, so the baseline is 4. Description adds no parameter info, but none is needed.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states 'Get or create the caller's AXIS referral token' – a specific verb and resource. It distinguishes from sibling tool get_referral_credits by specifying using this for the shareable token vs balances/milestones.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines5/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

Explicitly states when to use this tool ('when you need the shareable token itself') and the alternative ('Use get_referral_credits instead when you need balances...'). Also mentions required authentication: 'Requires Authorization: Bearer <api_key>'.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

get_referral_creditsA
Read-onlyIdempotent
Inspect

Get the caller's referral earnings, milestones, and free-call status. Requires Authorization: Bearer , has no usage charge, and returns the current discount ledger without creating a new analysis. Example: call after a referral campaign to inspect earned credits. Use this when you need balances and milestones. Use get_referral_code instead when you only need the shareable token.

ParametersJSON Schema
NameRequiredDescriptionDefault

No parameters

Output Schema

ParametersJSON Schema
NameRequiredDescription
costYes
tierYes
next_milestoneYes
referral_tokenYes
discount_activeYes
earned_discountYes
paid_call_countYes
lifetime_referralsYes
free_calls_remainingYes
earned_credits_millicentsYes
persistence_credits_remainingYes
Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

The description adds valuable behavioral context beyond annotations: it specifies authorization requirements ('Requires Authorization: Bearer <api_key>'), indicates no usage charge, clarifies that it returns a 'current discount ledger without creating a new analysis', and provides an example use case. While annotations cover read-only and idempotent aspects, the description enhances understanding with practical details, though it doesn't mention rate limits or error handling.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is well-structured and front-loaded with key information: purpose, authorization, cost, and return value. Each sentence adds value, such as the example and sibling differentiation, with no wasted words. It efficiently communicates necessary details in a compact form.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness5/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the tool has 0 parameters, rich annotations (readOnlyHint, idempotentHint), and an output schema, the description is complete. It covers purpose, usage guidelines, authorization, cost, behavioral traits, and sibling differentiation, providing all needed context without needing to explain return values due to the output schema.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters4/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

With 0 parameters and 100% schema description coverage, the baseline is 4. The description appropriately does not discuss parameters, as none exist, and instead focuses on the tool's purpose and usage, which is efficient and avoids redundancy.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the specific action ('Get') and the resource ('caller's referral earnings, milestones, and free-call status'), distinguishing it from the sibling tool get_referral_code which is for shareable tokens. It provides a concrete example of when to use it, making the purpose unambiguous and well-differentiated.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines5/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description explicitly states when to use this tool ('Use this when you need balances and milestones') and when to use an alternative ('Use get_referral_code instead when you only need the shareable token'). It also provides a contextual example ('call after a referral campaign to inspect earned credits'), offering clear guidance on usage scenarios.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

get_snapshotA
Read-onlyIdempotent
Inspect

Retrieve status and the full artifact listing for a prior analysis by snapshot_id. Use this to re-enumerate artifact paths without re-running analysis. Snapshots persist and can be shared between agents to avoid duplicate analysis costs.

ParametersJSON Schema
NameRequiredDescriptionDefault
snapshot_idYesSnapshot ID returned by analyze_repo or analyze_files

Output Schema

ParametersJSON Schema
NameRequiredDescription
statusYes
artifactsYes
project_idYes
snapshot_idYes
artifact_countYes
programs_executedNo
Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations already declare readOnlyHint=true and destructiveHint=false, so the description does not need to cover safety. It adds value by explaining the tool returns both status and artifact listing, that snapshots persist, and that they can be shared. This provides behavioral context beyond what annotations alone offer, without any contradiction.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is three sentences, each carrying essential information: purpose, usage guidance, and persistence benefit. There is no extraneous text, and the information is front-loaded, making it highly concise and well-structured.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness5/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the tool has a single parameter, clear annotations, and an output schema present, the description provides all necessary context: what it does, when to use it, and why snapshots are valuable. The output schema handles return format details, so the description is complete without needing to elaborate further.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

The input schema has 100% coverage with a clear description for snapshot_id. The description adds only that snapshot_id comes from analyze_repo or analyze_files, which reinforces but does not significantly extend beyond the schema's explanation. With high schema coverage, a baseline of 3 is appropriate.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the tool retrieves 'status and the full artifact listing for a prior analysis by snapshot_id'. It uses a specific verb (Retrieve) and resource (snapshot), which distinguishes it from sibling analysis tools like analyze_repo and analyze_files, as well as get_artifact which likely returns a single artifact.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description explicitly tells when to use this tool: 'to re-enumerate artifact paths without re-running analysis' and mentions sharing to avoid duplicate costs. While it does not include explicit when-not-to-use statements, the context and sibling tools make the alternatives clear, and the guidance is actionable.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

iliad_analyticsAInspect

AXIS-owned product analytics. Two operations: capture (insert events) and query (aggregations). Capture accepts a single event or a batch via events[] (max 500). Query kinds: count (total events), count_by_event (top events by frequency), distinct_users (unique user_id count), count_by_bucket (time-series with minute/hour/day buckets). All queries support optional event, from_ts, to_ts, and property_filter filters. Namespaces are account-scoped server-side (acct:<account_id>:<namespace>). Persistent across restarts via SQLite. Requires Authorization: Bearer . Best for funnels, cohorts, and retention on workloads up to ~1M events per account.

ParametersJSON Schema
NameRequiredDescriptionDefault
eventNoSingle event payload {event, user_id?, properties?, timestamp?} — used in capture mode.
queryNo{kind, event?, from_ts?, to_ts?, property_filter?, bucket?, limit?} — used in query mode.
eventsNoBatch of event payloads (max 500). Transactional — partial inserts never persist.
namespaceNoLogical isolation key. Defaults to 'default'. Account id is always prepended server-side.
operationYescapture or query.

Output Schema

ParametersJSON Schema
NameRequiredDescription
resultNoAggregation result shape depending on query.kind (query mode only).
capturedNoEvents written (capture mode only).
namespaceNoScoped namespace the call wrote to or queried.
operationNoEcho of the operation that ran.
Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations provide no hints (readOnlyHint=false, etc.). The description adds important behavioral details: batch capture is transactional, data persists via SQLite, and auth requirements. This compensates for the lack of annotation hints.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness4/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is well-structured: starts with overall purpose, then details operations, query types, filters, namespace, persistence, auth, and use case. Each sentence adds value, though a bit long with 6 sentences.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

The description covers operations, query kinds, filters, auth, and use cases. Output schema exists (not shown) but is sufficient. Missing error handling or examples, but overall complete for the complexity.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters4/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema coverage is 100%, so parameters are already described. The description adds context beyond schema, e.g., transactional behavior of batch capture and explanation of query kinds and filters, enhancing meaning.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states it's for 'AXIS-owned product analytics' with two operations (capture and query). It distinguishes from sibling tools by specifying analytics capabilities like funnels, cohorts, and retention, which are not offered by other iliad tools.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description advises it's 'Best for funnels, cohorts, and retention on workloads up to ~1M events per account,' giving clear context. However, it does not explicitly state when not to use it or suggest alternatives among siblings.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

iliad_code_sandboxAInspect

AXIS-owned secure code execution. Each call spawns a fresh ephemeral Docker container with hardened isolation: no network, read-only root filesystem, all Linux capabilities dropped, no-new-privileges, PID/memory/CPU limits, tmpfs /tmp only, runs as nobody:nobody. Container is force-removed after each call. Supports python | node | bash via the multi-runtime image nikolaik/python-nodejs:python3.12-nodejs22-slim (operator can override via AXIS_CODE_SANDBOX_IMAGE). Returns stdout/stderr/exit_code/timed_out/duration_ms/image. Wall-clock timeout enforced via SIGKILL + force-remove. Source is fed via stdin (no fs write to the read-only root). Code body capped at 256 KiB; stdin at 1 MiB; timeout 1-600 seconds (default 30); stdout/stderr each capped at 1 MiB output. When no Docker daemon is reachable (Render standard services don't expose /var/run/docker.sock), returns a structured _not_configured: true envelope with remediation. Engineer mode (X-Agent-Mode: engineer — Verified Exec, $0.25): the result includes an Ed25519-signed attestation binding code-hash → output-hash + a per-account hash-chain entry, so another agent that pins AXIS's published key can verify the run without re-executing it. Requires Authorization: Bearer .

ParametersJSON Schema
NameRequiredDescriptionDefault
codeYesSource code to execute. Fed via stdin to the interpreter. Max 256 KiB.
stdinNoOptional additional stdin appended after the code body. Max 1 MiB.
languageYesRuntime language.
timeout_secondsNoWall-clock limit. Defaults 30, max 600. SIGKILL on overrun.

Output Schema

ParametersJSON Schema
NameRequiredDescription
imageNoContainer image actually used.
reasonNodocker_daemon_unreachable | dockerode_import_failed (only when _not_configured=true).
stderrNoCaptured stderr (UTF-8, capped at 1 MiB).
stdoutNoCaptured stdout (UTF-8, capped at 1 MiB with truncation marker).
exit_codeNoProcess exit code (137 on SIGKILL).
timed_outNoTrue if the wall-clock timeout fired.
duration_msNoEnd-to-end wall time including container spawn + teardown.
remediationNoHow the operator should fix the unreachable-daemon condition.
_not_configuredNoTrue when no Docker daemon is reachable.
Behavior5/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

The description discloses many behavioral traits beyond annotations: container lifecycle (spawn and force-remove), security restrictions (no network, read-only root, dropped capabilities), timeouts and SIGKILL, error conditions (_not_configured), and the optional engineer attestation mode. Annotations only hint at mutability and destructiveness; the description fleshes out the exact behavior.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness4/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is front-loaded with the core purpose and then details security, constraints, error modes, and authentication. Every sentence adds necessary information for a complex tool. While it is somewhat lengthy (multiple sentences), it is well-structured and not wasteful; a slight reduction in verbosity could improve conciseness.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness5/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the tool's complexity (4 parameters, multiple runtimes, security context, error conditions, attestation mode), the description covers all critical aspects: purpose, parameter constraints, behavior (Docker isolation), error handling (no Docker daemon), authentication requirements, and output summary (stdout/stderr/exit_code etc.). It is complete enough for an AI agent to use correctly.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters5/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

The input schema has 100% description coverage for all four parameters. The description adds additional constraints such as size limits (code 256 KiB, stdin 1 MiB, output 1 MiB), default and maximum timeout, and the multi-runtime image name. This provides richer semantics than the schema alone.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description starts with a clear verb-resource pair: 'AXIS-owned secure code execution.' It then specifies the runtimes (python, node, bash) and the ephemeral Docker container model. Among siblings, no other tool provides code execution, so it is well-distinguished.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description provides extensive context on when to use the tool: for secure, isolated code execution with no network and read-only filesystem. It explicitly describes the behavior when Docker is unavailable and the engineer mode for verification. However, it does not compare against alternative tools or state when not to use it, which would improve this dimension.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

iliad_document_parsingA
Read-onlyIdempotent
Inspect

AXIS-owned document → Markdown extractor. Accepts either document_url (https fetch + 50 MiB cap + 60s timeout) or document_base64 (inline bytes, 50 MiB decoded cap) — exactly one. Optional mime_type hint (application/pdf, application/vnd.openxmlformats-officedocument.wordprocessingml.document, text/html, text/markdown, text/plain); we sniff from magic bytes + URL extension when omitted. Format dispatch: PDF → pdfjs-dist text extraction (one block per page with --- page N --- separators); DOCX → mammoth → markdown (tables preserved); HTML → tag-strip with heading + list + entity handling (NOT a full HTML→MD converter — bring turndown if you need fancier); plain text + markdown → passthrough. Returns {markdown, format_detected, byte_size, page_count, table_count, truncated}. Output capped at 1 MiB markdown with a truncation marker. Engineer mode (X-Agent-Mode: engineer — Document Intelligence, $0.10): adds an engineer block with retrieval chunks (heading-aware, overlapping) + extract-to-caller-schema (pass json_schema → a grammar-constrained, validated typed object) + image OCR (image/* via document_base64) — typed data, not just markdown. Requires Authorization: Bearer .

ParametersJSON Schema
NameRequiredDescriptionDefault
mime_typeNoOptional MIME-type hint. When omitted we sniff from magic bytes + URL extension. Engineer mode: an image/* mime triggers OCR.
json_schemaNoEngineer mode: a JSON Schema. The document is extracted into a validated object matching it (returned in engineer.extracted).
document_urlNohttps URL to a document. Use this OR document_base64, not both.
document_base64NoBase64-encoded document bytes. Use this OR document_url, not both.

Output Schema

ParametersJSON Schema
NameRequiredDescription
reasonNodocument_download_failed | document_decode_failed | unsupported_format | parse_failed | pdf_runtime_missing | docx_runtime_missing (only when _not_configured=true).
engineerNoEngineer mode only: { chunk_count, chunks, extracted? } — retrieval chunks + optional schema-validated extraction.
markdownNoExtracted text, formatted as Markdown when the source had structure.
byte_sizeNoRaw byte size of the source document.
truncatedNoTrue when the markdown output was capped at the 1 MiB ceiling.
page_countNoPage count for PDFs; null otherwise.
remediationNoOperator-actionable fix.
table_countNoNumber of tables detected in the rendered markdown (DOCX only; 0 elsewhere).
_not_configuredNoTrue when a prerequisite is missing or the document was unsupported.
format_detectedNopdf | docx | html | markdown | text | unknown.
Behavior5/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Discloses critical behaviors: 50 MiB caps, 60s timeout, format dispatch mechanisms, output truncation at 1 MiB, and engineer mode specifics. Annotations already indicate read-only and idempotent, so no contradiction; description adds valuable context.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness3/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is comprehensive but lengthy, with detailed processing steps for each format. While well-structured, it could be more concise by reducing technical detail that may be better suited for documentation.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness5/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Covers all aspects: input modes, format handling, output structure, limits, and special mode. With an output schema present, the summary of return fields is sufficient.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters5/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema description coverage is 100%, and the description adds extra context for each parameter, such as engineer-mode triggers for mime_type and json_schema, and constraints for document_url vs document_base64.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states 'AXIS-owned document → Markdown extractor', specifying the verb (extract) and resource (document). It uniquely distinguishes from sibling tools by detailing supported formats and processing logic.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

Explicitly states that exactly one of document_url or document_base64 must be used, and that mime_type is optional. Provides context for engineer mode. However, it doesn't explicitly state when not to use this tool or mention alternatives among siblings.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

iliad_embeddingsA
Idempotent
Inspect

Convert text into dense vectors. Accepts a single string or a batch (max 2048). Returns one vector per input plus token usage. Currently proxies OpenAI /v1/embeddings (model: text-embedding-3-small by default, overridable via OPENAI_EMBEDDING_MODEL). Requires Authorization: Bearer to call. When OPENAI_API_KEY is not provisioned, returns a structured _not_configured: true envelope. Pairs natively with iliad_vector_database — feed vectors from this tool's output into vector of the vector_database upsert/query calls. Engineer mode (X-Agent-Mode: engineer — Domain Embeddings, $0.08): pass dimensions (Matryoshka truncation → smaller vectors) and/or corpus_adapter: true (mean-center the batch to sharpen retrieval on your data); returns an engineer block with the fitted adapter_mean for query alignment.

ParametersJSON Schema
NameRequiredDescriptionDefault
inputYesA single string or an array of strings to embed. Empty strings and entries > 32k chars are rejected (chunk before calling).
dimensionsNoEngineer mode: truncate each vector to this many leading dims (Matryoshka) + renormalize. Smaller, cheaper vectors.
corpus_adapterNoEngineer mode: mean-center the batch (all-but-the-mean) to sharpen retrieval; returns the fitted adapter_mean.

Output Schema

ParametersJSON Schema
NameRequiredDescription
usageNo{prompt_tokens, total_tokens} when reported by the provider.
vectorsYesArray of dense vectors. vectors[i] corresponds to input[i] (order preserved).
engineerNoEngineer mode only: { dimensions, truncated, adapter_applied, adapter_mean? } — the post-processing applied + the fitted corpus mean.
model_usedYesConcrete embedding model name returned by the provider.
input_countYesNumber of inputs submitted (matches vectors.length).
Behavior5/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations indicate idempotent and non-destructive; description adds backend details (proxies OpenAI), auth requirement, failure mode (_not_configured), and engineer mode behaviors (Matryoshka truncation, mean-centering). No contradictions.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness4/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Description is detailed but well-organized with logical sections (main purpose, limits, return, backend, engineer mode). Slightly dense but every sentence contributes value; could be tightened slightly.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness5/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Covers all relevant aspects: input handling, batch limits, return (vectors + token usage), auth requirements, failure modes, engineering options, and integration with sibling tool. Output schema exists, so return structure explanation is unnecessary.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters5/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema coverage is 100%, but description adds critical details beyond schema: rejects empty strings and >32k chars with chunking advice, explains dimensions as Matryoshka truncation with renormalization, and corpus_adapter mean-centering returns adapter_mean. Greatly enhances understanding.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

Description clearly states 'Convert text into dense vectors' with specific verb and resource. Distinguishes from siblings by mentioning pairing with iliad_vector_database and differentiating embedder from other tools like iliad_llm_inference.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

Specifies input types (single string or batch max 2048), return values, and engineer mode conditions. Advises chunking for long texts and mentions failure envelope when API not configured. Lacks explicit when-not-to-use, but context is adequate.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

iliad_hygieneA
Read-onlyIdempotent
Inspect

AXIS-owned workspace hygiene grader. Analyzes an inline file set [{path,content}] and returns a letter grade (A-F) across a closed set of dimensions plus structured findings. Two modes: mode='scan' (DEFAULT, FREE) returns grade + findings (committed-secret scan, .env/secret-file detection, .gitignore gaps for build/scratch artifacts, oversized blobs, stub/placeholder markers, byte-identical duplicate files, source test-peer coverage, TODO/FIXME debt); mode='fix' (METERED, paid) adds a prioritized remediation plan with ready-to-apply .gitignore additions and per-finding actions. Deterministic, dependency-free, never mutates your repo (fix returns a PLAN). Rules needing a live git checkout/toolchain (worktree pruning, build/vet, route-registration dup-handler analysis) are reported as repo_only_rules, not run. Engineer mode (X-Agent-Mode: engineer — Security Engineer, $5): the fix arrives as a git-applyable unified-diff patch + a SARIF 2.1.0 log for CI code-scanning. Requires Authorization: Bearer .

ParametersJSON Schema
NameRequiredDescriptionDefault
modeNoscan (free grade+findings, default) | fix (metered, adds remediation plan).
filesYesInline files [{path, content}] to scan (non-empty; each content <= 5 MB).
configNoOptional threshold overrides: maxFileBytes, coverageA, coverageB, coverageC, todoDebtThreshold.

Output Schema

ParametersJSON Schema
NameRequiredDescription
modeNoEcho of the mode that ran.
gradeNoOverall hygiene grade A-F (minimum across dimensions).
countsNo{high, medium, low, deferredByPolicy} open-finding counts.
reasonsNoDimensions that capped the grade below A.
scannedNo{files, bytes} actually analyzed.
findingsNoAll findings [{id, ruleId, severity, path, message, policy, recommendedAction}].
dimensionsNoPer-dimension grade [{id, grade, detail}].
paid_fix_hintNoscan mode only: how to obtain the metered remediation plan.
repo_only_rulesNoRules that need a live repo and were not run.
remediation_planNofix mode only: {ordered_steps, gitignore_additions, summary}.
Behavior5/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

The description aligns with annotations (readOnlyHint, idempotentHint, destructiveHint) and adds critical behavioral details: 'Deterministic, dependency-free, never mutates your repo', explains that fix returns a plan, and notes unexecuted rules (repo_only_rules). This fully discloses behavior beyond annotations.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness4/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is thorough and well-organized: starts with a one-sentence summary, then details modes, then additional features (engineer mode, rules). Every sentence adds relevant information; no filler. Slightly long but necessary for the tool's complexity.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness5/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the tool's complexity (3 parameters, nested objects, multiple modes), the description covers all necessary aspects: mode behavior, engineer mode, authentication, dependency, irreversibility, and unexecuted rules. No gaps remain for an agent to understand correct invocation.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters4/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema coverage is 100%, so baseline is 3. The description adds value by listing the dimensions scanned and explaining mode options in detail, plus file constraints (content <= 5 MB, non-empty). While the schema already describes parameters, the description provides richer context for the overall operation.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states 'AXIS-owned workspace hygiene grader' and specifies the verb 'analyzes' with a defined resource (inline file set). It differentiates from sibling tools by focusing on hygiene grading with letter grades and findings. The two modes are explicitly named and explained.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description provides explicit guidance on when to use 'scan' vs 'fix' modes, including default behavior and metering. It also mentions engineer mode headers. However, it does not explicitly state when not to use the tool or alternatives, though the context is clear.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

iliad_llm_inferenceAInspect

AXIS-hosted LLM chat-completion via node-llama-cpp + a small GGUF model loaded in-process. Two input shapes accepted: prompt (single string) or messages (chat-style array of {role, content}). Sampling controls: max_tokens (≤2048), temperature (0-2), top_k, top_p, seed (for reproducibility), stop (string[]). Inference is fully in-process — no upstream provider, no per-call API fee. Operator sets AXIS_LLM_MODEL_PATH to point at a Phi-3-mini / TinyLlama / Llama-3.2-1B GGUF; if missing, the tool returns a _not_configured: true envelope. Engineer mode (X-Agent-Mode: engineer — Constrained Inference, $0.10): pass a json_schema and decoding is grammar-constrained to it AND the output is validated against it (returns a structured block with valid + parsed + schema_errors) — guaranteed-valid structured output. Requires Authorization: Bearer .

ParametersJSON Schema
NameRequiredDescriptionDefault
seedNoOptional seed for reproducible output.
stopNoStop sequences. Generation halts when any string in the array is produced.
top_kNoTop-k sampling (positive integer). Defaults 40.
top_pNoTop-p nucleus sampling in (0, 1]. Defaults 0.95.
promptNoSingle-prompt completion input. Use either this OR messages, not both.
systemNoOptional system prompt (prompt mode only). For messages mode, use role=system entries.
messagesNoChat-style input. Array of {role: system|user|assistant, content: string}.
max_tokensNoMax tokens to generate. Defaults 512, hard cap 2048.
json_schemaNoEngineer mode (required): a JSON Schema. Decoding is grammar-constrained to it and the output is validated against it; returns a `structured` block.
temperatureNoSampling temperature in [0, 2]. Defaults 0.7.

Output Schema

ParametersJSON Schema
NameRequiredDescription
textNoGenerated completion text.
reasonNoWhy the tool returned _not_configured (only present when true).
model_pathNoPath checked for the GGUF file (only present when _not_configured=true).
model_usedNoBasename of the GGUF model file used.
structuredNoEngineer mode only: { schema_constrained, valid, parsed, schema_errors } — the guaranteed-valid structured-output verdict.
remediationNoHow the operator should fix the missing-model condition.
prompt_tokensNoToken count of the input prompt (best-effort).
_not_configuredNoTrue when no GGUF model is present at AXIS_LLM_MODEL_PATH.
completion_tokensNoToken count of the generated text (best-effort).
Behavior5/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations indicate non-read-only, non-idempotent, non-destructive. The description adds valuable behavioral details: fully in-process, no upstream provider, no per-call fee, configuration via environment variable, and error handling with a `_not_configured` envelope. No contradictions.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness4/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is somewhat long but every sentence adds value. It is well-structured with clear sections, though could be slightly more concise by grouping related details.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness5/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the complexity (10 parameters, nested objects, output schema), the description comprehensively covers input formats, all parameters, configuration, engineer mode, and error handling. It fully compensates for the absence of title and other structured fields.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters5/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema coverage is 100%, but description adds significant meaning beyond property descriptions: explains two input shapes (prompt vs messages), sampling control behaviors, engineer mode with grammar-constrained decoding and validation, and constraints like max_tokens ≤2048.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the tool's purpose: AXIS-hosted LLM chat-completion using a local GGUF model via node-llama-cpp. It distinguishes from sibling tools by noting it's in-process with no upstream provider or API fee.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description explains when to use the tool (for LLM chat) and details engineer mode for structured output with json_schema. It does not explicitly state when not to use, but the context is clear enough.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

iliad_object_storageAInspect

AXIS-owned signed-URL minter backed by Cloudflare R2. Returns a pre-signed PUT or GET URL scoped to the calling account (keys are prefixed with accounts/<account_id>/ server-side, so accounts can't reach each other's objects). Requires Authorization: Bearer . Returns the URL plus expires_at (ISO 8601), bucket, and scoped_key. Returns {_not_configured: true, ...} when the operator has not provisioned R2_* env vars (no crash, no leaked secrets). TTL is capped at 86400 seconds (24h). Engineer mode (X-Agent-Mode: engineer — Managed Bucket, $0.05): adds delete + list + copy (server-side, no bytes through the agent) operations, content-addressed dedup keys (content_sha256), and mint-time PUT policy (pin content_type / exact content_length as signed headers R2 enforces).

ParametersJSON Schema
NameRequiredDescriptionDefault
extNoEngineer mode: optional extension appended to the content-addressed key (e.g. 'png').
keyYesObject key (max 1024 chars), or the prefix for operation=list. Path traversal and leading-/ are rejected.
operationYesput / get (standard). delete / list / copy and content-addressed put require X-Agent-Mode: engineer (Managed Bucket).
source_keyNoEngineer mode (operation=copy): source object key to copy from, scoped to your account; `key` is the destination. Echo the returned required_headers on the PUT.
ttl_secondsNoSigned-URL lifetime, 1..86400. Defaults to 3600.
content_typeNoEngineer mode (put): pin the Content-Type the upload must send (signed; R2 rejects a mismatch). Printable ASCII type/subtype, ≤255 chars. Echo via required_headers.
content_lengthNoEngineer mode (put): pin the EXACT byte size the upload must be (signed; ≤5 GiB). Pairs with content_sha256 for verified content-addressed writes.
content_sha256NoEngineer mode: 64-char hex sha256 of the bytes you'll PUT. When set, the object lands under accounts/<id>/cas/<sha256> so identical content dedupes.

Output Schema

ParametersJSON Schema
NameRequiredDescription
urlYesPre-signed URL valid for ttl_seconds.
bucketYesResolved R2 bucket name.
operationNoPUT or GET — what the URL was signed for.
expires_atYesISO-8601 expiry timestamp.
scoped_keyYesServer-side key after account scoping (the user-supplied key prefixed with accounts/<account_id>/).
Behavior5/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Description discloses significant behavioral details: account isolation, TTL cap, engineer mode additional operations, content-addressed dedup, signed header enforcement, and error behavior (_not_configured). This well exceeds what annotations (only destructiveHint=false) provide, adding valuable context for agent decision-making.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness4/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is a single dense paragraph that front-loads the main purpose. Every sentence adds value, though it could be improved with structured formatting (e.g., bullet points). No redundancy or unnecessary details.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness5/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the tool's complexity (8 parameters, engineer mode, error cases), the description covers all important aspects: standard and advanced operations, authentication, scoping, TTL, content-addressing, and provisioning errors. The output schema existence is acknowledged by summarizing return fields. No gaps identified.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters4/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

With 100% schema description coverage, the schema already documents all parameters. However, the description adds context like 'content-addressed dedup keys' and 'pin content_type/exact content_length as signed headers', which clarifies parameter usage beyond the schema descriptions, justifying a score above baseline 3.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

Description clearly identifies the tool as an 'AXIS-owned signed-URL minter' for object storage, specifying the exact resource (Cloudflare R2) and operations (PUT/GET). It distinguishes from sibling tools (none of which are storage-related) and provides a clear verb-resource pair.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

Description explains standard (put/get) and engineer modes (delete/list/copy), required authentication, and account scoping. While it does not explicitly state when not to use the tool, the context signals (sibling list) imply no alternatives for object storage, and the guidance is clear enough.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

iliad_speech_to_textA
Idempotent
Inspect

AXIS-owned audio transcription via whisper.cpp + ffmpeg-static. Accepts either audio_url (https URL we fetch, max 100 MiB, 60s download timeout) or audio_base64 (inline bytes, max 100 MiB decoded) — exactly one. Accepts any audio format ffmpeg can decode (mp3, wav, m4a, opus, ogg, flac); we resample to 16 kHz mono WAV internally. Optional language (ISO-639-1 like "en" / "fr" / "ja", or "auto" — default). Optional initial_prompt (≤512 chars; biases spelling of rare names). Optional word_timestamps boolean. Returns {text, segments: [{start, end, text}], language_detected, duration_seconds, model_used}. When operator hasn't installed whisper-cli or placed the GGML model file at AXIS_WHISPER_MODEL_PATH (default models/ggml-base.en.bin), returns {_not_configured: true, reason, detail, remediation}. Engineer mode (X-Agent-Mode: engineer — Diarization, $0.10): the response adds diarization — speaker turns grouped from the segments by inter-segment pause gaps (tune with diarization_gap_seconds / max_speakers; this is pause-based turn segmentation, not acoustic speaker ID). Requires Authorization: Bearer .

ParametersJSON Schema
NameRequiredDescriptionDefault
languageNoISO-639-1 language code (en, fr, ja, ...) or 'auto' to autodetect. Defaults 'auto'.
audio_urlNohttps URL to an audio file. Use this OR audio_base64, not both.
audio_base64NoBase64-encoded audio bytes. Use this OR audio_url, not both.
max_speakersNoEngineer mode: max alternating speaker labels. Defaults 2.
initial_promptNoOptional bias prompt (≤512 chars) — useful for spelling of rare names.
word_timestampsNoEmit word-level timestamps within segments. Defaults false.
diarization_gap_secondsNoEngineer mode: pause (seconds) between segments that starts a new speaker turn. Defaults 0.75.

Output Schema

ParametersJSON Schema
NameRequiredDescription
textNoFull transcript text, joined from segments.
reasonNomodel_file_not_found | whisper_cli_not_found | ffmpeg_static_missing | audio_download_failed | audio_decode_failed (only when _not_configured=true).
segmentsNo[{start: seconds, end: seconds, text}] timestamped segments.
model_usedNoBasename of the GGML model file used.
remediationNoOperator-actionable fix for the unconfigured prerequisite.
_not_configuredNoTrue when a prerequisite is missing.
duration_secondsNoAudio duration as inferred from the last segment end timestamp.
language_detectedNoLanguage code whisper detected (or echoed from input language).
Behavior5/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations indicate idempotent and non-destructive; description adds context: resampling, configuration checks, error remediation, and engineer mode behavior. No contradiction with annotations.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness4/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Description is comprehensive but somewhat lengthy; however, it is well-structured with clear sections (input methods, parameters, output, error case, engineer mode). Could be slightly more terse, but the detail is justified.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness5/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the tool's complexity (7 parameters, output schema, annotations), the description covers all critical aspects: input constraints, optional features, return format, error handling, and advanced mode. No gaps identified.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters5/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema coverage is 100% with descriptions; the description adds significant value: clarifies mutual exclusivity of audio_url/audio_base64, max sizes, timeout, default language, character limit for initial_prompt, and details on engineer-mode parameters (diarization_gap_seconds, max_speakers).

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states it is an 'AXIS-owned audio transcription' tool, specifying verb ('transcribes') and resource ('audio to text'). It distinguishes itself from sibling tools (e.g., iliad_text_to_speech, iliad_analytics) by its specific function.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

Provides detailed usage instructions: how to provide audio (url or base64), max sizes, optional parameters, and engineer mode. Lacks explicit when-not-to-use guidance, but given sibling set, it is clear when this tool is appropriate.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

iliad_text_to_speechA
Idempotent
Inspect

AXIS-owned voice synthesis via Piper (rhasspy/piper) + ffmpeg-static. Accepts text (1-5000 chars), optional voice slug (filename without extension; defaults to AXIS_PIPER_DEFAULT_VOICE or the first available voice), optional format (wav | mp3 | opus; defaults wav), optional sentence_silence (0-5 seconds, default 0.2). Returns {audio_base64, format, voice_used, sample_rate, duration_seconds, byte_size}. Inference is fully in-process — no upstream provider, no per-character fee. When operator hasn't installed piper or placed voice .onnx + .onnx.json files in AXIS_PIPER_VOICE_DIR (default models/piper/), returns {_not_configured: true, reason, detail, remediation}. format=mp3/opus additionally requires ffmpeg-static. Engineer mode (X-Agent-Mode: engineer — Brand Voice, $0.10): pass brand_text (a brand / voice-and-tone artifact) and AXIS auto-derives the voice persona (Piper voice slug + sentence pacing) and synthesizes in it; the persona is echoed in the response. Requires Authorization: Bearer .

ParametersJSON Schema
NameRequiredDescriptionDefault
textYesText to speak. 1-5000 chars after trim.
voiceNoVoice slug (filename without extension, e.g. 'en_US-amy-medium'). Defaults to first available voice or AXIS_PIPER_DEFAULT_VOICE.
formatNoAudio codec.
genderNoEngineer mode: persona gender override.
localeNoEngineer mode: persona locale override.
brand_textNoEngineer mode: brand / voice-and-tone artifact. AXIS derives a voice persona from it and synthesizes in that voice (overrides voice/sentence_silence).
sentence_silenceNoPer-sentence silence in seconds (0-5). Defaults 0.2.

Output Schema

ParametersJSON Schema
NameRequiredDescription
formatNoEcho of the requested format.
reasonNopiper_cli_not_found | voice_dir_missing | no_voices_available | voice_model_not_found | voice_config_not_found | ffmpeg_static_missing | synthesis_failed (only when _not_configured=true).
byte_sizeNoByte length of the encoded audio (post-transcode for mp3/opus).
voice_usedNoVoice slug that was used (resolved if caller omitted `voice`).
remediationNoOperator-actionable fix for the unconfigured prerequisite.
sample_rateNoWAV sample rate parsed from the RIFF header (typically 22050 for Piper).
audio_base64NoBase64-encoded audio bytes in the requested format.
_not_configuredNoTrue when a prerequisite is missing.
duration_secondsNoAudio duration in seconds, computed from the WAV header.
Behavior5/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations already indicate idempotent and non-destructive. The description adds significant context: fully in-process, no upstream provider, no per-character fee, error response for missing configuration, engineer mode persona derivation. These go well beyond annotations.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness3/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is thorough but somewhat verbose, especially with detailed explanations of engineer mode and error responses. It could be more concise while retaining essential information.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness5/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the tool's complexity (7 parameters, engineer mode, multiple formats, error states), the description covers all essential aspects: return structure, error conditions, default behaviors, and special mode. The presence of an output schema in the description (though not provided) further supports completeness.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters4/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema coverage is 100%; all parameters have descriptions. The description adds extra value by explaining default voice selection, ffmpeg requirement for certain formats, and engineer mode overrides. This justifies a score above baseline 3.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states it's a voice synthesis tool using Piper, accepts text and optional parameters, and produces audio output. The verb 'synthesize' is implied, and it distinguishes from sibling Iliad tools like speech-to-text.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

Provides clear context on when to use, including engineer mode and default behaviors. Does not explicitly contrast with sibling tools, but the detailed usage scenarios suffice. Missing explicit 'when not to use' guidance.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

iliad_transactional_emailAInspect

Send a single transactional email. Requires Authorization: Bearer . Provide either body_html, body_text, or both (Resend will pick the best variant per recipient). All emails ship from RESEND_FROM_ADDRESS — operator must verify that domain in Resend before sending. Returns the provider-assigned message_id plus the accepted recipient list. Returns a structured _not_configured envelope when RESEND_API_KEY or RESEND_FROM_ADDRESS is missing. Recipients capped at 50 per call; subject capped at 998 chars; bodies capped at 1 MB. Engineer mode (X-Agent-Mode: engineer — Deliverability, $0.50): instead of sending, pass a domain and get a full SPF/DKIM/DMARC setup (fresh DKIM keypair) + sender warmup schedule + verification checklist — no email sent, no ESP key needed.

ParametersJSON Schema
NameRequiredDescriptionDefault
toNoRecipient address or array of addresses (max 50). Required for a send (standard mode).
domainNoEngineer mode (Deliverability): domain to generate SPF/DKIM/DMARC setup for. Replaces the send.
subjectNoEmail subject (max 998 chars, RFC 5322).
providerNoEngineer mode: ESP for the SPF include (resend/sendgrid/mailgun/postmark/ses/google). Defaults resend.
reply_toNoOptional Reply-To address.
body_htmlNoHTML body. At least one of body_html / body_text required.
body_textNoPlaintext body. At least one of body_html / body_text required.
dmarc_policyNoEngineer mode: DMARC policy none|quarantine|reject. Defaults none (monitoring).
dkim_selectorNoEngineer mode: DKIM selector (alphanumeric/hyphen, 1-32). Defaults 'axis'.

Output Schema

ParametersJSON Schema
NameRequiredDescription
fromYesRESEND_FROM_ADDRESS used as the From: header.
subjectYesSubject sent (echo).
message_idYesProvider-assigned message ID.
delivered_toYesRecipients the provider accepted.
Behavior5/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Beyond annotations (readOnlyHint=false, destructiveHint=false), the description discloses important traits: authentication requirements, limits (recipients, subject, body), error handling for missing configuration, and the two distinct modes (send vs engineer). No contradictions with annotations.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness4/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is well-structured with clear separation of modes, but it is somewhat verbose. It is front-loaded with the main purpose, but some redundancy exists (e.g., repeating limits already in the schema). Still, every sentence earns its place.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness5/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the complexity (9 parameters, two modes, output schema exists), the description covers all necessary aspects: send and engineer modes, limits, authentication, error handling, and return values. It is fully self-contained and leaves no significant gaps.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters5/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema coverage is 100%, so baseline is 3. The description adds significant value by explaining the relationship between parameters and modes, the source email address, the need for domain verification, and the meaning of engineer mode parameters. This goes beyond the schema descriptions.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states 'Send a single transactional email' and also describes an engineer mode for deliverability setup. It distinguishes itself from sibling tools (like iliad_analytics, iliad_web_search) by specifying it handles email sending and DNS configuration.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description provides explicit guidance on when to use send mode vs engineer mode, including prerequisites (domain verification, API key) and behavior in each mode. However, it does not explicitly compare to sibling tools or state when not to use this tool in favor of others.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

iliad_vector_databaseAInspect

AXIS-owned vector store. Two operations: upsert (insert or replace vectors) and query (cosine top-k nearest neighbors). Namespaces are account-scoped server-side (acct:<account_id>:<namespace>), so tenants cannot read each other's vectors. Persistent across restarts via Postgres. Requires Authorization: Bearer . Best for RAG retrievers, deduplication, and similarity search. Engineer mode (X-Agent-Mode: engineer — Managed Memory, $0.05): query runs a pgvector/HNSW ANN candidate pool with optional recency-decay reranking (recency_half_life_days — managed forgetting), RRF hybrid fusion (sparse_ids), and metadata filter; upsert applies intra-batch semantic-dedup (dedup_threshold).

ParametersJSON Schema
NameRequiredDescriptionDefault
queryNo{vector: number[], top_k?: number, filter?: object}. Engineer mode also reads recency_half_life_days (number — exponential recency decay) and sparse_ids (string[] — RRF hybrid fusion). Required for query.
vectorsNoArray of {id, vector, metadata?} — required for upsert.
namespaceNoLogical isolation key. Defaults to 'default'. Account ID is always prepended server-side.
operationYesupsert (insert/replace) or query (top-k cosine).
semantic_dedupNoEngineer upsert: set false to disable dedup (default on).
dedup_thresholdNoEngineer upsert: cosine threshold for intra-batch semantic-dedup (default 0.97).

Output Schema

ParametersJSON Schema
NameRequiredDescription
backendNoEngineer query: 'pgvector' or 'js' — which ANN path served the query.
matchesNoNearest neighbors sorted by score desc (query mode only).
engineerNoEngineer flags { ann, recency_decay, hybrid_fusion } (query), or { dropped: [...] } (upsert semantic-dedup).
upsertedNoVectors written (upsert mode only).
namespaceNoScoped namespace the call wrote to or queried.
operationNoEcho of the operation that ran.
total_in_namespaceNoTotal vectors in this namespace after the call (upsert mode only).
Behavior5/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

The description adds significant behavioral context beyond minimal annotations: it explains namespace scoping (account-level isolation), persistence via Postgres, authentication requirements, and engineer mode features (HNSW ANN, recency decay, RRF fusion, dedup). No contradictions with annotations.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness4/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is a single cohesive paragraph that packs detail without redundancy. It front-loads the core purpose and operations. While dense, it could benefit from slight structuring (e.g., separating standard vs. engineer mode), but remains efficient and informative.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness5/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the tool's complexity (2 operations, 6 params, engineer mode, nested objects) and presence of an output schema, the description covers all essential aspects: operations, scoping, persistence, auth, use cases, and both standard and advanced features. It is fully adequate for correct invocation.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters4/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema coverage is 100%, baseline 3. The description enriches parameter understanding by explaining engineer-mode specifics (recency_half_life_days, sparse_ids, dedup_threshold) and providing rationale for defaults. This adds value beyond the schema's descriptions.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the tool is an 'AXIS-owned vector store' with two specific operations: upsert and query. It immediately distinguishes itself from sibling tools (e.g., iliad_embeddings, iliad_analytics) by focusing on vector storage and retrieval, making the purpose unambiguous.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description explicitly recommends the tool for 'RAG retrievers, deduplication, and similarity search,' providing clear when-to-use guidance. It does not mention when not to use or list alternatives, but the context of sibling tools implies differentiation. Slight improvement possible, but sufficient.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

iliad_web_researchA
Idempotent
Inspect

Scrape a single URL using Firecrawl and return markdown-formatted content. Returns markdown body, extracted metadata, and title. Best for research, documentation reading, or SEO analysis. Requires Authorization: Bearer . Pricing: $0.10 standard, $0.05 lite per page. Use iliad_web_research_crawl for crawling multiple pages or link following.

ParametersJSON Schema
NameRequiredDescriptionDefault
urlYesThe URL to scrape (http or https)
only_main_contentNoExtract only the main content (default: true)

Output Schema

ParametersJSON Schema
NameRequiredDescription
dataNo
errorNoError message if request failed
successYes
Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations indicate idempotentHint=true and destructiveHint=false, and the description confirms it is a read operation ('scrape'). It adds transparency about authorization ('Requires Authorization: Bearer <api_key>') and pricing. However, it does not detail error handling or rate limits, missing some behavioral context.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is three sentences, each adding distinct value: purpose, use cases, and alternatives with authorization/pricing. It is front-loaded and concise with no wasted words.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the complexity (2 params, output schema exists, good annotations), the description covers purpose, usage, auth, pricing, and alternatives. It is almost complete but could mention potential errors or rate limits for full contextual completeness.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

The input schema has 100% description coverage for parameters, so baseline is 3. The description mentions the return format ('markdown body, extracted metadata, and title') but does not explicitly clarify the 'only_main_content' parameter's effect—it implies full extraction while the default is to extract only main content, causing a slight inconsistency.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states it scrapes a single URL and returns markdown-formatted content, metadata, and title. It explicitly distinguishes itself from the sibling tool 'iliad_web_research_crawl' by noting that the sibling is for crawling multiple pages or following links.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines5/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description provides explicit usage scenarios: 'Best for research, documentation reading, or SEO analysis.' It also names the alternative tool for different use cases ('Use iliad_web_research_crawl for crawling multiple pages or link following'), making it clear when to choose this tool.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

iliad_web_research_crawlA
Idempotent
Inspect

Crawl a domain and scrape multiple pages using Firecrawl. Returns array of scraped pages with markdown content. Best for site mapping, content audits, or bulk research. Requires Authorization: Bearer . Pricing: $0.25 standard, $0.12 lite per page crawled (up to 100 pages per request). Use iliad_web_research for single-page scrapes.

ParametersJSON Schema
NameRequiredDescriptionDefault
urlYesThe domain/URL to crawl (http or https)
limitNoMaximum pages to crawl (1-100, default: 10)

Output Schema

ParametersJSON Schema
NameRequiredDescription
dataNo
errorNoError message if request failed
successYes
Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations indicate readOnlyHint false, idempotentHint true, destructiveHint false. Description adds details about output format, pricing, and limits (100 pages), but does not cover error handling or failure behavior.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Five concise sentences: action, output, use cases, auth, pricing/alternative. Every sentence serves a purpose; no redundancy.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness5/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

With two parameters and output schema present, description covers output format, auth, pricing, use cases, and alternative. No gaps for the tool's purpose.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters4/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema already describes both parameters fully (100% coverage). Description adds value by mentioning pricing per page, implying cost implications for the limit parameter, beyond what schema provides.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

Clearly states 'crawl a domain and scrape multiple pages using Firecrawl' and explicitly contrasts with sibling tool iliad_web_research for single-page scrapes, providing strong differentiation.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines5/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

Lists ideal use cases (site mapping, content audits, bulk research) and explicitly directs to alternative for single-page needs, along with pricing and authorization requirements.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

improve_my_agent_with_axisAInspect

Analyze an agent codebase and return a prioritized AXIS hardening plan. Requires Authorization: Bearer ; this creates a snapshot and may return auth, quota, file-limit, or validation errors. Example: pass your agent source files to see missing AGENTS.md, CLAUDE.md, and MCP config gaps. Use this when you want recommendations and missing-context detection. Use analyze_files instead when you want the full artifact bundle directly.

ParametersJSON Schema
NameRequiredDescriptionDefault
filesYesSource files of the agent to analyze
project_nameYesName of the agent/project to improve

Output Schema

ParametersJSON Schema
NameRequiredDescription
analysisYes
call_againYes
mcp_configYes
snapshot_idYes
project_nameYes
improvement_planYes
Behavior5/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

The description goes beyond annotations by stating that the tool 'creates a snapshot' (indicating a side effect) and may return specific errors (auth, quota, file-limit, validation). Annotations already mark readOnlyHint=false, so this adds useful behavioral context without contradiction.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is efficient: two sentences plus an example and a comparison. Every part is valuable, front-loaded with purpose and key constraints, and no wasted words.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness5/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the presence of an output schema, the description does not need to explain return values. It covers purpose, usage guidance, authorization, errors, and alternative tool, making it fully complete for the tool's complexity.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters4/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema coverage is 100%, so baseline is 3. The description provides an example ('pass your agent source files to see missing AGENTS.md, CLAUDE.md, and MCP config gaps') that adds contextual meaning to the parameters, earning an extra point.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the tool analyzes an agent codebase and returns a prioritized AXIS hardening plan. It distinguishes itself from the sibling analyze_files by specifying that this tool provides recommendations and missing-context detection, while analyze_files returns the full artifact bundle directly.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines5/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description explicitly says 'Use this when you want recommendations and missing-context detection. Use analyze_files instead when you want the full artifact bundle directly.' It also mentions required authorization and potential errors, providing clear guidance on when to use this tool versus its sibling.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

list_programsA
Read-onlyIdempotent
Inspect

Inventory mode. List all 20 AXIS programs, their generators, pricing tier, and artifact paths. Free, no auth, and no side effects. Use search_and_discover_tools instead when you only have a keyword, or discover_commerce_tools when you need install and onboarding metadata.

ParametersJSON Schema
NameRequiredDescriptionDefault

No parameters

Output Schema

ParametersJSON Schema
NameRequiredDescription
programsYes
pro_programsYes
free_programsYes
total_programsYes
total_generatorsYes
Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations already declare readOnlyHint, idempotentHint, and destructiveHint. The description adds value beyond these by stating 'Free, no auth, and no side effects,' providing context about cost and authentication requirements.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is two sentences, front-loaded with the core purpose. Every sentence adds value without redundancy.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness5/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given no parameters and the presence of an output schema, the description adequately explains what is returned (list of programs with specific fields) and the scope (all 20). It is complete for the tool's simplicity.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters4/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

The tool has no parameters and schema description coverage is 100%. The description does not need to add parameter information, so baseline 4 is appropriate.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description uses a specific verb 'List' and identifies the resource '20 AXIS programs', along with the fields returned (generators, pricing tier, artifact paths). It also distinguishes this tool from siblings by naming alternatives for different use cases.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines5/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description explicitly states when to use this tool and when to use alternatives: 'Use search_and_discover_tools instead when you only have a keyword, or discover_commerce_tools when you need install and onboarding metadata.'

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

prepare_agentic_purchasingAInspect

Prepare a codebase for agentic purchasing and return a readiness score plus commerce artifacts. Requires Authorization: Bearer ; paid analysis records a new snapshot and may return auth, quota, payment, file-limit, or validation errors. Example: submit checkout files with focus_areas=["sca","dispute"]. Use this when you need AP2/UCP/Visa, CE 3.0 dispute evidence, checkout, dispute, and negotiation hardening. Engineer mode (X-Agent-Mode: engineer — Commerce Integration, $250): also emits a deployable x402/AP2/PAI'D endpoint + a runnable sandbox test + a schema-validatable CE 3.0 pack + a transparent dispute-readiness score (a working integration, not just a score). Use discover_agentic_purchasing_needs instead when you only need workflow triage.

ParametersJSON Schema
NameRequiredDescriptionDefault
filesYesArray of {path, content} objects representing source files
focusNoAnalysis focus (default: purchasing)
goalsYesProject goals
agent_typeNoConsuming agent type hint
frameworksYesDetected or known frameworks
focus_areasNoCompliance focus areas
project_nameYesName of the project
project_typeYesProject type (web_application, api_service, cli_tool, library, monorepo)
referral_tokenNoOptional referral token from another agent
spending_windowNoAgent spending window
budget_per_run_centsNoAgent budget for this call in cents

Output Schema

ParametersJSON Schema
NameRequiredDescription
statusYes
summaryYes
project_idYes
snapshot_idYes
artifact_countYes
programs_executedYes
Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations indicate it is not read-only, not idempotent, and not destructive. The description adds meaningful context: requires authorization, records a new snapshot, may return various errors, and details engineer mode outputs. No contradiction with annotations.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness4/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is front-loaded with the core purpose, then provides authentication, example, usage guidance, mode details, and alternative. It is relatively long but well-organized, with each sentence serving a clear purpose.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the tool's complexity (11 parameters, output schema present), the description covers purpose, usage conditions, error types, and mode details. It is sufficient for an agent to understand when and how to invoke it, though it does not explain the output beyond mentioning readiness score and artifacts (output schema covers that).

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Input schema has 100% description coverage, so the schema already documents all 11 parameters. The description provides one example use of 'focus_areas' but does not add significant new semantic meaning beyond the schema.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the tool prepares a codebase for agentic purchasing and returns a readiness score and commerce artifacts. It distinguishes from the sibling tool 'discover_agentic_purchasing_needs' by specifying this is for full hardening while that is for workflow triage.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines5/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description explicitly states when to use this tool ('when you need AP2/UCP/Visa...') and points to 'discover_agentic_purchasing_needs' as an alternative for simpler needs. This provides clear usage boundaries.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

prepare_agentic_purchasing_previewA
Read-onlyIdempotent
Inspect

Compute a free Purchasing Readiness Score (0-100) and gap list for a codebase without generating artifacts. No auth, no charge, no snapshot persisted. Hard caps: 25 files / 50KB per file / 1MB total. Returns score, risk_level, top gaps, frameworks detected, and which AXIS programs would close which gaps. Use this to triage 'should I pay for the full hardening bundle?' before calling prepare_agentic_purchasing. The paid version generates the full artifact bundle including CE 3.0 dispute evidence, SCA exemption matrix, and TAP interop.

ParametersJSON Schema
NameRequiredDescriptionDefault
filesYesSource files to triage (max 25 files, 50KB each, 1MB total)
frameworksNoOptional framework hints
project_nameYesName of the project being previewed
project_typeNoOptional project type hint (web_application, api_service, cli_tool, library, monorepo)

Output Schema

ParametersJSON Schema
NameRequiredDescription
costNo
gapsNo
scoreNoCurrent Purchasing Readiness Score (0-100) for the codebase as submitted
strengthsNo
conversionNo
risk_levelNo
top_3_gapsNo
interpretationNo
frameworks_detectedNo
what_axis_would_addNo
Behavior5/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

The description adds valuable behavioral context beyond annotations: 'No auth, no charge, no snapshot persisted' confirms read-only, idempotent, non-destructive behavior. It also specifies hard caps (25 files, 50KB per file, 1MB total), which are not in annotations. No contradiction.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Two sentences efficiently convey purpose, constraints, usage, and sibling distinction. Every sentence adds value; no redundancy. Front-loaded with key information.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness5/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the tool's medium complexity (4 parameters, output schema exists), the description covers purpose, constraints, usage context, and return values (score, risk_level, gaps, programs). It is self-contained and leaves no major gaps for the agent.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema description coverage is 100%. The description reiterates optionality of frameworks and project_type but does not add new meaning beyond the schema. The hard caps are already in the schema's files description, so the description adds minimal value for parameters.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the tool computes a free Purchasing Readiness Score (0-100) and gap list without generating artifacts, and explicitly distinguishes it from the sibling tool prepare_agentic_purchasing by describing the triage use case.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description provides explicit guidance: use this tool to triage whether the paid version is needed before calling prepare_agentic_purchasing. It also notes 'No auth, no charge' indicating low risk. However, it does not mention alternative analysis tools, but the sibling differentiation is clear.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

search_and_discover_toolsA
Read-onlyIdempotent
Inspect

Search AXIS programs by keyword and return ranked matches with artifact paths. Free, no auth, and no stateful side effects. Example: q=checkout returns commerce-relevant programs first. Use this when you know the outcome you want but not the right program. Use list_programs instead for the full catalog, discover_commerce_tools for install metadata, or discover_agentic_purchasing_needs for purchasing-specific triage.

ParametersJSON Schema
NameRequiredDescriptionDefault
qNoSearch query — keyword or phrase
programNoOptional: filter results to a specific program name

Output Schema

ParametersJSON Schema
NameRequiredDescription
queryYes
resultsYes
total_matchesYes
program_filterYes
Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations already declare readOnlyHint=true, idempotentHint=true, destructiveHint=false. Description adds that it's free, no auth, no stateful side effects, and gives an example. No contradictions.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Description is two sentences plus example. Front-loaded with purpose. No wasted words.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness5/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Output schema exists, so return values need not be explained. Description covers purpose, when to use, alternatives, parameters, and behavior. Complete given complexity.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters4/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema description coverage is 100%, baseline 3. Description adds example usage (q=checkout) and explains filtering behavior, adding value beyond schema.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

Description clearly states it searches AXIS programs by keyword and returns ranked matches with artifact paths. It distinguishes from siblings like list_programs, discover_commerce_tools, etc.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines5/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

Explicitly states when to use (know outcome but not program) and when not to use (list_programs for full catalog, discover_commerce_tools for install metadata, discover_agentic_purchasing_needs for purchasing-specific triage).

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

Discussions

No comments yet. Be the first to start the discussion!

Try in Browser

Your Connectors

Sign in to create a connector for this server.