Skip to main content
Glama

Server Details

Build, validate, and deploy multi-agent AI solutions from any AI environment.

Status
Healthy
Last Tested
Transport
Streamable HTTP
URL
Repository
ariekogan/ateam-mcp
GitHub Stars
0
Server Listing
ateam-mcp

Glama MCP Gateway

Connect through Glama MCP Gateway for full control over tool access and complete visibility into every call.

MCP client
Glama
MCP server

Full call logging

Every tool call is logged with complete inputs and outputs, so you can debug issues and audit what your agents are doing.

Tool access control

Enable or disable individual tools per connector, so you decide what your agents can and cannot do.

Managed credentials

Glama handles OAuth flows, token storage, and automatic rotation, so credentials never expire on your clients.

Usage analytics

See which tools your agents call, how often, and when, so you can understand usage patterns and catch anomalies.

100% free. Your data is private.
Tool DescriptionsA

Average 4.3/5 across 40 of 40 tools scored. Lowest: 3.2/5.

Server CoherenceB
Disambiguation2/5

Many tools have overlapping purposes, e.g., ateam_github_patch vs ateam_github_write both write to GitHub, and ateam_build_and_run vs ateam_redeploy vs ateam_patch all involve deployment. The high number of tools (40) increases confusion despite detailed descriptions.

Naming Consistency5/5

All tools follow a consistent 'ateam_verb_noun' pattern, with clear verbs like auth, bootstrap, delete, get, test, etc. No mixing of conventions.

Tool Count1/5

40 tools is far above the typical well-scoped range of 3-15. While the domain is complex, this many tools likely overwhelms agents and introduces unnecessary complexity.

Completeness5/5

The tool set covers the full lifecycle: auth, onboarding, scaffolding, GitHub operations, deployment, testing, debugging, and cleanup. No obvious gaps for the platform's purpose.

Available Tools

42 tools
ateam_authAInspect

Authenticate with A-Team. Required before any tenant-aware operation (reading solutions, deploying, testing, etc.). The user can get their API key at https://mcp.ateam-ai.com/get-api-key. Only global endpoints (spec, examples, validate) work without auth. IMPORTANT: Even if environment variables (ADAS_API_KEY) are configured, you MUST call ateam_auth explicitly — env vars alone are not sufficient. For cross-tenant admin operations, use master_key instead of api_key.

ParametersJSON Schema
NameRequiredDescriptionDefault
urlNoOptional API URL override (e.g., https://dev-api.ateam-ai.com). Use this to target a different environment without restarting the MCP server.
tenantNoTenant name (e.g., dev, main). Optional with api_key if format is adas_<tenant>_<hex>. REQUIRED with master_key.
api_keyNoYour A-Team API key (e.g., adas_xxxxx)
master_keyNoMaster key for cross-tenant operations. Authenticates across ALL tenants without per-tenant API keys. Requires tenant parameter.
Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

With no annotations provided, the description carries the full burden. It successfully discloses that the tool establishes authentication state for subsequent calls, explains the relationship between api_key (tenant-specific) and master_key (cross-tenant), and clarifies that environment variables alone are insufficient. Minor gap: doesn't describe token expiration or explicit side effects beyond "authentication".

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Five information-dense sentences with zero waste: purpose (1), requirement scope (2), credential source (3), exceptions (4), environment variable constraint (5), and cross-tenant alternative (6). Front-loaded with the core action, followed by operational context and specific constraints. Appropriate length for a critical prerequisite tool.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

For a 4-parameter authentication tool with 100% schema coverage but no output schema, the description is substantially complete. It covers credential acquisition (URL provided), operational prerequisites relative to siblings, environment variable interactions, and tenant scoping rules. Only minor gap is absence of return value description, though this is less critical for an auth initialization tool.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters4/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema coverage is 100%, establishing a baseline of 3. The description adds valuable semantic context beyond the schema: api_key format examples ("adas_xxxxx"), tenant format pattern ("adas_<tenant>_<hex>"), and the mutual exclusivity/relationship logic between api_key, master_key, and tenant parameters that isn't captured in isolated schema field descriptions.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description opens with "Authenticate with A-Team" (specific verb+resource) and immediately distinguishes this from siblings by stating it's "Required before any tenant-aware operation (reading solutions, deploying, testing, etc.)"—explicitly listing the operational sibling tools that depend on this auth step.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines5/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

Provides explicit when-to-use ("Required before any tenant-aware operation"), when-not-needed ("Only global endpoints (spec, examples, validate) work without auth"), and alternatives ("For cross-tenant admin operations, use master_key instead of api_key"). Also includes critical exclusion: "Even if environment variables (ADAS_API_KEY) are configured, you MUST call ateam_auth explicitly".

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

ateam_bootstrapAInspect

REQUIRED onboarding entrypoint for A-Team MCP. MUST be called when user greets, says hi, asks what this is, asks for help, explores capabilities, or when MCP is first connected. Returns platform explanation, example solutions, and assistant behavior instructions. Do NOT improvise an introduction — call this tool instead.

ParametersJSON Schema
NameRequiredDescriptionDefault

No parameters

Behavior3/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

No annotations provided, so description carries full burden. It discloses what the tool returns (platform explanation, instructions), but omits side effects, idempotency, prerequisites, or whether calling it multiple times is safe. Adequate but not rich for a 'bootstrap' entrypoint.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Highly structured with two dense sentences. Front-loaded with 'REQUIRED' to signal importance. First sentence covers purpose and triggers; second covers returns and prohibitions. Zero waste.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

No output schema exists, but description compensates by detailing return contents. Given zero parameters and no annotations, description provides sufficient context for an onboarding tool, though could note error conditions or retry behavior.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters4/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Input schema has zero parameters. Description correctly implies no user input is needed by focusing entirely on trigger conditions rather than parameters. Baseline score applies.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

Description explicitly defines this as the 'REQUIRED onboarding entrypoint' and distinguishes it from operational siblings (auth, build, delete, etc.) by specifying it returns 'platform explanation, example solutions, and assistant behavior instructions.' Clear verb+resource+scope.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines5/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

Provides explicit trigger conditions ('when user greets, says hi, asks what this is...') and explicit prohibition ('Do NOT improvise an introduction — call this tool instead'). Covers both when-to-use and when-not-to-use with high specificity.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

ateam_build_and_runAInspect

DEPLOY THE CURRENT MAIN BRANCH TO A-TEAM CORE. ⚠️ HEAVIEST OPERATION (60-180s): validates solution+skills → deploys all connectors+skills to Core (regenerates MCP servers) → health-checks → optionally runs a warm test → auto-pushes to GitHub.

🌳 DEV/PROD WORKFLOW:

  1. Edit files → ateam_github_patch (writes to dev branch by default)

  2. (Optional) Preview what's about to ship → ateam_github_diff

  3. Ship dev → main → ateam_github_promote (merges + auto-tags prod-YYYY-MM-DD-NNN)

  4. Deploy main to Core → ateam_build_and_run

This tool ALWAYS deploys the main branch — there is no ref parameter. To deploy in-progress dev work, first promote it.

AUTO-DETECTS GitHub repo: if you omit mcp_store and a repo exists, connector code is pulled from main automatically. First deploy requires mcp_store. After that, edit via ateam_github_patch + promote, then build_and_run. For small changes prefer ateam_patch (faster, incremental). Requires authentication.

ParametersJSON Schema
NameRequiredDescriptionDefault
githubNoOptional: if true, pull connector source code from main. AUTO-DETECTED: if you omit both mcp_store and github, the system checks if a repo exists and pulls from main automatically.
skillsNoOptional after first deploy: skill definitions. If omitted, auto-pulled from main (skills/{id}/skill.json).
solutionNoFull solution definition. Required on first deploy. After first deploy, just pass solution_id instead — everything is auto-pulled from GitHub main.
mcp_storeNoOptional: connector source code files. Key = connector id, value = array of {path, content}.
connectorsNoOptional: connector metadata (id, name, transport). Entry points auto-detected from mcp_store.
solution_idNoThe solution ID. Use this INSTEAD of passing the full solution object — the solution definition is auto-pulled from main. Required if solution object is omitted.
test_messageNoOptional: send a test message after deployment to verify the skill works. Returns the full execution result.
test_skill_idNoOptional: which skill to test (defaults to the first skill).
Behavior5/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

No annotations exist, so the description fully discloses behavior: heaviest operation (60-180s), validation, deployment of all connectors/skills, health checks, optional warm test, auto-push to GitHub, and authentication requirement. It also explains auto-detection of GitHub repo.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness4/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is well-structured with an emoji warning, step-by-step workflow, and key notes. It is front-loaded with purpose. However, some repetition (e.g., always deploys main) could be trimmed; slightly too long for a concise definition.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness5/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Despite no output schema, the description covers workflow integration, prerequisites, alternative tools, and auto-detection. It provides a complete mental model for the agent to decide usage and parameter configuration effectively.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters5/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema descriptions cover 100% of parameters, but the description adds critical context: e.g., 'github' is auto-detected, 'solution' required only on first deploy, 'solution_id' as alternative. This enhances parameter understanding beyond the schema.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the tool deploys the current main branch to A-Team Core, with a specific verb and resource. It distinguishes from siblings like ateam_patch (small changes) and ateam_github_promote (branch merge), providing a clear workflow step.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines5/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

Explicit when-to-use (deploy main branch) and when-not-to-use (prefer ateam_patch for small changes). It outlines prerequisites (first deploy needs mcp_store) and a 4-step workflow, offering clear guidance on tool selection.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

ateam_conversationAInspect

Send a message to a deployed solution and get the result. No skill_id needed — the system auto-routes to the right skill. Supports multi-turn conversations: pass the actor_id from a previous response to continue the thread (e.g., reply to a confirmation prompt). Each call creates a new job but the same actor_id maintains conversation context.

ParametersJSON Schema
NameRequiredDescriptionDefault
waitNoIf true (default), wait for completion. If false, return job_id immediately for polling.
messageYesThe message to send (e.g., 'send email to X' or 'I confirm')
actor_idNoOptional: actor ID from a previous response to continue the conversation. Omit for a new conversation.
timeout_msNoOptional: max wait time in ms (default: 60000, max: 300000).
solution_idYesThe solution ID
Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

With no annotations provided, the description carries full disclosure burden and succeeds well: it explains the auto-routing behavior, that each call creates a new job (side effect), and that actor_id maintains conversation state across jobs. It clarifies the synchronous nature implied by 'get the result' (aligned with the wait parameter).

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Four sentences with zero redundancy. Front-loaded with core purpose, followed by routing logic, multi-turn mechanics, and job/context separation. Every sentence adds non-obvious information about behavior or usage.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Comprehensive for a conversational tool with no output schema. It adequately covers the essential mechanics: routing, job creation, and state persistence. Minor gap in not describing error conditions or the specific structure of 'the result', but given the schema coverage and complexity level, it provides sufficient context for correct invocation.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters4/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

While the schema has 100% coverage (baseline 3), the description adds significant semantic value: it provides concrete message examples ('send email to X', 'I confirm') and explains the actor_id parameter's purpose in the conversation lifecycle ('continue the thread', 'reply to a confirmation prompt'), helping the agent understand the stateful interaction pattern.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description opens with a precise action ('Send a message to a deployed solution') and outcome ('get the result'). It clearly distinguishes this from sibling tools by emphasizing the conversation/messaging pattern versus build, delete, or test operations evident in sibling names like ateam_build_and_run or ateam_delete_solution.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

Provides clear context for when to use (messaging deployed solutions, multi-turn interactions) and explains the conversation continuation pattern via actor_id. The note about auto-routing ('No skill_id needed') implicitly guides against manual skill selection. Lacks explicit 'when not to use' or named alternative tools.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

ateam_create_connectorAInspect

Scaffold a new MCP connector with server.js + package.json + README. Eliminates ~50% of identical boilerplate (MCP server setup, tool registration, stdio transport). You then fill in the tool implementations. Set ui_capable=true to include ui.listPlugins / ui.getPlugin stubs (plugin source files added separately via ateam_create_plugin). After scaffolding, the files are uploaded to Core via the same path as ateam_upload_connector.

ParametersJSON Schema
NameRequiredDescriptionDefault
nameNoHuman-readable name for the connector (e.g. 'Hue Lights'). Defaults to connector_id.
ui_capableNoIf true, include ui.listPlugins/ui.getPlugin handler stubs. Default: false.
solution_idYesThe solution ID
connector_idYesConnector ID (lowercase-with-dashes, no spaces). Becomes the directory name.
Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

With no annotations provided, the description carries the burden of behavioral disclosure. It mentions that scaffolding eliminates ~50% of boilerplate, creates specific files, and uploads via the same path as ateam_upload_connector. It does not disclose potential side effects like overwriting existing files or auth requirements, but the core behavior is clearly described.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness4/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is relatively concise and front-loaded with the main action. It contains necessary details without unnecessary verbosity. Minor improvement could be separating the upload note more clearly.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the parameter count of 4 and no output schema, the description adequately covers the tool's purpose, input parameters, and workflow. It references sibling tools for next steps. It could mention idempotency or error cases, but overall it is sufficient for an agent to use the tool correctly.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters5/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

The input schema has 100% coverage with descriptions. The description adds additional context: name defaults to connector_id, connector_id becomes directory name, and ui_capable conditionally includes stubs. This adds meaningful guidance beyond the schema.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description explicitly states 'scaffold a new MCP connector with server.js + package.json + README', providing a specific verb and resource. It distinguishes itself from sibling tools like ateam_upload_connector by noting the upload path is the same, implying this tool creates the scaffolding before uploading.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description gives clear context: 'You then fill in the tool implementations' and explains when to set ui_capable=true. It also references ateam_create_plugin for plugin source files. However, it does not explicitly state when not to use this tool or compare with alternatives like ateam_upload_connector in terms of workflow sequencing.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

ateam_create_pluginAInspect

Scaffold a UI plugin (iframe HTML, React Native TSX, or both) inside an existing connector. Eliminates ~50% of identical plugin boilerplate (imports, theme/bridge hooks, postMessage protocol, default export shape). You then fill in the component body. Use kind='iframe' for web-only, 'rn' for mobile-only, 'adaptive' for both. Auto-discovery (Phase 5 of the strip) picks up the new plugin at next deploy without a manifest declaration.

ParametersJSON Schema
NameRequiredDescriptionDefault
kindNoRender mode. 'adaptive' (default) produces both iframe + RN scaffolds.
plugin_nameYesPlugin name (lowercase-with-dashes). E.g. 'memories-panel'. Becomes the dir name.
solution_idYesThe solution ID
connector_idYesExisting connector to add the plugin into (e.g. 'personal-assistant-ui-mcp')
Behavior3/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

No annotations exist, so the description carries the full burden. It discloses the scaffolding process and auto-discovery, but does not mention potential errors (e.g., if plugin already exists or connector missing), required permissions, or the exact output. Behavior is partially transparent.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is concise at ~100 words across 4 sentences. The purpose is stated first, followed by benefit, parameter guidance, and deployment detail. No wasted words; each sentence serves a distinct purpose.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

For a tool with 4 parameters and no output schema or annotations, the description covers the main aspects: purpose, parameter semantics, and process. It does not explain the return value or error conditions, but overall it provides sufficient context for an agent to use the tool.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters4/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema coverage is 100%, so baseline is 3. The description adds value by explaining the 'kind' enum options and clarifying that 'plugin_name' should be lowercase-with-dashes and becomes the directory name. This goes beyond the schema descriptions.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the tool scaffolds a UI plugin inside an existing connector. It specifies the resource (plugin), verb (scaffold), and scope (inside connector). It distinguishes from siblings by focusing on plugin creation, while sibling tools cover auth, bootstrap, connectors, etc.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description explains when to use each value of the 'kind' parameter ('iframe', 'rn', 'adaptive') and mentions auto-discovery at deploy. However, it does not explicitly state when not to use the tool or mention alternatives (e.g., if the connector does not exist). Some context is missing, but the provided guidance is clear.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

ateam_delete_connectorAInspect

Remove a connector from a deployed solution. Stops and deletes it from A-Team Core, removes references from the solution definition (grants, platform_connectors) and skill definitions (connectors array), and cleans up mcp-store files.

ParametersJSON Schema
NameRequiredDescriptionDefault
solution_idYesThe solution ID (e.g. 'smart-home-assistant')
connector_idYesThe connector ID to remove (e.g. 'device-mock-mcp')
Behavior5/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

With no annotations provided, the description carries full burden and excels by disclosing cascade effects: it stops/deletes from Core, removes references from solution definitions and skill definitions, and cleans up mcp-store files. This comprehensively explains the destructive footprint and side effects beyond simple deletion.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

A single, dense sentence efficiently enumerates four distinct cleanup actions (Core deletion, solution def updates, skill def updates, file cleanup). Every clause conveys necessary behavioral information with no redundancy or waste.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Comprehensive for the destructive operation itself, detailing all affected resources. Minor gap: given no output schema and no annotations, it could mention return behavior (void/success message) or error conditions (e.g., if connector doesn't exist), but the core behavioral disclosure is complete.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Input schema has 100% description coverage with clear examples ('smart-home-assistant', 'device-mock-mcp'). The description does not add parameter semantics beyond the schema, but this meets the baseline expectation when schema coverage is complete.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description opens with 'Remove a connector from a deployed solution,' providing a specific verb (Remove) and resource (connector). It clearly distinguishes from sibling ateam_delete_solution by specifying it operates on a connector 'from' a solution rather than deleting the solution itself.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines3/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description details the scope of changes (removing references from grants, platform_connectors, skill definitions) which implicitly guides usage, but lacks explicit when-to-use guidance versus alternatives like ateam_test_connector or ateam_delete_solution, and omits prerequisites or warnings.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

ateam_delete_skillAInspect

Delete a single skill from a deployed solution. Removes the skill from A-Team Core (kills the running MCP process, unregisters from skill registry, deletes from Mongo), removes the skill from solution.skills[] and solution.linked_skills, and deletes the skill's files from Builder FS. Use this to drop a skill without tearing down the whole solution.

ParametersJSON Schema
NameRequiredDescriptionDefault
skill_idYesThe skill ID to remove (e.g. 'linkedin-agent')
solution_idYesThe solution ID (e.g. 'personal-adas')
Behavior5/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

With no annotations, the description fully details the effects: kills the running MCP process, unregisters from skill registry, deletes from Mongo, removes from solution arrays, and deletes files. This is highly transparent about the destructive nature and all side effects.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is two sentences, front-loaded with the main action, and every sentence adds value. There is no unnecessary information, making it highly efficient.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the lack of annotations and output schema, the description covers the tool's purpose, inputs, and extensive behavioral details. It is complete for practical use, though a note about potential errors or return values could improve it slightly.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema coverage is 100% and the schema already provides clear descriptions for both parameters (skill_id and solution_id). The description adds examples ('linkedin-agent', 'personal-adas') but does not add significant meaning beyond what the schema offers. Baseline score of 3 is appropriate.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the action ('Delete a single skill') and the resource ('from a deployed solution'). It provides a specific verb-resource mapping and distinguishes itself from the sibling tool 'ateam_delete_solution' by noting that this removes only the skill without tearing down the whole solution.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description explicitly advises when to use this tool: 'Use this to drop a skill without tearing down the whole solution.' This implies when not to use it (when you want to delete the entire solution). However, it does not explicitly name the alternative tool (ateam_delete_solution), though the sibling list makes it obvious.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

ateam_delete_solutionAInspect

Delete a deployed solution and all its skills from A-Team. Use with caution — this removes the solution from both the Skill Builder and A-Team Core. Useful for cleaning up test solutions or starting fresh.

ParametersJSON Schema
NameRequiredDescriptionDefault
solution_idYesThe solution ID to delete
Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

No annotations provided, so description carries full burden. Explicitly discloses scope of destruction ('removes the solution from both the Skill Builder and A-Team Core'), cascading effects ('all its skills'), and danger level ('Use with caution'). Could be enhanced by stating irreversibility explicitly.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Three well-structured sentences front-loaded with the action. Each sentence earns its place: purpose, scope/warning, and use case guidance. No redundancy or waste despite covering behavioral details.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given a single-parameter destructive operation with no output schema, the description adequately covers what gets destroyed and where. Slightly missing explicit confirmation about irreversibility and prerequisite checks, but sufficiently complete for safe invocation.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema has 100% description coverage with 'solution_id' fully documented. Description mentions 'deployed solution' which maps to the parameter, but does not add format details, validation rules, or constraints beyond what the schema already provides. Baseline 3 is appropriate.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

Specific verb 'Delete' with clear resource 'deployed solution and all its skills'. Explicitly distinguishes from sibling 'ateam_delete_connector' by targeting 'solution' rather than connector, and from other operations like redeploy or patch.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

Provides clear positive guidance ('Useful for cleaning up test solutions or starting fresh') and negative caution ('Use with caution'). Missing explicit alternative suggestions (e.g., when to prefer redeploy over delete), but use cases are well-defined.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

ateam_get_chainAInspect

Inspect the full chain tree for any job — rooted at the given job_id, walking down through every handoff and askAnySkill subcall.

Use when a chain has already run and you want to analyze the structure: which skill called which, how deep the call tree went, which tool inside which job invoked which sub-tool. The two main shapes: • response.chain.chainJobs[] — one entry per job in the chain. Fields: jobId, skill, status, iteration, depth (0 = root, +1 per askAnySkill subcall hop), relation ('root' | 'subcall' | 'handoff'), parentJobId, parentSkill, goal. • response.chain.executionSteps[] — every tool call across all chain jobs, tagged with _skill, _jobId, _depth (= job depth), _relation, _parentSkill, _parentJobId, _toolDepth (tool-in-tool nesting via opId/parentOpId).

Differs from ateam_test_status by purpose: status is for live polling of a job you just kicked off; get_chain is for post-hoc tree analysis (debugging multi-skill flows, regression testing, comparing two runs).

Auth: forwards your authed api_key. Tenant scoped by the key itself. Actor scoping: you can only inspect chains rooted at jobs your actor has access to.

ParametersJSON Schema
NameRequiredDescriptionDefault
job_idYesThe root job ID of the chain to inspect (or any job inside the chain — Core walks up to the root).
skill_slugNoOptional. The skill slug for the job — speeds up the lookup when the job isn't in memory and must be loaded from storage. Omit if you don't have it; lookup still works but does an extra round-trip.
Behavior5/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

No annotations provided, so description carries full burden. It details output structure (chainJobs[], executionSteps[]) and behavior: walks down chain, can start from any job, walks up to root. Discloses auth forwarding and tenant/actor scoping. No contradictions.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness4/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Description is relatively long but well-structured: purpose first, then usage, then output details, then differentiation. Every sentence adds value. Could be slightly more concise but efficient for the complexity.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness5/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given no output schema, description thoroughly explains both parts of the response structure. Covers purpose, behavior, auth, scoping, and parameter nuances. Complete for an agent to understand and invoke correctly.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Input schema has 100% description coverage with both params described adequately. Description adds minor context (walking up to root, extra round-trip without slug) but not significantly beyond schema. Baseline 3 is appropriate.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

Description clearly states 'Inspect the full chain tree' with specific verb 'inspect' and resource 'chain tree'. It distinguishes from sibling ateam_test_status by purpose (post-hoc analysis vs live polling).

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines5/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

Explicitly says 'Use when a chain has already run' and contrasts with ateam_test_status. Provides use cases: debugging, regression testing, comparing runs. Also notes auth requirements and scoping constraints.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

ateam_get_connector_sourceAInspect

Read the source code files of a deployed MCP connector. Returns all files (server.js, package.json, etc.) stored in the mcp_store for this connector. Use this BEFORE patching or rewriting a connector — always read the current code first so you can make surgical fixes instead of blind full rewrites.

ParametersJSON Schema
NameRequiredDescriptionDefault
solution_idYesThe solution ID (e.g. 'smart-home-assistant')
connector_idYesThe connector ID to read (e.g. 'home-assistant-mcp')
Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

No annotations provided, so description carries full burden. It discloses the storage location ('mcp_store'), provides concrete file examples ('server.js, package.json'), and explains the critical workflow implication (prevents blind rewrites). Minor gap: doesn't specify return format (content vs. metadata) or error conditions.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Three sentences, zero waste. Front-loaded with purpose ('Read...'), followed by scope ('Returns...'), then critical workflow guidance ('Use this BEFORE...'). Every sentence earns its place.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

No output schema exists, but description partially compensates by stating what is returned ('all files... server.js, package.json'). Covers the essential workflow context for a 2-parameter read tool. Could strengthen by clarifying if file contents or just paths are returned.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema coverage is 100% with clear descriptions and examples for both solution_id and connector_id. Description adds minimal semantic value beyond the schema (baseline 3), though it does reference 'mcp_store' as the storage context.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

Specific verb 'Read' + resource 'source code files of a deployed MCP connector' with clear scope (all files in mcp_store). Explicitly distinguishes from siblings like ateam_get_solution or ateam_github_read by specifying 'mcp_store' and 'connector source'.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines5/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

Explicit workflow instruction: 'Use this BEFORE patching or rewriting a connector.' Names the alternatives implicitly (patching/rewriting) and explains the value proposition ('surgical fixes instead of blind full rewrites').

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

ateam_get_examplesAInspect

Get complete working examples that pass validation. Study these before building your own.

ParametersJSON Schema
NameRequiredDescriptionDefault
typeYesExample type: 'skill' = Order Support Agent, 'connector' = stdio MCP connector, 'connector-ui' = UI-capable connector, 'solution' = full 3-skill e-commerce solution, 'script-cache-skill' = fat-tool skill with script_cache opt-in (reference implementation of script-level JIT shortcuts — study this before building any browser-automation skill), 'index' = list all available examples
Behavior3/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

With no annotations, the description carries full burden. It correctly indicates a read-only retrieval operation ('get'), but provides no details about side effects, authentication needs, or return behavior. This is adequate for a simple fetch but lacks extra context.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness4/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Two short sentences, front-loaded with the action. The second sentence adds usage hint but is not strictly necessary. Still efficient and clear.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness2/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

The tool has no output schema, so the description should disclose the return format (e.g., file paths, content). It does not, leaving uncertainty for the agent about how to use retrieved examples.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema coverage is 100% with enum descriptions for each type, so the baseline is 3. The description does not add any additional meaning about the parameter beyond what the schema already provides.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description states 'Get complete working examples that pass validation' with a clear verb and resource. Combined with the enum parameter specifying example types, the tool's purpose is unambiguous and distinct from all sibling tools.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines3/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The phrase 'Study these before building your own' implies a learning usage, but there is no explicit guidance on when to use this tool versus alternatives or when not to use it. No exclusions or alternatives are mentioned.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

ateam_get_solutionBInspect

Read solution state — definition, skills, health, status, or export. Use this to inspect deployed solutions.

ParametersJSON Schema
NameRequiredDescriptionDefault
viewYesWhat to read: 'definition' = full solution def, 'skills' = list skills, 'health' = live health check, 'status' = deploy status, 'export' = exportable bundle, 'validate' = re-validate from stored state, 'connectors_health' = connector status
skill_idNoOptional: read a specific skill by ID (original or internal)
solution_idYesThe solution ID
Behavior3/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

No annotations provided, so description carries full disclosure burden. It enumerates the view modes which hints at return variations, but omits critical behavioral details: does not confirm read-only safety, does not describe error responses (e.g., if solution_id not found), nor authentication requirements. Just above minimum as it does map view options.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness4/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Appropriately concise two-sentence structure. Action and resource are front-loaded in first sentence; usage intent follows in second. No filler text, though em-dash list is slightly compressed.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness3/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Adequate for a 3-parameter read operation with complete schema documentation. Missing output schema description (would benefit from knowing return structure) and safety guarantees given lack of readOnlyHint annotation. Acceptable but not exemplary.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema has 100% description coverage, establishing baseline 3. Description echoes the view enum values ('definition, skills, health, status, or export') without adding syntax constraints or validation rules beyond the schema. No additional context for optional 'skill_id' parameter (e.g., when to use it) beyond schema's 'Optional' label.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose4/5

Does the description clearly state what the tool does and how it differs from similar tools?

Clear verb 'Read' and resource 'solution state' with specific facets (definition, skills, health, status, export). Distinguishes from sibling 'ateam_list_solutions' by implying singular instance inspection vs. listing, and from mutation tools like 'ateam_delete_solution' or 'ateam_patch' via 'inspect'. Lacks explicit contrast with list operation.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines3/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

Provides basic usage context with 'Use this to inspect deployed solutions', indicating when to use (examining existing deployments). However, lacks explicit when-NOT-to-use guidance and does not mention alternatives like 'ateam_list_solutions' for discovery without a specific ID.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

ateam_get_specAInspect

Get the A-Team specification — schemas, validation rules, system tools, agent guides, and templates. Start here after bootstrap to understand how to build skills and solutions. Use 'section' to get just one part of the skill spec (much smaller than the full spec). Use 'search' to find specific fields or concepts across the spec.

When designing a persona that orchestrates logic via run_python_script (the Python-as-orchestrator pattern), also fetch topic='python_helpers' — that returns the adas.* helper namespace reference. Skills designed without knowing about adas.* produce 5-10x larger / brittler scripts.

ParametersJSON Schema
NameRequiredDescriptionDefault
topicYesWhat to fetch: 'overview' = API overview + endpoints, 'skill' = full skill spec, 'solution' = full solution spec, 'enums' = all enum values, 'connector-multi-user' = multi-user connector guide, 'python_helpers' = adas.* helper namespace for run_python_script orchestration (read this when designing personas that read state → call tools → checkpoint → status; without it, scripts hand-roll JSON parsing and tool delegation = 5-10x larger and brittler).
searchNoOptional: filter the spec to only sections containing this search term. Works with any topic. Example: search='bootstrap' returns only fields/sections mentioning 'bootstrap'.
sectionNoOptional: get just one section of the skill spec (only works with topic='skill'). Sections: 'engine' = model/reasoning/planner optimization/bootstrap tools, 'tools' = tool definitions/meta tools, 'intents' = intents/problem/scenarios, 'policy' = access control/grants/workflows, 'triggers' = automation triggers, 'connectors' = connector linking/channels, 'role' = persona/goals, 'template' = minimal quick start, 'guide' = build steps/common mistakes
Behavior5/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

No annotations are provided, so the description carries the full burden. It discloses that using 'section' returns a much smaller response, and warns that not knowing about adas.* leads to 5-10x larger/brittler scripts. This adds valuable behavioral context beyond what the schema provides.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is well-structured and concise. It front-loads the main purpose, then provides usage guidance, and ends with a specific scenario. Every sentence adds value without redundancy.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the three parameters and no output schema, the description covers retrieval options, filtering, and a special use case. It lacks explicit mention of return format, but overall provides sufficient context for correct invocation.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters4/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema coverage is 100%, but the description adds meaningful usage context for parameters (e.g., 'much smaller than the full spec' for section, 'find specific fields' for search). This goes beyond the schema descriptions, justifying a score above the baseline of 3.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the tool's purpose: 'Get the A-Team specification — schemas, validation rules, system tools, agent guides, and templates.' It uses a specific verb and resource, and distinguishes itself from sibling tools by positioning itself as the entry point after bootstrap.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description provides clear context for when to use the tool ('Start here after bootstrap') and gives specific guidance on using 'section' and 'search' parameters. It also includes a scenario for 'python_helpers'. However, it does not explicitly compare to sibling tools or state when not to use this tool.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

ateam_get_workflowsAInspect

Get the builder workflows — step-by-step state machines for building skills and solutions. Use this to guide users through the entire build process conversationally. Returns phases, what to ask, what to build, exit criteria, and tips for each stage.

ParametersJSON Schema
NameRequiredDescriptionDefault

No parameters

Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

No annotations provided, so description carries full disclosure burden. Compensates well by detailing return structure (phases, what to ask, exit criteria, tips) since no output schema exists. Does not explicitly declare read-only nature or performance characteristics, but 'Get' prefix and detailed return description provide sufficient behavioral context.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Two sentences, zero waste. First sentence defines the resource and its nature; second sentence gives usage context and return value details. Front-loaded with the essential operation and resource type.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness5/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Fully complete for a zero-parameter tool lacking annotations and output schema. Description adequately explains both the invocation trigger and the return payload structure (compensating for missing output_schema), requiring no additional detail.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters4/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Zero parameters present. Per rubric, zero-parameter tools baseline at 4. No additional semantic elaboration needed or possible, and none is attempted unnecessarily.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

Clear specific verb 'Get' + resource 'builder workflows' + domain context 'for building skills and solutions'. The phrase 'step-by-step state machines' distinguishes this from sibling tools like ateam_get_solution or ateam_get_spec which retrieve specific artifacts rather than process workflows.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

Provides explicit positive guidance 'Use this to guide users through the entire build process conversationally', clearly indicating when to invoke the tool. Lacks explicit 'when-not' guidance or named alternative tools for specific sub-tasks, but the conversational guidance use case is unambiguous.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

ateam_github_diffAInspect

PRE-FLIGHT BEFORE PROMOTE. Compares dev (head) vs main (base) by default — shows exactly which commits and files are about to ship if you call ateam_github_promote() next.

Use this when you want to: • Review changes before promoting to prod • See if dev is ahead of main at all (returns ahead_by: 0 if nothing to promote) • Inspect arbitrary branch/tag/commit comparisons (override base/head)

ParametersJSON Schema
NameRequiredDescriptionDefault
baseNoBase branch/tag/sha (the target — what you're comparing TO). Default: 'main'.main
headNoHead branch/tag/sha (the source — what you're comparing FROM). Default: 'dev'.dev
solution_idYesThe solution ID
Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

No annotations exist. Description fully discloses behavior: compares branches, shows commits/files, returns ahead_by. No destructive implications.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Concise 4 sentences with bullet points. Front-loaded purpose, then usage. No fluff.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness5/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Adequately describes input, behavior, and output (ahead_by). No output schema but description covers key return info.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters4/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema coverage 100% but description adds value by explaining defaults and ability to override for arbitrary comparisons.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

Description states specific verb 'compares dev vs main' and links to promote tool. Clearly distinguishes from sibling tools like ateam_github_promote and ateam_github_log.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

Provides explicit bullet points on when to use (review before promote, check if dev is ahead, arbitrary comparisons). Lacks explicit when-not but context is clear.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

ateam_github_list_versionsAInspect

List all available checkpoints (safe-* tags) for a solution. Shows tag name, date, counter, and commit SHA. Use before rollback to see available safe points.

ParametersJSON Schema
NameRequiredDescriptionDefault
solution_idYesThe solution ID
Behavior3/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Discloses output fields returned (tag name, date, counter, commit SHA) which compensates somewhat for missing output schema. However, lacks explicit safety profile (read-only vs destructive) and error behavior (e.g., invalid solution_id) that annotations would normally cover.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Three sentences: action+resource, output fields, usage guidance. Front-loaded with core purpose, zero redundancy, each sentence earns its place.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

For a single-parameter tool without annotations or output schema, description adequately covers purpose, return data structure, and usage context. Could strengthen with error case mention, but sufficient for this complexity level.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema coverage is 100% with 'The solution ID' description. Description contextualizes as 'for a solution' but adds no additional format constraints, validation rules, or examples beyond schema baseline.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

Clear specific verb 'List' with resource 'checkpoints (safe-* tags)' and scope 'for a solution'. Distinguishes from sibling git tools by specifying 'safe-* tags' pattern and checkpoint-specific semantics.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines5/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

Explicitly states 'Use before rollback to see available safe points' - provides clear temporal guidance (when) and purpose (why), implicitly referencing the ateam_github_rollback sibling workflow.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

ateam_github_logAInspect

View commit history for a solution's GitHub repo. Shows recent commits with messages, SHAs, timestamps, and links. Default reads from main (prod). Pass ref: 'dev' to see in-progress work.

ParametersJSON Schema
NameRequiredDescriptionDefault
refNoBranch to read commits from. Default: 'main'.main
limitNoMax commits to return (default: 10)
solution_idYesThe solution ID
Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

No annotations provided, so the description carries the full burden. It discloses that the tool is read-only (view), shows recent commits, and defaults to main. This is sufficient for a simple read operation; no contradictions.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Two sentences with no wasted words. First sentence states purpose and output; second gives immediate usage tip. Highly efficient and front-loaded.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness5/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Despite no output schema, the description lists what the response includes (messages, SHAs, timestamps, links). It covers the key behavioral aspects for a straightforward read-only tool. Complete enough for agent invocation.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters4/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema coverage is 100%, giving a baseline of 3. The description adds value by explaining the real-world meaning of the 'ref' parameter (main=prod, dev=in-progress) beyond the schema's default, making it more actionable.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the verb (View), resource (commit history for a solution's GitHub repo), and specifies what is shown (messages, SHAs, timestamps, links). It distinguishes from sibling tools like ateam_github_status or ateam_github_diff by focusing on log/history.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

Provides explicit guidance on default branch (main/prod) and how to see in-progress work (ref: 'dev'). While it doesn't list when not to use or compare extensively to siblings, the context is clear and helpful.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

ateam_github_patchAInspect

Edit a file in the solution's GitHub repo and commit. Two modes:

  1. FULL FILE: provide content — replaces entire file (good for new files or small files)

  2. SEARCH/REPLACE: provide search + replace — surgical edit without sending full file (preferred for large files like server.js) Always use search/replace for large files (>5KB). Always read the file first with ateam_github_read to get the exact text to search for.

DEFAULTS TO dev BRANCH — writes don't touch prod. Use ateam_github_promote to ship dev→main when ready. Pass ref:'main' only for emergency hotfixes.

ParametersJSON Schema
NameRequiredDescriptionDefault
refNoTarget branch. Default: 'dev' (safe — won't touch prod). Use 'main' only for emergency hotfixes.dev
pathYesFile path to create/update (e.g. 'connectors/home-assistant-mcp/server.js')
searchNoExact text to find in the file (mode 2 — search/replace). Must match exactly including whitespace.
contentNoThe full file content to write (mode 1 — full file replacement)
messageNoOptional commit message (default: 'Update <path>')
replaceNoText to replace the search string with (mode 2 — required with search)
solution_idYesThe solution ID
Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Without annotations, the description carries the full burden. It explains the two modes, default branch, and the need for exact text in search. It could mention required permissions or what a commit returns, but overall it is transparent about the tool's behavior.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is concise yet comprehensive, using bullet points for clarity. Every sentence adds value: modes, best practices, branch safety, and alternatives. No redundant information.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

The description covers modes, branch safety, prerequisites, and sibling differentiation. Missing output details (e.g., commit info), but there is no output schema. For a mutation tool, a note on what is returned would be helpful, but the description is still quite complete.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters4/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

The input schema already describes each parameter (100% coverage), so baseline is 3. The description adds value by explaining the two modes and when to use content vs search+replace, providing context beyond the schema's individual descriptions.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the tool edits a file in a GitHub repo and commits, with two modes (full file and search/replace). It distinguishes itself from sibling tools like ateam_github_read (which reads) and ateam_github_promote (which promotes branches).

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines5/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

Provides explicit guidance: use search/replace for large files >5KB, always read the file first with ateam_github_read, defaults to dev branch, use ateam_github_promote to ship to main, and pass ref='main' only for emergency hotfixes. This helps the agent choose the correct mode and avoid mistakes.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

ateam_github_promoteAInspect

SHIP DEV TO PROD. Merges the dev branch into main and auto-tags the new main HEAD as safe-YYYY-MM-DD-NNN. Use after testing your dev work, when you're ready to deploy changes to production.

Workflow: 1) ateam_github_patch (writes to dev) → 2) ateam_github_promote (merges dev→main) → 3) ateam_build_and_run (deploys main).

Pass dry_run:true to see what's about to ship without merging. On merge conflict the call returns 409 — resolve manually on GitHub (open a PR or use the web UI), then retry.

ParametersJSON Schema
NameRequiredDescriptionDefault
labelNoOptional: human-readable label for the auto-tag (e.g., 'v2 stable', 'before refactor')
dry_runNoIf true: show the diff (commits + files about to ship) without merging. Default: false.
skip_tagNoIf true: merge without creating an auto-tag. Default: false (auto-tag enabled).
solution_idYesThe solution ID
Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

With no annotations, the description carries full burden. It discloses the key behavioral traits: merges dev→main, auto-tags HEAD, returns 409 on conflict, supports dry_run and skip_tag. It could mention idempotency or success response format, but the existing detail is strong.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is tightly written: two sentences convey purpose and workflow, then lists steps and options. No extraneous words; every sentence adds value. Front-loaded with key action.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given no output schema, the description covers error handling (409 on conflict), optional behaviors (dry_run, skip_tag), and the workflow sequence. It lacks an explicit statement of what a successful invocation returns (e.g., confirmation or tag name), but the context is largely complete for an experienced developer.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters5/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema coverage is 100%, and the description adds significant meaning beyond the schema: labels are 'human-readable for the auto-tag', dry_run 'show the diff without merging', skip_tag 'merge without creating an auto-tag'. It effectively explains the why and how of each parameter.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description states explicitly: 'SHIP DEV TO PROD. Merges the dev branch into main and auto-tags...' This clearly identifies the action (merge) and resource (dev to main), and distinguishes it from sibling tools like ateam_github_patch and ateam_github_rollback via the workflow.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines5/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

Guidelines are explicit: 'Use after testing your dev work, when you're ready to deploy changes to production.' It also provides when not to use: on merge conflict, resolve manually. Additionally, it offers the alternative dry_run mode and positions the tool in a workflow sequence (patch → promote → build_and_run).

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

ateam_github_pullAInspect

Deploy a solution FROM its GitHub repo. Reads .ateam/export.json + connector source from the repo and feeds it into the deploy pipeline. Use this to restore a previous version or deploy from GitHub as the source of truth.

ParametersJSON Schema
NameRequiredDescriptionDefault
solution_idYesThe solution ID to pull and deploy from GitHub
Behavior3/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

With no annotations provided, the description carries the full burden. It discloses the mechanism (reads .ateam/export.json + connector source, feeds deploy pipeline) and implies destructive overwrite via 'restore', but lacks explicit safety warnings about overwriting current state, required permissions, or error conditions (e.g., if repo is missing export.json).

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Two sentences with zero waste: first defines action and mechanism, second defines use cases. Information is front-loaded with the core verb 'Deploy' and directional qualifier 'FROM' appearing immediately.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the complexity of GitHub-integrated deployment and lack of output schema, the description adequately covers the essential behavioral contract (input source, processing pipeline, intent). Minor gap: does not describe return value or deployment confirmation behavior, which would help given no output schema exists.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema coverage is 100% with a clear description ('The solution ID to pull and deploy from GitHub'), establishing baseline 3. The main description adds implicit context that this ID references the GitHub repository state rather than local state, but does not add syntax details or validation rules beyond the schema.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

Excellent specificity: 'Deploy a solution FROM its GitHub repo' uses a concrete verb (deploy), identifies the resource (solution), and the directional preposition 'FROM' clearly distinguishes this from sibling 'ateam_github_push' and local deployment tools like 'ateam_redeploy'. The mechanism details (.ateam/export.json) further clarify scope.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

Provides explicit when-to-use scenarios ('restore a previous version', 'deploy from GitHub as the source of truth') that clearly indicate this is for GitHub-centric workflows. However, it does not explicitly name alternatives like ateam_redeploy for local changes or ateam_github_rollback for specific version targeting.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

ateam_github_pushAInspect

Push the current deployed solution to GitHub. Auto-creates the repo on first use. Commits the full bundle (solution + skills + connector source) atomically. Use after ateam_build_and_run to version your solution, or anytime you want to snapshot the current state.

ParametersJSON Schema
NameRequiredDescriptionDefault
messageNoOptional commit message (default: 'Deploy <solution_id>')
solution_idYesThe solution ID (e.g. 'smart-home-assistant')
Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

No annotations provided, so description carries full burden. Discloses important behavioral traits: auto-creates repo on first use (side effect), commits atomically (transactional safety), and scope of commit (full bundle vs. partial). Deducted one point for not mentioning failure modes, idempotency beyond repo creation, or auth requirements.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Four sentences with zero waste. Front-loaded with core action ('Push...'), followed by side effects, behavioral details, and usage guidelines. Every sentence earns its place.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given no annotations and no output schema, description adequately covers tool behavior, side effects, and workflow context. Sufficient for correct agent invocation, though could be strengthened by describing return values or error conditions.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema description coverage is 100%, establishing baseline of 3. Description does not explicitly discuss parameter syntax or formats beyond what schema provides, though it contextualizes solution_id by explaining what the 'full bundle' includes.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

Specific verb 'Push' + resource 'deployed solution to GitHub' clearly stated. Explicitly distinguishes from siblings like ateam_github_pull or ateam_github_write by specifying it commits the 'full bundle (solution + skills + connector source)' atomically, not just individual files.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines5/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

Explicitly states when to use: 'Use after ateam_build_and_run to version your solution' and provides alternative usage scenario 'or anytime you want to snapshot the current state.' Clear workflow integration guidance.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

ateam_github_readAInspect

Read any file from a solution's GitHub repo. Returns the file content. Use this to read connector source code, skill definitions, or any versioned file. Default reads from main (deployed/prod state). Pass ref: 'dev' to read in-progress work.

ParametersJSON Schema
NameRequiredDescriptionDefault
refNoBranch, tag, or commit SHA to read from. Default: 'main' (prod). Use 'dev' to read in-progress work.main
pathYesFile path in the repo (e.g. 'connectors/home-assistant-mcp/server.js', 'solution.json', 'skills/order-support/skill.json')
solution_idYesThe solution ID
Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

No annotations are provided, so the description carries full burden. It discloses that the tool reads from a GitHub repo, returns file content, defaults to 'main' branch, and supports a 'ref' parameter for development branches. It does not describe non-obvious behaviors like permissions or rate limits, but for a read operation the transparency is adequate.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Three sentences, no wasted words. Front-loads the core purpose, then provides usage details and examples. Every sentence serves a purpose.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness5/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

For a read tool with three well-documented parameters and no output schema, the description is complete. It covers what the tool reads, how to specify the branch, and file path examples. No gaps identified.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters4/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema description coverage is 100%, so the schema already documents parameters well. The description adds value by explaining the default and usage of 'ref' and providing concrete path examples, enhancing understanding beyond the schema.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description states explicitly that it reads any file from a solution's GitHub repo and returns content, with concrete examples (connector source code, skill definitions). This clearly distinguishes it from sibling tools like ateam_github_write or ateam_github_diff.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description gives clear context on when to use default branch (main/prod) vs the 'dev' ref for in-progress work. While it doesn't explicitly exclude alternatives, the guidance on ref usage is helpful. Missing comparison to other read tools like ateam_github_log but still informative.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

ateam_github_rollbackAInspect

Roll prod (main branch) back to a previous state.

ADDITIVE — does NOT destroy history. Creates a new commit on top of main whose tree matches the target's tree. The history of everything between target and current main is preserved (you can roll back the rollback).

Workflow: 1) ateam_github_list_versions (find a safe-* tag) → 2) ateam_github_rollback(target: 'safe-...') → 3) ateam_build_and_run (deploys the reverted state).

ParametersJSON Schema
NameRequiredDescriptionDefault
targetYesTag (e.g., 'safe-2026-05-19-001') or commit SHA to revert main to. Use ateam_github_list_versions to find safe-* tags.
solution_idYesThe solution ID
Behavior5/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

The description thoroughly explains the behavior: it is additive, does not destroy history, creates a new commit with the target's tree, and preserves history for easy reversal. This exceeds expectations given no annotations are provided.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is concise and well-structured. The first sentence states the purpose, the second explains the additive nature, and the third provides the workflow. No extraneous information is included.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness3/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

The description lacks information about error handling, expected output, or how to confirm success. Given no output schema, the agent might need more guidance on what to expect after invoking the tool.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema coverage is 100%, so baseline is 3. The description adds minimal value beyond the schema: it provides an example tag format and references to list_versions. The solution_id parameter is only restated.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose4/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the tool rolls prod (main branch) back to a previous state. It uses specific verbs and resources. However, it does not explicitly differentiate from sibling tools like ateam_github_promote or ateam_github_patch, though it mentions using ateam_github_list_versions as a prerequisite.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines5/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description provides an explicit workflow: 1) ateam_github_list_versions to find a safe-* tag, 2) ateam_github_rollback with the target, 3) ateam_build_and_run to deploy. It also explains when to use the tool and notes that it is additive and not destructive.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

ateam_github_statusAInspect

Check if a solution has a GitHub repo, its URL, and the latest commit. Use this to verify GitHub integration is working for a solution.

ParametersJSON Schema
NameRequiredDescriptionDefault
solution_idYesThe solution ID
Behavior3/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

No annotations provided, so description carries full disclosure burden. It effectively lists returned data points (existence, URL, commit) but omits safety properties (read-only nature), error behavior when repo doesn't exist, or idempotency. 'Check' implies read-only, but explicit safety disclosure is missing.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Two sentences with zero waste: first defines functionality, second specifies use case. Front-loaded with essential information and no filler text.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given simple single-parameter input and lack of output schema, description adequately compensates by listing specific return values (URL, latest commit, existence check). Sufficient for low-complexity tool, though response structure/format remains unspecified.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema description coverage is 100% (solution_id described as 'The solution ID'), establishing baseline 3. Description mentions 'solution' in context but does not add semantic depth, validation rules, or format details beyond what the schema already provides.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

Description uses specific verb 'Check' with clear resources 'GitHub repo, its URL, and the latest commit'. Effectively distinguishes from sibling GitHub tools (read, write, push, pull) by focusing on existence verification and metadata retrieval rather than content manipulation.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

Provides explicit use case 'Use this to verify GitHub integration is working for a solution', giving clear context for when to invoke. Lacks explicit exclusions or named alternatives (e.g., 'use ateam_github_read for file contents'), but the verification context is specific enough to guide selection.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

ateam_github_writeAInspect

Write a file to the solution's GitHub repo. Use this to create new connector files or replace existing ones — one file per call. This is the PRIMARY way to write connector code after first deploy. Write each file individually (server.js, package.json, UI assets), then call ateam_github_promote() to ship to prod (dev→main), then ateam_build_and_run() to deploy.

DEFAULTS TO dev BRANCH.

ParametersJSON Schema
NameRequiredDescriptionDefault
refNoTarget branch. Default: 'dev'.dev
pathYesFile path to write (e.g. 'connectors/my-mcp/server.js', 'connectors/my-mcp/package.json')
contentYesThe full file content
messageNoOptional commit message (default: 'Write <path>')
solution_idYesThe solution ID
Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Despite no annotations, the description discloses key behavioral traits: it defaults to the 'dev' branch, writes one file per call, and can replace existing files. It does not mention authentication requirements or error handling, but the core behaviors are clearly stated.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is extremely concise and well-structured: it starts with the primary action, gives usage guidelines, sequences the workflow with sibling tools, and ends with a crucial default behavior. Every sentence adds value with no repetition or fluff.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the absence of output schema and annotations, the description covers the tool's purpose, usage, behavior (default branch, one-file constraint), and integration with other tools. It lacks details on error conditions and prerequisites, but for a file write tool with 5 parameters, it provides sufficient context for correct usage.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema description coverage is 100%, so parameters are well-documented in the schema. The description adds minor additional context by mentioning expected file types (e.g., 'server.js', 'package.json') and the one-file-per-call constraint. However, this does not significantly surpass the schema's own descriptions, so a baseline score of 3 is appropriate.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the tool writes a file to the solution's GitHub repo for creating or replacing connector files. It distinguishes itself from sibling tools like ateam_github_patch or ateam_github_read by specifying it's the primary way to write connector code after first deploy, and it references the subsequent workflow steps.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description provides clear context: use this tool to create new files or replace existing ones, one file per call, and it's the primary method after first deploy. It outlines a usage sequence (write, promote, build and run), but does not explicitly state when not to use it or list alternatives for cases where other tools might be better suited.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

ateam_list_solutionsBInspect

List all solutions deployed in the Skill Builder.

ParametersJSON Schema
NameRequiredDescriptionDefault

No parameters

Behavior2/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

No annotations are provided, so the description carries full burden. While it specifies the scope ('deployed in the Skill Builder'), it lacks critical behavioral details such as pagination behavior, output format, or performance characteristics of listing all solutions.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is a single, efficient sentence with 9 words. It is front-loaded with the action and contains no redundant or wasted text.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness3/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

For a zero-parameter tool, the description adequately covers the basic purpose, but is incomplete regarding what data is returned (no output schema exists to compensate). It meets minimum viability but leaves gaps in operational context.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters4/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

The input schema has 0 parameters, which per the rubric establishes a baseline score of 4. There are no parameters to document.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose4/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description uses specific verb 'List' with resource 'solutions' and scope 'deployed in the Skill Builder', clearly indicating what the tool does. However, it does not explicitly differentiate from sibling tool 'ateam_get_solution' (singular vs plural).

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines2/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description provides no guidance on when to use this tool versus alternatives like 'ateam_get_solution', nor does it mention prerequisites or filtering capabilities.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

ateam_patchAInspect

Surgically update ANY field in a skill or solution definition, redeploy, and optionally re-test — all in one step.

SUPPORTED OPERATIONS:

  1. Scalar (dot notation): { "problem.statement": "new value", "role.persona": "You are..." }

  2. Deep nested: { "intents.thresholds.accept": 0.9, "policy.escalation.enabled": true }

  3. Array push: { "tools_push": [{ name: "new_tool", description: "..." }] }

  4. Array delete: { "tools_delete": ["tool_name"] }

  5. Array update: { "tools_update": [{ name: "existing_tool", description: "updated" }] }

  6. Replace whole section: { "role": { persona: "...", goals: [...] } }

EXAMPLES:

  • Change persona (full replace): updates: { "role.persona": "You are a friendly assistant" }

  • Append to persona (don't replace): updates: { "persona_append": "\n\nALWAYS respond in 2 sentences." }

  • Add a guardrail: updates: { "policy.guardrails.never_push": ["Never share passwords"] }

  • Update problem: updates: { "problem.statement": "...", "problem.goals": ["goal1"] }

  • Add a tool: updates: { "tools_push": [{ name: "conn.tool", description: "...", inputs: [...], output: {...} }] }

  • Change intent: updates: { "intents.supported_update": [{ id: "i1", description: "new desc" }] }

  • Force redeploy: updates: { "_force_redeploy": true }

  • CREATE a new skill: target='skill', skill_id='my-new-skill', updates: { "problem.statement": "...", "role.persona": "..." } If the skill doesn't exist yet, a default scaffold is created and the updates are applied on top. The skill is automatically added to the solution topology.

Use target='skill' + skill_id for skill fields. Use target='solution' for solution-level fields (linked_skills, platform_connectors, ui_plugins).

ParametersJSON Schema
NameRequiredDescriptionDefault
targetYesWhat to update: 'solution' for solution definition, 'skill' for skill definition fields (problem, role, intents, tools, policy, engine, scenarios, etc.)
updatesYesThe update payload. Use dot notation for nested scalars (e.g. 'problem.statement': 'new value'). For arrays, use _push/_delete/_update suffixes (e.g. 'tools_push', 'tools_delete'). You can update ANY field in the skill definition: problem, role, intents, tools, policy, engine, scenarios, glossary, etc.
skill_idNoRequired when target is 'skill'. The skill ID to patch.
solution_idYesThe solution ID
test_messageNoOptional: re-test the skill after patching. Requires skill_id.
Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

With no annotations provided, the description carries the full burden. It discloses behavioral traits: redeploying after update, optional re-test, array operations (_push/_delete/_update), and ability to create a new skill if it doesn't exist. It also mentions force_redeploy. However, it does not discuss permissions, side effects on existing data, or rate limits, which would be beneficial but are not critical for a patch tool.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness4/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is well-structured with a concise opening sentence, followed by bulleted lists of operations and examples. It front-loads the core purpose. However, it is somewhat long due to the many examples, which could be trimmed without losing essential information. Still, every part contributes to clarity.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the tool's complexity (5 parameters, nested objects, array operations) and no output schema, the description is reasonably complete. It explains return values implicitly (redeploy, re-test), but does not describe the response format or error conditions. The examples cover most use cases, making it sufficient for an agent to use correctly.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters5/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema coverage is 100% (all parameters described), but the description adds significant value beyond the schema. It provides detailed examples for dot notation, array operations, and optional fields like test_message and _force_redeploy. It clarifies how to use target and skill_id together, and the structure of the updates parameter with multiple usage patterns. This fully compensates for any schema limitations.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

Description clearly states the tool 'surgically update ANY field in a skill or solution definition, redeploy, and optionally re-test — all in one step.' It distinguishes from siblings like ateam_redeploy by including the patch and optional test in a single operation. The extensive list of supported operations and examples leaves no ambiguity about the tool's purpose.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

Description explains when to use target='skill' vs 'solution', and mentions that if a skill doesn't exist, a default scaffold is created. However, it does not explicitly state when NOT to use this tool or suggest alternatives for non-patch operations, though the sibling list implies distinction. The guidance is clear but lacks exclusionary criteria.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

ateam_redeployAInspect

Re-deploy skills WITHOUT changing any definitions. ⚠️ HEAVY OPERATION: regenerates MCP servers (Python code) for every skill, pushes each to A-Team Core, restarts connectors, and verifies tool discovery. Takes 30-120s depending on skill count. Use after connector restarts, Core hiccups, or stale state. For incremental changes, prefer ateam_patch (which updates + redeploys in one step).

ParametersJSON Schema
NameRequiredDescriptionDefault
skill_idNoOptional: redeploy a single skill only. Omit to redeploy ALL skills in the solution.
solution_idYesThe solution ID to redeploy
Behavior5/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Outstanding given no annotations: Discloses multi-step internals ('regenerates MCP servers... pushes... restarts... verifies'), performance characteristics ('30-120s'), and operational weight ('HEAVY OPERATION'). Agent understands exact cost and side effects.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Perfectly structured: opening constraint (no definition changes), warning symbol with heavy operation details, timing estimate, usage conditions, and alternative—all in compact form. Every sentence conveys unique operational intelligence.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness5/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Complete despite no output schema: Description adequately covers the 'heavy operation' nature, prerequisite conditions (stale state), and sibling relationships. With rich schema coverage and behavioral disclosure, nothing essential is missing.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters4/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema coverage is 100% (both params fully documented), establishing baseline 3. Description adds value by contextualizing 'skill count' (relating to optional skill_id parameter) and implying the scope difference between single-skill and full-solution deployment in the timing warning.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

Excellent: 'Re-deploy skills WITHOUT changing any definitions' provides specific verb (re-deploy), resource (skills), and scope constraint. Explicitly distinguishes from sibling 'ateam_patch' by contrasting with 'incremental changes' use case.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines5/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

Exemplary: Explicitly states when to use ('after connector restarts, Core hiccups, or stale state') and explicitly names alternative ('prefer ateam_patch'). Clear guidance on heavy vs. light operational paths.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

ateam_show_skill_minimalAInspect

Show the minimal authoring view of a skill — persona + connectors + handoff_when + style + policy guardrails only. ~10× smaller than ateam_get_solution(view:'skills') for the same skill. Use this when you only need the irreducible author content (Phase 9 of the strip).

ParametersJSON Schema
NameRequiredDescriptionDefault
skill_idYesThe skill ID
solution_idYesThe solution ID
Behavior3/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

No annotations provided; description carries full burden. States it is a read operation showing minimal view and is smaller than the full view, but does not detail output format, potential errors, or required permissions.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Two sentences, front-loaded with purpose and contents. No wasted words; every part adds value.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Adequate for a simple read tool without output schema. Covers purpose, contents, and size comparison. Lacks details on prerequisites or error handling, but sufficient for typical use.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema has 100% coverage with basic descriptions for both parameters (skill_id, solution_id). Description adds no additional meaning beyond what the schema provides.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

Clearly states it shows the minimal authoring view of a skill, lists specific components included, and explicitly distinguishes from a sibling tool (ateam_get_solution) by noting it is ~10× smaller.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

Provides a clear 'Use this when' condition for irreducible author content (Phase 9). Does not explicitly state when not to use or list alternative tools beyond the single comparison, but the guidance is actionable.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

ateam_show_solution_minimalAInspect

Show the minimal authoring view of a solution — name + description + style + routing_mode + identity_mode + skill ids + connector ids only. Skips deployed metadata, handoffs (auto-generated), grants, ui_plugins, validation results. Use this for fast inspection without the verbose fields (Phase 9 of the strip).

ParametersJSON Schema
NameRequiredDescriptionDefault
solution_idYesThe solution ID
Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Without annotations, the description transparently lists what is included and skipped (deployed metadata, handoffs, etc.), giving a clear behavioral picture. It doesn't confirm read-only nature, but 'show' implies safe operation.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is two succinct sentences with no unnecessary words. It front-loads the purpose and lists exclusions efficiently.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

For a simple one-parameter tool with no output schema, the description adequately explains what the response includes. It could mention error conditions or authentication, but the core functionality is well covered.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema coverage is 100% for 'solution_id', and the description adds no additional meaning beyond the schema's 'The solution ID.' The baseline for full schema coverage is 3.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the tool 'shows the minimal authoring view of a solution' and lists specific fields included and excluded. It distinguishes from sibling tools by emphasizing the minimal scope and skipping verbose fields.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description explicitly recommends using this tool for 'fast inspection without the verbose fields,' providing a clear use case. While it doesn't name alternatives directly, the contrast with verbose fields implies when not to use it.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

ateam_status_allAInspect

Show GitHub sync status for ALL tenants and solutions in one call. Requires master key authentication. Returns a summary table of every tenant's solutions with their GitHub sync state.

ParametersJSON Schema
NameRequiredDescriptionDefault

No parameters

Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

With no annotations provided, the description carries the full disclosure burden and successfully communicates authentication requirements ('master key') and return format ('summary table'). It implies read-only safety through the verb 'Show,' though explicit confirmation of idempotency would strengthen it further.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Three sentences with zero waste: scope and action (sentence 1), authentication constraint (sentence 2), and return value (sentence 3). Information is front-loaded with the core purpose stated immediately.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the absence of annotations and output schema, the description adequately compensates by disclosing the authentication barrier and return format. It sufficiently covers a simple read-only status tool, though enumerating possible sync state values would achieve completeness.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters4/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

The input schema contains zero parameters, establishing a baseline score of 4. The description appropriately requires no additional parameter clarification since the tool operates as a parameterless bulk status request.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the specific action ('Show GitHub sync status'), the resource (tenants and solutions), and distinguishes itself from siblings like 'ateam_github_status' by emphasizing the 'ALL' scope and 'in one call' bulk operation.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description provides critical usage context through 'Requires master key authentication,' establishing a clear prerequisite. It implies bulk usage via 'ALL tenants,' though it could be strengthened by explicitly naming the alternative tool for single-tenant queries.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

ateam_sync_allAInspect

Sync ALL tenants: push Builder FS → GitHub, then pull GitHub → Core MongoDB. Requires master key authentication. Returns a summary table with results for each tenant/solution.

ParametersJSON Schema
NameRequiredDescriptionDefault
pull_onlyNoOnly pull from GitHub to Core (skip push). Default: false (full sync).
push_onlyNoOnly push to GitHub (skip pull to Core). Default: false (full sync).
Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

No annotations provided, so description carries full disclosure burden. Successfully documents authentication requirements and return format ('summary table with results'). Omits error handling behavior (e.g., partial tenant failures) and idempotency characteristics that would be critical for a bulk operation.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Three compact sentences with zero redundancy. The main action and scope are front-loaded ('Sync ALL tenants:'), followed by authentication constraints and return value. Every sentence provides distinct operational information.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Adequately covers multi-system scope, authentication, and return values despite lacking annotations or output schema. Given the high complexity (affecting all tenants across three systems) and lack of structured safety hints, the description provides sufficient but not exhaustive operational context.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters4/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

While schema has 100% description coverage, the description adds essential semantic context by mapping parameters to the conceptual sync flow—explaining that pull_only skips 'push Builder FS → GitHub' and push_only skips 'pull GitHub → Core MongoDB.' This clarifies the parameter purposes beyond the boolean definitions.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

Specifies exact verb ('Sync'), target ('ALL tenants'), and bidirectional data flow ('Builder FS → GitHub → Core MongoDB'). Clearly distinguishes from unidirectional siblings like ateam_github_push by describing the complete two-stage pipeline.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

Explicitly states the prerequisite 'Requires master key authentication,' providing clear access control context. However, lacks explicit guidance on when to use this combined tool versus individual ateam_github_push or ateam_github_pull operations for single-direction syncs.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

ateam_test_abortAInspect

Abort a running skill test. Stops the job execution at the next iteration boundary. (Advanced.)

ParametersJSON Schema
NameRequiredDescriptionDefault
job_idYesThe job ID to abort
skill_idYesThe skill ID
solution_idYesThe solution ID
Behavior3/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

With no annotations provided, the description carries the full burden. It successfully discloses the graceful stop mechanism ('next iteration boundary'), but omits safety-critical details like whether the abort is reversible, side effects on the solution/skill, permission requirements, or what happens to partial results.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness4/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The three-part structure is front-loaded with the primary action, followed by behavioral mechanics and complexity tagging. It is appropriately brief with minimal waste, though the parenthetical '(Advanced.)' could be better integrated into a complete sentence for slightly better flow.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness3/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

For a 3-parameter destructive operation with no output schema, the description provides the essential functional context but leaves gaps regarding prerequisites (how to discover the job_id), success/failure indicators, and consequences of aborting. It meets minimum viability but lacks richness for safe operation.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema description coverage is 100%, establishing a baseline of 3. The description mentions 'job execution' which loosely maps to the job_id parameter, but adds no specific semantic guidance about where to obtain these IDs, their format, or relationships between solution_id, skill_id, and job_id beyond what the schema already provides.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly specifies the action (Abort) and the specific resource type (running skill test), effectively distinguishing it from sibling tools like ateam_test_pipeline, ateam_test_connector, and ateam_test_voice. The scope is precisely defined.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines3/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The '(Advanced.)' tag provides minimal guidance about expertise requirements, and the 'next iteration boundary' detail implies graceful termination behavior. However, there are no explicit when-to-use/when-not-to-use rules, prerequisites (e.g., checking status first), or alternatives mentioned.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

ateam_test_connectorAInspect

Call a tool on a running connector and get the result. Use this to test individual connector tools (e.g., triggers.list, entities.list, google.command) without deploying to a client. The connector must be connected and running.

ParametersJSON Schema
NameRequiredDescriptionDefault
argsNoOptional: arguments to pass to the tool
toolYesThe tool name to call (e.g., 'triggers.list', 'entities.list', 'google.devices')
solution_idYesThe solution ID
connector_idYesThe connector ID (e.g., 'home-assistant-mcp', 'google-home-mcp')
Behavior3/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

With no annotations provided, the description carries full burden. It discloses the prerequisite that the connector must be running and implies synchronous execution ('get the result'), but omits safety profile (read-only vs. destructive), error handling, or side effects of calling arbitrary sub-tools.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Three tightly constructed sentences: first defines core action, second provides use case with examples, third states prerequisites. No redundancy, well front-loaded with the essential verb-object pattern.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given moderate complexity (4 params, nested args object) and lack of output schema, the description adequately covers intent, prerequisites, and examples. Could be improved by noting error states (e.g., connector not running) or return structure, but sufficient for tool selection.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema coverage is 100%, establishing baseline 3. The description repeats examples already present in schema property descriptions (e.g., 'triggers.list') without adding new semantic details about parameter formats, validation rules, or nested args structure beyond what the schema already documents.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the specific action ('Call a tool on a running connector'), the resource type (individual connector tools), and distinguishes from siblings by emphasizing 'without deploying to a client'—contrasting with ateam_build_and_run or deployment tools.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

Provides clear prerequisites ('The connector must be connected and running') and context ('Use this to test... without deploying to a client'). Lacks an explicit named alternative, though 'without deploying' implies when to prefer this over deployment tools.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

ateam_test_notificationAInspect

Fire a REAL notification at an existing actor in a deployed solution — for end-to-end testing of the system-initiated notification path (telegram/push/app channels).

Unlike ateam_test_skill (synthetic test actor with no channels) and ateam_conversation (user-initiated thread), this calls the /api/internal/notify-user path that PCM and other sibling services use — so the actor's real enabled channels actually receive the message.

Use for: • Channel fan-out smoke (does telegram/push/app actually receive it?) • Delivery-result verification (per-channel ok/failed in the response).

Auth: forwards your authed api_key to Core (no master-secret involvement). Tenant is pinned by the key itself — cross-tenant targeting is structurally impossible.

⚠️ SAFETY: • The text is prefixed with [TEST] in the actual notification — visible to the user, anti-phishing. • Rate-limited: 10 calls/min per session. • Every call is audited (caller, tenant, actor, content hash) regardless of outcome. • actor_id is scoped to your tenant — cross-tenant targeting is rejected by Core's per-tenant Mongo isolation. • reply_handler is NOT supported via api-key auth (Core ignores it). Routing the user's next reply to an arbitrary skill is a privilege-escalation surface. For routing/engagement tests, use ateam_test_skill.

ParametersJSON Schema
NameRequiredDescriptionDefault
sourceNoAudit label for message.source. Default 'ateam-test'.
contentYesNotification text. Will be sent to all of the actor's enabled channels, prefixed with [TEST] for the recipient.
urgencyNoNotification urgency. Default 'normal'.
actor_idYesTarget actor ID in your tenant (e.g. 'usr_arie_admin_0001'). Must exist; Core rejects if not found in your tenant.
metadataNoOptional metadata merged into message.metadata. Useful for correlation IDs.
solution_idYesThe solution ID (required for tenant scoping + audit context).
Behavior5/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

With no annotations provided, the description details behavioral traits: authentication (forwards api_key), tenant scoping (pinned by key), safety (prefix [TEST], rate-limited, audited), cross-tenant rejection, and that reply_handler is ignored. This fully discloses the tool's behavior.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness4/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is well-structured with sections and bullet points, front-loading the purpose. While slightly long, every sentence adds necessary value. Minor redundancy could be tightened, but overall it's efficient.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness5/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the complexity (real notification, auth, safety, no output schema), the description covers all critical aspects: behavior, use cases, auth, safety measures, rate limits, audit, tenant isolation, and limitations on reply_handler. It leaves no essential context missing.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema coverage is 100% (all 6 parameters described in schema). The description adds some context (e.g., default values for source and urgency, that actor_id must exist), but does not significantly augment the schema. Baseline 3 is appropriate.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description starts with 'Fire a REAL notification at an existing actor in a deployed solution — for end-to-end testing of the system-initiated notification path', clearly stating the specific verb and resource. It distinguishes from sibling tools by explicitly contrasting with ateam_test_skill (synthetic, no channels) and ateam_conversation (user-initiated thread).

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines5/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description explicitly lists use cases ('Channel fan-out smoke', 'Delivery-result verification') and implies when not to use it (e.g., for routing/engagement tests, 'use ateam_test_skill'). It also provides context on authentication, safety, and rate limits.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

ateam_test_pipelineAInspect

Test the decision pipeline (intent detection → planning) for a skill WITHOUT executing tools. Returns intent classification, first planned action, and timing. Use this to debug why a skill classifies intent incorrectly or plans the wrong action.

ParametersJSON Schema
NameRequiredDescriptionDefault
messageYesThe test message to classify and plan for
skill_idYesThe skill ID to test
solution_idYesThe solution ID
Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

With no annotations provided, the description carries the full burden of behavioral disclosure. It successfully communicates that no tools are executed (safety), what data is returned ('intent classification, first planned action, and timing'), and the debugging purpose. It could be improved by noting any side effects like logging or state persistence.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description consists of two highly efficient sentences. The first sentence front-loads the core functionality and key constraint (non-execution), while the second sentence immediately follows with the specific use case. There is zero redundant or wasted language.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the tool's moderate complexity (3 parameters, no output schema), the description adequately covers the essential information by specifying what the tool returns even without a formal output schema. It is complete enough for an agent to select and invoke correctly, though explicit output format details would achieve a 5.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

With 100% schema description coverage across all three parameters, the schema already fully documents the inputs. The description references 'skill' and 'message' conceptually but does not add syntax details, constraints, or examples beyond what the schema provides, making 3 the appropriate baseline score.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description states exactly what the tool does using specific verbs ('Test') and resources ('decision pipeline'), including the specific subprocesses covered ('intent detection → planning'). It effectively distinguishes itself from execution-focused siblings like `ateam_test_skill` by explicitly stating it runs 'WITHOUT executing tools'.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description provides clear guidance on when to use the tool ('Use this to debug why a skill classifies intent incorrectly or plans the wrong action'). This implies the appropriate troubleshooting context, though it could be strengthened by explicitly contrasting with `ateam_test_skill` for cases requiring full execution.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

ateam_test_skillAInspect

Send a test message to a deployed skill and get the execution result.

Wait modes (wait_for): • 'root' (default, back-compat) — wait until the message's root job completes, return single-job result. Fast, ignores any sub-skills the root delegated to via askAnySkill. • 'chain' — wait until EVERY job in the chain (root + handoffs + askAnySkill subcalls, recursively) reaches a terminal state, then return the full chain tree. Use when testing multi-skill flows (orchestrator → workers, builders → sub-builders, etc.). The response.chain field carries chainJobs[] with parentJobId/relation/depth and executionSteps[] with tool-nesting (opId/parentOpId/_toolDepth).

Legacy: wait:false is equivalent to wait_for:'never' — returns job_id immediately for polling via ateam_test_status. wait:true is the same as the default wait_for:'root'.

ParametersJSON Schema
NameRequiredDescriptionDefault
waitNoLegacy: if false, return job_id immediately for polling. If true or omitted, behaves like wait_for:'root'. Prefer wait_for going forward.
messageYesThe test message to send to the skill
actor_idNoOptional actor ID for conversation continuity. Pass the actor_id from a previous test response to continue the conversation. Omit to auto-generate a test actor (test_<timestamp>_<random>, auto-expires in 24h).
skill_idYesThe skill ID to test (original or internal ID)
wait_forNoWhat to wait for before returning. 'root' (default) = root job done; 'chain' = every chain job terminal (use for multi-skill flows); 'never' = return job_id immediately (poll via ateam_test_status). When 'chain', the response includes the chain tree under response.chain.
solution_idYesThe solution ID
chain_timeout_msNoOptional. Max total ms to wait when wait_for:'chain'. Default 300000 (5 min). Long-running chains (skill-factory, large bundle builds) may need higher. Clamped to [10000, 900000].
Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

No annotations provided, so the description fully discloses behavior. It explains the return structure for each wait mode, legacy behavior, actor_id auto-generation and expiration (24h), and chain_timeout_ms clamping. The description could be slightly more detailed about error scenarios, but it's already rich.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness4/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is well-structured with bullet points for wait modes. It front-loads the purpose. Some slight redundancy in explaining legacy wait behavior, but overall concise and easy to scan.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness5/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given 7 parameters, 3 required, and no output schema, the description covers all parameters comprehensively. It explains return values for each mode, making it complete for an agent to select and invoke correctly.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters4/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema description coverage is 100%, so baseline is 3. The description adds significant value beyond the schema: details on wait modes, legacy behavior, actor_id auto-generation, and timeout clamping. It enhances understanding without being redundant.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the tool's purpose: 'Send a test message to a deployed skill and get the execution result.' It uses a specific verb ('Send') and resource ('a deployed skill'), distinguishing it from sibling tools like ateam_test_status which is for polling.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines5/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description explicitly explains when to use different wait_for modes: 'root' for simple tests, 'chain' for multi-skill flows, 'never' for polling. It also mentions legacy wait parameter and prefers wait_for. This provides clear decision guidance.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

ateam_test_statusAInspect

Poll the progress of an async skill test. Returns iteration count, tool call steps, status (running/completed/failed), and result when done.

Set include_chain:true to ALSO include the full chain tree (every job in the chain, rooted at this job_id, with parent/child linkage). Use when this job dispatched askAnySkill subcalls and you want a single snapshot of the whole multi-skill state instead of polling each child job_id separately.

ParametersJSON Schema
NameRequiredDescriptionDefault
job_idYesThe job ID returned by ateam_test_skill
skill_idYesThe skill ID
solution_idYesThe solution ID
include_chainNoIf true, includes response.chain — the full chain tree rooted at this job_id (chainJobs[] with parentJobId/relation/depth, executionSteps[] with tool-nesting). Costs one extra Core call. Default false (back-compat).
Behavior3/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

With no annotations, the description carries the full burden. It discloses that this is a poll operation, lists return fields, and notes that include_chain costs an extra Core call. However, it omits error handling (e.g., invalid job_id), rate limiting, and whether the tool blocks or returns immediately. The term 'poll' implies repeated calls, but more detail on expected behavior would improve transparency.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is concise and well-structured. The first paragraph front-loads the core purpose and return values. The second paragraph efficiently explains the optional parameter with a use-case rationale. No redundant sentences or filler.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given no output schema, the description lists return fields (iteration count, tool call steps, status, result) and explains the optional include_chain output. It provides enough context for an agent to use the tool effectively. Minor gaps: no details on field types or structure, but overall sufficient for a polling tool.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters4/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema coverage is 100%. The description adds value beyond the schema by specifying the origin of job_id ('returned by ateam_test_skill') and providing detailed semantics for include_chain, including cost and use case. The other parameters are not elaborated but are sufficiently clear. This goes beyond the baseline 3.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description uses a specific verb 'Poll' and identifies the resource as 'progress of an async skill test'. It clearly distinguishes this tool from siblings like ateam_test_skill (initiation) and ateam_test_abort (cancellation), as this is the only polling tool among them.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description provides guidance for the include_chain parameter ('Use when this job dispatched askAnySkill subcalls') and contrasts with polling child jobs individually. However, it does not explicitly state prerequisites (e.g., that a test must be started first) or when not to use the tool. The implied usage is clear but not exhaustive.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

ateam_test_voiceAInspect

Simulate a voice conversation with a deployed solution. Runs the full voice pipeline (session → caller verification → prompt → skill dispatch → response) using text instead of audio. Returns each turn with bot response, verification status, tool calls, and entities. Use this to test voice-enabled solutions end-to-end without making a phone call.

ParametersJSON Schema
NameRequiredDescriptionDefault
messagesYesArray of user messages to send sequentially (simulates a multi-turn phone conversation)
skill_slugNoOptional: target a specific skill by slug instead of using voice routing.
timeout_msNoOptional: max wait time per skill execution in milliseconds (default: 60000).
solution_idYesThe solution ID
phone_numberNoOptional: simulated caller phone number (e.g., '+14155551234'). If the number is in the solution's known phones list, the caller is auto-verified.
Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

No annotations provided, so description carries full burden. Discloses detailed behavioral flow (session → caller verification → prompt → skill dispatch → response) and return structure (bot response, verification status, tool calls, entities). Lacks explicit safety disclosure (state mutation, production impact) but 'simulate' implies read-only testing.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Four sentences with zero waste: (1) purpose declaration, (2) pipeline mechanics, (3) return value disclosure, (4) usage guidance. Well front-loaded with critical information first.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Despite lacking output schema, description comprehensively explains return values (turns, verification status, entities). Addresses complexity of 5-parameter voice pipeline tool. Minor gap regarding whether simulation affects production state or requires specific auth.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema description coverage is 100%, establishing baseline 3. Description adds conceptual context linking parameters to voice simulation (text vs audio, multi-turn conversation) but does not add parameter-specific semantics beyond what schema already documents.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

Description uses specific verb 'Simulate' with resource 'voice conversation' and explicitly states it runs the 'full voice pipeline' using 'text instead of audio.' Effectively distinguishes from sibling tools like ateam_test_skill by emphasizing voice-specific stages (caller verification) and end-to-end scope.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

Explicitly states when to use: 'test voice-enabled solutions end-to-end without making a phone call.' Provides clear context for simulation vs. real calls. Could improve by explicitly contrasting with ateam_test_skill or ateam_test_pipeline for routing decisions.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

ateam_upload_connectorAInspect

Upload connector code to Core and restart — WITHOUT redeploying skills.

MERGES with the GitHub state at ref by default (default ref: 'dev'). Sending a partial file set ONLY overlays those files — the rest of the connector is preserved from GitHub. To fully replace the connector dir (historical behavior), pass replace:true.

Modes: • github:true (no files) — deploy the GitHub state at ref as-is. • github:true + files:[] — GitHub state at ref as BASE, your files overlay on top (incoming wins). • files:[] (no github) — default MERGE with GitHub state at ref. Refuses if no GitHub base exists (no silent nuke). • files:[] + replace:true — full replace. Wipes connector dir + writes only the provided files. Use deliberately.

Common traps this design prevents: • Pre-fix bug (2026-06-06): sending just ui-dist HTML wiped server.js + node_modules — connector broke until a full re-upload. Now: those files merge with the GitHub base. • Pre-fix bug: github:true silently read from main even when patches were on dev. Now: defaults to dev; pass ref:'main' to opt into the legacy path.

ParametersJSON Schema
NameRequiredDescriptionDefault
refNoGitHub branch to read from for the BASE state. Default: 'dev' (matches ateam_github_patch). Pass 'main' to read from production. Pre-2026-06-05 callers that relied on the silent-main default must pass ref:'main' explicitly.
filesNoFiles to upload. By default merges with the GitHub state at `ref`. Set replace:true to wipe the connector dir and write only these files.
githubNoIf true, pull connector files from GitHub repo at `ref`. Default: false. Combine with files:[] to use GitHub as the base and overlay your files.
replaceNoOpt into FULL REPLACE: wipe the connector dir and write only the provided `files`. Default: false (= merge with GitHub state at `ref`). Use with intent — sending an incomplete file set with replace:true will break the connector.
solution_idYesThe solution ID
connector_idYesThe connector ID to upload (e.g. 'personal-assistant-ui-mcp')
Behavior5/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Despite no annotations, the description fully discloses behavioral traits: it restarts the connector, merges with GitHub state by default, and warns that replace:true is destructive. It also explains past bug fixes to clarify current behavior. This level of transparency is exceptional.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is well-organized with a clear main purpose, a section on merging behavior, a list of modes, and a note on common traps. Each sentence adds value, and the structure is front-loaded with the key action. No wasted words.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

The description covers the tool's purpose, modes, and parameter interactions thoroughly. However, it lacks explicit mention of the return value or output format (e.g., whether it returns a status, error, or nothing). Given the complexity and no output schema, a brief note on expected output would improve completeness.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters5/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

The schema has 100% description coverage, but the tool description adds significant value beyond the schema by explaining parameter interactions and usage contexts (e.g., 'files:[] + replace:true — full replace'). This extra semantic guidance helps the agent correctly combine parameters.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the tool's primary action: 'Upload connector code to Core and restart — WITHOUT redeploying skills.' It identifies the specific resource (connector code to Core) and distinguishes it from a related operation (redeploying skills). This is specific and actionable.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines5/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description provides detailed usage guidance, including four distinct modes with parameter combinations (github, files, replace) and explains when to use each. It also warns about common traps and historical bugs, helping the agent avoid pitfalls. This explicit guidance is excellent.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

ateam_verify_consistencyAInspect

Check that the Builder filesystem state and GitHub state are in sync for a solution. Read-only probe — does NOT trigger a deploy.

Returns: • ok: true + drifts: [] if everything matches • ok: false + drifts: [{path, kind}] listing files that differ (kinds: fs_missing, gh_missing, content_differs)

Drift can creep in when GitHub writes happen but Builder FS doesn't get the mirror update (network blip, container restart mid-write). Boot sync heals most of it on next backend restart; this tool surfaces drift earlier.

Run after a series of ateam_github_patch calls to confirm the Builder backend is consistent with GitHub before you ateam_build_and_run.

ParametersJSON Schema
NameRequiredDescriptionDefault
solution_idYesThe solution ID to verify
Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Describes return values (ok+drifts), confirms read-only and no deploy, explains drift causes. No annotations so description carries burden; lacks mention of authentication or permission requirements.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Concise yet complete: purpose first, then return format, drift explanation, and usage guidance. No wasted sentences.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness5/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

For a simple 1-parameter tool with no output schema, description fully explains return format, drift semantics, and usage context. Nothing missing for an agent to use correctly.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Single parameter solution_id with schema description 'The solution ID to verify'. Description adds no extra meaning beyond schema, and schema coverage is 100% so baseline 3 is appropriate.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the tool checks sync between Builder FS and GitHub, explicitly says it's a read-only probe not a deploy, and distinguishes from sibling tools like ateam_github_diff and ateam_sync_all.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

Provides explicit guidance: 'Run after a series of ateam_github_patch calls to confirm consistency before ateam_build_and_run.' Also explains drift causes and that boot sync heals most. Missing explicit when-not-to-use scenarios.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

Discussions

No comments yet. Be the first to start the discussion!

Try in Browser

Your Connectors

Sign in to create a connector for this server.