Skip to main content
Glama
Ownership verified

Server Details

Connect engineering metrics, DORA performance, and deploy risk scoring to any AI assistant. Score PRs for deployment risk using a 36-signal model, query team health, incidents, coverage, and more.

Status
Healthy
Last Tested
Transport
Streamable HTTP
URL

Glama MCP Gateway

Connect through Glama MCP Gateway for full control over tool access and complete visibility into every call.

MCP client
Glama
MCP server

Full call logging

Every tool call is logged with complete inputs and outputs, so you can debug issues and audit what your agents are doing.

Tool access control

Enable or disable individual tools per connector, so you decide what your agents can and cannot do.

Managed credentials

Glama handles OAuth flows, token storage, and automatic rotation, so credentials never expire on your clients.

Usage analytics

See which tools your agents call, how often, and when, so you can understand usage patterns and catch anomalies.

100% free. Your data is private.
Tool DescriptionsA

Average 4.2/5 across 20 of 20 tools scored.

Server CoherenceA
Disambiguation4/5

Most tools have distinct purposes targeting specific metrics or entities (e.g., ai.summary vs. coverage.summary vs. pr.summary), but some overlap exists. For example, pr.open and pr.at_risk both identify problematic pull requests, though their descriptions clarify different focuses (open vs. at-risk). Overall, descriptions help differentiate tools, but minor ambiguity remains in a few cases.

Naming Consistency5/5

Tool names follow a highly consistent dot-separated pattern (e.g., ai.summary, developers.get, pr.open) with clear entity.action or entity.subcategory conventions. This predictability makes it easy for agents to understand the structure and locate tools, with no deviations or mixed naming styles observed across the set.

Tool Count4/5

With 20 tools, the count is slightly high but reasonable for the broad scope of engineering metrics and analytics. Each tool appears to serve a specific purpose without obvious redundancy, though some consolidation might be possible (e.g., pr.open and pr.at_risk). The number aligns well with covering multiple domains like AI adoption, DORA metrics, pull requests, and team health.

Completeness5/5

The tool set provides comprehensive coverage for engineering analytics, spanning AI adoption, test coverage, developer metrics, DORA metrics, incidents, organizational health, pull requests, repositories, risk assessment, search, and team management. There are no apparent gaps; agents can access high-level summaries, detailed entity metrics, trends, and search capabilities, enabling full workflow support without dead ends.

Available Tools

20 tools
ai.summaryAI Coding Tool Adoption SummaryA
Read-onlyIdempotent
Inspect

Get AI coding tool adoption metrics including GitHub Copilot acceptance rate, Cursor active users, AI-generated code percentage, and suggestions per developer. Use this to understand how the team is using AI coding assistants and measure their impact on productivity. Read-only.

ParametersJSON Schema
NameRequiredDescriptionDefault
teamIdNoTeam ID to filter AI adoption metrics to a specific team. Omit for org-wide summary.
Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

The description adds valuable context beyond annotations by specifying the metrics included (e.g., GitHub Copilot acceptance rate, AI-generated code percentage) and the tool's purpose for measuring productivity impact. While annotations cover safety (readOnlyHint=true, destructiveHint=false) and idempotency, the description enriches understanding of what data is retrieved without contradicting annotations.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is front-loaded with the core purpose in the first sentence, followed by usage context, and ends with a clear behavioral note ('Read-only'). Every sentence adds value without redundancy, making it efficient and well-structured for quick comprehension.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the tool's low complexity (1 optional parameter), rich annotations (covering safety and idempotency), and no output schema, the description is mostly complete. It specifies the metrics and purpose, but could slightly improve by hinting at the return format (e.g., structured data vs. text) to fully compensate for the lack of output schema.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

With 100% schema description coverage, the input schema already fully documents the single parameter (teamId). The description doesn't add any parameter-specific details beyond what's in the schema, such as format examples or default behavior when omitted. Baseline 3 is appropriate as the schema handles parameter documentation.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the specific action ('Get AI coding tool adoption metrics') and lists the exact metrics included (GitHub Copilot acceptance rate, Cursor active users, etc.). It distinguishes this tool from siblings by focusing specifically on AI coding tool metrics, unlike broader tools like 'org.health' or 'dora.summary'.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description provides clear context for when to use this tool ('to understand how the team is using AI coding assistants and measure their impact on productivity'), but it doesn't explicitly state when not to use it or name specific alternatives among the sibling tools. The guidance is helpful but lacks explicit exclusions or comparisons.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

ai.trendAI Tool Adoption TrendA
Read-onlyIdempotent
Inspect

Get the trend of AI tool adoption over time showing weekly active users, acceptance rates, and code attribution percentages. Use this to track whether AI tool adoption is growing or declining across the team. Read-only.

ParametersJSON Schema
NameRequiredDescriptionDefault
daysNoNumber of days to look back for trend data (default: 90)
Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations already declare read-only, open-world, idempotent, and non-destructive properties. The description adds value by explicitly stating 'Read-only' (reinforcing annotations) and specifying the time-based nature of trend data, which provides useful context beyond what annotations convey about safety and behavior.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is efficiently structured in three sentences: the first states what the tool does, the second provides usage context, and the third reinforces behavioral traits. Each sentence adds value without redundancy, making it front-loaded and appropriately sized.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the tool's moderate complexity (single parameter, no output schema), the description is mostly complete. It covers purpose, metrics, usage context, and behavioral traits. However, it lacks details on output format or data interpretation, which could be helpful since there's no output schema.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema description coverage is 100%, with the single parameter 'days' fully documented in the schema. The description does not add any parameter-specific information beyond what the schema provides, so it meets the baseline for high schema coverage without compensating with extra details.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the tool's purpose with specific verbs ('Get the trend') and resources ('AI tool adoption'), and distinguishes it from siblings by focusing on adoption metrics over time rather than summaries or other data types. It explicitly lists the metrics returned: weekly active users, acceptance rates, and code attribution percentages.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines3/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description implies usage context ('to track whether AI tool adoption is growing or declining across the team'), but does not explicitly state when to use this tool versus alternatives like ai.summary or dora.trend. It provides a general purpose but lacks specific guidance on tool selection among siblings.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

coverage.summaryTest Coverage SummaryA
Read-onlyIdempotent
Inspect

Get test coverage summary per repository showing overall coverage percentage, lines covered, and coverage trend. Use this to understand testing health and identify repositories with low coverage that may warrant additional review attention. Read-only.

ParametersJSON Schema
NameRequiredDescriptionDefault
repositoryIdNoFilter to a specific repository by ID. Omit to see all repositories.
Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations already cover read-only, non-destructive, idempotent, and open-world behavior. The description adds useful context by specifying the scope ('per repository') and the types of data returned (coverage percentage, lines, trend), which helps the agent understand what to expect beyond the basic safety profile.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is front-loaded with the core purpose, followed by usage context and a behavioral note. Every sentence adds value without redundancy, and it efficiently conveys necessary information in three concise sentences.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the tool's low complexity (1 optional parameter), rich annotations, and no output schema, the description is mostly complete. It covers purpose, usage, and key return data, though it could optionally mention response format or pagination for completeness, but this is not critical here.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema description coverage is 100%, so the schema fully documents the single optional parameter. The description does not add any parameter-specific information beyond what the schema provides, but it implies the tool can return data for all repositories if no filter is applied, which aligns with the schema.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the specific action ('Get test coverage summary'), resource ('per repository'), and key metrics ('overall coverage percentage, lines covered, and coverage trend'). It distinguishes this tool from siblings by focusing on test coverage rather than AI, DORA, PR, or other repository/team metrics.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description provides clear context for when to use it ('to understand testing health and identify repositories with low coverage that may warrant additional review attention'), but does not explicitly mention when not to use it or name alternative tools for similar purposes among the siblings.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

developers.getGet Developer ProfileA
Read-onlyIdempotent
Inspect

Get metrics for an individual developer including their PR throughput, review activity, average cycle time, and code contributions. Use this when asked about a specific engineer's activity or productivity. Read-only.

ParametersJSON Schema
NameRequiredDescriptionDefault
loginYesGitHub username / login of the developer (e.g. "jsmith")
Behavior3/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations already provide readOnlyHint=true, destructiveHint=false, openWorldHint=true, and idempotentHint=true, covering safety and idempotency. The description adds 'Read-only' (redundant with annotations) and implies it returns metrics, but doesn't disclose behavioral traits like rate limits, authentication needs, or data freshness. With annotations, the bar is lower, and it adds minimal context beyond them.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is front-loaded with the purpose, followed by usage guidance and a safety note. It uses two concise sentences with zero waste—every part adds value (metrics list, usage context, read-only confirmation). Efficiently structured for quick comprehension.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the tool's low complexity (1 parameter, no output schema), rich annotations (covering safety and idempotency), and high schema coverage, the description is mostly complete. It specifies metrics and usage context, but lacks details on output format or error handling, which could be helpful despite annotations. However, for a simple read operation, it's adequate.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema description coverage is 100%, with the 'login' parameter fully documented in the schema. The description doesn't add any meaning beyond the schema (e.g., it doesn't clarify format constraints or examples not in the schema). Baseline is 3 when schema coverage is high, as the schema carries the parameter documentation burden.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the verb ('Get') and resource ('metrics for an individual developer'), specifying the exact metrics included (PR throughput, review activity, average cycle time, code contributions). It distinguishes from siblings like 'developers.top' (which likely aggregates multiple developers) by focusing on individual profiles.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines5/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description explicitly states when to use this tool: 'when asked about a specific engineer's activity or productivity.' This provides clear context for usage, though it doesn't explicitly name alternatives (e.g., 'developers.top' for top performers), the guidance is sufficient for an agent to apply it correctly in relevant scenarios.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

developers.topTop ContributorsA
Read-onlyIdempotent
Inspect

List the most active contributors ranked by commits and PRs merged over a time window. Use this to identify key contributors, bus-factor risks, or to recognize top performers. Read-only.

ParametersJSON Schema
NameRequiredDescriptionDefault
daysNoTime window in days to measure activity (default: 30)
limitNoNumber of contributors to return (default: 10, max: 50)
Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations already declare read-only, open-world, idempotent, and non-destructive properties. The description adds value by explicitly stating 'Read-only' (reinforcing annotations) and specifying that it lists 'ranked' contributors, which provides behavioral context beyond annotations. It does not mention rate limits or authentication needs, but annotations cover key safety aspects.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is front-loaded with the core purpose in the first sentence, followed by usage context and a behavioral note. Every sentence earns its place by adding value without redundancy, making it efficient and well-structured.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the tool's moderate complexity (ranking contributors), rich annotations (covering safety and idempotency), and no output schema, the description is mostly complete. It explains the purpose, usage, and behavioral traits, but could benefit from mentioning output format (e.g., list structure) or ranking methodology to fully compensate for the lack of output schema.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema description coverage is 100%, with clear descriptions for both parameters ('days' and 'limit'). The description adds no additional parameter semantics beyond what the schema provides, such as explaining how ranking is calculated or what 'activity' entails. Baseline 3 is appropriate given high schema coverage.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the specific action ('List the most active contributors'), identifies the resource ('contributors'), and specifies the ranking criteria ('by commits and PRs merged over a time window'). It distinguishes this tool from siblings like 'developers.get' by focusing on ranking rather than retrieving individual developer data.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description provides clear context for when to use this tool ('to identify key contributors, bus-factor risks, or to recognize top performers'), giving practical applications. However, it does not explicitly mention when NOT to use it or name alternative tools for similar purposes, such as 'ai.summary' or 'pr.summary'.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

dora.summaryDORA Metrics SummaryA
Read-onlyIdempotent
Inspect

Get DORA metrics (deploy frequency, lead time for changes, change failure rate, MTTR) for the organization or a specific team. Use this to understand overall engineering delivery performance and reliability. Read-only — returns aggregated metrics for the selected time window.

ParametersJSON Schema
NameRequiredDescriptionDefault
toNoEnd date as ISO string (e.g. 2024-01-31). Defaults to today.
fromNoStart date as ISO string (e.g. 2024-01-01). Defaults to 30 days ago.
teamIdNoTeam ID to filter metrics for a specific team
Behavior3/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations already declare readOnlyHint=true, openWorldHint=true, idempotentHint=true, and destructiveHint=false, covering safety and idempotency. The description adds useful context by specifying that it returns aggregated metrics and is read-only, but does not disclose additional behavioral traits like rate limits, authentication needs, or data freshness. No contradiction with annotations.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is front-loaded with the core purpose, followed by usage context and behavioral note. It consists of two efficient sentences with no wasted words, making it easy to scan and understand quickly.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the tool's moderate complexity, rich annotations, and 100% schema coverage, the description is mostly complete. It covers purpose, usage, and key behavioral aspects. However, without an output schema, it could benefit from more detail on the structure of returned metrics, though annotations help mitigate this gap.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema description coverage is 100%, with clear documentation for all three parameters (to, from, teamId). The description adds marginal value by mentioning the time window and team filtering, but does not provide additional semantics beyond what the schema already covers, such as format details or default behaviors.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the specific action ('Get DORA metrics'), identifies the resources (deploy frequency, lead time for changes, change failure rate, MTTR), and distinguishes the scope (organization or specific team). It explicitly differentiates from siblings by focusing on DORA metrics, unlike tools like 'ai.summary' or 'pr.summary'.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description provides clear context for when to use this tool ('to understand overall engineering delivery performance and reliability') and implies usage for aggregated metrics over a time window. However, it does not explicitly state when not to use it or name specific alternatives among siblings, such as 'dora.trend' for trend analysis.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

dora.trendDORA Metric TrendA
Read-onlyIdempotent
Inspect

Get DORA metric trend over time to see how deployment frequency, lead time, change failure rate, or MTTR has changed week over week. Useful for spotting regressions or improvements across releases. Read-only.

ParametersJSON Schema
NameRequiredDescriptionDefault
daysNoNumber of days to look back (default: 90). Max recommended: 365.
metricYesWhich DORA metric to trend: deploy_frequency, lead_time, change_failure_rate, or mttr
teamIdNoTeam ID to filter metrics for a specific team
Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations already declare readOnlyHint=true, destructiveHint=false, openWorldHint=true, and idempotentHint=true, covering safety and idempotency. The description adds value by explicitly stating 'Read-only' (reinforcing annotations) and providing context on the tool's purpose (trend analysis for spotting changes), though it lacks details on rate limits, authentication needs, or data freshness.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is front-loaded with the core purpose in the first sentence, followed by utility context and a safety note. All three sentences earn their place: defining the tool, its use case, and behavioral trait. No wasted words or redundancy.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the tool's moderate complexity (3 parameters, no output schema), annotations cover safety and idempotency well, and the description clarifies the trend analysis purpose. However, without an output schema, the description could better explain the return format (e.g., time-series data) to enhance completeness for the agent.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema description coverage is 100%, with clear descriptions for all parameters (days, metric, teamId). The description mentions the four metric options but does not add meaning beyond what the schema provides (e.g., explaining metric definitions or teamId usage). Baseline 3 is appropriate as the schema handles parameter documentation adequately.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the verb 'Get' and the resource 'DORA metric trend over time', specifying the four specific metrics (deployment frequency, lead time, change failure rate, MTTR) and the temporal scope 'week over week'. It distinguishes from sibling tools like 'dora.summary' by focusing on trend analysis rather than summary statistics.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description provides clear context for when to use this tool: 'to see how metrics have changed week over week' and 'useful for spotting regressions or improvements across releases'. However, it does not explicitly state when not to use it or name specific alternatives among siblings (e.g., 'dora.summary' for aggregated metrics).

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

incidents.listRecent Production IncidentsA
Read-onlyIdempotent
Inspect

List recent production incidents from PagerDuty or OpsGenie with their severity, MTTR (mean time to recovery), and affected services. Use this to understand reliability posture or investigate a recent outage in context with deployment activity. Read-only.

ParametersJSON Schema
NameRequiredDescriptionDefault
daysNoNumber of days to look back for incidents (default: 30)
limitNoMaximum number of incidents to return (default: 20)
Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

The description adds valuable behavioral context beyond what annotations provide. While annotations already declare read-only, open-world, idempotent, and non-destructive properties, the description explicitly states 'Read-only' and mentions the tool's focus on 'recent production incidents' with specific data fields. This helps the agent understand the scope and output format, though it doesn't mention rate limits or authentication requirements.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is perfectly concise and well-structured in just two sentences. The first sentence establishes the core functionality, the second provides usage context, and the final 'Read-only' statement reinforces behavioral transparency. Every element serves a clear purpose with zero wasted words.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the tool's moderate complexity (2 parameters, no output schema), the description provides good contextual completeness. It explains what the tool does, when to use it, and key behavioral aspects. However, without an output schema, the description could benefit from more detail about the structure of returned incident data beyond the mentioned fields (severity, MTTR, affected services).

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

With 100% schema description coverage, the input schema already fully documents both parameters (days and limit) with their types and default values. The description doesn't add any parameter-specific information beyond what's in the schema, so it meets the baseline expectation without providing additional semantic context.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the tool's purpose with specific verbs ('List recent production incidents') and resources ('from PagerDuty or OpsGenie'), including key data fields like severity, MTTR, and affected services. It distinguishes itself from sibling tools by focusing on incident data rather than developer metrics, DORA metrics, or repository information.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description provides clear usage context ('to understand reliability posture or investigate a recent outage in context with deployment activity'), giving the agent concrete scenarios for when to use this tool. However, it doesn't explicitly mention when NOT to use it or name specific alternatives among the sibling tools for similar purposes.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

org.healthOrganization Health SnapshotA
Read-onlyIdempotent
Inspect

Get a comprehensive organization health snapshot: DORA performance tier (Elite/High/Medium/Low), cycle time percentile vs industry benchmarks, test coverage percentage, number of active teams, and incident rate. Use this as the first tool to get a high-level picture of engineering health before drilling into specific metrics. Read-only.

ParametersJSON Schema
NameRequiredDescriptionDefault
teamIdNoTeam ID to scope health metrics to a specific team. Omit for org-wide snapshot.
Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

The description adds valuable behavioral context beyond what annotations provide. While annotations already indicate read-only, open-world, idempotent, and non-destructive characteristics, the description explicitly states 'Read-only' for emphasis and clarifies this is a 'snapshot' tool for getting a 'high-level picture.' However, it doesn't mention potential rate limits, authentication requirements, or data freshness considerations that would be useful for an agent.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is perfectly structured and concise. The first sentence clearly states the purpose and lists specific metrics. The second sentence provides explicit usage guidance. The final word 'Read-only' adds important behavioral emphasis. Every sentence earns its place with zero wasted words.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the tool's complexity (providing multiple health metrics) and the absence of an output schema, the description does well by listing the specific metrics returned. However, it doesn't explain the format or structure of the return data (e.g., whether it's a single object with all metrics, what units are used, or how benchmarks are calculated). With rich annotations but no output schema, the description could be more complete about the response format.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters4/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

With 100% schema description coverage for the single parameter (teamId), the schema already documents its purpose. The description doesn't add specific parameter semantics beyond what's in the schema, but it does provide important context about scoping by mentioning 'org-wide snapshot' when the parameter is omitted, which helps the agent understand the default behavior. Since there's only one parameter with good schema coverage, this is sufficient.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the specific action ('Get a comprehensive organization health snapshot') and lists the exact metrics returned (DORA performance tier, cycle time percentile, test coverage percentage, number of active teams, incident rate). It distinguishes this tool from siblings by emphasizing it provides a 'high-level picture' before drilling into specific metrics, unlike more focused tools like dora.summary or coverage.summary.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines5/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description explicitly states when to use this tool ('Use this as the first tool to get a high-level picture of engineering health before drilling into specific metrics') and provides clear context about its purpose as an initial assessment tool. It distinguishes from alternatives by positioning it as a starting point rather than a detailed analysis tool.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

org.wellbeingTeam Well-Being & Burnout RiskA
Read-onlyIdempotent
Inspect

Get team well-being scores across pillars: focus time (uninterrupted deep work hours), meeting load (percentage of time in meetings), context switching (task interruptions per day), and burnout risk indicators. Use this to understand developer experience and identify teams under stress before it affects delivery. Read-only.

ParametersJSON Schema
NameRequiredDescriptionDefault
teamIdNoTeam ID to filter well-being data. Omit for org-wide summary.
Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

The description adds valuable behavioral context beyond what annotations provide: it specifies the four specific well-being pillars measured (focus time, meeting load, context switching, burnout risk) and the tool's purpose for proactive stress identification. While annotations already declare it as read-only, non-destructive, idempotent, and open-world, the description usefully elaborates on what data is actually retrieved.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is efficiently structured in two sentences: the first explains what the tool does and what metrics it returns, the second explains its purpose and includes the read-only declaration. Every element serves a clear purpose with no wasted words.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

For a read-only tool with comprehensive annotations and a simple parameter schema, the description provides good contextual completeness by explaining the specific well-being pillars and the tool's purpose. The main gap is the lack of output schema, but the description compensates somewhat by listing the types of metrics returned.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

With 100% schema description coverage, the input schema already fully documents the single optional parameter (teamId). The description doesn't add any parameter-specific information beyond what's in the schema, so it meets the baseline expectation without providing extra semantic value.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the specific action ('Get team well-being scores') and resources ('across pillars: focus time, meeting load, context switching, and burnout risk indicators'), distinguishing it from sibling tools like 'org.health' or 'risk.score' by focusing specifically on developer well-being metrics rather than general organizational health or risk assessments.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description provides clear context for when to use this tool ('to understand developer experience and identify teams under stress before it affects delivery'), but doesn't explicitly state when not to use it or name specific alternatives among the sibling tools, such as 'org.health' for broader organizational metrics.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

pr.at_riskAt-Risk Pull RequestsA
Read-onlyIdempotent
Inspect

Get pull requests at risk of becoming long-running or blocked. These are PRs that have been open for more than 3 days, have no reviews, or are very large (>500 lines). Use this to prompt engineering leads to take action on blocked work before it impacts the team. Read-only.

ParametersJSON Schema
NameRequiredDescriptionDefault
repositoryIdNoFilter to a specific repository by ID. Omit to return at-risk PRs across all repos.
Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations already declare readOnlyHint=true, destructiveHint=false, openWorldHint=true, and idempotentHint=true. The description adds valuable context by specifying the risk criteria (3 days, no reviews, >500 lines) and the intended use case for prompting leads, which goes beyond the annotations. No contradiction exists.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is front-loaded with the core purpose, followed by risk criteria, usage context, and a safety note ('Read-only'). Every sentence adds value without redundancy, making it efficient and well-structured.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the tool's moderate complexity (filtering at-risk PRs), rich annotations (covering safety and behavior), and no output schema, the description is largely complete. It explains the risk logic and use case, though it could briefly mention output format (e.g., list of PRs with details) for full completeness.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema description coverage is 100% for the single parameter (repositoryId), with the schema fully documenting its purpose and optionality. The description does not add any parameter-specific details beyond what the schema provides, meeting the baseline for high coverage.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the specific verb ('Get') and resource ('pull requests at risk'), with explicit criteria defining 'at-risk' (open >3 days, no reviews, >500 lines). It distinguishes from sibling tools like 'pr.open' or 'pr.summary' by focusing on problematic PRs needing intervention.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines5/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description explicitly states when to use this tool: 'to prompt engineering leads to take action on blocked work before it impacts the team.' It implies an alternative (not using it risks impact) and provides clear context for proactive management.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

pr.openList Open Pull RequestsA
Read-onlyIdempotent
Inspect

List currently open pull requests with their age in hours, size (additions + deletions), and reviewer assignments. Use this to identify stale or large PRs that may be blocking the team. Optionally filter to high-risk PRs only (large, old, or no reviewers). Read-only.

ParametersJSON Schema
NameRequiredDescriptionDefault
highRiskOnlyNoIf true, only return PRs that are large (>500 lines), old (>72h open), or have no reviewers
repositoryIdNoFilter to a specific repository by ID
Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations already declare readOnlyHint=true, destructiveHint=false, openWorldHint=true, and idempotentHint=true. The description adds valuable context beyond annotations by specifying what data is returned (age in hours, size, reviewer assignments) and the purpose (identifying stale/large/blocking PRs). It doesn't contradict annotations and provides useful behavioral context about the tool's output and filtering logic.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is perfectly concise with three sentences that each serve a distinct purpose: stating what the tool does, explaining when to use it, and describing the optional filter. There's no wasted language, and the most important information (listing open PRs) comes first.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

For a read-only listing tool with comprehensive annotations and full parameter documentation, the description provides excellent context about what data is returned and why to use it. The only minor gap is the lack of output schema, but the description compensates by specifying the return fields (age, size, reviewer assignments). It could be slightly more complete by mentioning pagination or result limits.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema description coverage is 100%, so the schema already fully documents both parameters. The description adds minimal semantic context by mentioning 'Optionally filter to high-risk PRs only' which aligns with the highRiskOnly parameter, but doesn't provide additional meaning beyond what's in the schema descriptions. This meets the baseline expectation for high schema coverage.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description explicitly states the verb ('List') and resource ('currently open pull requests') with specific attributes (age, size, reviewer assignments). It clearly distinguishes from siblings like 'pr.at_risk' by focusing on open PRs rather than at-risk metrics, and from 'pr.summary' by providing detailed listing rather than aggregated summary.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines5/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description provides explicit usage context: 'Use this to identify stale or large PRs that may be blocking the team.' It also specifies when to use the optional filter: 'Optionally filter to high-risk PRs only (large, old, or no reviewers).' This gives clear guidance on when this tool is appropriate versus alternatives.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

pr.summaryPull Request Metrics SummaryA
Read-onlyIdempotent
Inspect

Get pull request metrics including cycle time (time from first commit to merge), throughput (PRs merged per week), review health (time to first review, reviewer distribution), and PR size trends. Use this to assess code review efficiency and identify bottlenecks in your delivery pipeline. Read-only.

ParametersJSON Schema
NameRequiredDescriptionDefault
daysNoNumber of days to look back (default: 30)
teamIdNoTeam ID to filter to a specific team
Behavior3/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

The description adds useful context by specifying the metrics included (cycle time, throughput, etc.) and stating 'Read-only', but the annotations already cover readOnlyHint=true, destructiveHint=false, openWorldHint=true, and idempotentHint=true, so the description does not disclose significant additional behavioral traits beyond what annotations provide.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is appropriately sized and front-loaded, with the first sentence detailing the tool's purpose and metrics, the second providing usage context, and the third stating behavioral traits, all without unnecessary information.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the tool's moderate complexity, rich annotations, and lack of an output schema, the description is mostly complete but could benefit from more detail on output format or data structure, as it only lists metrics without specifying how they are returned.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

The description does not add meaning beyond the input schema, which has 100% coverage with clear descriptions for both parameters. With high schema coverage, the baseline score is 3, as the schema adequately documents the parameters without needing extra explanation in the description.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the tool's purpose with specific verbs ('Get pull request metrics') and resources ('cycle time, throughput, review health, PR size trends'), and distinguishes it from siblings by focusing on PR metrics rather than AI, DORA, or other domain summaries.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description provides clear context for when to use this tool ('to assess code review efficiency and identify bottlenecks in your delivery pipeline'), but does not explicitly state when not to use it or name specific alternatives among the sibling tools.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

repos.getGet Repository DetailsA
Read-onlyIdempotent
Inspect

Get detailed metrics for a specific repository including deployment frequency, PR cycle time, contributor count, and code health indicators. Use this when asked about a specific codebase. Read-only.

ParametersJSON Schema
NameRequiredDescriptionDefault
nameYesRepository name (e.g. "my-service") or full name with owner (e.g. "org/my-service")
Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations already declare readOnlyHint=true, destructiveHint=false, openWorldHint=true, and idempotentHint=true, covering safety and idempotency. The description adds 'Read-only' which reinforces the annotations, and specifies the scope ('detailed metrics for a specific repository') and types of metrics returned, providing useful behavioral context beyond the annotations.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is two sentences that are front-loaded with purpose and usage, with zero wasted words. Every element (what it does, when to use it, behavioral note) serves a clear function without redundancy.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the tool's complexity (single parameter, read-only operation) and rich annotations covering safety and behavior, the description provides sufficient context about what metrics are returned and when to use it. The lack of an output schema is mitigated by the description listing specific metric types, though it doesn't fully detail return format.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema description coverage is 100%, with the single parameter 'name' fully documented in the schema. The description doesn't add any parameter-specific information beyond what's in the schema, so it meets the baseline of 3 where the schema handles parameter documentation adequately.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the verb 'Get' and resource 'detailed metrics for a specific repository', listing specific metrics like deployment frequency, PR cycle time, contributor count, and code health indicators. It distinguishes from sibling 'repos.list' by focusing on a single repository's details rather than listing repositories.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description provides explicit usage context: 'Use this when asked about a specific codebase.' This gives clear guidance on when to invoke the tool. However, it doesn't specify when NOT to use it or mention alternatives among the many sibling tools.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

repos.listList Connected RepositoriesA
Read-onlyIdempotent
Inspect

List all repositories connected to Koalr with their health scores, activity levels, and basic stats. Use this to get an overview of the codebase landscape or find repository IDs for filtering other tools. Read-only.

ParametersJSON Schema
NameRequiredDescriptionDefault
limitNoMaximum number of repositories to return (default: 50, max: 200)
Behavior3/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

The description adds some behavioral context beyond annotations: it mentions the tool is 'read-only' (though annotations already declare readOnlyHint=true, destructiveHint=false, and idempotentHint=true) and implies it returns a list with specific data fields (health scores, activity levels, basic stats). However, it lacks details on pagination, rate limits, or authentication needs, which would be valuable given the openWorldHint=true annotation. No contradiction with annotations exists.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is front-loaded with the core purpose in the first sentence, followed by usage guidance and a behavioral note. It is concise with three sentences, each adding value without redundancy, making it easy for an AI agent to parse quickly.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the tool's low complexity (1 optional parameter, no output schema), the description is mostly complete: it covers purpose, usage, and basic behavioral traits. However, it could improve by addressing pagination or response format details, especially since openWorldHint=true suggests potential large datasets. The annotations provide safety and idempotency context, but the description could better complement them.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

The description does not mention the 'limit' parameter at all, but the input schema has 100% description coverage, providing default and max values. With high schema coverage, the baseline score is 3, as the description adds no additional parameter semantics beyond what the schema already documents.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the specific action ('List all repositories'), identifies the resource ('connected to Koalr'), and specifies the scope ('with their health scores, activity levels, and basic stats'). It distinguishes itself from sibling tools like 'repos.get' by emphasizing a comprehensive overview rather than detailed retrieval of a single repository.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description provides clear context for when to use this tool ('to get an overview of the codebase landscape or find repository IDs for filtering other tools'), which helps differentiate it from siblings like 'repos.get' (likely for single repository details) or 'search' (likely for specific queries). However, it does not explicitly state when not to use it or name specific alternatives, preventing a perfect score.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

risk.scoreScore PR Deployment RiskA
Read-onlyIdempotent
Inspect

Score a specific pull request for deployment risk using Koalr's 36-signal model. Returns a 0–100 risk score with a detailed factor breakdown covering change entropy, DDL migration detection, author file expertise, PR size, CODEOWNERS violations, blast radius, coverage delta, and more. Use this to answer "How risky is this PR?" or "Should we merge this before the release?". Read-only — scoring does not modify the PR.

ParametersJSON Schema
NameRequiredDescriptionDefault
shaYesHead commit SHA.
bodyNoPR description body (used to detect risky phrases).
repoYesRepository name. Example: "api-service".
filesNoList of changed file paths. Used for DDL detection, CODEOWNERS analysis, and entropy calculation. More accurate results when provided.
ownerYesGitHub repository owner (org or user). Example: "acme".
titleYesPR title.
prNumberYesPull request number.
additionsYesLines added.
deletionsYesLines deleted.
hasReviewNoWhether the PR has at least one review (any type).
authorLoginNoGitHub login of the PR author.
hasApprovalNoWhether the PR has at least one approving review.
changedFilesYesNumber of files changed.
Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations already declare readOnlyHint=true, destructiveHint=false, and idempotentHint=true, covering safety and idempotency. The description adds valuable context by stating 'scoring does not modify the PR' (reinforcing read-only nature) and listing specific risk factors analyzed (change entropy, DDL migration detection, etc.), which goes beyond what annotations provide.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is efficiently structured in three sentences: first states purpose and output, second provides usage examples, third clarifies behavioral aspect. Every sentence adds value with zero redundant information, making it front-loaded and appropriately sized.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the tool's complexity (13 parameters, risk scoring logic) and rich annotations, the description provides strong purpose clarity and usage guidance. However, without an output schema, it could benefit from more detail about the return format beyond '0–100 risk score with detailed factor breakdown' to fully prepare the agent for response handling.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

With 100% schema description coverage, the input schema already documents all 13 parameters thoroughly. The description doesn't add any parameter-specific details beyond what's in the schema, so it meets the baseline expectation without providing extra semantic value.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the specific action ('Score a specific pull request for deployment risk') using Koalr's 36-signal model, and distinguishes it from siblings by focusing on risk assessment rather than summary, trend, or list operations. It explicitly mentions the 0–100 risk score output with detailed factor breakdown.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines5/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description provides explicit usage guidance with concrete examples: 'Use this to answer "How risky is this PR?" or "Should we merge this before the release?"' This gives clear context for when to invoke this tool versus alternatives like pr.summary or pr.open.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

teams.getGet Team DetailsA
Read-onlyIdempotent
Inspect

Get details and metrics for a specific team including DORA performance, cycle time, and member count. Use this when asked about a specific team's engineering health. Combines DORA and flow metrics in a single response. Read-only.

ParametersJSON Schema
NameRequiredDescriptionDefault
teamIdYesThe team ID (use teams.list to find team IDs)
Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations already provide readOnlyHint=true, destructiveHint=false, openWorldHint=true, and idempotentHint=true. The description adds value by explicitly stating 'Read-only' (reinforcing annotations) and specifying the combined metrics scope (DORA and flow metrics), which helps the agent understand what data to expect. However, it doesn't mention rate limits, authentication needs, or pagination behavior.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is front-loaded with the core purpose, followed by usage guidance and key behavioral notes. All three sentences are necessary and efficient, with no redundant information. It effectively communicates essential information in minimal space.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the tool's moderate complexity (single required parameter, no output schema), the description provides good context: purpose, usage guidelines, and behavioral notes. It covers what the tool does and when to use it, though it could mention response format or error handling. With annotations covering safety and idempotency, the description is mostly complete.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

The input schema has 100% description coverage, with the teamId parameter clearly documented. The description doesn't add any parameter-specific information beyond what's in the schema, but it does imply the tool returns metrics related to the teamId. With high schema coverage, the baseline score of 3 is appropriate.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the specific action ('Get details and metrics') and resource ('for a specific team'), listing specific metrics included (DORA performance, cycle time, member count). It distinguishes from siblings like 'teams.list' (which lists teams) and 'dora.summary' (which focuses only on DORA metrics).

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines5/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description explicitly states when to use this tool ('Use this when asked about a specific team's engineering health') and distinguishes it from alternatives by noting it 'Combines DORA and flow metrics in a single response' (unlike separate DORA or flow tools). The parameter description also references 'teams.list' as a prerequisite for finding team IDs.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

teams.listList Engineering TeamsA
Read-onlyIdempotent
Inspect

List all engineering teams in the organization with their member counts and slugs. Use this to discover team IDs needed for filtering other metrics tools. Returns an array of team objects with id, name, slug, and memberCount. Read-only.

ParametersJSON Schema
NameRequiredDescriptionDefault
searchNoFilter teams by name substring (case-insensitive). Omit to list all teams.
Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations already cover read-only, open-world, idempotent, and non-destructive behavior. The description adds useful context about the return format ('array of team objects with id, name, slug, and memberCount') and the tool's role in filtering other metrics, which isn't captured by annotations alone.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Three sentences with zero waste: first states purpose and return attributes, second provides usage context, third clarifies behavioral trait. Each sentence adds distinct value, and the description is appropriately front-loaded with core functionality.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness5/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

For a simple read-only listing tool with comprehensive annotations (covering safety and behavior) and no output schema, the description provides complete context: purpose, usage guidance, return format, and behavioral confirmation. No gaps exist given the tool's low complexity.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema description coverage is 100%, so the schema fully documents the single optional 'search' parameter. The description doesn't add any parameter-specific details beyond what the schema provides, meeting the baseline for high schema coverage.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the verb ('List'), resource ('engineering teams'), and scope ('in the organization') with specific attributes returned ('member counts and slugs'). It distinguishes from sibling tools like 'teams.get' by emphasizing discovery of team IDs for filtering other metrics tools.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines5/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

Explicitly states when to use this tool ('to discover team IDs needed for filtering other metrics tools'), providing clear context for its purpose. It differentiates from potential alternatives by focusing on team listing rather than detailed retrieval or other metrics.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

teams.membersList Team MembersA
Read-onlyIdempotent
Inspect

List all members of a specific team with their GitHub logins and roles. Use this to understand team composition or find developer logins for the developers.get tool. Read-only.

ParametersJSON Schema
NameRequiredDescriptionDefault
teamIdYesThe team ID (use teams.list to find team IDs)
Behavior3/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations already declare readOnlyHint=true, destructiveHint=false, openWorldHint=true, and idempotentHint=true, so the agent knows this is a safe, idempotent read operation. The description adds the 'Read-only' confirmation (redundant with annotations) and mentions the specific data returned (logins and roles), but doesn't provide additional behavioral context like pagination, rate limits, or authentication requirements.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is perfectly concise with two sentences that each earn their place: the first explains what the tool does and what it returns, the second provides usage context and safety information. No wasted words, front-loaded with core functionality.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

For a simple read-only tool with comprehensive annotations and a well-documented single parameter, the description is reasonably complete. It explains the purpose, usage context, and what data is returned. The main gap is the lack of output schema, but the description partially compensates by specifying the return format ('GitHub logins and roles').

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema description coverage is 100% with the single parameter 'teamId' well-documented in the schema. The description doesn't add any parameter-specific information beyond what's already in the schema, so the baseline score of 3 is appropriate when the schema does the heavy lifting.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the verb ('List') and resource ('all members of a specific team'), specifies what information is returned ('GitHub logins and roles'), and distinguishes this tool from its sibling 'teams.list' by focusing on members rather than teams themselves.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description provides clear context for when to use this tool ('to understand team composition or find developer logins for the developers.get tool'), but doesn't explicitly state when NOT to use it or mention alternative tools for similar purposes beyond the indirect reference to developers.get.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

Discussions

No comments yet. Be the first to start the discussion!

Try in Browser

Your Connectors

Sign in to create a connector for this server.

Resources