Koalr

Server Details

Connect engineering metrics, DORA performance, and deploy risk scoring to any AI assistant. Score PRs for deployment risk using a 36-signal model, query team health, incidents, coverage, and more.

Status: Healthy
Last Tested: 2026-07-14 09:42
Transport: Streamable HTTP
URL

Glama MCP Gateway

Connect through Glama MCP Gateway for full control over tool access and complete visibility into every call.

MCP client

Glama

MCP server

Full call logging

Every tool call is logged with complete inputs and outputs, so you can debug issues and audit what your agents are doing.

Tool access control

Enable or disable individual tools per connector, so you decide what your agents can and cannot do.

Managed credentials

Glama handles OAuth flows, token storage, and automatic rotation, so credentials never expire on your clients.

Usage analytics

See which tools your agents call, how often, and when, so you can understand usage patterns and catch anomalies.

100% free. Your data is private.

Tool Definition Quality

A4.2/5.0

Tool DescriptionsA

Average 4.2/5 across 20 of 20 tools scored.

Server CoherenceA

Disambiguation4/5

Most tools have distinct purposes targeting specific metrics or entities (e.g., ai.summary vs. coverage.summary vs. pr.summary), but some overlap exists. For example, pr.open and pr.at_risk both identify problematic pull requests, though their descriptions clarify different focuses (open vs. at-risk). Overall, descriptions help differentiate tools, but minor ambiguity remains in a few cases.

Naming Consistency5/5

Tool names follow a highly consistent dot-separated pattern (e.g., ai.summary, developers.get, pr.open) with clear entity.action or entity.subcategory conventions. This predictability makes it easy for agents to understand the structure and locate tools, with no deviations or mixed naming styles observed across the set.

Tool Count4/5

With 20 tools, the count is slightly high but reasonable for the broad scope of engineering metrics and analytics. Each tool appears to serve a specific purpose without obvious redundancy, though some consolidation might be possible (e.g., pr.open and pr.at_risk). The number aligns well with covering multiple domains like AI adoption, DORA metrics, pull requests, and team health.

Completeness5/5

The tool set provides comprehensive coverage for engineering analytics, spanning AI adoption, test coverage, developer metrics, DORA metrics, incidents, organizational health, pull requests, repositories, risk assessment, search, and team management. There are no apparent gaps; agents can access high-level summaries, detailed entity metrics, trends, and search capabilities, enabling full workflow support without dead ends.

Available Tools

20 tools

get_ai_adoption_summaryAI Coding Tool Adoption SummaryA

Read-onlyIdempotent

Inspect

Get AI coding tool adoption metrics including GitHub Copilot acceptance rate, Cursor active users, AI-generated code percentage, and suggestions per developer. Use this to understand how the team is using AI coding assistants and measure their impact on productivity. Read-only.

ParametersJSON Schema

Name	Required	Description	Default
`teamId`	No	Team ID to filter AI adoption metrics to a specific team. Omit for org-wide summary.

Tool Definition Quality

A3.6/5.0

Behavior3/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

The description aligns with annotations (readOnlyHint=true, etc.) by stating 'Read-only.' However, it does not add significant behavioral details beyond what annotations already provide, such as data freshness or scope.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Two sentences that are direct and front-loaded: first states output, second states usage. No superfluous content.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the simple input (one optional param) and rich annotations, the description adequately covers what the tool returns and when to use it. It lacks mention of output structure but that is acceptable without an output schema.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

With 100% schema description coverage, the schema already documents the lone parameter (teamId). The description does not add extra meaning or usage guidance for the parameter.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose4/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states it retrieves AI coding tool adoption metrics and lists specific examples (Copilot acceptance rate, Cursor users, etc.). It differentiates from siblings like get_ai_adoption_trend by implying a current snapshot, but does not explicitly contrast.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines3/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

It provides a usage context ('measure impact on productivity') but does not specify when to use this tool over alternatives (e.g., get_ai_adoption_trend for trends) or when not to use it.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

get_ai_adoption_trendAI Tool Adoption TrendA

Read-onlyIdempotent

Inspect

Get the trend of AI tool adoption over time showing weekly active users, acceptance rates, and code attribution percentages. Use this to track whether AI tool adoption is growing or declining across the team. Read-only.

ParametersJSON Schema

Name	Required	Description	Default
`days`	No	Number of days to look back for trend data (default: 90)

Tool Definition Quality

A4/5.0

Behavior3/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations already declare readOnlyHint=true, idempotentHint=true, destructiveHint=false, covering safety and idempotency. The description adds 'Read-only' and lists returned metrics but no additional behavioral traits (e.g., rate limits, caching, data freshness). With annotations present, the description adds minimal extra transparency.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Description is two sentences, front-loaded with purpose, no redundant words. Every sentence adds value. Extremely concise and well-structured.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given no output schema, description includes the key output fields (weekly active users, acceptance rates, code attribution percentages). It mentions the time resolution ('weekly') and a parameter for lookback period. Minor gap: does not specify if data is aggregated per day or week, but default 90 days and weekly metrics are clear enough. Overall adequate for a simple read-only trend tool.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema coverage is 100% with a clear description of the 'days' parameter including default value. The description does not elaborate further on parameter semantics beyond what the schema provides. Baseline score of 3 is appropriate as schema does the heavy lifting.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

Description clearly states 'Get the trend of AI tool adoption over time showing weekly active users, acceptance rates, and code attribution percentages.' It specifies a precise verb (Get) and resource (AI tool adoption trend), and distinguishes from sibling get_ai_adoption_summary by emphasizing trend over summary.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

Provides explicit usage context: 'Use this to track whether AI tool adoption is growing or declining across the team.' This guides the agent on appropriate scenarios. However, it does not explicitly mention when not to use it (e.g., preferring the summary variant), so it lacks full exclusions.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

get_at_risk_prsAt-Risk Pull RequestsA

Read-onlyIdempotent

Inspect

Get pull requests at risk of becoming long-running or blocked. These are PRs that have been open for more than 3 days, have no reviews, or are very large (>500 lines). Use this to prompt engineering leads to take action on blocked work before it impacts the team. Read-only.

ParametersJSON Schema

Name	Required	Description	Default
`repositoryId`	No	Filter to a specific repository by ID. Omit to return at-risk PRs across all repos.

Tool Definition Quality

A4.3/5.0

Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations already declare readOnlyHint, openWorldHint, idempotentHint, and destructiveHint false. The description adds 'Read-only' and explains the criteria for at-risk PRs, which is consistent and adds value beyond annotations.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is concise with two sentences that front-load the purpose and criteria. Every sentence adds value without redundancy.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness5/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

For a simple tool with one optional parameter and no output schema, the description provides sufficient context about what the tool returns and its application, making it complete for reliable invocation.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

The schema covers the single parameter ('repositoryId') fully with description. The tool description does not add additional parameter semantics beyond what the schema provides, so baseline score of 3 applies.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the tool retrieves pull requests at risk of becoming long-running or blocked, specifying concrete criteria (open >3 days, no reviews, >500 lines). This differentiates it from siblings like 'get_open_prs' and 'score_pr_for_deploy_risk'.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description advises using the tool to prompt engineering leads for action, providing clear context. However, it does not explicitly mention when not to use it or name alternative tools for similar tasks.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

get_coverage_summaryTest Coverage SummaryA

Read-onlyIdempotent

Inspect

Get test coverage summary per repository showing overall coverage percentage, lines covered, and coverage trend. Use this to understand testing health and identify repositories with low coverage that may warrant additional review attention. Read-only.

ParametersJSON Schema

Name	Required	Description	Default
`repositoryId`	No	Filter to a specific repository by ID. Omit to see all repositories.

Tool Definition Quality

A4.1/5.0

Behavior3/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations already provide readOnlyHint, idempotentHint, openWorldHint, and destructiveHint=false. The description adds 'Read-only' but also describes the return fields ('coverage percentage, lines covered, coverage trend') which adds context beyond annotations. However, it doesn't disclose any additional behavioral traits like pagination or rate limits.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Three sentences, no wasted words. The main action is front-loaded, and every sentence adds value. It is concise yet informative.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness5/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

The tool has one simple parameter, no output schema, and the description adequately explains what is returned (coverage percentage, lines covered, trend). For a read-only tool with rich annotations, this is complete enough for an agent to use correctly.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema coverage is 100% with a single parameter 'repositoryId' described as filtering to a specific repository or omitting to see all. The description does not add new meaning beyond the schema, so baseline 3 is appropriate.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the verb 'Get' and the resource 'test coverage summary per repository', listing the specific data points shown. It distinguishes from siblings like get_dora_summary which cover different metrics.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description explicitly states the purpose: 'to understand testing health and identify repositories with low coverage'. It gives a clear context but does not mention when not to use it or alternatives, though siblings are distinct enough to avoid confusion.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

get_developerGet Developer ProfileA

Read-onlyIdempotent

Inspect

Get metrics for an individual developer including their PR throughput, review activity, average cycle time, and code contributions. Use this when asked about a specific engineer's activity or productivity. Read-only.

ParametersJSON Schema

Name	Required	Description	Default
`login`	Yes	GitHub username / login of the developer (e.g. "jsmith")

Tool Definition Quality

A4.3/5.0

Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

The description aligns with annotations by stating 'Read-only' and adds useful detail about the types of metrics returned, enriching the behavioral context beyond annotations.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Two concise sentences: first describing purpose with specific metrics, second giving usage guidance. No fluff, front-loaded.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness5/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

For a simple single-parameter read-only tool with comprehensive annotations, the description adequately covers the return data and usage, making it complete.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

With 100% schema description coverage, the description adds no extra meaning to the 'login' parameter beyond what the schema provides, meeting the baseline.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the verb 'Get', the resource 'developer profile', and lists specific metrics (PR throughput, review activity, etc.), distinguishing it from siblings like get_team and get_repository.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description explicitly says 'Use this when asked about a specific engineer's activity or productivity', providing clear usage context, though it does not mention when not to use or alternatives.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

get_dora_summaryDORA Metrics SummaryA

Read-onlyIdempotent

Inspect

Get DORA metrics (deploy frequency, lead time for changes, change failure rate, MTTR) for the organization or a specific team. Use this to understand overall engineering delivery performance and reliability. Read-only — returns aggregated metrics for the selected time window.

ParametersJSON Schema

Name	Required	Description
`to`	No	End date as ISO string (e.g. 2024-01-31). Defaults to today.
`from`	No	Start date as ISO string (e.g. 2024-01-01). Defaults to 30 days ago.
`teamId`	No	Team ID to filter metrics for a specific team

Tool Definition Quality

A4.3/5.0

Behavior5/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations already indicate read-only, open-world, idempotent, non-destructive. The description reinforces 'Read-only' and adds context about aggregated metrics and time window. No contradiction exists.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Two concise sentences with no wasted words. The first sentence states the purpose and metrics, the second provides guidance and read-only nature.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness3/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

The tool has no output schema, so the description should explain the return format. It says 'returns aggregated metrics' but does not specify the structure (e.g., JSON with metric names and values). Given the number of parameters and sibling tools, this gap reduces completeness.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema description coverage is 100%, so the schema adequately documents parameter semantics. The description adds marginal value by mentioning 'selected time window' and 'specific team', but does not provide additional details beyond the schema.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the verb 'Get', the resource 'DORA metrics', and specifies the four metrics (deploy frequency, lead time for changes, change failure rate, MTTR) and scope (organization or specific team). It distinguishes from sibling tools like get_dora_trend by focusing on summary metrics.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description explicitly states when to use the tool ('to understand overall engineering delivery performance and reliability'). However, it does not explicitly mention when not to use it or provide direct alternatives, such as get_dora_trend for trends over time.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

get_dora_trendDORA Metric TrendA

Read-onlyIdempotent

Inspect

Get DORA metric trend over time to see how deployment frequency, lead time, change failure rate, or MTTR has changed week over week. Useful for spotting regressions or improvements across releases. Read-only.

ParametersJSON Schema

Name	Required	Description
`days`	No	Number of days to look back (default: 90). Max recommended: 365.
`metric`	Yes	Which DORA metric to trend: deploy_frequency, lead_time, change_failure_rate, or mttr
`teamId`	No	Team ID to filter metrics for a specific team

Tool Definition Quality

A3.8/5.0

Behavior3/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations already declare readOnlyHint=true, destructiveHint=false, idempotentHint=true, and openWorldHint=true. The description adds only 'Read-only', which is redundant. No contradiction, but no additional behavioral context beyond annotations.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Two sentences with no wasted words. First sentence clearly states purpose and scope; second adds usage guidance and read-only note. Ideal front-loading and brevity.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness2/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Despite good purpose clarity, the description lacks any detail about the output format (e.g., time series data points, aggregation level). Without an output schema, this is a significant gap for a tool that returns trend data.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

All three parameters are described in the schema with 100% coverage. The description lists the metric enum values but does not add meaning beyond what the schema provides. Baseline 3 applies.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the verb 'Get', the resource 'DORA metric trend', and the specific metrics (deployment frequency, lead time, change failure rate, MTTR). It distinguishes from sibling tools like get_dora_summary and ai-related trends.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description provides usage context: 'Useful for spotting regressions or improvements across releases.' It implies when to use (spot changes over time) but does not explicitly mention when not to use or name alternatives.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

get_open_prsList Open Pull RequestsA

Read-onlyIdempotent

Inspect

List currently open pull requests with their age in hours, size (additions + deletions), and reviewer assignments. Use this to identify stale or large PRs that may be blocking the team. Optionally filter to high-risk PRs only (large, old, or no reviewers). Read-only.

ParametersJSON Schema

Name	Required	Description	Default
`highRiskOnly`	No	If true, only return PRs that are large (>500 lines), old (>72h open), or have no reviewers
`repositoryId`	No	Filter to a specific repository by ID

Tool Definition Quality

A4.4/5.0

Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations already indicate readOnlyHint, idempotentHint, openWorldHint, and destructiveHint=false. The description adds 'Read-only' which aligns. It also discloses the returned data (age, size, reviewer assignments) and filtering option, providing useful context beyond annotations. No contradictions.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is three sentences, each serving a purpose: stating the core action, explaining the use case, and noting an optional filter. No redundant or unnecessary words. Information is front-loaded with the main verb and resource.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

For a simple list tool with two optional parameters and no output schema, the description adequately covers purpose and usage. However, it does not explicitly state the return format or that it returns a list of PRs with detailed fields, which could be helpful for an agent. Minor gap.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters4/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema coverage is 100%, so parameters are documented. The description adds extra meaning by explaining the 'highRiskOnly' parameter as 'large, old, or no reviewers' and mentions the optional repository filter. This enriches understanding beyond the schema descriptions.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description uses a specific verb 'List' with a clear resource 'open pull requests' and lists attributes (age, size, reviewer assignments). It distinguishes itself from siblings like 'get_at_risk_prs' by focusing on general stale/large PRs rather than a pre-filtered set.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description explicitly states the use case: 'identify stale or large PRs that may be blocking the team.' It mentions an optional filter for high-risk PRs, which guides scope. However, it does not explicitly exclude when not to use it or compare to siblings like 'get_at_risk_prs', leaving some ambiguity.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

get_org_healthOrganization Health SnapshotA

Read-onlyIdempotent

Inspect

Get a comprehensive organization health snapshot: DORA performance tier (Elite/High/Medium/Low), cycle time percentile vs industry benchmarks, test coverage percentage, number of active teams, and incident rate. Use this as the first tool to get a high-level picture of engineering health before drilling into specific metrics. Read-only.

ParametersJSON Schema

Name	Required	Description	Default
`teamId`	No	Team ID to scope health metrics to a specific team. Omit for org-wide snapshot.

Tool Definition Quality

A4.7/5.0

Behavior5/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations already declare readOnlyHint=true, idempotentHint=true, destructiveHint=false, and the description reinforces 'Read-only.' It adds detailed behavioral context by listing the exact fields returned, exceeding what annotations provide.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is two sentences plus a tag, front-loaded with the core action and results list. Every sentence adds value, and there is no redundancy.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness5/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Despite no output schema, the description comprehensively lists all returned fields, making the tool's output clear. Usage guidance and parameter behavior are fully covered, making the description complete for the tool's simplicity.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema description coverage is 100%, so the description need not add much. It does not provide additional meaning beyond the schema's description of teamId, but that is sufficient. Baseline 3 is appropriate.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description uses a specific verb ('Get') and resource ('comprehensive organization health snapshot'), and explicitly lists the fields returned. It distinguishes itself from sibling tools like get_dora_summary by being a high-level aggregate.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines5/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description explicitly says 'Use this as the first tool to get a high-level picture of engineering health before drilling into specific metrics,' providing clear context and an implicit list of alternatives (the sibling tools for drilling down).

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

get_pr_summaryPull Request Metrics SummaryA

Read-onlyIdempotent

Inspect

Get pull request metrics including cycle time (time from first commit to merge), throughput (PRs merged per week), review health (time to first review, reviewer distribution), and PR size trends. Use this to assess code review efficiency and identify bottlenecks in your delivery pipeline. Read-only.

ParametersJSON Schema

Name	Required	Description	Default
`days`	No	Number of days to look back (default: 30)
`teamId`	No	Team ID to filter to a specific team

Tool Definition Quality

A4.2/5.0

Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations already indicate read-only, idempotent, safe operation. Description adds value by listing specific returned metrics (cycle time, throughput, etc.), giving behavioral insight beyond annotations.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Two sentences with no redundancy. Front-loaded with action and resource, then use context and safety note. Every word earns its place.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

With no output schema, description compensates by listing the metrics returned. Two optional parameters are well-documented in schema. Sufficient for a read-only summary tool.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema coverage is 100% with clear parameter descriptions. Description does not add extra semantics beyond what schema provides, meeting the baseline for high coverage.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states it gets pull request metrics (cycle time, throughput, etc.) and distinguishes from sibling tools like get_open_prs and get_at_risk_prs, which focus on individual PRs rather than aggregate metrics.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

Explicitly says 'Use this to assess code review efficiency and identify bottlenecks,' providing clear context. Does not explicitly mention when not to use, but the purpose is well-defined.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

get_repositoryGet Repository DetailsA

Read-onlyIdempotent

Inspect

Get detailed metrics for a specific repository including deployment frequency, PR cycle time, contributor count, and code health indicators. Use this when asked about a specific codebase. Read-only.

ParametersJSON Schema

Name	Required	Description	Default
`name`	Yes	Repository name (e.g. "my-service") or full name with owner (e.g. "org/my-service")

Tool Definition Quality

A4.3/5.0

Behavior3/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations already declare readOnlyHint=true, destructiveHint=false, idempotentHint=true, openWorldHint=true. The description adds 'Read-only,' which aligns but does not provide new behavioral traits beyond what annotations already convey. No additional context like auth needs or data freshness is given.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is two sentences: the first states what it retrieves, the second provides usage guidance. No extraneous words or repetition, achieving maximum conciseness.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness5/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the tool's simplicity (one parameter, no output schema) and rich annotations covering safety and idempotency, the description sufficiently covers purpose, usage, and what to expect. No additional information is necessary for an AI agent to use it correctly.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters4/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

The input schema has 100% coverage with a description for the 'name' parameter. The description adds examples of acceptable formats ('my-service' or 'org/my-service'), which enhances understanding beyond the schema alone.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the tool retrieves detailed metrics for a specific repository, enumerating specific metrics like deployment frequency, PR cycle time, etc. This distinguishes it from sibling tools like list_repositories or get_dora_summary.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description explicitly says 'Use this when asked about a specific codebase,' providing clear usage context. It does not explicitly mention when not to use, but the context of a specific repository vs. lists or summaries is implied from sibling names.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

get_teamGet Team DetailsA

Read-onlyIdempotent

Inspect

Get details and metrics for a specific team including DORA performance, cycle time, and member count. Use this when asked about a specific team's engineering health. Combines DORA and flow metrics in a single response. Read-only.

ParametersJSON Schema

Name	Required	Description	Default
`teamId`	Yes	The team ID (use list_teams to find team IDs)

Tool Definition Quality

A4.2/5.0

Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations already set readOnlyHint, openWorldHint, idempotentHint, destructiveHint. The description adds that it combines DORA and flow metrics, which is extra behavioral context beyond the annotations.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is concise (three sentences) and front-loaded with the main purpose. Every sentence adds value.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the low complexity (one parameter) and rich annotations, the description adequately covers usage and return content. It mentions key metrics returned. No output schema needed.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema coverage is 100% (one parameter with description). The description does not add further meaning to the teamId parameter, so baseline 3 is appropriate.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states it retrieves details and metrics for a specific team, including DORA performance, cycle time, and member count. It distinguishes itself from sibling tools like get_dora_summary by noting it combines DORA and flow metrics in a single response.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

It explicitly says to use when asked about a specific team's engineering health, providing clear context. It does not explicitly state when not to use alternatives, but the context is sufficient.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

get_well_being_summaryTeam Well-Being & Burnout RiskA

Read-onlyIdempotent

Inspect

Get team well-being scores across pillars: focus time (uninterrupted deep work hours), meeting load (percentage of time in meetings), context switching (task interruptions per day), and burnout risk indicators. Use this to understand developer experience and identify teams under stress before it affects delivery. Read-only.

ParametersJSON Schema

Name	Required	Description	Default
`teamId`	No	Team ID to filter well-being data. Omit for org-wide summary.

Tool Definition Quality

A4.2/5.0

Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations already declare readOnlyHint, openWorldHint, idempotentHint, and non-destructive. The description adds value by listing the specific metrics (focus time, meeting load, context switching, burnout risk) and the proactive monitoring use case. No contradictions or missing critical behavior.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Two concise sentences: first lists pillars, second gives use case and read-only note. Front-loaded and no wasted words.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Tool is simple (1 optional param, no output schema). Description covers metrics and use case. Lacks details on return format or burnout risk indicator values, but adequate given simplicity and annotations.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Single optional parameter teamId with schema description 'Team ID to filter well-being data. Omit for org-wide summary.' Schema coverage is 100%, so baseline is 3. The main description does not add additional parameter details beyond what the schema provides.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

Description clearly states it gets well-being scores across specific pillars (focus time, meeting load, context switching, burnout risk). Title and name align. Among siblings like get_org_health or get_dora_summary, this tool is distinctly about developer well-being and burnout, not general health or DORA metrics.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

Explicitly says 'Use this to understand developer experience and identify teams under stress before it affects delivery,' which provides clear context. However, it does not explicitly mention when not to use it or compare to alternatives like get_org_health.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

list_recent_incidentsRecent Production IncidentsA

Read-onlyIdempotent

Inspect

List recent production incidents from PagerDuty or OpsGenie with their severity, MTTR (mean time to recovery), and affected services. Use this to understand reliability posture or investigate a recent outage in context with deployment activity. Read-only.

ParametersJSON Schema

Name	Required	Description	Default
`days`	No	Number of days to look back for incidents (default: 30)
`limit`	No	Maximum number of incidents to return (default: 20)

Tool Definition Quality

A4/5.0

Behavior3/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations already declare readOnlyHint=true, destructiveHint=false, idempotentHint=true, openWorldHint=true. The description adds that the tool is read-only and pulls from PagerDuty/OpsGenie, which is useful but not extensive beyond annotations. No contradictions.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Two sentences with no fluff. First sentence states what the tool does, second sentence gives usage guidance. Front-loaded and efficient.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Despite no output schema, the description lists the return fields (severity, MTTR, affected services), which is sufficient for a simple list tool. It could mention ordering or pagination but is otherwise complete for the intended use cases.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema coverage is 100% with both parameters (days, limit) having descriptions explaining their defaults and purpose. The description does not add any additional parameter meaning beyond what the schema provides, so baseline of 3 is appropriate.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the tool lists recent production incidents with specific fields (severity, MTTR, affected services) from known sources (PagerDuty/OpsGenie). The verb 'List' is precise, and the purpose is distinct from sibling tools like get_dora_summary or get_org_health.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description explicitly provides two use cases: 'understand reliability posture' and 'investigate a recent outage in context with deployment activity.' While it does not explicitly state when not to use or name alternatives, the context is clear and actionable.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

list_repositoriesList Connected RepositoriesA

Read-onlyIdempotent

Inspect

List all repositories connected to Koalr with their health scores, activity levels, and basic stats. Use this to get an overview of the codebase landscape or find repository IDs for filtering other tools. Read-only.

ParametersJSON Schema

Name	Required	Description	Default
`limit`	No	Maximum number of repositories to return (default: 50, max: 200)

Tool Definition Quality

A4.3/5.0

Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations already declare readOnlyHint=true, idempotentHint=true, destructiveHint=false. Description adds that it returns health scores, activity levels, and basic stats, confirming read-only nature. No extra side effects mentioned, but sufficient given annotations.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Two sentences: first defines purpose and output, second gives usage context. No extraneous information. Highly efficient.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness5/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Simple list tool with one optional parameter. Annotations cover safety and idempotency. Description lists return fields. No output schema needed. Complete for its complexity.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema coverage is 100% with description for 'limit' parameter. Tool description does not add additional meaning beyond what is in the schema, so baseline 3 is appropriate.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the verb 'List' and the resource 'repositories' with specific return fields (health scores, activity levels, basic stats). It differentiates from sibling 'get_repository' by mentioning usage for overview and finding IDs for filtering.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

It explicitly states two use cases (get an overview, find repository IDs for filtering other tools). It implies not for details of a single repository, but does not explicitly exclude other uses. Still clear guidance.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

list_team_membersList Team MembersA

Read-onlyIdempotent

Inspect

List all members of a specific team with their GitHub logins and roles. Use this to understand team composition or find developer logins for the get_developer tool. Read-only.

ParametersJSON Schema

Name	Required	Description	Default
`teamId`	Yes	The team ID (use list_teams to find team IDs)

Tool Definition Quality

A4.3/5.0

Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations already declare read-only, idempotent, and non-destructive behavior. The description adds 'Read-only' for reinforcement and specifies the output fields (logins, roles), providing useful behavioral context beyond what annotations cover.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is two sentences, front-loaded with the purpose, and contains no filler. Every sentence adds value.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness5/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

For a simple list tool with one parameter and no output schema, the description explains what it returns (logins, roles) and its typical use. This is fully sufficient given the low complexity.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

The single parameter teamId is well-documented in the schema with a description referencing list_teams. Since schema coverage is 100% and the description adds no further parameter info, a baseline of 3 is appropriate.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states it lists team members with GitHub logins and roles, which is specific and distinguishes it from sibling tools like list_teams and get_developer.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description explicitly advises using it to understand team composition or find logins for get_developer, providing clear context. It doesn't exclude other uses, but the guidance is clear enough.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

list_teamsList Engineering TeamsA

Read-onlyIdempotent

Inspect

List all engineering teams in the organization with their member counts and slugs. Use this to discover team IDs needed for filtering other metrics tools. Returns an array of team objects with id, name, slug, and memberCount. Read-only.

ParametersJSON Schema

Name	Required	Description	Default
`search`	No	Filter teams by name substring (case-insensitive). Omit to list all teams.

Tool Definition Quality

A4/5.0

Behavior3/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations already indicate readOnly, idempotent, non-destructive. The description adds return structure (array of team objects) and confirms read-only nature, but does not contribute substantial new behavioral insights beyond annotations.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is three sentences, each serving a distinct purpose: stating the action, providing use-case guidance, and summarizing return type and safety. No extraneous information.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

For a simple read-only list tool with no output schema, the description adequately explains the return value and purpose. Could mention pagination or limits, but the scope is clear and useful.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema coverage is 100% with a well-described 'search' parameter. The tool description does not add additional parameter details beyond what the schema provides, warranting baseline score.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the verb 'List' and the resource 'engineering teams' with specifics (member counts, slugs). It distinguishes from siblings like get_team (single team) and list_team_members (members).

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description explicitly frames the tool as a means to discover team IDs for filtering other metrics tools, providing clear context. It does not mention when not to use, but the purpose is well-defined.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

list_top_contributorsTop ContributorsA

Read-onlyIdempotent

Inspect

List the most active contributors ranked by commits and PRs merged over a time window. Use this to identify key contributors, bus-factor risks, or to recognize top performers. Read-only.

ParametersJSON Schema

Name	Required	Description	Default
`days`	No	Time window in days to measure activity (default: 30)
`limit`	No	Number of contributors to return (default: 10, max: 50)

Tool Definition Quality

A4.2/5.0

Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations already declare readOnlyHint=true and destructiveHint=false. Description adds that the tool ranks by commits and PRs merged and is read-only, providing context beyond annotations.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Two sentences: first clearly states functionality, second gives usage guidance. No wasted words, front-loaded with key information.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Tool has 2 parameters, no output schema, but annotations are rich. Description explains temporal window and ranking criteria, though output format is not detailed. Adequate for the tool's complexity.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema coverage is 100% with parameter descriptions for days and limit. Description adds overall context about ranking by commits and PRs, but no additional parameter-level details needed beyond schema.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states it lists the most active contributors ranked by commits and PRs merged over a time window. It distinguishes from sibling tools like get_pr_summary or get_open_prs.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

Provides use cases: identify key contributors, bus-factor risks, or recognize top performers. Does not explicitly state when not to use or provide alternatives, but guidance is clear.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

score_pr_for_deploy_riskScore PR Deployment RiskA

Read-onlyIdempotent

Inspect

Score a specific pull request for deployment risk using Koalr's 36-signal model. Returns a 0–100 risk score with a detailed factor breakdown covering change entropy, DDL migration detection, author file expertise, PR size, CODEOWNERS violations, blast radius, coverage delta, and more. Use this to answer "How risky is this PR?" or "Should we merge this before the release?". Read-only — scoring does not modify the PR.

ParametersJSON Schema

Name	Required	Description
`sha`	Yes	Head commit SHA.
`body`	No	PR description body (used to detect risky phrases).
`repo`	Yes	Repository name. Example: "api-service".
`files`	No	List of changed file paths. Used for DDL detection, CODEOWNERS analysis, and entropy calculation. More accurate results when provided.
`owner`	Yes	GitHub repository owner (org or user). Example: "acme".
`title`	Yes	PR title.
`prNumber`	Yes	Pull request number.
`additions`	Yes	Lines added.
`deletions`	Yes	Lines deleted.
`hasReview`	No	Whether the PR has at least one review (any type).
`authorLogin`	No	GitHub login of the PR author.
`hasApproval`	No	Whether the PR has at least one approving review.
`changedFiles`	Yes	Number of files changed.

Tool Definition Quality

A4.7/5.0

Behavior5/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations already declare readOnlyHint, idempotentHint, destructiveHint. The description reinforces 'Read-only — scoring does not modify the PR' and details the signals analyzed (change entropy, DDL, CODEOWNERS, etc.), adding valuable context beyond annotations.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Single, well-structured paragraph front-loaded with primary action. Every sentence adds value—no redundancy or fluff.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness5/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Despite 13 parameters and no output schema, the description sufficiently explains the tool's purpose, inputs, and output (risk score with factor breakdown). Mentions the model name and key analysis dimensions, making it complete for agent decision-making.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters4/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema description coverage is 100%, so baseline is 3. The description adds context by linking parameters to the analysis (e.g., 'files' used for DDL detection, CODEOWNERS) and listing the factor breakdown, which helps agents understand how parameters contribute.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

Clearly states verb 'Score' and resource 'pull request', specifies output (0-100 risk score with factor breakdown), and uses distinct language like 'use this to answer...' distinguishing it from sibling tools like get_pr_summary or get_at_risk_prs.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

Provides explicit usage examples ('How risky is this PR?', 'Should we merge this before the release?'). Does not mention when not to use or alternatives, but the clear purpose and sibling list make differentiation straightforward.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

searchUnified Entity SearchA

Read-onlyIdempotent

Inspect

Search across all Koalr entities: developers (by name or GitHub login), repositories (by name), pull requests (by title or branch), and teams (by name). Use this when you need to find an entity before using a more specific tool. Read-only.

ParametersJSON Schema

Name	Required	Description	Default
`type`	No	Limit results to a specific entity type. Omit to search all types.
`query`	Yes	Search query text (e.g. "payments", "alice", "auth-service")

Tool Definition Quality

A4.5/5.0

Behavior3/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations already provide readOnlyHint=true, destructiveHint=false, idempotentHint=true, openWorldHint=true. The description adds 'Read-only' which confirms, but does not add significant new behavioral context beyond what annotations cover.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Two sentences with no extraneous words. All information is front-loaded: purpose, scope, usage, and safety. Every sentence earns its place.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness5/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

For a search tool with no output schema, the description is complete. It covers the search scope, entity types, usage context, and read-only nature. Sibling tools are diverse, but the description clearly positions this as a discovery tool.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters4/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema coverage is 100% with descriptions for each parameter. The description adds value by providing concrete search examples ('payments', 'alice', 'auth-service') that help the agent understand how to formulate queries beyond the schema's generic text.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the action ('Search across all Koalr entities'), lists the specific entity types (developers, repositories, pull requests, teams), and how to search each (by name, login, etc.). This distinguishes it from sibling tools that target single entity types.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines5/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

Explicitly states when to use: 'Use this when you need to find an entity before using a more specific tool.' This implies the alternative is to use the specific tools (e.g., developers.get, repos.get) listed as siblings.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

Claim this connector by publishing a /.well-known/glama.json file on your server's domain with the following structure:

{
  "$schema": "https://glama.ai/mcp/schemas/connector.json",
  "maintainers": [{ "email": "your-email@example.com" }]
}

The email address must match the email associated with your Glama account. Once published, Glama will automatically detect and verify the file within a few minutes.

Discussions

No comments yet. Be the first to start the discussion!

Try in Browser

Your Connectors

Resources

Need Help?