Skip to main content
Glama

Server Details

Multi-model AI debates: GPT-4o, Claude, Gemini & 200+ models discuss, then synthesize insight.

Status
Healthy
Last Tested
Transport
Streamable HTTP
URL

Glama MCP Gateway

Connect through Glama MCP Gateway for full control over tool access and complete visibility into every call.

MCP client
Glama
MCP server

Full call logging

Every tool call is logged with complete inputs and outputs, so you can debug issues and audit what your agents are doing.

Tool access control

Enable or disable individual tools per connector, so you decide what your agents can and cannot do.

Managed credentials

Glama handles OAuth flows, token storage, and automatic rotation, so credentials never expire on your clients.

Usage analytics

See which tools your agents call, how often, and when, so you can understand usage patterns and catch anomalies.

100% free. Your data is private.
Tool DescriptionsB

Average 3.8/5 across 13 of 13 tools scored. Lowest: 2.9/5.

Server CoherenceA
Disambiguation4/5

Most tools have distinct purposes, such as consult-council for AI discussions and debug-issue for bug diagnosis, but some overlap exists between design-architecture and plan-implementation, which both involve planning and could be confused for similar tasks. Descriptions help clarify their focuses, but the boundaries aren't entirely clear-cut.

Naming Consistency4/5

Tool names follow a consistent verb-noun pattern with hyphens (e.g., consult-council, debug-issue, get-logs), except for assess-tradeoffs which uses a verb-verb format, slightly deviating from the pattern. Overall, the naming is predictable and readable, with only minor inconsistencies.

Tool Count5/5

With 13 tools, the count is well-scoped for the server's purpose of providing AI-driven council discussions and session management. Each tool serves a specific role, such as consulting, debugging, or logging, without feeling excessive or insufficient for the domain.

Completeness5/5

The tool surface offers comprehensive coverage for AI council interactions and session management, including creation (e.g., consult-council), retrieval (e.g., get-session), listing (e.g., list-sessions), and configuration (e.g., set-thread-visibility). No obvious gaps exist, supporting full lifecycle operations in this domain.

Available Tools

13 tools
assess-tradeoffsAssess TradeoffsAInspect

Tradeoff assessment council. Pragmatist, Skeptic, and Futurist evaluate options from different angles — short-term vs long-term, risk vs reward, simplicity vs flexibility. Output as pros-cons.

ParametersJSON Schema
NameRequiredDescriptionDefault
contextNoBackground context — codebase, team, timeline, constraints
optionsNoSpecific options to compare
decisionYesThe decision or question to evaluate
prioritiesNoWhat matters most (e.g., ["performance", "dx", "cost"])
webhook_urlNoWebhook URL to POST results to after completion
thinking_levelNoAnalysis depthmedium
session_contextNoSession IDs to use as context (max 3). Prior moderator summaries will be prepended.
Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Discloses unique behavioral methodology (three specific personas evaluating from different angles) and output format ('pros-cons') beyond annotations. Annotations cover safety profile (destructive: false, readOnly: false) and external access (openWorld: true), so description appropriately focuses on functional behavior rather than repeating safety flags.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Extremely efficient two-sentence structure. First sentence establishes the council mechanism and personas; second sentence defines evaluation scope and output format. Zero redundancy—every phrase adds distinct information (personas, angles, output type).

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Adequately covers core functionality and compensates for missing output schema by specifying 'pros-cons' return format. Could be strengthened by acknowledging asynchronous webhook behavior or session persistence implications given the presence of webhook_url and session_context parameters.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema description coverage is 100%, establishing solid baseline. Description aligns conceptually with 'options' and 'decision' parameters but adds no syntax guidance, examples, or explanations for complex params like 'session_context' or 'webhook_url' beyond what the schema already provides.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

Specific verb ('assess') and resource ('tradeoffs') with unique mechanism (Pragmatist/Skeptic/Futurist personas). Clearly distinguishes from generic 'consult-council' sibling by specifying evaluation dimensions (short-term/long-term, risk/reward) and output format (pros-cons).

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines3/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

Implies usage scenarios through listed evaluation dimensions (risk vs reward, simplicity vs flexibility), suggesting use when comparing options across these axes. However, lacks explicit when-not guidance or naming of sibling alternatives like 'consult-council' or 'design-architecture' for different decision types.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

check-usageCheck UsageA
Read-onlyIdempotent
Inspect

Check your remaining credits, usage limits, and plan info

ParametersJSON Schema
NameRequiredDescriptionDefault
verboseNoInclude detailed per-window rate limit breakdown

Output Schema

ParametersJSON Schema
NameRequiredDescription
planYesCurrent plan tier
statusYesOverall usage status
creditsYesRemaining credit balance
rateLimitsYesRate limit status across all windows
Behavior3/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations already declare readOnlyHint=true, idempotentHint=true, and destructiveHint=false, covering the safety profile. The description adds specificity about what data is retrieved (credits, usage limits, plan info), providing content context beyond annotations. However, it does not elaborate on caching, rate limits, or data freshness.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is a single, efficient sentence with no wasted words. It leads with the action verb and immediately specifies the returned data categories, respecting the reader's attention.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness5/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the tool's simplicity (one optional boolean parameter), complete annotations, and presence of an output schema, the description provides sufficient context. It appropriately focuses on the domain purpose rather than duplicating schema or annotation details.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

With 100% schema description coverage for the single 'verbose' parameter, the schema fully documents the input. The description does not mention the parameter, but since the schema is complete, it meets the baseline expectation without requiring additional compensation from the description.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description uses a specific verb 'Check' and clearly identifies the resources accessed: 'remaining credits, usage limits, and plan info'. This clearly distinguishes the tool from coding-focused siblings like debug-issue or review-code by specifying the billing/accounting domain.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines3/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

While the description implies usage context through the domain-specific resources mentioned (credits, limits), it lacks explicit guidance on when to use this versus alternatives, or any prerequisites. The usage is clear from context but not explicitly stated as when/when-not.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

consult-councilConsult CouncilAInspect

Consult the AI coding council — multiple models discuss your engineering question sequentially (each sees prior responses), then a moderator synthesizes. Auto-mode by default — AI picks optimal models, roles, and conversation mode from your prompt. Provide explicit models to override (manual mode). Fully configurable: mode, format, roles, models, thinking level.

ParametersJSON Schema
NameRequiredDescriptionDefault
modeNoConversation mode: analyzing (research), brainstorming (ideas), debating (tradeoffs), solving (action plans)debating
rolesNoInline role names for participants (e.g., ["Security Architect", "Backend Engineer"])
formatNoModerator output format: discussion (narrative), adr (architecture decision), comparison (table), pros-consdiscussion
modelsNoOverride specific model IDs. Min 3 models. Use list-models to see available options
promptYesThe question, topic, or problem to debate
contextNoAdditional background context for the debate (code, docs, requirements)
knowledgeNoReference knowledge to inject as context
auto_routeNoAuto-select optimal models based on prompt analysis and historical performance
webhook_urlNoWebhook URL to POST results to after completion
thinking_levelNoControls model quality and cost: low (fast/cheap), medium (balanced), high (maximum reasoning)medium
session_contextNoSession IDs to use as context (max 3). Prior moderator summaries will be prepended.
Behavior3/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

While the description adds valuable procedural context not in annotations (sequential discussion flow, visibility of prior responses, moderator synthesis), it fails to disclose what the write operation (readOnlyHint: false) actually creates or persists. Given siblings like get-session and get-logs, the description should explicitly state that this initiates a tracked session/thread.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness4/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is efficiently structured in three sentences covering mechanism, defaults, and configurability. However, the claim 'Auto-mode by default' potentially conflicts with the schema's auto_route default of false, creating slight ambiguity that detracts from otherwise tight prose.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness3/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

For a complex 11-parameter tool with session-related siblings, the description lacks critical lifecycle context: it doesn't mention that the operation creates a persistent session (implied by get-session/get-logs siblings), nor does it describe the output structure despite the absence of an output schema. The webhook_url parameter hints at async behavior not explained in the description.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

With 100% schema description coverage, the baseline is 3. The description lists configurable parameters ('mode, format, roles, models, thinking level') but adds minimal semantic meaning beyond what the schema already provides. It mentions 'Auto-mode' which maps to auto_route, though the schema indicates auto_route defaults to false, creating potential confusion.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly defines the tool as a multi-model deliberation system ('multiple models discuss...sequentially') with synthesis ('moderator synthesizes'), distinguishing it from single-model siblings like review-code or debug-issue. The verb-resource combination ('consult the AI coding council') is specific and actionable.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines3/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description provides implicit guidance on auto vs manual configuration ('Auto-mode by default...Provide explicit models to override'), but fails to explicitly state when to use this tool versus similar siblings like assess-tradeoffs or design-architecture. No explicit 'when-not-to-use' or alternative recommendations are provided.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

debug-issueDebug IssueCInspect

Debugging council. Root Cause Analyst, Systems Engineer, and Edge Case Investigator collaboratively diagnose bugs, analyze errors, and propose fixes.

ParametersJSON Schema
NameRequiredDescriptionDefault
codeNoThe relevant code where the bug occurs
errorNoError message, stack trace, or unexpected output
problemYesDescribe the bug, failure, or unexpected behavior
knowledgeNoReference knowledge to inject as context
webhook_urlNoWebhook URL to POST results to after completion
thinking_levelNoAnalysis depthmedium
session_contextNoSession IDs to use as context (max 3). Prior moderator summaries will be prepended.
expected_behaviorNoWhat should happen vs what actually happens
Behavior2/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

While annotations declare openWorldHint=true and readOnlyHint=false, the description does not clarify critical behavioral questions: whether 'propose fixes' means suggesting code or applying changes, nor does it explain the webhook_url parameter's async POST behavior or how session_context maintains conversation continuity.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness4/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is compact at two sentences and front-loads the core function. However, the 'Debugging council' fragment adds slight syntactic awkwardness without adding meaning, and the specific role names (Root Cause Analyst, etc.) consume space that could clarify behavior.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness2/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the tool's complexity—including webhook callbacks, session continuity references, knowledge injection, and non-idempotent behavior—the description is incomplete. It omits what the 'council' actually returns, how the webhook delivers results, and the implications of the 'openWorld' annotation for external data fetching.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

With 100% schema description coverage, the structured data already fully documents all 8 parameters including the enum values for thinking_level and the webhook format. The description adds no additional parameter semantics, meeting the baseline expectation for well-documented schemas.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose4/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the tool diagnoses bugs, analyzes errors, and proposes fixes, with specific 'persona' roles that distinguish it from the generic 'consult-council' sibling. However, the 'Debugging council' framing introduces slight abstraction without clarifying if this is a single invocation or multi-agent workflow.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines2/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description provides no guidance on when to use this tool versus siblings like 'consult-council' (general advice), 'review-code' (static analysis), or 'get-logs' (raw data retrieval). It fails to specify prerequisites or when debugging is appropriate versus other actions.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

design-architectureDesign ArchitectureBInspect

Architecture design council. Systems Architect, Infrastructure Engineer, and DX Advocate evaluate your system design. Always uses high thinking for maximum depth. Output as ADR.

ParametersJSON Schema
NameRequiredDescriptionDefault
scaleNoTarget scale: startup (small team), growth (scaling), enterprise (large org)startup
tech_stackNoPreferred technologies
descriptionYesWhat the system should do
focus_areasNoPriority areas (e.g., ["security", "performance"])
webhook_urlNoWebhook URL to POST results to after completion
Behavior3/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

The description adds valuable context beyond annotations by specifying 'high thinking for maximum depth' (computational cost) and 'Output as ADR' (return format). However, it fails to explain implications of idempotentHint:false (that repeated calls may yield different results) or elaborate on openWorldHint:true interactions beyond the webhook parameter.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness4/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is efficiently structured in four sentences with minimal redundancy. While the opening 'Architecture design council' slightly echoes the title, each subsequent sentence adds distinct value regarding roles, processing depth, and output format. No extraneous information is present.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness3/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the lack of output schema, the description appropriately specifies ADR output format. With complete schema coverage and annotations handling safety hints, the description meets basic requirements. However, it could improve by clarifying asynchronous webhook behavior or detailing what the ADR contains.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

With 100% schema description coverage, the input parameters are fully documented in the schema itself. The description does not add parameter-specific semantics, but this is acceptable given the comprehensive schema documentation, warranting the baseline score of 3.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose4/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the tool evaluates system design using specific personas (Systems Architect, Infrastructure Engineer, DX Advocate) and outputs an ADR. It distinguishes from generic sibling 'consult-council' by specifying the council composition and output format, though it doesn't explicitly differentiate from 'assess-tradeoffs'.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines3/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The phrase 'Always uses high thinking for maximum depth' implies this tool is for complex architectural work requiring deep analysis, providing implied usage context. However, it lacks explicit when-not-to-use guidance or direct comparisons to alternatives like 'consult-council' or 'assess-tradeoffs'.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

get-logsGet LogsA
Read-onlyIdempotent
Inspect

Query structured logs from your MCP tool executions. Filter by session, severity level, event type, and time range. Useful for debugging and monitoring tool usage.

ParametersJSON Schema
NameRequiredDescriptionDefault
eventNoFilter by event name (e.g., "debate_completed")
levelNoFilter by log level
limitNoMax results
offsetNoPagination offset
end_timeNoEnd timestamp (ms) for time range filter
session_idNoFilter logs for a specific session
start_timeNoStart timestamp (ms) for time range filter

Output Schema

ParametersJSON Schema
NameRequiredDescription
logsYesMatching log entries
Behavior3/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations already declare readOnlyHint, idempotentHint, and destructiveHint, covering the safety profile. The description adds value by specifying the logs are 'structured' and listing the available filter dimensions (session, severity, event type, time range), but does not disclose pagination behavior, rate limits, or result set expectations.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description consists of three tightly constructed sentences: purpose declaration, capability summary, and usage context. Every sentence earns its place with zero redundancy, and the information is properly front-loaded with the primary action.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the presence of an output schema (handling return value documentation), comprehensive input schema coverage, and detailed annotations, the description provides sufficient contextual framing. It appropriately focuses on scope and purpose rather than replicating structured data, though it could briefly note that all filters are optional.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

With 100% schema description coverage, the schema carries the full burden of parameter documentation. The description provides a high-level grouping of filter categories ('session, severity level, event type, and time range') but does not add semantic meaning, syntax details, or constraints beyond what the schema already provides, warranting the baseline score.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose4/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the specific action (Query) and resource (structured logs from MCP tool executions), providing strong specificity. However, it does not explicitly differentiate from siblings like 'debug-issue' or 'get-session', which also relate to troubleshooting and session data.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description provides clear context for when to use the tool ('Useful for debugging and monitoring tool usage'), establishing the operational context. However, it lacks explicit exclusions or guidance on when to prefer alternatives like 'debug-issue' instead.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

get-sessionGet SessionA
Read-onlyIdempotent
Inspect

Get full details of a previous MCP session by ID. Returns the complete result including participant responses and moderator synthesis. Use list-sessions first to find session IDs.

ParametersJSON Schema
NameRequiredDescriptionDefault
session_idYesThe session ID to retrieve

Output Schema

ParametersJSON Schema
NameRequiredDescription
idYesSession identifier
promptYesOriginal prompt text
toolNameYesTool that was invoked
createdAtYesSession creation timestamp
durationMsYesDuration in milliseconds
qualityScoreYesAI-rated quality score
totalCreditsYesTotal credits consumed
Behavior3/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations cover safety profile (readOnly, idempotent, non-destructive). Description adds value by disclosing return content details ('participant responses and moderator synthesis') not evident in annotations, but does not mention auth needs, rate limits, or error states.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Three sentences with zero waste: purpose front-loaded, return value described, prerequisite stated. Every clause earns its place without redundancy.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness5/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the simple single-parameter input, presence of output schema, and comprehensive annotations, the description provides sufficient context for invocation. It covers domain (MCP sessions), content type (responses/synthesis), and workflow prerequisite.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema description coverage is 100%, establishing baseline 3. Description reinforces the 'by ID' requirement and provides workflow context for sourcing IDs via list-sessions, but does not add syntax/format details beyond the schema.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

Specific verb 'Get' + resource 'full details of a previous MCP session' + scope 'including participant responses and moderator synthesis'. Clearly distinguishes from list-sessions sibling by emphasizing 'by ID' vs listing.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

Explicit prerequisite instruction 'Use list-sessions first to find session IDs' establishes the workflow sequence. Could be strengthened by explicitly stating when NOT to use (e.g., without an ID), but the dependency guidance is clear.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

list-modelsList ModelsA
Read-onlyIdempotent
Inspect

List available AI models grouped by thinking level (low/medium/high). Shows default models, credit costs, capabilities for each tier. Use this before consult to understand model options.

ParametersJSON Schema
NameRequiredDescriptionDefault
thinking_levelNoFilter to a specific thinking level

Output Schema

ParametersJSON Schema
NameRequiredDescription
levelsYesAvailable thinking level tiers
Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations declare read-only/idempotent safety profile. Description adds valuable behavioral context beyond annotations: it discloses that results include credit costs, capabilities, and default models—critical business logic for model selection that raw annotations don't capture.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Three tight sentences with zero waste: purpose declaration, output details, and usage timing. Front-loaded with the core action and resource. No redundant or filler text.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness5/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the presence of an output schema, simple parameter structure (1 optional enum), and comprehensive annotations, the description appropriately focuses on business logic (credit costs, capabilities) rather than return value documentation. Complete for this tool complexity.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters4/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

With 100% schema coverage, the baseline is satisfied. Description adds semantic reinforcement by connecting the 'thinking_level' parameter to the grouping logic ('low/medium/high') mentioned in the first sentence, clarifying the parameter's relationship to the output structure.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

Description provides specific verb ('List'), clear resource ('AI models'), and distinct grouping logic ('grouped by thinking level'). It also distinguishes from sibling 'consult-council' by stating this helps understand 'model options' before consulting.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

Explicitly states when to use ('before consult') and implies the workflow relationship with 'consult-council'. Lacks explicit 'when not to use' exclusions, but provides clear temporal context for invocation.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

list-sessionsList SessionsA
Read-onlyIdempotent
Inspect

List your previous MCP tool sessions. Returns session metadata including prompt, tool used, quality score, and credits consumed. Useful for reviewing past council discussions.

ParametersJSON Schema
NameRequiredDescriptionDefault
limitNoMax results to return
offsetNoPagination offset
tool_nameNoFilter by tool name (e.g., "consult", "architect")

Output Schema

ParametersJSON Schema
NameRequiredDescription
hasMoreYesWhether more results are available
sessionsYesList of sessions
Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Supplements annotations effectively by disclosing specific metadata fields returned (prompt, tool used, quality score, credits consumed). Does not contradict readOnlyHint/destructiveHint annotations. Could mention pagination behavior explicitly.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Three well-structured sentences: action definition, return value specification, and usage context. Front-loaded with the core purpose. No redundant or wasted language.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Appropriate for complexity level. With output schema present and rich annotations covering safety/idempotency, the description adequately previews return content. Minor gap: could explicitly mention pagination pattern to connect limit/offset parameters conceptually.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema coverage is 100% with clear parameter descriptions (limit, offset, tool_name). Description does not add parameter-specific semantics beyond the schema, warranting the baseline score of 3 for high-coverage schemas.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

States specific verb 'List' and resource 'previous MCP tool sessions' clearly. Distinct from sibling 'get-session' (singular retrieval) by implication of listing multiple sessions, though explicit differentiation would strengthen it further.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines3/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

Provides implied usage context ('Useful for reviewing past council discussions') linking to the 'consult-council' sibling. However, lacks explicit guidance on when to use this versus 'get-session' for single-session details, and no 'when-not-to-use' guidance.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

plan-implementationPlan ImplementationAInspect

Implementation planning council. Tech Lead, Senior Engineer, and QA Strategist break down a feature into actionable steps, identify risks, and define acceptance criteria. Output as ADR.

ParametersJSON Schema
NameRequiredDescriptionDefault
featureYesThe feature or change to plan
knowledgeNoReference knowledge to inject as context
tech_stackNoCurrent tech stack
constraintsNoConstraints (e.g., ["no breaking changes", "must support offline"])
webhook_urlNoWebhook URL to POST results to after completion
thinking_levelNoPlanning depthmedium
session_contextNoSession IDs to use as context (max 3). Prior moderator summaries will be prepended.
codebase_contextNoRelevant existing code, file structure, or architecture notes
Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

The description adds valuable context about the collaborative agent behavior ('council') and deliverable format ('ADR') beyond the annotations. It aligns with readOnlyHint:false by implying document creation. It could be improved by mentioning the webhook delivery mechanism or session continuity behavior, but effectively explains the core multi-agent interaction model.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Single sentence delivers maximum information density: establishes the agent pattern (council), specific roles, three distinct activities, and output format. No redundancy or filler; every clause serves disambiguation.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness3/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

While the conceptual model is clear, the description omits mention of significant capabilities evident in the schema: webhook delivery, session context continuity, and knowledge injection. For a tool with 8 parameters including async callbacks and reference knowledge, these behavioral gaps are notable despite the rich schema.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

With 100% schema description coverage, the parameters are fully documented in the schema itself. The description focuses on the conceptual model rather than parameter specifics, which is acceptable given the comprehensive schema. It earns the baseline score for high-coverage schemas without adding supplementary parameter guidance.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description specifies a concrete multi-agent workflow ('Tech Lead, Senior Engineer, and QA Strategist'), distinct actions ('break down... identify risks... define acceptance criteria'), and output format ('ADR'). This clearly distinguishes it from sibling tools like consult-council or design-architecture by specifying the exact council composition and deliverable type.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines3/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

Usage is implied by the specific domain ('implementation planning') and output format ('ADR'), suggesting use when ready to plan execution details rather than high-level architecture. However, it lacks explicit when-to-use guidance, prerequisites, or differentiation from consult-council when both involve councils.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

review-codeReview CodeBInspect

Code review council. Senior Engineer, Security Reviewer, and Performance Analyst analyze your code and a moderator synthesizes their findings.

ParametersJSON Schema
NameRequiredDescriptionDefault
codeYesThe code to review
focusNoReview focus areas (e.g., ["security", "performance"])
languageNoProgramming language (auto-detected if not specified)
webhook_urlNoWebhook URL to POST results to after completion
thinking_levelNoReview depth: low (quick scan), medium (balanced), high (thorough)medium
Behavior3/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Adds valuable internal process context (multi-persona analysis with moderator synthesis) not present in annotations. However, fails to address implications of openWorldHint=true (external service calls, latency, potential failures) or describe the return format/structure given no output schema exists.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness4/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Two tight sentences with zero redundancy and strong front-loading. The metaphorical 'council' framing efficiently conveys the multi-dimensional analysis. Could benefit from slight formatting (e.g., listing the roles) to improve scannability but remains appropriately concise.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness3/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Adequate for parameter understanding but incomplete regarding the output (what does the synthesized finding look like? JSON? Markdown? Given 5 parameters and external calls implied by openWorldHint, the description should characterize the return value or response structure.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters4/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

While schema has 100% coverage, the description enhances the 'focus' parameter significantly by mapping array items to specific council personas (Security Reviewer → 'security', Performance Analyst → 'performance'), helping users understand the expertise applied to each focus area.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose4/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly identifies the action (review) and resource (code) via the 'Code review council' framing. It distinguishes from sibling 'consult-council' by specifying the exact council composition (Senior Engineer, Security Reviewer, Performance Analyst, moderator), though it could explicitly state 'submits code for analysis' to be perfect.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines2/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

No explicit guidance on when to use this versus 'consult-council' (which also appears to use a council pattern) or 'debug-issue'. Missing prerequisites like code size limits, supported languages, or when synchronous vs webhook usage is preferred.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

set-thread-visibilitySet Thread VisibilityB
Idempotent
Inspect

Set a thread as public or private. Public threads can be shared via URL. Use session_id from a previous debate session.

ParametersJSON Schema
NameRequiredDescriptionDefault
is_publicYesSet to true to make the thread publicly accessible, false to make it private
session_idYesThe session ID from a previous debate

Output Schema

ParametersJSON Schema
NameRequiredDescription
isPublicYesCurrent visibility state
publicUrlYesPublic share URL if thread is public
Behavior3/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

The annotations already declare idempotentHint=true, readOnlyHint=false, and destructiveHint=false, covering the operation's safety profile. The description adds behavioral context by explaining that public threads are shareable via URL, which helps the agent understand the functional impact beyond the boolean flag. It misses the opportunity to explicitly reassure about idempotency or mention authorization requirements.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness4/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description consists of three efficient sentences that are front-loaded with the core action. Each sentence serves a distinct purpose: defining the operation, explaining the behavioral consequence (URL sharing), and providing parameter sourcing guidance. There is minimal redundancy or extraneous information.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness3/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the tool's simplicity (2 parameters, 100% schema coverage, existing output schema, and comprehensive annotations), the description provides sufficient context for basic invocation. However, for a state-modifying operation, it could be improved by mentioning error scenarios (e.g., invalid session_id) or explicitly referencing the idempotent nature hinted at in annotations.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

With 100% schema description coverage, the input schema fully documents both parameters ('is_public' semantics and 'session_id' format). The description reinforces the session_id requirement ('Use session_id from a previous debate session') but primarily restates information already present in the schema property descriptions, meeting the baseline expectation for high-coverage schemas.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose4/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the core action ('Set a thread as public or private') and identifies the resource being manipulated. It adds valuable scope clarification by explaining the consequence of public visibility ('can be shared via URL'). However, it does not explicitly differentiate from the sibling tool 'get-thread-link', which likely relates to URL generation.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines3/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description provides implicit usage guidance by specifying the provenance of the session_id parameter ('from a previous debate session'), helping the agent understand when this tool is applicable. However, it lacks explicit guidance on when to prefer this over siblings like 'get-thread-link' or prerequisites such as checking current visibility state.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

Discussions

No comments yet. Be the first to start the discussion!

Try in Browser

Your Connectors

Sign in to create a connector for this server.

Resources