Skip to main content
Glama

Server Details

Multi-model AI debates: GPT-4o, Claude, Gemini & 200+ models discuss, then synthesize insight.

Status
Healthy
Last Tested
Transport
Streamable HTTP
URL

Glama MCP Gateway

Connect through Glama MCP Gateway for full control over tool access and complete visibility into every call.

MCP client
Glama
MCP server

Full call logging

Every tool call is logged with complete inputs and outputs, so you can debug issues and audit what your agents are doing.

Tool access control

Enable or disable individual tools per connector, so you decide what your agents can and cannot do.

Managed credentials

Glama handles OAuth flows, token storage, and automatic rotation, so credentials never expire on your clients.

Usage analytics

See which tools your agents call, how often, and when, so you can understand usage patterns and catch anomalies.

100% free. Your data is private.
Tool DescriptionsA

Average 3.9/5 across 13 of 13 tools scored. Lowest: 3.1/5.

Server CoherenceA
Disambiguation4/5

Most tools have distinct purposes, such as consult-council for AI discussions, debug-issue for bug diagnosis, and get-logs for log queries. However, some overlap exists between design-architecture and plan-implementation, both involving system design and outputting ADRs, which could cause mild confusion. The descriptions help clarify differences, but the boundaries are not perfectly sharp.

Naming Consistency5/5

All tool names follow a consistent verb_noun pattern with hyphens, such as assess-tradeoffs, check-usage, and list-sessions. This uniformity makes the set predictable and easy to navigate, with no deviations in naming conventions across the 13 tools.

Tool Count5/5

With 13 tools, the count is well-scoped for a server focused on AI council discussions and session management. Each tool serves a clear purpose, such as consult-council for core functionality and get-logs for monitoring, ensuring no tool feels redundant or unnecessary for the domain.

Completeness4/5

The tool set covers key aspects like initiating discussions (consult-council), debugging (debug-issue), and session management (list-sessions, get-session). Minor gaps exist, such as no tool for deleting or editing sessions, but agents can work around this with the available tools for viewing and setting visibility.

Available Tools

13 tools
assess-tradeoffsAssess TradeoffsBInspect

Tradeoff assessment council. Pragmatist, Skeptic, and Futurist evaluate options from different angles — short-term vs long-term, risk vs reward, simplicity vs flexibility. Output as pros-cons.

ParametersJSON Schema
NameRequiredDescriptionDefault
contextNoBackground context — codebase, team, timeline, constraints
optionsNoSpecific options to compare
decisionYesThe decision or question to evaluate
prioritiesNoWhat matters most (e.g., ["performance", "dx", "cost"])
webhook_urlNoWebhook URL to POST results to after completion
thinking_levelNoAnalysis depthmedium
session_contextNoSession IDs to use as context (max 3). Prior moderator summaries will be prepended.
Behavior3/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

While annotations indicate openWorldHint=true and readOnlyHint=false (suggesting external interactions and state modification), the description adds valuable context about the internal decision-making mechanism (three specific analytical personas) not found in structured fields. However, it fails to clarify what external systems are contacted (via webhook_url) or what state is persisted (implied by session_context and readOnly=false).

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description consists of two highly efficient sentences. The first establishes the conceptual model ('Tradeoff assessment council'), and the second details the methodology, evaluation dimensions, and output format. Every phrase carries semantic weight with no redundancy or filler.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness3/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the presence of complex infrastructure parameters (webhook_url, session_context) and annotations indicating non-idempotent external interactions, the description adequately covers the core analytical behavior but leaves significant gaps regarding side effects, persistence mechanisms, and the asynchronous webhook behavior implied by the parameter schema.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

With 100% schema description coverage, the parameters are well-documented in the schema itself. The description implicitly references the 'options' parameter and illustrates the types of tradeoffs considered (which loosely map to 'priorities'), but does not explicitly explain parameter relationships, validation rules, or provide usage examples beyond what the schema already contains.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose4/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly establishes the tool's function using the 'Pragmatist, Skeptic, and Futurist' personas to evaluate options across specific dimensions (short-term vs long-term, risk vs reward, simplicity vs flexibility), and specifies the output format as 'pros-cons'. However, it relies on metaphorical framing ('council') without explicitly contrasting with the similar 'consult-council' sibling tool.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines2/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description provides no guidance on when to select this tool versus alternatives like 'consult-council', 'design-architecture', or 'plan-implementation'. It lacks explicit prerequisites, exclusion criteria, or scenario-based recommendations for usage.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

check-usageCheck UsageB
Read-onlyIdempotent
Inspect

Check your remaining credits, usage limits, and plan info

ParametersJSON Schema
NameRequiredDescriptionDefault
verboseNoInclude detailed per-window rate limit breakdown

Output Schema

ParametersJSON Schema
NameRequiredDescription
planYesCurrent plan tier
statusYesOverall usage status
creditsYesRemaining credit balance
rateLimitsYesRate limit status across all windows
Behavior3/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations declare read-only, idempotent, safe operations. Description adds value by specifying returned data types (credits, limits, plan info) but omits cache behavior, rate limit costs, or authentication scope beyond what annotations provide.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Extremely efficient at 9 words. Front-loaded with action and resources. No filler or redundancy; every word earns its place.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Appropriate for a simple metadata tool. Output schema exists (reducing need to describe return values), annotations cover safety profile, and description identifies the domain. Could improve by noting if this reflects real-time or cached data.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema coverage is 100% with the 'verbose' parameter fully documented. Description mentions no parameters, but baseline 3 is appropriate since the schema carries the full semantic burden adequately.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose4/5

Does the description clearly state what the tool does and how it differs from similar tools?

Clear verb 'Check' with specific resources (credits, usage limits, plan info). Distinct from sibling tools which focus on code, architecture, and debugging. However, lacks explicit differentiation text comparing to alternatives like get-session.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines2/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

No guidance on when to use versus alternatives, prerequisites, or frequency recommendations. Description states what it does but not when an agent should invoke it.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

consult-councilConsult CouncilAInspect

Consult the AI coding council — multiple models discuss your engineering question sequentially (each sees prior responses), then a moderator synthesizes. Auto-mode by default — AI picks optimal models, roles, and conversation mode from your prompt. Provide explicit models to override (manual mode). Fully configurable: mode, format, roles, models, thinking level.

ParametersJSON Schema
NameRequiredDescriptionDefault
modeNoConversation mode: analyzing (research), brainstorming (ideas), debating (tradeoffs), solving (action plans)debating
rolesNoInline role names for participants (e.g., ["Security Architect", "Backend Engineer"])
formatNoModerator output format: discussion (narrative), adr (architecture decision), comparison (table), pros-consdiscussion
modelsNoOverride specific model IDs. Min 3 models. Use list-models to see available options
promptYesThe question, topic, or problem to debate
contextNoAdditional background context for the debate (code, docs, requirements)
knowledgeNoReference knowledge to inject as context
auto_routeNoAuto-select optimal models based on prompt analysis and historical performance
webhook_urlNoWebhook URL to POST results to after completion
thinking_levelNoControls model quality and cost: low (fast/cheap), medium (balanced), high (maximum reasoning)medium
session_contextNoSession IDs to use as context (max 3). Prior moderator summaries will be prepended.
Behavior3/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Adds valuable process context beyond annotations (sequential discussion flow, moderator synthesis, auto-routing logic). However, fails to disclose operational traits implied by parameters: that it creates a session/thread (see siblings get-session/get-thread-link), async webhook behavior, or cost/latency implications of multi-model calls.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Four sentences, front-loaded with core mechanism. Every sentence earns its place: (1) what it does, (2) default behavior, (3) override path, (4) configurability scope. No redundancy.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness3/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Adequate for input configuration given rich schema, but incomplete regarding output behavior: no output schema exists, yet description doesn't clarify return structure (synthesis format) or session lifecycle implications critical for an agent invoking an async/multi-step tool.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters4/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

With 100% schema coverage (baseline 3), the description adds conceptual framework linking auto_route and models parameters into 'auto-mode' vs 'manual mode' paradigm, helping agents understand the relationship between these settings.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

Specific verb ('consult') + resource ('AI coding council') + mechanism ('multiple models discuss sequentially... moderator synthesizes'). Clearly distinguishes from siblings like 'review-code' or 'design-architecture' by emphasizing the multi-model debate/discussion pattern.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines3/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

Provides operational guidance distinguishing auto-mode (default) from manual mode ('Provide explicit models to override'), but lacks explicit guidance on when to select this tool vs siblings like 'assess-tradeoffs' or 'design-architecture' for engineering problems.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

debug-issueDebug IssueBInspect

Debugging council. Root Cause Analyst, Systems Engineer, and Edge Case Investigator collaboratively diagnose bugs, analyze errors, and propose fixes.

ParametersJSON Schema
NameRequiredDescriptionDefault
codeNoThe relevant code where the bug occurs
errorNoError message, stack trace, or unexpected output
problemYesDescribe the bug, failure, or unexpected behavior
knowledgeNoReference knowledge to inject as context
webhook_urlNoWebhook URL to POST results to after completion
thinking_levelNoAnalysis depthmedium
session_contextNoSession IDs to use as context (max 3). Prior moderator summaries will be prepended.
expected_behaviorNoWhat should happen vs what actually happens
Behavior3/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations indicate readOnlyHint=false and openWorldHint=true, but the description doesn't clarify what side effects occur (e.g., whether debugging sessions are persisted, what the webhook receives, or that external URLs in the knowledge parameter will be fetched). It adds context about the collaborative 'council' process but omits critical behavioral details an agent needs to use the tool safely.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness3/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is brief (two sentences) but inefficiently structured. The first sentence 'Debugging council.' is a content-free fragment that restates the title. The second sentence is information-dense but front-loading the personas before the action creates slight cognitive overhead for the agent parsing intent.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness3/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

For a complex tool with 8 parameters including webhooks, session context references, and external knowledge injection, the description is underspecified. It fails to explain the interaction pattern (async via webhook vs sync), the role of session_context, or what output to expect (no output schema exists). While the schema documents individual parameters, the description doesn't tie them together into a complete operational picture.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

With 100% schema description coverage, the schema fully documents all 8 parameters including complex nested objects (knowledge) and enums (thinking_level). The description text adds no parameter-specific guidance, syntax details, or examples, earning the baseline score of 3.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose4/5

Does the description clearly state what the tool does and how it differs from similar tools?

The second sentence provides specific verbs (diagnose, analyze, propose) and resources (bugs, errors, fixes), while the persona list (Root Cause Analyst, Systems Engineer, Edge Case Investigator) suggests a specialized multi-agent approach. However, the first sentence 'Debugging council.' merely restates the concept without adding value, and the description doesn't explicitly differentiate this from the sibling 'consult-council' or 'review-code' tools.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines2/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description provides no guidance on when to select this tool versus siblings like 'consult-council' (general advice) or 'review-code' (code review). It doesn't mention prerequisites (e.g., having error messages or code snippets ready) or when the webhook parameter should be utilized.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

design-architectureDesign ArchitectureAInspect

Architecture design council. Systems Architect, Infrastructure Engineer, and DX Advocate evaluate your system design. Always uses high thinking for maximum depth. Output as ADR.

ParametersJSON Schema
NameRequiredDescriptionDefault
scaleNoTarget scale: startup (small team), growth (scaling), enterprise (large org)startup
tech_stackNoPreferred technologies
descriptionYesWhat the system should do
focus_areasNoPriority areas (e.g., ["security", "performance"])
webhook_urlNoWebhook URL to POST results to after completion
Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations indicate openWorldHint=true and readOnlyHint=false. The description adds valuable behavioral context not in annotations: the computational cost ('high thinking'), the collaborative multi-agent nature of the evaluation, and the specific output format (ADR). It does not contradict the annotations.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is three sentences with zero waste. It front-loads the core concept (architecture council), follows with specific roles and action, then adds operational characteristics (high thinking) and output format. Every clause earns its place.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness3/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

For a complex multi-agent tool with webhook support (implying potential async operation), the description is minimally adequate. It notes the ADR output but does not explain the webhook's role in async delivery versus synchronous return, nor does it define ADR for users unfamiliar with the acronym.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

With 100% schema description coverage, the schema adequately documents all parameters (description, scale, tech_stack, focus_areas, webhook_url). The description does not add parameter-specific semantics, but this is acceptable given the comprehensive schema documentation, meeting the baseline for this dimension.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly identifies the specific multi-persona 'council' structure (Systems Architect, Infrastructure Engineer, DX Advocate), the action (evaluate system design), and deliverable format (ADR). This effectively distinguishes it from the generic 'consult-council' sibling and tradeoff analysis tools.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines3/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description provides one explicit usage signal ('Always uses high thinking for maximum depth'), suggesting when to use it for deep analysis. However, it lacks explicit guidance on when NOT to use it (e.g., for quick/simple questions) and does not mention alternatives like 'assess-tradeoffs' for narrower scope decisions.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

get-logsGet LogsA
Read-onlyIdempotent
Inspect

Query structured logs from your MCP tool executions. Filter by session, severity level, event type, and time range. Useful for debugging and monitoring tool usage.

ParametersJSON Schema
NameRequiredDescriptionDefault
eventNoFilter by event name (e.g., "debate_completed")
levelNoFilter by log level
limitNoMax results
offsetNoPagination offset
end_timeNoEnd timestamp (ms) for time range filter
session_idNoFilter logs for a specific session
start_timeNoStart timestamp (ms) for time range filter

Output Schema

ParametersJSON Schema
NameRequiredDescription
logsYesMatching log entries
Behavior3/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations already declare read-only, idempotent, non-destructive behavior. Description adds valuable context about data source (MCP tool executions) and available filter dimensions, but omits behavioral details like pagination behavior, log retention, or result ordering.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Three well-structured sentences: purpose/scope, capabilities, and use cases. No redundancy or waste. Information is front-loaded with the core action.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Appropriate for complexity: covers data source, filtering capabilities, and use cases. With output schema present and rich annotations, description doesn't need to detail return values. Minor gap: could note that all filters are optional (0 required params).

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

With 100% schema description coverage, baseline is 3. Description adds semantic grouping by mapping parameters to filter categories (session, severity level, event type, time range), but does not add syntax details, examples, or constraints beyond the schema.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

Description uses specific verb 'Query' with clear resource 'structured logs from your MCP tool executions', distinguishing it from sibling tools like get-session (metadata) or debug-issue (procedural debugging).

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines3/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

Provides implied usage contexts ('Useful for debugging and monitoring tool usage') but lacks explicit when-to-use guidance versus similar siblings like debug-issue or check-usage, and does not specify prerequisites or exclusions.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

get-sessionGet SessionA
Read-onlyIdempotent
Inspect

Get full details of a previous MCP session by ID. Returns the complete result including participant responses and moderator synthesis. Use list-sessions first to find session IDs.

ParametersJSON Schema
NameRequiredDescriptionDefault
session_idYesThe session ID to retrieve

Output Schema

ParametersJSON Schema
NameRequiredDescription
idYesSession identifier
promptYesOriginal prompt text
toolNameYesTool that was invoked
createdAtYesSession creation timestamp
durationMsYesDuration in milliseconds
qualityScoreYesAI-rated quality score
totalCreditsYesTotal credits consumed
Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

While annotations cover safety (readOnlyHint, idempotentHint), the description adds valuable context about the return content ('participant responses and moderator synthesis') that helps the agent understand what data to expect beyond the schema structure.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Three sentences with zero waste: purpose declaration, return value details, and prerequisite workflow. Information is front-loaded with the action verb.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness5/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the presence of an output schema, the description appropriately focuses on purpose and workflow. It mentions the key components of the result (participant responses, moderator synthesis) without redundantly detailing the full schema structure.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

With 100% schema description coverage for the single 'session_id' parameter, the schema carries the semantic load. The description implies the ID refers to a 'previous MCP session' but doesn't elaborate beyond the schema definition.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description states the specific action ('Get full details'), resource ('MCP session'), and scope ('by ID'), clearly distinguishing it from sibling 'list-sessions' which finds IDs rather than retrieving full details.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines5/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

Explicitly states the workflow dependency: 'Use list-sessions first to find session IDs,' providing clear guidance on when to use this tool versus its sibling.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

list-modelsList ModelsA
Read-onlyIdempotent
Inspect

List available AI models grouped by thinking level (low/medium/high). Shows default models, credit costs, capabilities for each tier. Use this before consult to understand model options.

ParametersJSON Schema
NameRequiredDescriptionDefault
thinking_levelNoFilter to a specific thinking level

Output Schema

ParametersJSON Schema
NameRequiredDescription
levelsYesAvailable thinking level tiers
Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations already declare readOnly/idempotent/destructive hints, so safety is covered. The description adds valuable business logic context: it discloses that results include 'credit costs' (billing implications), 'default models', and 'capabilities for each tier'—information beyond what annotations provide.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Three sentences each earning their place: (1) establishes core purpose and grouping, (2) details return payload contents, (3) provides workflow guidance. Front-loaded with the key action verb and zero wasted words.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness5/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the presence of an output schema and comprehensive annotations covering safety properties, the description appropriately focuses on workflow integration and data content (credit costs, capabilities) rather than redundant structural details. Complete for a discovery/listing tool.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema coverage is 100% with the 'thinking_level' parameter fully documented as 'Filter to a specific thinking level'. The description mentions the enum values (low/medium/high) and maps them to 'tiers', but per rubric baseline is 3 when schema carries the full documentation load.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description uses specific verb 'List' with resource 'AI models' and clarifies grouping method 'by thinking level'. It explicitly distinguishes from sibling 'consult-council' by stating this is used 'before consult', establishing a clear workflow boundary.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines5/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

Explicitly states when to use ('before consult') and why ('to understand model options'), providing clear sequencing guidance relative to the consult-council workflow. This directly addresses the alternative tool in the sibling set.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

list-sessionsList SessionsA
Read-onlyIdempotent
Inspect

List your previous MCP tool sessions. Returns session metadata including prompt, tool used, quality score, and credits consumed. Useful for reviewing past council discussions.

ParametersJSON Schema
NameRequiredDescriptionDefault
limitNoMax results to return
offsetNoPagination offset
tool_nameNoFilter by tool name (e.g., "consult", "architect")

Output Schema

ParametersJSON Schema
NameRequiredDescription
hasMoreYesWhether more results are available
sessionsYesList of sessions
Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations declare the safety profile (readOnly, idempotent, non-destructive). The description adds valuable behavioral context about what data is returned (specific metadata fields) and domain context (council discussions), though it omits pagination behavior details.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Three sentences with zero waste: action declaration, return value specification, and usage context. Front-loaded with the core verb and appropriately sized.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness5/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the tool's moderate complexity (3 optional params, simple filtering/pagination), the existence of an output schema, and comprehensive annotations, the description provides sufficient context including domain-specific terminology ('council discussions') relevant to the tool ecosystem.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Input schema has 100% description coverage (limit, offset, tool_name all documented). The description does not add parameter-specific semantics, but with full schema coverage, no additional description is necessary. Baseline score applies.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the verb (List) and resource (MCP tool sessions), distinguishes from sibling get-session by implying plural/collection return, and specifically enumerates the metadata fields returned (prompt, tool used, quality score, credits) providing concrete scope.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

Provides explicit usage context ('Useful for reviewing past council discussions') tying to the consult-council sibling workflow. However, it does not explicitly contrast with get-session for retrieving single session details vs. listing multiple sessions.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

plan-implementationPlan ImplementationAInspect

Implementation planning council. Tech Lead, Senior Engineer, and QA Strategist break down a feature into actionable steps, identify risks, and define acceptance criteria. Output as ADR.

ParametersJSON Schema
NameRequiredDescriptionDefault
featureYesThe feature or change to plan
knowledgeNoReference knowledge to inject as context
tech_stackNoCurrent tech stack
constraintsNoConstraints (e.g., ["no breaking changes", "must support offline"])
webhook_urlNoWebhook URL to POST results to after completion
thinking_levelNoPlanning depthmedium
session_contextNoSession IDs to use as context (max 3). Prior moderator summaries will be prepended.
codebase_contextNoRelevant existing code, file structure, or architecture notes
Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Excellent disclosure beyond annotations: describes the multi-agent 'council' mechanism with specific roles, reveals output format (ADR), and explains activities performed. openWorldHint=true aligns with knowledge parameter's URL fetching capability, though description doesn't explicitly mention external data access.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Two efficient sentences with zero waste. Front-loaded with the council pattern identifier, followed by specific actions and output format. No redundant or filler text.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

For moderate complexity (8 params, nested objects), description adequately covers the council composition and output format. Missing: webhook async behavior, session persistence details, and relationship to session_context parameter. However, 'Output as ADR' provides crucial behavioral context given no output schema exists.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema description coverage is 100%, so baseline 3 is appropriate. Description adds no parameter-specific guidance, but schema adequately documents all 8 parameters including the nested knowledge object structure and thinking_level enum values.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

Specific verb+resource combination ('break down a feature into actionable steps') clearly identifies the tool's function. Distinguishes from siblings like design-architecture (high-level structure) and assess-tradeoffs (evaluation) by specifying implementation focus, specific personas (Tech Lead, Senior Engineer, QA Strategist), and ADR output format.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines3/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

Implies usage through description of activities (risk identification, acceptance criteria definition) but lacks explicit when-to-use guidance versus alternatives like consult-council or design-architecture. No mention of prerequisites or when NOT to use.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

review-codeReview CodeAInspect

Code review council. Senior Engineer, Security Reviewer, and Performance Analyst analyze your code and a moderator synthesizes their findings.

ParametersJSON Schema
NameRequiredDescriptionDefault
codeYesThe code to review
focusNoReview focus areas (e.g., ["security", "performance"])
languageNoProgramming language (auto-detected if not specified)
webhook_urlNoWebhook URL to POST results to after completion
thinking_levelNoReview depth: low (quick scan), medium (balanced), high (thorough)medium
Behavior3/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Adds valuable context about the internal multi-agent process (council metaphor with synthesis) not present in annotations. However, fails to disclose async behavior implied by webhook_url parameter, does not clarify openWorldHint implications (external AI calls), and omits idempotency behavior (each call creates new review).

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Single sentence that efficiently establishes the metaphor, participants, and process flow. Every clause contributes to understanding the tool's unique multi-persona approach.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness3/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Adequate for basic invocation but incomplete given the tool's complexity. No output schema exists, yet description doesn't characterize return format (structured report? findings list?). Also fails to mention asynchronous operation capability despite webhook_url parameter.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

With 100% schema coverage, baseline is 3. The description implicitly maps 'Security Reviewer' and 'Performance Analyst' to likely focus area values, adding semantic context, but does not explain webhook_url's async implication or thinking_level's depth trade-offs.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description uses specific verbs and resources ('analyze your code,' 'synthesizes their findings') and distinguishes from sibling consult-council by specifying the unique 'council' composition (Senior Engineer, Security Reviewer, Performance Analyst, moderator) dedicated to code review.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines2/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

No guidance on when to use this tool versus siblings like debug-issue, design-architecture, or assess-tradeoffs. Does not indicate prerequisites (e.g., code completeness) or when to prefer simpler alternatives.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

set-thread-visibilitySet Thread VisibilityA
Idempotent
Inspect

Set a thread as public or private. Public threads can be shared via URL. Use session_id from a previous debate session.

ParametersJSON Schema
NameRequiredDescriptionDefault
is_publicYesSet to true to make the thread publicly accessible, false to make it private
session_idYesThe session ID from a previous debate

Output Schema

ParametersJSON Schema
NameRequiredDescription
isPublicYesCurrent visibility state
publicUrlYesPublic share URL if thread is public
Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

The description adds valuable behavioral context beyond annotations by explaining that 'Public threads can be shared via URL'—clarifying the functional consequence of the boolean flag. It correctly implies mutation (aligning with readOnlyHint=false) without contradicting the idempotentHint=true annotation.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Three sentences efficiently cover: (1) core functionality, (2) behavioral implication of public status, and (3) critical usage prerequisite. Every sentence earns its place with no redundancy or filler.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness5/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

For a simple 2-parameter mutation tool with full schema coverage, existing annotations declaring idempotency/safety, and an output schema present, the description provides complete context. It explains the 'why' (URL sharing) and 'what' (visibility) without needing to detail return values.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters4/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Despite 100% schema coverage (baseline 3), the description adds provenance context for session_id ('from a previous debate session'), helping the agent understand the parameter source. It also maps the abstract boolean to concrete concepts ('public or private').

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description opens with a precise verb ('Set') and resource ('thread visibility'), explicitly stating the tool controls public/private status. It distinguishes from read-only siblings like 'get-thread-link' and 'get-session' by emphasizing the mutation aspect ('Set').

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines3/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description provides a critical prerequisite ('Use session_id from a previous debate session'), implying this tool requires an existing session. However, it lacks explicit guidance on when to choose this over alternatives like 'get-thread-link' or when visibility changes are appropriate.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

Discussions

No comments yet. Be the first to start the discussion!

Try in Browser

Your Connectors

Sign in to create a connector for this server.

Resources