Canvas MCP Server
Server Quality Checklist
Latest release: v1.3.0
- Disambiguation3/5
Several tools have overlapping purposes, especially around announcement deletion (four distinct delete tools) and peer review reporting/analytics (multiple tools with similar functions). While most tools have distinct intents, the redundancy in certain areas could confuse an agent about which tool to use.
Naming Consistency5/5Tool names consistently use a verb_noun pattern in snake_case (e.g., create_assignment, list_modules, delete_page). Even tools with 'get_my_' prefix follow a predictable convention. No mixing of styles.
Tool Count3/589 tools is a high count for a single server, but it reflects the breadth of Canvas LMS functionality (courses, assignments, modules, pages, discussions, announcements, rubrics, peer reviews, files, conversations, analytics, accessibility). The number is borderline but not unreasonable for a comprehensive integration.
Completeness4/5The tool set covers most common Canvas operations: CRUD for key entities, bulk operations, analytics, and accessibility scanning. Minor gaps exist, such as no individual grade update (only bulk/submission grading) and no calendar event tools, but overall it's well-rounded.
Average 3.7/5 across 89 of 89 tools scored. Lowest: 2.6/5.
See the Tool Scores section below for per-tool breakdowns.
- 8 of 24 issues responded to in the last 6 months
- No commit activity data available
- Last stable release on
- No critical vulnerability alerts
- No high-severity vulnerability alerts
- No code scanning findings
- CI is passing
Add a LICENSE file by following GitHub's guide. Once GitHub recognizes the license, the system will automatically detect it within a few hours.
If the license does not appear after some time, you can manually trigger a new scan using the MCP server admin interface.
MCP servers without a LICENSE cannot be installed.
This repository includes a README.md file.
No tool usage detected in the last 30 days. Usage tracking helps demonstrate server value.
Tip: use the "Try in Browser" feature on the server page to seed initial usage.
This repository includes a glama.json configuration file.
This server has been verified by its author.
Add related servers to improve discoverability.
How to sync the server with GitHub?
Servers are automatically synced at least once per day, but you can also sync manually at any time to instantly update the server profile.
To manually sync the server, click the "Sync Server" button in the MCP server admin interface.
How is the quality score calculated?
The overall quality score combines two components: Tool Definition Quality (70%) and Server Coherence (30%).
Tool Definition Quality measures how well each tool describes itself to AI agents. Every tool is scored 1–5 across six dimensions: Purpose Clarity (25%), Usage Guidelines (20%), Behavioral Transparency (20%), Parameter Semantics (15%), Conciseness & Structure (10%), and Contextual Completeness (10%). The server-level definition quality score is calculated as 60% mean TDQS + 40% minimum TDQS, so a single poorly described tool pulls the score down.
Server Coherence evaluates how well the tools work together as a set, scoring four dimensions equally: Disambiguation (can agents tell tools apart?), Naming Consistency, Tool Count Appropriateness, and Completeness (are there gaps in the tool surface?).
Tiers are derived from the overall score: A (≥3.5), B (≥3.0), C (≥2.0), D (≥1.0), F (<1.0). B and above is considered passing.
Tool Scores
- Behavior2/5
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
No annotations are provided, and the description only mentions 'analyze' and 'send,' but does not disclose what analysis is performed, whether data is modified, or any side effects. For an action-oriented tool, this is insufficient.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Conciseness3/5Is the description appropriately sized, front-loaded, and free of redundancy?
The description is very short and lacks structure (no sections, examples, or context). While concise, it sacrifices necessary detail, making it under-specified for an agent to understand the tool's full purpose.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Completeness2/5Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Despite having an output schema (not shown), the description does not explain the workflow's output or behavior. With multiple sibling tools handling peer review reminders, the description fails to clarify what makes this 'followup campaign' distinct or complete.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Parameters2/5Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema description coverage is 0%, and the description adds minimal information: it lists parameter names with brief type hints ('Course code or Canvas ID' and 'Canvas assignment ID'), which are mostly redundant with the schema's anyOf definition. No explanation of format, constraints, or usage is provided.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Purpose4/5Does the description clearly state what the tool does and how it differs from similar tools?
The description states 'analyze peer reviews and send targeted reminders,' clearly indicating the tool's action and resource. However, it does not differentiate from similar sibling tools like send_peer_review_reminders, which also sends reminders related to peer reviews.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Usage Guidelines2/5Does the description explain when to use this tool, when not to, or what alternatives exist?
There is no guidance on when to use this tool versus alternatives. The description merely says 'Complete workflow' without specifying context, prerequisites, or conditions for use.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
- Behavior2/5
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
No annotations are provided, so the description must disclose behavioral traits. It does not mention that the tool overwrites existing content, requires specific permissions, or any potential side effects. The term 'edit' implies mutation, but no warnings or details about reversibility, rate limits, or authentication are given.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Conciseness4/5Is the description appropriately sized, front-loaded, and free of redundancy?
The description is short and structured with an 'Args' block. It is efficient but could be slightly more concise by removing redundant phrasing.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Completeness2/5Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given the presence of an output schema, return values need not be explained. However, the description lacks critical context like prerequisites (e.g., user permissions), the fact that it overwrites existing page content, and how it compares to sibling tools. This leaves an agent without enough information to use it correctly in all cases.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Parameters3/5Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
The description provides brief but helpful explanations for each parameter beyond what the schema (which lacks property descriptions) offers. For example, 'course_identifier: Course code or Canvas ID' clarifies valid input. However, descriptions are minimal; 'new_content' says 'New HTML content' but no format constraints or size limits.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Purpose4/5Does the description clearly state what the tool does and how it differs from similar tools?
The description states the verb 'Edit' and resource 'content of a specific page', clearly indicating the action. It distinguishes from sibling tools like 'update_page_settings' and 'create_page', though not explicitly.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Usage Guidelines2/5Does the description explain when to use this tool, when not to, or what alternatives exist?
No guidance on when to use this tool versus alternatives like 'update_page_settings' or 'bulk_update_pages'. Does not mention prerequisites or context.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
- Behavior3/5
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
Annotations already declare readOnlyHint=true, so the description's claim of listing is consistent. The description adds minimal behavioral context beyond the annotation, such as the scope of courses (authenticated user), but does not discuss pagination or other behaviors.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Conciseness5/5Is the description appropriately sized, front-loaded, and free of redundancy?
The description is a single concise sentence with no redundant words. It is efficiently front-loaded and earns its place.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Completeness3/5Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given the tool has two optional parameters and an output schema, the description provides the basic purpose but lacks explanation of the parameters and their effect on the return data. It is minimally complete for a simple list tool but could be more informative.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Parameters1/5Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
There are two boolean parameters (include_concluded, include_all) with 0% schema description coverage. The description does not explain these parameters or their purpose, leaving the agent without guidance on how to use them.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Purpose4/5Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the tool lists courses for the authenticated user. While the verb and resource are specific, it does not differentiate from other list_* siblings, but the purpose is unambiguous.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Usage Guidelines2/5Does the description explain when to use this tool, when not to, or what alternatives exist?
No guidance on when to use this tool versus alternatives. The description implies the scope (authenticated user) but does not provide context about when to include concluded or all courses, nor how this tool differs from other list tools.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
- Behavior2/5
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
Lacks annotations, so description must disclose behavioral traits. Only mentions it's a reply, but no info on side effects, idempotency, restrictions (e.g., deadlines), or success/failure behavior.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Conciseness5/5Is the description appropriately sized, front-loaded, and free of redundancy?
Extremely concise: two lines for purpose then clean argument list. Every sentence is necessary. Front-loaded with the action. No wasted words.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Completeness2/5Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Despite no annotations, the description lacks completeness. Does not mention output schema, required permissions, or how to obtain entry_id. Minimal context for a write tool with 4 required params. Sibling tools are not differentiated.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Parameters3/5Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema coverage is 0%, so description must compensate. Provides brief but helpful explanations for each parameter (e.g., 'Course code or Canvas ID' for course_identifier), adding some meaning beyond schema titles. However, descriptions are minimal and could be more detailed.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Purpose4/5Does the description clearly state what the tool does and how it differs from similar tools?
Clearly states 'Reply to a student's discussion entry/comment' with specific verb and resource. Distinguishes from sibling 'post_discussion_entry' by implying it's a reply to an existing entry, though not explicit.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Usage Guidelines2/5Does the description explain when to use this tool, when not to, or what alternatives exist?
No guidance on when to use this tool vs alternatives like 'post_discussion_entry' or prerequisites (e.g., needing valid topic_id and entry_id). Just states the action without context.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
- Behavior2/5
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
With no annotations provided, the description bears full responsibility for disclosing behavioral traits. It does not mention side effects, authentication requirements, rate limits, or error handling (e.g., what happens if the course doesn't exist). The description focuses solely on parameter descriptions.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Conciseness3/5Is the description appropriately sized, front-loaded, and free of redundancy?
The description is adequately structured with an Args block, but it is lengthy due to listing all 14 parameters. While no information is redundant, the verbosity could be trimmed by grouping similar parameters (e.g., date-related). It lacks a concise summary sentence at the start.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Completeness3/5Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given the complexity (14 parameters, no annotations, 0% schema coverage), the description covers parameter meanings but omits the return value structure (despite having an output schema). It also does not specify behavior when optional fields are omitted. Overall, it meets minimum requirements but has gaps.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Parameters4/5Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema description coverage is 0%, so the description must compensate. It provides meaningful explanations for each parameter, such as 'Course code or Canvas ID' for course_identifier and ISO 8601 format for dates. This adds significant value over the raw schema titles.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Purpose4/5Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the tool creates an assignment in a course, with a specific verb and resource. However, it does not differentiate from sibling tools like 'update_assignment' or 'list_assignments', missing an opportunity to clarify when to use this tool over others.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Usage Guidelines2/5Does the description explain when to use this tool, when not to, or what alternatives exist?
There is no guidance on when to use this tool versus alternatives, nor any mention of prerequisites or situations where it should not be used. The description only explains the action, leaving the agent to infer context from sibling tool names.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
- Behavior2/5
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
Annotations are absent, so the description carries the full burden. It only mentions creating a CSV file without disclosing side effects (e.g., file overwrite behavior), authentication needs, or whether it modifies any data. The return value is not described despite an output schema existing.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Conciseness4/5Is the description appropriately sized, front-loaded, and free of redundancy?
The description is very short (two lines plus an Args block) and front-loaded, with no fluff. However, it omits important details like output schema context, which would fit naturally without bloating.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Completeness2/5Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Despite having an output schema, the description does not mention what the tool returns (e.g., the map, a status). It also lacks context about the CSV file location (local to server or client) and any side effects. The tool is simple but incomplete for safe use.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Parameters4/5Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema description coverage is 0%, but the description clarifies the parameter 'course_identifier' as 'Course code or Canvas ID', adding meaning beyond the schema's type information. It could be more specific about allowed formats or constraints.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Purpose4/5Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states it creates a local CSV file mapping real student data to anonymous IDs for a course, specifying the verb 'Create' and the resource. However, it doesn't differentiate from the sibling tool 'get_anonymization_status' and could be more explicit about the scope.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Usage Guidelines2/5Does the description explain when to use this tool, when not to, or what alternatives exist?
No guidance is provided on when to use this tool versus alternatives, no prerequisites or exclusions are mentioned. Siblings like 'get_anonymization_status' exist but are not referenced.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
- Behavior2/5
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
With no annotations, the description must convey behavioral traits. It only lists parameters and their basic meaning, omitting details like side effects (e.g., reordering on position change), failure modes (e.g., invalid item_id), or permission requirements. The 'move_to_module_id' parameter hints at cross-module moves, but no explanation of implications.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Conciseness3/5Is the description appropriately sized, front-loaded, and free of redundancy?
The description begins with a clear one-line purpose, then enumerates all 12 parameters in a list format. This is structured but somewhat lengthy. Some parameter descriptions are redundant with common sense (e.g., 'New item title'), but the list is comprehensive and not excessively verbose.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Completeness3/5Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given the tool's complexity (12 parameters, 3 required, no annotations, but an output schema exists), the description covers all parameters but lacks broader context. Missing elements include error handling, permission requirements, valid value ranges for position/indent, and interaction effects when multiple fields are updated. The description is adequate for a straightforward update but not fully complete.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Parameters4/5Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Since the input schema provides no descriptions (0% coverage), the parameter list in the description is essential. Each parameter gets a brief explanation (e.g., 'Course code or Canvas ID', 'New completion type, or empty string to remove'), adding significant meaning beyond the schema names. Some details could be more precise (e.g., position indexing), but overall it compensates well.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Purpose4/5Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states 'Update an existing module item,' establishing a specific verb and resource. However, it does not explicitly differentiate from sibling tools like update_module or delete_module_item, which share the same module context.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Usage Guidelines2/5Does the description explain when to use this tool, when not to, or what alternatives exist?
No guidance is provided on when to use this tool versus alternatives (e.g., for creating items use add_module_item, for deleting use delete_module_item). There is no mention of prerequisites, setup steps, or common pitfalls.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
- Behavior2/5
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
With no annotations, the description must disclose behavioral traits but only states the core operation. It omits side effects (e.g., whether it overwrites existing associations, triggers notifications) and does not explain the impact of the 'use_for_grading' and 'purpose' parameters beyond their defaults. This is insufficient for a tool that modifies state.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Conciseness4/5Is the description appropriately sized, front-loaded, and free of redundancy?
The description is concise with a clear front-loaded purpose sentence followed by a structured parameter list. It avoids extraneous information, though the param descriptions could be slightly more compact. Overall, well-organized for quick scanning.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Completeness3/5Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given the tool has 5 parameters (3 required) and no annotations, the description covers purpose and parameter meanings but lacks usage guidance, behavioral details, and error handling context. The presence of an output schema reduces the need to describe return values, but the description still feels incomplete for a production tool.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Parameters4/5Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
The schema has 0% description coverage, so the description must compensate. It explains each parameter with helpful context (e.g., 'Course code or Canvas ID' for course_identifier, 'grading, bookmark' for purpose). This adds significant value beyond the bare schema types and titles.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Purpose4/5Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states 'Associate an existing rubric with an assignment,' which is a specific verb-resource combination. It distinguishes from siblings like create_rubric or grade_with_rubric by specifying 'existing rubric,' but additional context differentiating from similar association tools would improve clarity.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Usage Guidelines2/5Does the description explain when to use this tool, when not to, or what alternatives exist?
The description provides no guidance on when to use this tool versus alternatives. There are no prerequisites mentioned (e.g., rubric must exist, assignment must exist), nor exclusions for when not to use it. This leaves the agent without decision context.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
- Behavior2/5
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
Annotations already declare readOnlyHint: true, so the agent knows it's safe. The description adds no behavioral traits beyond that, such as auth needs or output structure.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Conciseness4/5Is the description appropriately sized, front-loaded, and free of redundancy?
The description is concise with two sentences, but the second sentence is an arg description that could be merged. Overall efficient and front-loaded.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Completeness3/5Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
With an output schema present, the description does not need to detail return values. However, it omits any mention of what 'detailed information' includes, leaving some ambiguity.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Parameters4/5Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
The description clarifies the parameter 'course_identifier' as 'Course code or Canvas ID', adding semantic meaning beyond the schema's title. This compensates for the 0% schema description coverage.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Purpose4/5Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the tool's purpose: 'Get detailed information about a specific course.' This distinguishes it from sibling tools like get_assignment_details or get_course_structure, though not explicitly.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Usage Guidelines2/5Does the description explain when to use this tool, when not to, or what alternatives exist?
No guidance is provided on when to use this tool versus alternatives, such as get_course_content_overview or other detail tools. The description lacks any context for selection.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
- Behavior2/5
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
Annotations indicate readOnlyHint=true, so the tool is read-only. The description adds no additional behavioral traits beyond the name, such as what data is returned (grades, student names) or if pagination is involved.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Conciseness4/5Is the description appropriately sized, front-loaded, and free of redundancy?
The description is very concise with one sentence plus parameter explanations. It is front-loaded with the main purpose. It could be slightly more structured, but it is efficient.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Completeness3/5Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given the tool's simplicity (2 params, read-only, output schema exists), the description is adequate but lacks scoping details (e.g., all submissions or filtered?) and does not address differentiation from siblings. It meets minimum viability.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Parameters4/5Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema description coverage is 0%, but the description provides meaningful explanations for both parameters: course_identifier as 'Course code or Canvas ID' and assignment_id as 'Canvas assignment ID'. This adds value beyond the schema titles.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Purpose4/5Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states it lists submissions for a specific assignment, using a specific verb and resource. However, it does not differentiate from sibling tools like get_my_submission_status or list_assignments.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Usage Guidelines2/5Does the description explain when to use this tool, when not to, or what alternatives exist?
No guidance is provided on when to use this tool vs alternatives (e.g., get_my_submission_status for own submissions). There is no exclusion criteria or context for usage.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
- Behavior2/5
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
With no annotations, the description must disclose behavioral traits. It only states the action without detailing side effects, permission requirements, whether the operation is idempotent, or what the output schema contains. The agent lacks critical context for safe invocation.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Conciseness5/5Is the description appropriately sized, front-loaded, and free of redundancy?
The description is extremely concise, consisting of two sentences with no extraneous information. It is front-loaded, immediately stating the purpose, then covering the parameter. Every word is necessary.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Completeness3/5Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given the tool's simplicity and the presence of an output schema (not shown), the description is minimally complete. It lacks context about error handling, relationship to other conversation tools, and use cases. For a standalone mutation, it suffices but is not comprehensive.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Parameters3/5Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
The description adds a brief explanation for the only parameter: 'List of conversation IDs to mark as read.' This clarifies the purpose beyond the schema, which only has a title 'Conversation Ids' and type. However, it provides no additional constraints like format or expected length.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Purpose4/5Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the tool's purpose: 'Mark multiple conversations as read.' This verb+resource pair is specific and distinct from sibling tools, none of which perform a similar 'mark' operation. However, it could be more explicit about what 'mark as read' entails (e.g., whether it affects unread counts or acknowledges all messages).
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Usage Guidelines2/5Does the description explain when to use this tool, when not to, or what alternatives exist?
The description provides no guidance on when to use this tool versus alternatives, such as 'get_unread_count' or 'list_conversations.' It does not mention prerequisites, constraints, or scenarios where the tool should be avoided.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
- Behavior2/5
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
No annotations are provided, so the description must disclose behaviors. It only lists parameters; it omits side effects, permission requirements, or what happens after creation (e.g., visibility, notifications).
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Conciseness5/5Is the description appropriately sized, front-loaded, and free of redundancy?
The description is compact: one sentence stating purpose followed by a bullet list of arguments. No unnecessary text, and the key action is front-loaded.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Completeness3/5Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
The description covers parameter semantics but lacks behavioral context (e.g., does it require an existing course?). For a 7-parameter tool with no annotations, it is minimally adequate but not fully complete.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Parameters4/5Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema coverage is 0%, so descriptions add necessary meaning. Each parameter has a concise explanation (e.g., 'ISO 8601 datetime to schedule posting'). Some descriptions are tautological (e.g., 'Discussion topic title'), but overall they compensate well.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Purpose4/5Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states 'Create a new discussion topic for a course' with a specific verb and resource. It distinguishes from siblings like create_announcement, but does not explicitly differentiate from similar tools, leaving some ambiguity.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Usage Guidelines2/5Does the description explain when to use this tool, when not to, or what alternatives exist?
No guidance on when to use this tool versus alternatives (e.g., create_announcement). With many sibling creation tools, the agent needs context on selection criteria.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
- Behavior3/5
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
Annotations already declare readOnlyHint=true, indicating safe read operation. Description does not add further behavioral details (e.g., performance, error conditions). Acceptable but not enhanced.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Conciseness3/5Is the description appropriately sized, front-loaded, and free of redundancy?
Docstring format with Args block is clear but somewhat verbose for the number of parameters. Could be more concise, but front-loads purpose effectively.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Completeness3/5Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Covers parameter semantics well but lacks usage context, behavioral notes, and differentiation from sibling tools. Output schema exists, so return values are handled elsewhere. Adequate but not comprehensive.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Parameters4/5Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
With 0% schema description coverage, the description provides clear parameter meanings (e.g., 'Course code or Canvas ID') and specifies allowed values for report_type and format_type. However, defaults (e.g., comprehensive, markdown, false) are not mentioned.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Purpose4/5Does the description clearly state what the tool does and how it differs from similar tools?
Clear verb 'Create' and resource 'instructor-ready reports on peer review quality'. However, does not distinguish from sibling 'generate_peer_review_report'.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Usage Guidelines2/5Does the description explain when to use this tool, when not to, or what alternatives exist?
No guidance on when to use this tool versus alternatives like 'generate_peer_review_report' or 'analyze_peer_review_quality'. Missing context on prerequisites or when not to use.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
- Behavior3/5
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
Annotations declare readOnlyHint=true, indicating a safe read operation. The description adds 'detailed analytics' but does not elaborate on what data is returned (e.g., scores, statistics). While not contradictory, it offers minimal behavioral insight beyond the annotation.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Conciseness4/5Is the description appropriately sized, front-loaded, and free of redundancy?
The description is concise with a single opening sentence and an args list. The main purpose is front-loaded. The args list is somewhat redundant with the schema but remains clear and brief.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Completeness4/5Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
The tool is simple (two required parameters), has an output schema to document return values, and annotations indicate read-only. The description, while sparse, is sufficient for a straightforward read operation given the supporting structured information.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Parameters3/5Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
The schema has 0% description coverage; the description provides brief explanations for each parameter (e.g., 'course_identifier: Course code or Canvas ID'). This adds some value but is minimal, leaving ambiguity about exact formats or allowed values.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Purpose4/5Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the tool retrieves analytics on student performance for a specific assignment. However, among siblings like 'get_assignment_details' and 'get_student_analytics', it does not differentiate the type or scope of analytics provided.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Usage Guidelines2/5Does the description explain when to use this tool, when not to, or what alternatives exist?
No explicit guidance on when to use this tool versus alternatives. The description does not mention prerequisites, context, or when to avoid using it. Sibling tools exist but no comparisons are made.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
- Behavior2/5
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
Annotations already include 'readOnlyHint: true', so the read-only nature is clear. The description adds no further behavioral context (e.g., permissions, rate limits) beyond what is implied.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Conciseness4/5Is the description appropriately sized, front-loaded, and free of redundancy?
The description is short and front-loaded with the main action. Including parameter descriptions is useful, but the Args block could be considered redundant given the schema. Overall, it's efficient.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Completeness3/5Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
An output schema exists, so explaining return values is not required. However, the description does not elaborate on what 'detailed information' includes, leaving some ambiguity. It meets minimal completeness.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Parameters3/5Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema description coverage is 0%, so the description must compensate. It explains that 'course_identifier' is a course code or Canvas ID and 'assignment_id' is a Canvas assignment ID, adding value beyond the schema titles. However, more detail on formats or constraints would help.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Purpose5/5Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states 'Get detailed information about a specific assignment', which is a specific verb-resource pair. It distinguishes from sibling tools like 'list_assignments' or 'get_assignment_analytics' by emphasizing a single assignment.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Usage Guidelines2/5Does the description explain when to use this tool, when not to, or what alternatives exist?
The description provides no guidance on when to use this tool versus alternatives such as 'list_assignments' or 'get_assignment_analytics'. It lacks context for choosing this tool.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
- Behavior2/5
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
Annotations already declare readOnlyHint=true. The description adds no further behavioral context (e.g., performance, rate limits, or side effects). It only repeats the purpose and parameter list.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Conciseness5/5Is the description appropriately sized, front-loaded, and free of redundancy?
The description is very concise: one sentence for purpose followed by a structured Args section. Every sentence is necessary, and the most important information is front-loaded.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Completeness3/5Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given the presence of an output schema, the description is adequate but missing potential details like pagination, ordering, or what the tool returns. For a list tool, this could be more informative.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Parameters4/5Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema description coverage is 0%, so the description must compensate. It provides meaningful explanations for both parameters (e.g., 'Course code or Canvas ID' for course_identifier, default value for include_announcements). This adds value beyond the schema.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Purpose4/5Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the action ('List discussion topics') and scope ('for a specific course'). It is unambiguous but does not explicitly differentiate from siblings like get_discussion_topic_details or list_discussion_entries.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Usage Guidelines2/5Does the description explain when to use this tool, when not to, or what alternatives exist?
The description provides no guidance on when to use this tool versus alternatives. No when-to-use or when-not-to-use information is given, leaving the agent to infer from the name and context.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
- Behavior3/5
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
Annotations already declare readOnlyHint=true, indicating a safe read operation. The description adds that it lists 'all' peer reviews, implying a full list return. However, it does not disclose details like pagination, ordering, or filtering options, which would be valuable for behavioral understanding.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Conciseness4/5Is the description appropriately sized, front-loaded, and free of redundancy?
The description is concise, consisting of two short sentences followed by parameter explanations. It is front-loaded with the main purpose and avoids unnecessary details. However, the parameter list could be integrated more succinctly.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Completeness3/5Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given the presence of an output schema, the description does not need to explain return values. It adequately identifies the two required parameters and the list operation. However, it lacks context about optional filters, result ordering, or any behavioral nuances that might affect agent decisions.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Parameters3/5Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
With 0% schema description coverage, the description partially compensates by adding brief clarifications: 'course_identifier' can be a course code or Canvas ID, and 'assignment_id' is specifically a Canvas assignment ID. This adds some meaning beyond the schema titles but is minimal and does not cover all aspects like formatting or validation.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Purpose4/5Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the tool lists peer review assignments for a specific assignment, using a specific verb ('List') and resource ('peer review assignments'). However, it does not differentiate from a sibling tool 'get_peer_review_assignments', which likely serves a similar purpose, limiting clarity in distinguishing between them.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Usage Guidelines2/5Does the description explain when to use this tool, when not to, or what alternatives exist?
The description provides no guidance on when to use this tool versus alternatives like 'get_peer_review_assignments' or when not to use it. It only lists required parameters, leaving the agent to infer usage context.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
- Behavior2/5
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
With no annotations, the description should disclose behavioral traits like permissions needed, side effects (e.g., notifications sent), or constraints. It only says 'manually assign' without any such details.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Conciseness4/5Is the description appropriately sized, front-loaded, and free of redundancy?
The description is efficient with a clear purpose sentence followed by a structured Args list. No unnecessary words, though the 'Args:' header could be omitted for brevity.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Completeness2/5Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
The tool is a mutation with no annotations and no output schema details. The description lacks information about return values, error handling, or specific behaviors (e.g., overwriting existing assignments). It is incomplete for safe invocation.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Parameters4/5Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
The description provides meaningful explanations for all four parameters beyond the schema titles, such as 'Course code or Canvas ID' and 'User ID of the student who will review'. This adds value given 0% schema coverage.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Purpose5/5Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the action (assign), resource (peer review), and target (student, assignment). It uses a specific verb and resource, and distinguishes from siblings like list_peer_reviews which are read-only.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Usage Guidelines2/5Does the description explain when to use this tool, when not to, or what alternatives exist?
The description provides no guidance on when to use this tool versus alternatives like get_peer_review_assignments or list_peer_reviews. There are no prerequisites or conditions mentioned.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
- Behavior3/5
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
Annotations already set readOnlyHint=true, so the description adds minimal behavioral context beyond the safety aspect. It does not disclose potential errors or authorization needs.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Conciseness4/5Is the description appropriately sized, front-loaded, and free of redundancy?
Concise two-sentence structure with an Args list. No unnecessary verbosity, though the phrase 'from Canvas course pages' could be implied.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Completeness3/5Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given the existence of sibling tools like parse_ufixit_violations and fix_accessibility_issues, the description could mention relationships. The presence of an output schema mitigates the need to describe return values.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Parameters4/5Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
With 0% schema description coverage, the description adds meaningful explanations: 'Course code or Canvas ID' for course_identifier and 'Title of the UFIXIT report page (default: "UFIXIT")' for page_title.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Purpose4/5Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states 'Fetch UFIXIT accessibility report from Canvas course pages' with a specific verb and resource. It distinguishes from siblings like parse_ufixit_violations by focusing on fetching rather than parsing, but does not explicitly differentiate.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Usage Guidelines2/5Does the description explain when to use this tool, when not to, or what alternatives exist?
No guidance on when to use this tool versus alternatives like parse_ufixit_violations. The description lacks any context about prerequisites or exclusion cases.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
- Behavior3/5
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
Annotations already declare readOnlyHint=true, so the safety profile is known. The description adds that it retrieves the user's own assignments across courses and mentions the `days` parameter, but does not disclose other behaviors like sorting or exclusion criteria.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Conciseness4/5Is the description appropriately sized, front-loaded, and free of redundancy?
The description is very concise (two sentences) and front-loads the purpose. However, it lacks structural elements like bullet points that could improve readability.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Completeness3/5Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
The description is adequate for a simple tool with an output schema, but does not cover aspects like how assignments are sorted, whether past-due items are included, or any limitations.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Parameters2/5Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema description coverage is 0%, but the description repeats the schema's default and purpose. It adds no new meaning beyond the schema, such as valid ranges or behavior when omitted.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Purpose5/5Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states 'Get your upcoming assignments across all courses,' specifying the verb (Get), resource (upcoming assignments), and scope (across all courses). This distinguishes it from siblings like `list_assignments` which are course-specific.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Usage Guidelines2/5Does the description explain when to use this tool, when not to, or what alternatives exist?
No guidance is provided on when to use this tool versus alternatives like `list_assignments` or `get_my_todo_items`. The description lacks when-not-to-use or contextual cues.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
- Behavior3/5
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
The description does not add behavioral details beyond the annotation (readOnlyHint=true). It does not explain the output format or what happens when criteria are applied. The annotation already indicates no mutations, so the description is not harmful but minimal.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Conciseness4/5Is the description appropriately sized, front-loaded, and free of redundancy?
The description is concise—one line for purpose and a simple Args list. No unnecessary words. Could be improved by structuring the usage guidance, but it is efficient as is.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Completeness3/5Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
With an output schema present, the description need not detail return values. However, it lacks context on when to use the tool and how the criteria parameter impacts behavior. Adequate for basic use but not fully comprehensive.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Parameters4/5Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Despite 0% schema description coverage, the description's 'Args' section explains each parameter: course_identifier as 'Course code or Canvas ID', assignment_id as 'Canvas assignment ID', and criteria as 'JSON string of custom flagging criteria'. This adds meaning beyond the schema's type-only constraints.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Purpose4/5Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the tool 'flags reviews that may need instructor attention', specifying the verb 'flag' and the resource 'reviews'. It distinguishes from siblings like 'analyze_peer_review_quality' by focusing on flagging for instructor attention rather than general analysis. However, it does not define what 'problematic' means.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Usage Guidelines2/5Does the description explain when to use this tool, when not to, or what alternatives exist?
No guidance on when to use this tool versus alternatives such as analyze_peer_review_quality or generate_peer_review_report. No exclusions or prerequisites mentioned, leaving the agent to infer usage context.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
- Behavior3/5
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
The description is consistent with the annotations (readOnlyHint=true) as it describes a list operation. However, it adds no behavioral details beyond the annotation—such as whether the tool returns only metadata or full objects, or any performance implications. The annotation already covers the safety profile, so a baseline score is appropriate.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Conciseness4/5Is the description appropriately sized, front-loaded, and free of redundancy?
The description is front-loaded with a clear one-line purpose, followed by a structured list of parameters. It is appropriately sized for a 5-parameter tool, though the parameter comments could be more concise (e.g., 'Include participant info' could be 'Include participants'). No wasted sentences.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Completeness3/5Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given the presence of an output schema and annotations, the description is adequate for a list tool. It covers input parameters but lacks any mention of output behavior (e.g., pagination, ordering). With 5 parameters and 0% schema coverage, the description could be more complete, but it meets the minimum viable threshold.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Parameters2/5Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
With 0% schema description coverage, the description must compensate for parameter meaning. It lists parameter names and one-line labels (e.g., 'scope: unread, starred...'), but lacks detailed semantics: what does filter_mode 'and' vs 'or' actually do? How does include_all_ids differ from include_participants? The defaults are given but not explained. This provides minimal value beyond the schema itself.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Purpose5/5Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the tool lists conversations for the current user, with a specific verb and resource. It distinguishes from sibling tools like send_conversation or get_conversation_details by being a list operation, and the parameter list further clarifies its scope.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Usage Guidelines2/5Does the description explain when to use this tool, when not to, or what alternatives exist?
No guidance on when to use this tool versus alternatives like get_conversation_details or mark_conversations_read. The description does not mention context, prerequisites, or when to avoid using it. The agent receives no help in deciding between this and other conversation tools.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
- Behavior2/5
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
With no annotations, the description must fully disclose behavioral traits. It only names modes ('sync' or 'async') and mentions templates, but omits crucial details: idempotency, error handling, rate limits, return value, or side effects. The safety profile is unclear for a messaging tool.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Conciseness4/5Is the description appropriately sized, front-loaded, and free of redundancy?
The description is concise (8 lines) and front-loaded with the purpose. The 'Args:' section is structured but some entries are redundant (e.g., repeating parameter names). No wasted sentences, though could be slightly tighter.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Completeness2/5Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given the complexity of bulk messaging and lack of annotations, the description is incomplete. It does not explain behavior for async mode, failure handling, or whether the tool is destructive. Although an output schema exists (not mentioned), the description misses critical usage contexts for a tool that sends messages to many recipients.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Parameters3/5Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema coverage is 0%, so the description must explain parameters. It does provide brief explanations (e.g., 'List of dicts with recipient info and template variables'), but lacks details: expected keys in recipient_data dicts, exact placeholder syntax, and constraints on mode values. Partial guidance, not fully compensating for missing schema descriptions.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Purpose5/5Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the tool sends customized messages to multiple recipients using templates, with a specific verb ('Send') and resource ('customized messages to multiple recipients'). It distinguishes from sibling tools like 'send_conversation' (single message) and peer review campaign tools.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Usage Guidelines3/5Does the description explain when to use this tool, when not to, or what alternatives exist?
The description implies usage for bulk templated messages by mentioning 'multiple recipients' and 'templates', but does not explicitly state when to prefer this over alternatives like 'send_conversation', nor does it provide when-not-to-use guidance or prerequisites.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
- Behavior2/5
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
No annotations provided, so description carries full burden. It does not disclose side effects, permission requirements, or behavioral traits like idempotency or error conditions.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Conciseness4/5Is the description appropriately sized, front-loaded, and free of redundancy?
The description is concise with a single action sentence followed by a parameter list, but the structure could be improved with separation of usage context and parameter details.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Completeness3/5Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Adequately covers basic purpose and parameters, but lacks behavioral context and additional usage details. Output schema exists, so return values are assumed covered.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Parameters4/5Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
The description includes a docstring-style list that explains each parameter (e.g., 'ISO 8601 datetime to schedule posting'), compensating for the 0% schema description coverage.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Purpose5/5Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the verb 'Create' and the resource 'announcement for a course', distinguishing it from sibling tools like create_assignment or list_announcements.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Usage Guidelines2/5Does the description explain when to use this tool, when not to, or what alternatives exist?
No guidance on when to use this tool versus alternatives, no prerequisites or conditions mentioned, and no explicit when-not-to-use advice.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
- Behavior2/5
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
With no annotations, the description must fully disclose behavior. It indicates creation but does not mention permissions, what happens on duplicate modules, or return value expectations despite an output schema being present.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Conciseness4/5Is the description appropriately sized, front-loaded, and free of redundancy?
The description is front-loaded with a clear purpose sentence and then lists parameters in a structured format. It is reasonably concise but the list could be slightly tighter.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Completeness3/5Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
The description covers all parameters adequately but omits behavioral context (e.g., return value, side effects, permission requirements). Given the presence of an output schema, this is acceptable but not fully sufficient.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Parameters4/5Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
The description provides clear inline explanations for each parameter (e.g., 'Course code or Canvas ID', 'Position in module list (1-indexed)'), adding meaningful context beyond the schema's titles and types. With 0% schema description coverage, this is valuable.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Purpose5/5Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states 'Create a new module in a course' with a specific verb and resource. It is easily distinguishable from sibling tools like update_module or add_module_item.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Usage Guidelines2/5Does the description explain when to use this tool, when not to, or what alternatives exist?
No guidance on when to use this tool versus alternatives (e.g., update_module) or any prerequisites. The description only lists parameters without contextual usage advice.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
- Behavior2/5
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
No annotations are present, and the description does not disclose behavioral traits such as authorization needs, rate limits, or side effects (e.g., overwriting existing pages, failure scenarios). It only states defaults, not behaviors beyond that.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Conciseness4/5Is the description appropriately sized, front-loaded, and free of redundancy?
The description is concise and front-loaded with the purpose statement. The Args block is structured but could be slightly more compact. Every sentence adds value, with no redundant filler.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Completeness3/5Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given the 6 parameters and lack of annotations, the description covers creation semantics adequately but omits usage guidelines and behavioral transparency. The presence of an output schema reduces the need to explain return values, but completeness is moderate.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Parameters4/5Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
The input schema has 0% description coverage, but the description's Args section provides clear meaning for each parameter, including types, defaults, and purpose (e.g., 'published: Whether to publish (default: True)'), significantly compensating for the schema gap.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Purpose5/5Does the description clearly state what the tool does and how it differs from similar tools?
The description explicitly states 'Create a new page in a Canvas course' using a specific verb and resource, clearly distinguishing it from sibling tools like update_page_settings, edit_page_content, and delete_page.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Usage Guidelines2/5Does the description explain when to use this tool, when not to, or what alternatives exist?
No guidance is provided on when to use this tool versus alternatives (e.g., edit_page_content for modifying existing pages, bulk_update_pages for batch operations). The description implies usage for creation but does not specify exclusions or contexts.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
- Behavior3/5
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
The annotations already declare readOnlyHint=true, indicating a safe read operation. The description adds minimal behavioral context beyond that, such as that it retrieves scores for a specific submission. It does not disclose any additional traits like return format (though output schema exists) or potential errors. Since annotations cover the safety profile, the description's value is moderate, earning a 3.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Conciseness5/5Is the description appropriately sized, front-loaded, and free of redundancy?
The description is extremely concise: one sentence plus a list of parameter descriptions. Every word is purposeful, and the structure is front-loaded with the action. No unnecessary information is included, making it efficient for quick understanding.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Completeness4/5Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given the tool's simplicity (3 parameters, read-only, with an output schema), the description is mostly complete. It identifies the parameters and the core action. It could mention that a rubric must be associated with the assignment, but this is not critical. The presence of an output schema reduces the need to describe return values. Overall, it is sufficiently complete for its context.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Parameters3/5Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
The schema has 0% description coverage, so the description's parameter descriptions are crucial. The description provides brief explanations for each parameter (e.g., 'Canvas ID' for assignment_id). However, these are very basic and do not add details about expected formats, constraints, or how to obtain the IDs. The description partially compensates for the schema gap but is minimal, resulting in a score of 3.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Purpose4/5Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the action ('Get rubric assessment scores') and the resource ('for a specific submission'). It distinguishes the tool from siblings like 'get_rubric' (which gets rubric details) and 'grade_with_rubric' (which writes grades), as it is explicitly for reading assessment scores. However, it does not explicitly differentiate from other similar read tools, so it loses a point.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Usage Guidelines2/5Does the description explain when to use this tool, when not to, or what alternatives exist?
The description does not provide any guidance on when to use this tool versus alternatives. It merely states the action and parameters. There is no mention of prerequisites (e.g., assignment must have a rubric), nor any when-not-to-use instructions. This lack of usage context makes it hard for an AI to decide when this tool is appropriate.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
- Behavior2/5
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
With no annotations, the description should disclose behavioral traits such as side effects, permissions, or idempotency. It only states 'post' without explaining what happens on success, error conditions, or whether the action is reversible.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Conciseness5/5Is the description appropriately sized, front-loaded, and free of redundancy?
The description is concise: one sentence plus a minimal parameter list with no extraneous information. The structure is front-loaded and efficient.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Completeness4/5Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given the simplicity (3 simple parameters), the description is mostly complete. It could mention that the topic must already exist and that this creates a new thread. The output schema exists to define return values, so missing that info is acceptable.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Parameters3/5Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema coverage is 0%, so the description must compensate. It provides brief parameter descriptions: 'Course code or Canvas ID' for course_identifier, 'Discussion topic ID' for topic_id, and 'Entry message content' for message. This adds some clarity over schema titles but lacks constraints or examples.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Purpose5/5Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states 'Post a new top-level entry to a discussion topic,' which specifies the action (post), target (discussion topic), and scope (top-level entry). This distinguishes it from sibling tools like 'reply_to_discussion_entry' which is for replies.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Usage Guidelines2/5Does the description explain when to use this tool, when not to, or what alternatives exist?
No guidance is provided on when to use this tool versus alternatives. It does not mention prerequisites (e.g., topic must exist), exclusions, or comparisons to similar tools like 'create_discussion_topic'.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
- Behavior3/5
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
Annotations declare readOnlyHint=true, so the description is consistent with a safe read operation. It adds that the tool returns the caller's own submissions and supports optional filtering. However, it does not disclose any behavioral details beyond what the name implies.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Conciseness5/5Is the description appropriately sized, front-loaded, and free of redundancy?
The description is extremely concise, with two sentences and a parameter description. No redundant information; every part earns its place.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Completeness4/5Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given the tool's simplicity (single param, output schema present), the description provides sufficient context for invocation. It explains the parameter and purpose. Missing details about output format are covered by the output schema.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Parameters4/5Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
With 0% schema description coverage, the description adds meaningful semantics: it explains that course_identifier can be a course code or Canvas ID and that omitting it queries all courses. This compensates well for the lack of schema descriptions.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Purpose4/5Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the tool retrieves submission status for assignments, and the name includes 'my' to indicate current user. However, it does not explicitly distinguish from sibling tools like 'list_submissions' which may be used by instructors.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Usage Guidelines2/5Does the description explain when to use this tool, when not to, or what alternatives exist?
No explicit guidance on when to use this tool versus alternatives. The only usage hint is about omitting course_identifier for all courses, but no context about prerequisites or when not to use.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
- Behavior3/5
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
The description states it retrieves the 'full content body', adding context beyond the readOnlyHint annotation. However, it does not disclose potential behaviors like large response size, required permissions, or error states. The annotation already covers the read-only nature, so a baseline of 3 is appropriate.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Conciseness5/5Is the description appropriately sized, front-loaded, and free of redundancy?
The description is extremely concise: one sentence followed by a parameter list. Every word earns its place, and the structure is front-loaded with the purpose. No wasted content.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Completeness4/5Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
The tool has an output schema, so return value details are not needed. Given the two parameters and the description's coverage, it is sufficiently complete for a simple retrieval tool. Missing context about potential alternatives or usage scenarios, but not critical.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Parameters4/5Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
With 0% schema coverage, the description compensates by explaining each parameter's meaning: 'Course code or Canvas ID' for course_identifier and 'Page URL slug or page ID' for page_url_or_id. This adds significant value over the schema's bare titles. However, it could be slightly more explicit about acceptable formats.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Purpose4/5Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the action ('Get') and the resource ('full content body of a specific page'), distinguishing it from sibling tools like 'get_page_details' (metadata) and 'list_pages' (listing). However, it does not specify the format of the content body (e.g., HTML, plain text), which slightly reduces clarity.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Usage Guidelines2/5Does the description explain when to use this tool, when not to, or what alternatives exist?
The description provides no guidance on when to use this tool versus alternatives, such as 'get_page_details' or 'get_front_page'. It lacks any context about prerequisites, limitations, or exclusions.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
- Behavior3/5
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
Annotations provide readOnlyHint=true, and the description's 'Get' aligns with read-only behavior. The description adds some behavioral context through the parameter descriptions (e.g., grouping options) but does not disclose additional traits like rate limits, permissions needed, or output format details beyond what annotations already cover.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Conciseness5/5Is the description appropriately sized, front-loaded, and free of redundancy?
The description is concise, front-loading the purpose in a single sentence followed by parameter descriptions. No unnecessary words or repetition. Every sentence serves a purpose.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Completeness4/5Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given the presence of an output schema, the description does not need to detail return values. It covers the key aspects: purpose and parameters. However, it could mention the scope of analytics (e.g., whether it includes all students or only those with reviews) or any constraints like date ranges. Still, it is largely complete for its intended use.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Parameters4/5Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
The input schema has no descriptions for parameters (0% coverage). The description compensates by providing brief explanations for each parameter in the Args section, such as 'Course code or Canvas ID' for course_identifier and 'Include per-student breakdown' for include_student_details. This adds meaningful semantics beyond the schema's type-only information.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Purpose4/5Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the purpose: 'Get peer review completion analytics with student-level breakdown and summary stats.' It specifies the resource (peer review completion analytics) and the action (get). While it distinguishes from siblings like 'analyze_peer_review_quality' or 'generate_peer_review_report' by focusing on completion stats, it does not explicitly differentiate from similar analytics tools.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Usage Guidelines2/5Does the description explain when to use this tool, when not to, or what alternatives exist?
The description provides no guidance on when to use this tool versus alternatives. No mention of when-not-to-use or specific contexts that warrant this tool over other peer review related tools listed as siblings.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
- Behavior3/5
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
Annotations already provide readOnlyHint=true. Description adds that it includes pages, but does not disclose pagination, response size, or any side effects. It does not contradict annotations.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Conciseness4/5Is the description appropriately sized, front-loaded, and free of redundancy?
Description is short and includes purpose first, followed by an Args list. The Args list is somewhat redundant with the schema but not overly verbose. It is efficient overall.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Completeness4/5Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
With a read-only annotation and an output schema (not shown), the description covers the core purpose and required inputs. It mentions including pages, which is specific to this tool. Does not explain output, but output schema covers that.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Parameters3/5Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema coverage is 0%, so description must explain parameters. It provides a brief description for each: course_identifier as 'Course code or Canvas ID', module_id as 'The module ID', and include_content_details with default. This adds some meaning but is minimal.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Purpose5/5Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the tool lists items within a module, including pages, and the name 'list_module_items' is self-explanatory. It distinguishes itself from siblings like 'list_modules' by specifying it operates on a specific module's items.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Usage Guidelines2/5Does the description explain when to use this tool, when not to, or what alternatives exist?
No guidance on when to use this tool versus alternatives like 'add_module_item' or 'list_modules'. Does not mention prerequisites or when not to use.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
- Behavior2/5
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
No annotations are provided, and the description lacks disclosure of behavioral traits such as idempotency, validation behavior, error handling for partial failures, or whether settings are overwritten or merged. The note about front_page is helpful but insufficient for a bulk operation.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Conciseness4/5Is the description appropriately sized, front-loaded, and free of redundancy?
The description is concise with a one-sentence summary followed by a parameter list and a crucial note. It avoids unnecessary elaboration, though the parameter list is formatted as a docstring which could be streamlined.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Completeness3/5Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given the tool's complexity (bulk update with 5 parameters) and the presence of an output schema, the description covers the input semantics and a key limitation. However, it lacks discussion of behavioral completeness like atomicity or partial failure handling.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Parameters4/5Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
The description explains each parameter beyond the schema's empty descriptions, e.g., 'Comma-separated list of page URL slugs' for page_urls and specifying valid values for editing_roles. This adds significant meaning.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Purpose5/5Does the description clearly state what the tool does and how it differs from similar tools?
The description starts with 'Update settings for multiple pages at once,' which clearly states the verb (update), resource (pages), and scope (multiple). This differentiates it from sibling tools like update_page_settings for single pages.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Usage Guidelines3/5Does the description explain when to use this tool, when not to, or what alternatives exist?
The description implicitly indicates usage for bulk updates but does not explicitly state when to use this tool versus alternatives like update_page_settings or create_page. The limitation note about front_page provides context but no comparative guidance.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
- Behavior3/5
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
Annotations already declare readOnlyHint=true, so the description correctly implies read-only. However, it adds no extra behavioral traits like pagination, error handling, or performance considerations. The description does not contradict annotations.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Conciseness4/5Is the description appropriately sized, front-loaded, and free of redundancy?
Very short (3 lines), front-loaded with purpose. Each sentence is useful. The 'Args:' line is a minor redundancy but acceptable. No wasted words.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Completeness3/5Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Output schema exists, so return values need not be described. The description covers the main purpose and key parameters. However, it does not mention sorting, pagination of messages, or scope (e.g., only user's own conversations). Adequate for a simple get operation.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Parameters4/5Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema description coverage is 0%, so description must compensate. It adds brief but meaningful descriptions for each parameter: 'Conversation ID', 'Mark as read when viewed', 'Include all messages'. This clarifies boolean defaults and intent, though could be richer.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Purpose5/5Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states 'Get detailed conversation information with messages', a specific verb+resource combination. It distinguishes from siblings like 'list_conversations' (list only) and 'get_discussion_topic_details' (discussion topics vs conversations).
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Usage Guidelines2/5Does the description explain when to use this tool, when not to, or what alternatives exist?
No guidance on when to use this tool versus alternatives. No indication of prerequisites, context, or when NOT to use it. The parameters have defaults but no explanation of when to override them.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
- Behavior3/5
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
Annotations declare readOnlyHint=true, so the description doesn't need to reiterate safety. However, it adds no additional behavioral context (e.g., data scope, freshness, permissions) beyond the annotation.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Conciseness4/5Is the description appropriately sized, front-loaded, and free of redundancy?
The description is concise with a clear first sentence. The parameter listing is functional but slightly redundant with the schema. No wasted words, but could be tighter.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Completeness3/5Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
With an output schema present, return values don't need full explanation. However, the description is minimal and doesn't hint at the breadth of 'details' returned, leaving some uncertainty.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Parameters4/5Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Despite 0% schema coverage in descriptions, the text clarifies that course_identifier can be a course code or Canvas ID, and topic_id is a discussion topic ID. This adds meaning beyond the schema's generic titles.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Purpose5/5Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the tool retrieves detailed information about a specific discussion topic. The verb 'Get' and resource 'discussion topic' are specific. The name and description are unambiguous.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Usage Guidelines2/5Does the description explain when to use this tool, when not to, or what alternatives exist?
No guidance on when to use this tool versus alternatives like get_discussion_entry_details or get_discussion_with_replies. No prerequisites or context for usage are provided.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
- Behavior3/5
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
Annotations already declare readOnlyHint: true, so the description's 'Get' aligns with read-only behavior. However, the description adds no further behavioral context (e.g., permissions, performance, or what 'detailed information' entails beyond the schema).
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Conciseness4/5Is the description appropriately sized, front-loaded, and free of redundancy?
The description is a single sentence plus a concise Args block. It is front-loaded with the main purpose. The Args block is somewhat redundant with the schema but adds explanatory value, so no wasted words.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Completeness3/5Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
The tool has 2 required parameters and an output schema, but the description does not clarify what 'detailed information' covers (e.g., whether it includes page body, settings, etc.) compared to sibling get_page_content. Given the output schema exists, this is adequate but leaves ambiguity.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Parameters4/5Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
The description includes an Args section that explains both parameters: 'course_identifier: Course code or Canvas ID' and 'page_url_or_id: Page URL slug or page ID'. Since the schema has 0% description coverage, this adds meaningful semantics beyond the schema's types and titles.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Purpose5/5Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states 'Get detailed information about a specific page', which is a specific verb+resource. It distinguishes from sibling tools like list_pages (listing) and get_page_content (content retrieval), as 'details' implies comprehensive information.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Usage Guidelines2/5Does the description explain when to use this tool, when not to, or what alternatives exist?
The description provides no guidance on when to use this tool versus alternatives like get_page_content or list_pages. No exclusions or context are given, leaving the agent to infer without help.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
- Behavior3/5
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
Annotations already declare readOnlyHint: true, so the read-only nature is clear. The description adds behavioral context by listing parameters (e.g., sorting, filtering by published status), but does not disclose whether results are paginated or limited. This is adequate given the annotations.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Conciseness4/5Is the description appropriately sized, front-loaded, and free of redundancy?
The description is concise: one sentence stating the primary purpose followed by a parameter list. It is front-loaded and avoids fluff. The docstring format is acceptable, though it could be more streamlined.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Completeness3/5Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given the existence of an output schema and annotations, the description covers the core functionality and parameter details. However, it lacks information about result size limits, pagination, or that it returns page metadata (not full content). This is adequate but not fully complete.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Parameters4/5Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema description coverage is 0%, but the description provides meaningful explanations for each parameter: 'Course code or Canvas ID' for course_identifier, 'Sort by title, created_at, or updated_at' for sort, 'asc or desc' for order, 'Filter pages containing this term' for search_term, and 'Filter by published status (None for all)'. This adds significant value beyond the schema titles.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Purpose5/5Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states 'List pages for a specific course,' specifying the verb (List), resource (pages), and scope (for a course). This distinguishes it from sibling tools like get_page_content (single page) or create_page. The parameter list further clarifies filtering capabilities.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Usage Guidelines2/5Does the description explain when to use this tool, when not to, or what alternatives exist?
No guidance is provided on when to use this tool versus alternatives. It does not mention when not to use it (e.g., for retrieving full page content) or suggest alternative tools like get_page_content or search_canvas_tools.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
- Behavior3/5
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
Annotations already indicate readOnlyHint=true, and the description's use of 'scan' aligns with a read-only operation. However, the description adds minimal behavioral context beyond the annotation, such as whether the scan is fast or what happens on errors.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Conciseness4/5Is the description appropriately sized, front-loaded, and free of redundancy?
The description is concise and includes a structured args section. While efficient, it could benefit from a clearer separation of purpose and parameters, but overall it is well-organized and to the point.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Completeness3/5Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
An output schema exists, so missing return value info is acceptable. However, the description does not mention error conditions, rate limits, or what 'basic accessibility issues' entails, leaving some ambiguity regarding the tool's output and behavior.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Parameters4/5Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
With 0% schema description coverage, the description compensates by explaining the parameters: course_identifier as course code or Canvas ID, and content_types as comma-separated types listing examples (pages, assignments, discussions, syllabus). This adds meaning beyond the schema's default value.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Purpose5/5Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the tool scans Canvas course content for basic accessibility issues, using a specific verb and resource. It distinguishes from siblings like 'fix_accessibility_issues' and 'format_accessibility_summary' by focusing on scanning.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Usage Guidelines2/5Does the description explain when to use this tool, when not to, or what alternatives exist?
No explicit guidance on when to use this tool versus alternatives, such as when to scan vs. fix issues. It does not mention prerequisites or exclusions, leaving the agent to infer usage from the tool name and its siblings.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
- Behavior2/5
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
No annotations provided and the description omits behavioral details such as whether reminders are sent immediately, if completion status is checked, or how errors are handled. The description only states the action without transparency into side effects or limitations.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Conciseness5/5Is the description appropriately sized, front-loaded, and free of redundancy?
The description is concise: a single purpose line followed by a clear bullet list of parameters. No redundant information, every sentence adds value.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Completeness3/5Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
While the purpose and parameters are clear, the description lacks context on outcomes (return value), error conditions, and behavior in edge cases. An output schema exists but is not referenced; the description could be more complete for a transactional tool.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Parameters5/5Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Each parameter is explained with a brief description beyond the schema titles (e.g., 'course_identifier: Course code or Canvas ID'), adding clarity for agents. With 0% schema description coverage, the description fully compensates by detailing each parameter.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Purpose5/5Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the verb 'send' and resource 'peer review completion reminders' targeting 'specific students', distinguishing it from siblings like 'send_bulk_messages_from_list' and 'send_peer_review_followup_campaign'.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Usage Guidelines2/5Does the description explain when to use this tool, when not to, or what alternatives exist?
No guidance on when to use this tool versus alternatives. The description does not mention exclusions, prerequisites, or comparison to similar tools like 'send_peer_review_followup_campaign'.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
- Behavior2/5
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
No annotations are provided, so the description carries full burden. It does not disclose behavioral traits such as side effects, destructive actions, permissions, or behavior for omitted parameters (e.g., whether they remain unchanged). Only minimal per-parameter behavior is hinted (e.g., 'empty string to remove' for unlock_at).
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Conciseness4/5Is the description appropriately sized, front-loaded, and free of redundancy?
The description is structured as a concise bullet list with clear purpose in the first line. It is relatively efficient, though the parameter list could be more compact for an AI agent.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Completeness3/5Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given the output schema exists, return values are covered. However, the description lacks behavioral context (e.g., partial update behavior) and does not explain the tool's full effect, making it minimally adequate for an 8-parameter update tool.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Parameters4/5Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema description coverage is 0%, but the description adds meaningful context for each parameter (e.g., 'New position in module list', 'Students must complete items in order'). This significantly aids the AI in understanding parameter intent beyond the schema's titles and types.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Purpose5/5Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states 'Update an existing module's settings.' It uses a specific verb (update) and resource (module settings), distinguishing it from sibling tools like create_module, delete_module, and update_module_item.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Usage Guidelines3/5Does the description explain when to use this tool, when not to, or what alternatives exist?
The description implies usage for updating existing modules but does not explicitly state when to use this tool over alternatives, nor does it provide when-not-to-use guidance or prerequisites.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
- Behavior4/5
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
Annotations indicate destructiveHint: true, and the description adds valuable context: the deletion is permanent, removes item associations, but does not delete the actual content (pages, assignments, etc.). This clarifies the behavioral scope beyond the annotation alone.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Conciseness5/5Is the description appropriately sized, front-loaded, and free of redundancy?
The description is concise: a one-line summary followed by a bullet-like list of parameters. Every sentence adds value, and the structure is easy to parse quickly.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Completeness3/5Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
The description explains the permanent nature and the non-deletion of content, but it lacks information about permissions, potential side effects, error scenarios, or what the output schema contains. For a destructive action, more context about safety and outcomes would be appropriate.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Parameters3/5Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema description coverage is 0%, so the description must compensate. It briefly describes course_identifier as 'Course code or Canvas ID' and module_id as 'Module ID to delete', adding some context beyond the schema's anyOf type. However, it does not explain formats, constraints, or how to obtain these identifiers.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Purpose5/5Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the action ('Delete a module from a course') with a specific verb and resource. It distinguishes from sibling tools like delete_module_item and update_module by specifying that it removes the module and its item associations but not the actual content.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Usage Guidelines2/5Does the description explain when to use this tool, when not to, or what alternatives exist?
The description does not provide guidance on when to use this tool versus alternatives such as update_module or delete_module_item. It lacks explicit when-to-use, when-not-to-use, or prerequisites, leaving the agent without clear decision criteria.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
- Behavior3/5
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
Annotations already declare readOnlyHint=true, so the description adds only that the report includes summary/analytics/recommendations. It doesn't disclose potential performance implications or output format (though output schema exists). No contradiction noted.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Conciseness4/5Is the description appropriately sized, front-loaded, and free of redundancy?
The description is front-loaded with a single sentence stating the purpose, followed by a list of parameters. It is efficient but the parameter list is somewhat lengthy; could be tighter.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Completeness4/5Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
For a report generation tool with 9 parameters (2 required) and an output schema, the description covers the core function and parameter semantics. It lacks prerequisites or performance hints, but is adequate for basic usage.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Parameters4/5Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema description coverage is 0%, but the description provides brief semantic details for each parameter (e.g., 'course_identifier: Course code or Canvas ID'). This adds meaning beyond schema titles, though descriptions are terse.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Purpose5/5Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states it generates a peer review completion report with summary, analytics, and follow-up recommendations. It distinguishes itself from sibling tools like get_peer_review_completion_analytics and get_peer_review_followup_list by packaging multiple components into one comprehensive report.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Usage Guidelines2/5Does the description explain when to use this tool, when not to, or what alternatives exist?
The description offers no guidance on when to use this tool versus its numerous siblings (e.g., extract_peer_review_dataset, generate_peer_review_feedback_report). An agent has no explicit context for choosing this tool over alternatives.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
- Behavior3/5
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
The readOnlyHint annotation already indicates a read operation, and the description's 'Get' matches this. The description adds no further behavioral context (e.g., authentication, rate limits, or side effects). It is consistent but uninformative beyond the annotation.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Conciseness4/5Is the description appropriately sized, front-loaded, and free of redundancy?
The description is concise with a clear purpose sentence followed by an organized Args block. Every sentence adds value, though the structure could be slightly more compact.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Completeness4/5Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given the output schema exists, the description adequately covers what the tool returns. It mentions inclusion of replies. However, it lacks any context on prerequisites or edge cases, which would make it more complete.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Parameters4/5Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
The schema has 0% description coverage, so the description's docstring compensates by explaining each parameter (e.g., 'include_replies: Fetch and include replies (default: True)'). This adds meaning beyond the schema's titles and types.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Purpose5/5Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the action ('Get detailed information') and the resource ('a specific discussion entry including all its replies'). It distinguishes from sibling tools like 'get_discussion_topic_details' by focusing on entries rather than topics.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Usage Guidelines2/5Does the description explain when to use this tool, when not to, or what alternatives exist?
The description provides no guidance on when to use this tool versus alternatives such as 'get_discussion_with_replies' or 'get_discussion_topic_details'. No use cases, prerequisites, or exclusions are mentioned.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
- Behavior3/5
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
The description adds context that the tool returns a mapping with completion status, which goes beyond the readOnlyHint annotation. However, it does not disclose other behavioral traits such as pagination, default filters, or any limits. Annotations already indicate read-only, so the description contributes moderately but not extensively.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Conciseness5/5Is the description appropriately sized, front-loaded, and free of redundancy?
The description is extremely concise: a single sentence stating the tool's purpose followed by a clean parameter list. Every element earns its place without redundancy. The structure is front-loaded with the main action and then details.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Completeness3/5Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given that an output schema exists, the description does not need to explain return values. It adequately covers purpose and parameters. However, it lacks usage context or guidance on when to invoke this tool relative to siblings, which is a gap considering the number of related tools.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Parameters4/5Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema description coverage is 0%, so the description must compensate. It provides brief but meaningful explanations for each parameter (e.g., 'Course code or Canvas ID', 'Include student names'), adding semantic value over the raw schema types. While more detail (e.g., accepted formats, defaults) could help, the current level is adequate.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Purpose5/5Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states that the tool retrieves a peer review assignment mapping showing who reviews whom with completion status. This distinguishes it from sibling tools like list_peer_reviews (which lists reviews) and get_peer_review_comments (which gets comments), making the purpose specific and actionable.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Usage Guidelines2/5Does the description explain when to use this tool, when not to, or what alternatives exist?
No guidance is provided on when to use this tool versus alternatives. The sibling list includes other peer review tools, but the description does not mention scenarios where this tool is preferred, nor does it suggest any exclusions or prerequisites.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
- Behavior3/5
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
Annotations already declare readOnlyHint=true, and the description adds that it retrieves 'actual comment text'. However, it does not disclose behavioral details like pagination, rate limits, or what the include options imply for response size. The output schema exists but the description adds minimal behavioral context.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Conciseness4/5Is the description appropriately sized, front-loaded, and free of redundancy?
The description is succinct, starting with a clear one-line purpose, followed by a parameter list. It is efficient and front-loaded, though the parameter list could be formatted more readably.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Completeness4/5Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given the tool's read-only nature, 6 parameters (2 required), and the existence of an output schema, the description covers the essential purpose and param semantics. It does not mention prerequisites or filtering, but overall it is reasonably complete for a retrieval tool.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Parameters4/5Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema description coverage is 0%, but the description lists all 6 parameters with brief explanations (e.g., 'Course code or Canvas ID', 'Replace names with anonymous IDs'), adding significant meaning beyond the schema's parameter definitions.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Purpose5/5Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the verb 'Retrieve' and the specific resource 'actual comment text for peer reviews on an assignment', distinguishing it from sibling tools like get_peer_review_assignments or list_peer_reviews.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Usage Guidelines2/5Does the description explain when to use this tool, when not to, or what alternatives exist?
No guidance is provided on when to use this tool versus alternatives (e.g., analyze_peer_review_quality, generate_peer_review_report). The description does not mention prerequisites, context, or exclusions.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
- Behavior3/5
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
Annotations already include readOnlyHint=true, so the description does not need to restate that. It adds behavioral context like prioritization, optional contact info, and days-based urgency calculation. However, it does not describe output format or sort order beyond priority, which is partially covered by the output schema.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Conciseness5/5Is the description appropriately sized, front-loaded, and free of redundancy?
The description is a single clear sentence followed by a concise parameter list. Every sentence adds necessary information, with no redundant or filler content. Front-loaded with the main purpose.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Completeness4/5Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given 5 parameters and existing output schema, the description covers all necessary parameter semantics. It lacks explanation of how priority is calculated beyond the days_threshold parameter, but this is sufficient for a list-retrieval tool.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Parameters5/5Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
The description explicates all 5 parameters beyond what the input schema provides (which has 0% description coverage). It specifies acceptable values for priority_filter (urgent, medium, low, all) and clarifies that course_identifier accepts both code and Canvas ID. Defaults are omitted but visible in schema.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Purpose4/5Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states it retrieves a prioritized list of students needing follow-up on peer review completion. It identifies the resource (students) and action (get list with prioritization), but does not explicitly differentiate from siblings like 'generate_peer_review_report' or 'get_peer_review_completion_analytics', which would earn a 5.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Usage Guidelines2/5Does the description explain when to use this tool, when not to, or what alternatives exist?
The description provides no guidance on when to use this tool versus alternatives such as 'send_peer_review_followup_campaign' or 'list_peer_reviews'. It lacks prerequisites, exclusions, or context about typical use cases.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
- Behavior3/5
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
Annotations already indicate readOnlyHint=true. The description adds no extra behavioral details (e.g., pagination, permissions). Minimal value beyond annotations.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Conciseness4/5Is the description appropriately sized, front-loaded, and free of redundancy?
Two-line description plus Args section; concise and structured, though Args section is slightly redundant with schema.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Completeness4/5Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Output schema covers return format. Description covers purpose, but omits potential details like how users are listed (e.g., role filter, pagination). Adequate for a simple list tool.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Parameters4/5Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema description coverage is 0%, but description explains course_identifier as 'Course code or Canvas ID', adding meaning beyond the schema's title.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Purpose5/5Does the description clearly state what the tool does and how it differs from similar tools?
The description 'List users enrolled in a specific course' clearly states the action (list) and resource (users in a course), distinguishing it from sibling tools like list_assignments or list_courses.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Usage Guidelines2/5Does the description explain when to use this tool, when not to, or what alternatives exist?
No guidance on when to use this tool versus alternatives (e.g., search or filter tools). Lacks context about when not to use it or prerequisites.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
- Behavior2/5
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
With no annotations present, the description must fully disclose behavioral traits. It only states 'Update,' indicating mutation, but omits critical details such as error conditions (e.g., missing assignment), side effects, required permissions, rate limits, or return value structure. This falls short of meaningful transparency.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Conciseness4/5Is the description appropriately sized, front-loaded, and free of redundancy?
The description is front-loaded with the core purpose and uses a structured bullet-like list for parameters, which enhances readability. However, it is somewhat lengthy due to repeating parameter names, and could be slightly more compact. Overall, it is appropriately sized but not maximally concise.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Completeness3/5Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given 15 parameters and the presence of an output schema (not shown), the description adequately covers parameter semantics but omits behavioral context such as what happens when an optional field is omitted or how the updated assignment is returned. While the parameter details are thorough, the overall usage context is incomplete without more on return value and error handling.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Parameters5/5Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema description coverage is 0%, so the description carries the entire burden. It provides concise yet informative descriptions for all 15 parameters, including example values for enums and formats (e.g., 'Comma-separated types (online_text_entry, ...)', 'ISO 8601 format'). This adds substantial meaning beyond the schema's minimal titles.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Purpose5/5Does the description clearly state what the tool does and how it differs from similar tools?
The description begins with 'Update an existing assignment in a course,' which clearly states the action and resource. This verb+resource specification is specific enough to distinguish from unrelated tools, and there is no ambiguity with siblings like 'create_assignment' since the resource type differs.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Usage Guidelines3/5Does the description explain when to use this tool, when not to, or what alternatives exist?
The description implies usage when an assignment needs modifying, but it provides no explicit guidance on when to use this tool versus alternatives, nor does it mention prerequisites or exclusions. While the context is clear, the lack of explicit when/when-not instructions makes it only adequate.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
- Behavior3/5
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
Annotations include readOnlyHint=true, and the description's 'Analyze' aligns with a read-only operation. However, the description does not disclose any additional behavioral traits beyond the parameter definitions, such as performance or return characteristics.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Conciseness5/5Is the description appropriately sized, front-loaded, and free of redundancy?
The description is concise: a single sentence for purpose followed by a structured Args list with clear parameter explanations. No redundant or extraneous content.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Completeness4/5Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given the 4 parameters and 0% schema coverage, the description provides reasonable completeness by defining each parameter. The output schema exists, so return values are covered. However, the description lacks specifics on what the analysis entails beyond 'quality and content,' which could be improved.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Parameters4/5Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
With 0% schema description coverage, the description compensates by explaining each parameter: course_identifier as 'Course code or Canvas ID', assignment_id as 'Canvas assignment ID', analysis_criteria as 'JSON string of custom criteria', and generate_report as 'Generate detailed analysis report'. This adds substantive meaning beyond the schema.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Purpose5/5Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states 'Analyze the quality and content of peer review comments,' which is a specific verb+resource combination. It distinguishes this tool from siblings like 'extract_peer_review_dataset' or 'generate_peer_review_report,' as those focus on different aspects.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Usage Guidelines2/5Does the description explain when to use this tool, when not to, or what alternatives exist?
The description provides no guidance on when to use this tool versus alternatives, nor does it mention prerequisites or scenarios where it is appropriate. The Args list gives parameter details but lacks contextual usage advice.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
- Behavior4/5
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
Annotations provide destructiveHint: true, but the description adds important context: the deletion is permanent (though Canvas may retain a recycle-bin copy) and describes the safety require_title_match parameter.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Conciseness4/5Is the description appropriately sized, front-loaded, and free of redundancy?
The description is concise with front-loaded purpose, includes a brief note on permanence, and uses a clear Args list. Minor improvement could combine the first two sentences.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Completeness3/5Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
For a delete tool with an output schema, the description covers key aspects (purpose, parameters, safety, permanence) but does not address return values or error behaviors. Adequate but not overly complete.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Parameters4/5Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema coverage is 0%, so the description carries full burden. It explains course_identifier as 'Course code or Canvas ID', page_url_or_id as 'Page URL slug or page ID', and require_title_match as 'Safety check — only delete if page title matches exactly', adding essential meaning.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Purpose5/5Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states 'Delete a page from a Canvas course' with a specific verb and resource, distinguishing it from sibling tools like delete_announcement or delete_module.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Usage Guidelines2/5Does the description explain when to use this tool, when not to, or what alternatives exist?
No guidance on when to use this tool versus alternatives (e.g., when to use create_page or update_page). The description lacks explicit usage context or exclusions.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
- Behavior3/5
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
The description is consistent with the 'readOnlyHint: true' annotation, indicating a read operation. It adds value by explaining each parameter's role, but does not disclose additional behavioral traits such as file handling when 'save_locally' is false or potential size limits.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Conciseness4/5Is the description appropriately sized, front-loaded, and free of redundancy?
The description is front-loaded with a clear purpose sentence, followed by a concise parameter list. It is efficient given the number of parameters, though the parameter details could be slightly more compact.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Completeness3/5Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
With 7 parameters and an output schema, the description covers parameter meanings but omits details like prerequisites, synchronous behavior, or what happens when 'save_locally' is false. It is adequate but not fully comprehensive.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Parameters4/5Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
The schema has 0% description coverage, but the tool description compensates by explaining each parameter, e.g., 'output_format (csv, json, xlsx)' adds constraints absent from the schema. This provides meaningful context beyond types and titles.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Purpose5/5Does the description clearly state what the tool does and how it differs from similar tools?
The description uses a specific verb 'Export' and resource 'all peer review data,' clearly distinguishing it from sibling tools like generating reports or listing assignments. It also mentions 'in various formats for analysis,' which further clarifies the tool's function.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Usage Guidelines3/5Does the description explain when to use this tool, when not to, or what alternatives exist?
The description implies use for data extraction but does not explicitly state when to use this tool versus alternatives like 'generate_peer_review_report' or 'list_peer_reviews.' No when-not or alternative guidance is provided.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
- Behavior3/5
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
Annotations already indicate readOnlyHint=true, so the description does not need to emphasize safety. The description adds the optional reply fetching behavior, but no further traits like rate limits or error handling are disclosed.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Conciseness4/5Is the description appropriately sized, front-loaded, and free of redundancy?
The description is concise with one sentence plus an Args block. The block is helpful given the schema's lack of parameter descriptions, but could be slightly more streamlined.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Completeness4/5Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given the presence of an output schema (not shown), the return values are documented externally. The description covers inputs and core behavior adequately, though edge cases or error conditions are omitted.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Parameters5/5Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
The schema has 0% description coverage, but the description's Args block explains all three parameters clearly—course_identifier, topic_id, and include_replies with default—adding meaning beyond the raw schema types.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Purpose5/5Does the description clearly state what the tool does and how it differs from similar tools?
The description explicitly states the verb 'get' and the resource 'discussion entries with optional reply fetching,' distinguishing it from siblings like 'get_discussion_entry_details' and 'list_discussion_entries' which serve different scopes.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Usage Guidelines2/5Does the description explain when to use this tool, when not to, or what alternatives exist?
No guidance is provided on when to use this tool versus alternatives such as 'get_discussion_entry_details' for a single entry or 'list_discussion_entries' for a list without replies, leaving the agent to infer.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
- Behavior3/5
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
Annotations already declare readOnlyHint=true, so the description does not need to restate safety. However, the description adds no further behavioral details (e.g., pagination, ordering, side effects). It is adequate but minimal.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Conciseness5/5Is the description appropriately sized, front-loaded, and free of redundancy?
The description is a single sentence plus a brief arg docstring. It is front-loaded and contains no unnecessary words.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Completeness4/5Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given the tool's simplicity and the presence of an output schema, the description is complete enough. It covers the purpose and parameter. Missing details like listing order or filtering are not critical for this basic list operation.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Parameters4/5Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
The schema has 0% description coverage, but the description's docstring explicitly states that course_identifier can be a 'Course code or Canvas ID', adding meaningful context beyond the schema title.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Purpose5/5Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the action ('List announcements') and the target resource ('for a specific course'). It distinguishes from sibling tools like list_assignments or list_discussion_topics by specifying 'announcements' as the resource.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Usage Guidelines2/5Does the description explain when to use this tool, when not to, or what alternatives exist?
No guidance is provided on when to use this tool versus alternatives (e.g., search tools or other list tools). There is no mention of context, prerequisites, or when not to use it.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
- Behavior3/5
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
With no annotations provided, the description must convey behavioral traits. It explains the bracket-notation form-data encoding requirement and details the criteria parameter structure. However, it omits important information such as required permissions, idempotency, response format (despite output schema existing but undocumented), and potential side effects. The description adds value but leaves gaps.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Conciseness3/5Is the description appropriately sized, front-loaded, and free of redundancy?
The description is well-structured with a clear intro, encoding note, criteria documentation, and parameter list. However, it is lengthy due to the extensive criteria example and technical encoding details. While every part is relevant, it could be more concise; the criteria explanation could be truncated and referenced in the parameter description section.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Completeness4/5Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given the tool's complexity (7 parameters, JSON criteria), the description provides thorough coverage of inputs, especially the critical criteria structure. However, it does not describe the output or return value (despite an output schema existing), and lacks example calls or usage patterns. It is mostly complete for input but not fully self-contained.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Parameters5/5Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema description coverage is 0%, so the description must compensate. It does so excellently by thoroughly explaining each parameter, especially the 'criteria' parameter with a detailed docstring and a JSON example. It adds meaning beyond the schema's type and title fields for all 7 parameters.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Purpose5/5Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states 'Create a new rubric in a course, optionally associating it with an assignment.' This is a specific verb-resource pair with scope, effectively distinguishing it from sibling tools like 'associate_rubric' and 'list_rubrics'.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Usage Guidelines3/5Does the description explain when to use this tool, when not to, or what alternatives exist?
The description implies when to use the tool (to create a rubric) and mentions optional association with an assignment, but does not explicitly state when not to use it or contrast it with alternatives like 'associate_rubric' for linking existing rubrics. Usage context is clear but lacks exclusions.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
- Behavior2/5
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
No annotations are provided, so the description must fully disclose behavioral traits. It describes the action as 'Submit grades' but does not clarify whether grades are overwritten, if the submission is final, or any permission requirements. The description omits critical details about side effects (e.g., irreversible grading, notification to students) that are essential for an agent to use the tool safely.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Conciseness4/5Is the description appropriately sized, front-loaded, and free of redundancy?
The description is well-structured with a clear summary followed by an 'Args:' block. It is not overly verbose, but the bullet-like listing of arguments adds slight redundancy. Overall, it is efficient and front-loaded with key usage notes, earning a high score for conciseness.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Completeness3/5Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given the tool's complexity (nested rubric_assessment, 5 parameters) and the presence of an output schema, the description covers input semantics well but does not mention the return value or any post-submission behavior. The output schema exists, which reduces the burden, but a brief note on what the response contains would enhance completeness. The description is adequate but not fully comprehensive.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Parameters5/5Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema description coverage is 0%, so the description must fully explain parameters. It does so effectively, listing each parameter with clear definitions, especially the complex 'rubric_assessment' parameter: 'Dict mapping criterion_id to {points (required), rating_id?, comments?}'. This adds significant meaning beyond the schema's generic 'object' type, ensuring the agent constructs valid inputs.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Purpose5/5Does the description clearly state what the tool does and how it differs from similar tools?
The description explicitly states 'Submit grades using rubric criteria,' making the tool's purpose clear. The verb 'submit' and the resource 'grades using rubric criteria' precisely define the action. Among sibling tools like 'bulk_grade_submissions' and 'get_rubric_assessment', this tool uniquely focuses on single-student rubric-based grading, and the description effectively distinguishes it.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Usage Guidelines4/5Does the description explain when to use this tool, when not to, or what alternatives exist?
The description provides important guidance: 'Use get_rubric to find criterion/rating IDs' and 'The rubric must be attached to the assignment and configured for grading (use_for_grading=true).' This clarifies prerequisites and directs the agent to necessary preparatory steps. However, it does not explicitly state when not to use this tool or mention alternatives like 'bulk_grade_submissions', which would strengthen guidance.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
- Behavior3/5
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
The readOnlyHint annotation already indicates this is a read operation. The description adds no further behavioral context such as pagination, ordering, or filtering beyond the required course identifier.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Conciseness5/5Is the description appropriately sized, front-loaded, and free of redundancy?
The description is extremely concise with two lines, no unnecessary words, and uses code formatting for parameters. Every part is purposeful.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Completeness3/5Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
While the output schema is present, the description lacks details on return format, pagination, or how this differs from similarly named sibling tools like list_submissions. It is minimally complete for a simple list operation.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Parameters4/5Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
The schema parameter has 0% description coverage, but the description explains that course_identifier is a 'Course code or Canvas ID', adding meaning beyond the raw anyOf type. This compensates for the schema gap.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Purpose5/5Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states 'List assignments for a specific course', specifying the verb (list), resource (assignments), and scope (course). This distinguishes it from siblings like get_assignment_details or create_assignment.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Usage Guidelines3/5Does the description explain when to use this tool, when not to, or what alternatives exist?
The description implies usage for listing assignments but does not explicitly guide when to use this tool versus alternatives like get_assignment_details for a single assignment or list_submissions for submissions.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
- Behavior3/5
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
No annotations are provided, so the description must carry all behavioral disclosure. It mentions parameters like 'force_new' which hints at behavior, but fails to describe side effects, authentication needs, or error responses. The existence of an output schema partially mitigates this, but the description itself lacks depth.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Conciseness4/5Is the description appropriately sized, front-loaded, and free of redundancy?
The description is concise with a clear summary sentence followed by a parameter list. It avoids redundancy but could be more structured (e.g., using bullet points). Every sentence adds value, and it's appropriately sized for the complexity.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Completeness4/5Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given the tool's complexity (10 parameters, 4 required) and no annotations, the description covers key edge cases like bulk messaging and async mode. It references an output schema, so return values need not be detailed. However, it omits permissions or rate limits, which are relevant for a messaging tool.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Parameters4/5Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema description coverage is 0%, so the description must explain parameters. It adds meaning for many, e.g., 'group_conversation: Create group conversation (required for custom subjects)' and 'mode: sync or async (use async for >100 recipients)'. Some parameters like 'course_identifier' lack format details, but overall it adds significant value.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Purpose5/5Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states 'Send messages to students via Canvas conversations' with specific verb and resource. It distinguishes from sibling tools like 'list_conversations' or 'mark_conversations_read' by focusing on sending new messages. The parameter list further clarifies the action.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Usage Guidelines3/5Does the description explain when to use this tool, when not to, or what alternatives exist?
Internal usage guidelines are provided, e.g., 'use async for >100 recipients' and 'group_conversation required for custom subjects'. However, it does not compare with sibling tools like 'send_bulk_messages_from_list', leaving the choice between them ambiguous.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
- Behavior2/5
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
No annotations are provided, so the description must carry the full burden. It does not mention side effects (e.g., ordering changes, permission requirements, or idempotency), leaving the agent unaware of behavioral implications beyond the action name.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Conciseness4/5Is the description appropriately sized, front-loaded, and free of redundancy?
The description opens with a clear summary and then lists parameters in a readable format. It is somewhat lengthy but well-organized, though the 'IMPORTANT' line and subsequent parameter list overlap slightly.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Completeness4/5Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given the complexity (12 parameters, no schema descriptions), the description covers all usage requirements comprehensively. The presence of an output schema means missing return value information is acceptable.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Parameters5/5Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
With 0% schema coverage, the description fully compensates by detailing each parameter's meaning, conditional requirements, and valid values (e.g., '1-indexed' for position, 'must_view' for completion_requirement_type). This adds significant value over the bare schema.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Purpose5/5Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states 'Add an item to a module.' and lists all item types and parameter requirements, making the tool's purpose and scope unambiguous. It distinguishes itself from sibling tools like delete_module_item or list_module_items by focusing on creation.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Usage Guidelines4/5Does the description explain when to use this tool, when not to, or what alternatives exist?
The description provides explicit conditional requirements for different item types (e.g., 'content_id required for File'), which guides proper usage. However, it does not mention when to use alternatives like update_module_item or other tools, leaving some context unaddressed.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
- Behavior3/5
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
Annotations already declare readOnlyHint=true, so the agent knows this is a safe read operation. The description adds no additional behavioral details beyond stating it retrieves content. No contradiction.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Conciseness5/5Is the description appropriately sized, front-loaded, and free of redundancy?
The description is extremely concise: one sentence for purpose and one line for the parameter. Front-loaded with the main action, no unnecessary text.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Completeness5/5Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
For a simple read tool with one parameter, annotations, and an output schema, the description covers the essential parameter meaning. The output schema will define the return format, so no gaps remain.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Parameters5/5Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
With 0% schema description coverage, the description compensates by explaining the parameter as 'Course code or Canvas ID', adding semantic meaning beyond the schema's type constraints.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Purpose5/5Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states 'Get the front page content for a course.' This is a specific verb+resource combination that distinguishes it from siblings like 'get_page_content' (which likely retrieves any page).
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Usage Guidelines2/5Does the description explain when to use this tool, when not to, or what alternatives exist?
The description provides no guidance on when to use this tool versus alternatives such as 'get_page_content' or 'get_page_details'. An agent would benefit from explicit differentiation.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
- Behavior3/5
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
The description adds context beyond the readOnlyHint annotation by specifying 'you need to complete', indicating filtered results. However, no disclosure of rate limits, data scope, or response format beyond what annotations provide.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Conciseness5/5Is the description appropriately sized, front-loaded, and free of redundancy?
Extremely concise: one sentence plus an args line. No redundant information. The purpose is front-loaded.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Completeness4/5Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
For a simple tool with one parameter and an existing output schema, the description covers the essential functionality and parameter meaning. Minor gap: could specify that it returns a list, but output schema likely handles that.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Parameters4/5Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
With 0% schema description coverage, the description compensates well by explaining the single parameter 'course_identifier' as a course code or Canvas ID, and notes that omitting it returns all courses. This adds crucial meaning beyond the schema.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Purpose5/5Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the tool retrieves pending peer reviews for the user, using specific verb 'get' and resource 'peer reviews you need to complete'. It differentiates from siblings like 'list_peer_reviews' and 'get_peer_review_assignments' by focusing on pending items for the user.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Usage Guidelines3/5Does the description explain when to use this tool, when not to, or what alternatives exist?
The description implies usage for the user's own pending peer reviews, but lacks explicit guidance on when not to use it or compare with alternative tools. No mention of prerequisites or context.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
- Behavior3/5
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
The description does not add behavioral details beyond the readOnlyHint annotation. Since annotations already indicate read-only behavior, the description is not contradictory but fails to disclose any additional traits like data freshness or scope.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Conciseness5/5Is the description appropriately sized, front-loaded, and free of redundancy?
The description is a single, precise sentence. It is front-loaded and contains no extraneous information, making it highly efficient for an agent to parse.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Completeness4/5Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given the tool's simplicity (no parameters, output schema exists), the description provides adequate context. However, it could briefly clarify what constitutes a 'TODO list' (e.g., tasks, assignments) for full completeness.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Parameters4/5Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
The input schema has zero parameters, and the description adds no further parameter details. Per the guideline, a baseline of 4 is appropriate for zero parameters.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Purpose5/5Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the action ('Get') and resource ('your Canvas TODO list'), making the purpose unmistakable. It effectively differentiates from sibling tools like get_my_peer_reviews_todo and get_my_upcoming_assignments by specifying TODO items.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Usage Guidelines3/5Does the description explain when to use this tool, when not to, or what alternatives exist?
No explicit guidance on when to use this tool versus alternatives. While the name and description imply it's for the user's TODO list, the presence of similar sibling tools (e.g., get_my_upcoming_assignments) could cause confusion without additional context.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
- Behavior3/5
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
The description adds no behavioral details beyond the readOnlyHint annotation; it does not disclose scope or counting criteria, but no contradiction with annotations.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Conciseness5/5Is the description appropriately sized, front-loaded, and free of redundancy?
Extremely concise (5 words, one sentence) with no unnecessary information, appropriately front-loaded.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Completeness4/5Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
For a simple, parameterless tool with output schema, the description is mostly complete; could specify user scope (e.g., current user) but not critical.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Parameters4/5Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
No parameters exist, so the description does not need to supplement schema; baseline score per dimension guidelines.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Purpose5/5Does the description clearly state what the tool does and how it differs from similar tools?
The description uses a specific verb ('Get') and resource ('number of unread conversations'), clearly distinguishing it from sibling tools like 'list_conversations' which return full lists.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Usage Guidelines3/5Does the description explain when to use this tool, when not to, or what alternatives exist?
No guidance on when to use this tool versus alternatives; the purpose is implied but no exclusions or context provided.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
- Behavior3/5
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
Annotations already indicate readOnlyHint=true, so the description's read-only behavior is consistent. The description adds context that the tool consumes a JSON string from parse_ufixit_violations, but does not disclose additional behavioral traits beyond what annotations provide.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Conciseness5/5Is the description appropriately sized, front-loaded, and free of redundancy?
The description is concise with two efficient sentences, front-loading the purpose. No extraneous information; every sentence adds value.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Completeness5/5Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
The tool is simple with one parameter and an output schema. The description covers what the tool does and the input requirements fully, leaving no gaps for the agent to interpret.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Parameters4/5Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema coverage is 0% with no parameter descriptions, but the description adds meaning: 'violations_json: JSON string from parse_ufixit_violations.' This guides the agent on the expected input format and source, compensating for the lack of schema descriptions.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Purpose5/5Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the tool's purpose: 'Format parsed violations into a human-readable summary.' It specifies the verb (format) and resource (parsed violations), and references the input source (parse_ufixit_violations), distinguishing it from sibling tools.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Usage Guidelines3/5Does the description explain when to use this tool, when not to, or what alternatives exist?
The description implies usage after parse_ufixit_violations but does not explicitly state when to use it over alternatives or provide exclusions. Usage context is implied but lacks direct guidance.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
- Behavior3/5
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
Annotations already declare readOnlyHint=true, indicating a read operation. The description adds that it retrieves status and statistics, but does not disclose any additional behavioral traits beyond what annotations provide.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Conciseness5/5Is the description appropriately sized, front-loaded, and free of redundancy?
The description is a single, concise sentence that efficiently conveys the tool's purpose without unnecessary words.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Completeness5/5Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given the tool has no parameters, an output schema exists to describe return values, and the description covers the purpose, the information is complete for effective use.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Parameters4/5Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
The input schema has no parameters, and schema description coverage is 100%. The description does not need to add parameter meaning, and no further clarification is necessary.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Purpose5/5Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly specifies the verb 'get' and the resource 'current data anonymization status and statistics', which is specific and distinguishes it from sibling tools like 'create_student_anonymization_map' which deals with creation.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Usage Guidelines3/5Does the description explain when to use this tool, when not to, or what alternatives exist?
The description provides no explicit guidance on when to use this tool vs alternatives. However, the nature of the tool (getting status) is straightforward and sibling tools are distinct, so usage is implied.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
- Behavior3/5
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
Annotations already declare readOnlyHint: true, so the description carries a lower burden. The description adds no behavioral details beyond the basic purpose. It does not contradict annotations.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Conciseness5/5Is the description appropriately sized, front-loaded, and free of redundancy?
Single sentence, no redundancy, front-loaded with the core action. Every word earns its place.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Completeness5/5Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
For a tool with no parameters, a readOnlyHint annotation, and an output schema, the description provides sufficient context. It fully explains the tool's purpose in one sentence.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Parameters4/5Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Input schema has zero parameters, so description need not add parameter details. The phrase 'across all enrolled courses' implicitly confirms no filtering. Baseline for 0 params is 4.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Purpose5/5Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the action ('Get'), the resource ('your current grades'), and the scope ('across all enrolled courses'). It is specific and distinguishes the tool from siblings that target individual assignments or courses.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Usage Guidelines3/5Does the description explain when to use this tool, when not to, or what alternatives exist?
The description implies usage context (when you need overall current grades) but does not explicitly state when to avoid this tool or mention alternatives. Given the list of siblings, no direct competitor exists, but explicit guidance would improve clarity.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
- Behavior3/5
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
Annotations already indicate readOnlyHint=true, and the description adds that optional full content and replies can be fetched. However, it does not disclose pagination behavior, rate limits, or any side effects beyond read-only nature. The description provides moderate additional context but is not comprehensive.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Conciseness5/5Is the description appropriately sized, front-loaded, and free of redundancy?
The description is a single clear sentence followed by a compact parameter list. Every sentence adds value, and the structure is front-loaded with the core purpose. No extraneous content.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Completeness4/5Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
The description adequately covers purpose and parameters, and the presence of an output schema helps. However, it lacks mention of pagination, ordering, or explicit differentiation from similar tools like get_discussion_with_replies, which slightly reduces completeness.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Parameters5/5Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
With 0% schema description coverage, the description fully explains each parameter: course_identifier as 'Course code or Canvas ID', topic_id as 'Discussion topic ID', and boolean options with defaults. This adds critical meaning beyond the bare schema types, compensating for the lack of schema descriptions.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Purpose5/5Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states 'List discussion entries (posts) for a specific discussion topic', specifying the action (list) and resource (discussion entries). It distinguishes this from sibling tools like get_discussion_entry_details (single entry) and list_discussion_topics (lists topics), making the purpose unambiguous.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Usage Guidelines3/5Does the description explain when to use this tool, when not to, or what alternatives exist?
The description explains the parameters but does not provide explicit guidance on when to use this tool versus alternatives like get_discussion_with_replies or get_discussion_entry_details. The usage context is implied but lacks direct exclusions or recommendations.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
- Behavior3/5
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
Annotations already declare readOnlyHint=true. The description adds that it lists groups and members, which is non-obvious, but does not disclose potential behavioral traits like pagination or permission requirements.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Conciseness5/5Is the description appropriately sized, front-loaded, and free of redundancy?
The description is extremely concise, consisting of a single sentence and an Args line. Every word adds value, and the key action is front-loaded.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Completeness5/5Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given the tool's simplicity (one parameter, output schema present), the description fully covers what the tool does and what input it needs. No additional context is necessary.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Parameters4/5Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
The description adds meaningful semantic context to the sole parameter by stating it accepts 'Course code or Canvas ID,' supplementing the schema's minimal type-only definition (0% coverage).
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Purpose5/5Does the description clearly state what the tool does and how it differs from similar tools?
The description explicitly states the tool lists 'all groups and their members for a specific course,' using a specific verb and resource that clearly differentiates it from siblings like list_courses or list_users.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Usage Guidelines3/5Does the description explain when to use this tool, when not to, or what alternatives exist?
The description indicates the tool requires a course identifier but does not provide when-not-to-use guidance or mention alternatives like other group-related tools. Context is clear but exclusion criteria are absent.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
- Behavior4/5
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
Annotations already mark destructiveHint=true. The description adds behavioral details: permanent deletion with possible recycle-bin, dry_run behavior that fetches titles without deleting, and limits ignored when dry_run=True. This provides useful context beyond annotations.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Conciseness4/5Is the description appropriately sized, front-loaded, and free of redundancy?
The description is well-structured with a summary line, a permanence note, and an Args list. It is concise but not terse; every sentence adds value. A minor improvement would be removing the 'Args:' header since it's a docstring convention, but overall effective.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Completeness4/5Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given 5 parameters and an existing output schema, the description covers parameter behavior comprehensively. It does not describe the output, but that is handled by the output schema. The dry_run and limit details ensure safe usage. Slightly more context on error behavior could boost completeness.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Parameters5/5Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema coverage is 0% (no descriptions in input schema). The description fully compensates with an Args block explaining each parameter: course_identifier, announcement_ids, stop_on_error, limit, and dry_run. It clarifies limit ignored when dry_run=True, adding semantics missing from schema.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Purpose5/5Does the description clearly state what the tool does and how it differs from similar tools?
The description starts with a specific verb+resource 'Delete multiple announcements from a Canvas course.' It clearly distinguishes from single-delete sibling tools like 'delete_announcement' and criteria-based 'delete_announcements_by_criteria.'
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Usage Guidelines3/5Does the description explain when to use this tool, when not to, or what alternatives exist?
The description gives usage context (permanence, dry_run) but does not explicitly compare with alternative delete tools or state when to use this over siblings. The dry_run parameter is a safety guideline, but no exclusions or when-not-to-use advice is provided.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
- Behavior3/5
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
Annotations already provide readOnlyHint=true. The description adds no further behavioral details beyond what the annotation and schema cover.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Conciseness5/5Is the description appropriately sized, front-loaded, and free of redundancy?
Concise one-line purpose followed by parameter list. No fluff, front-loaded.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Completeness4/5Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Covers purpose and all parameters. Has output schema, so return details are handled. Minor omission: no mention of pagination or ordering, but adequate for a list tool.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Parameters4/5Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
With 0% schema description coverage, the description compensates by explaining each parameter's purpose (e.g., 'Course code or Canvas ID', 'Include summary of items'). Adds value beyond schema titles.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Purpose5/5Does the description clearly state what the tool does and how it differs from similar tools?
Clearly states 'List all modules in a course,' specifying the verb and resource. Distinguishes from sibling tools like create_module, delete_module, etc.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Usage Guidelines4/5Does the description explain when to use this tool, when not to, or what alternatives exist?
The description implies use for listing modules, and parameters clarify filtering options. However, it does not explicitly state when to prefer this over alternatives like list_module_items or get_course_structure.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
- Behavior3/5
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
Annotations already declare readOnlyHint: true, so the description adds little behavioral transparency beyond stating 'parse'. It does not mention any additional behaviors like potential side effects or limitations.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Conciseness4/5Is the description appropriately sized, front-loaded, and free of redundancy?
The description is concise with one sentence and an Args block. The Args block is slightly redundant given the schema, but it provides useful context. Overall, no wasted words.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Completeness5/5Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
For a simple tool with one parameter and an output schema present, the description covers the essential: what the tool does, what input it expects, and its read-only nature. The output schema handles return values, so no further explanation is needed.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Parameters4/5Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema has no description for the parameter (0% coverage). The description adds meaning by specifying that report_json is a JSON string from fetch_ufixit_report, which adds value beyond the bare schema.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Purpose5/5Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states it parses UFIXIT report content to extract accessibility violations. This verb+resource combination is specific and distinguishes it from siblings like fetch_ufixit_report and fix_accessibility_issues.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Usage Guidelines4/5Does the description explain when to use this tool, when not to, or what alternatives exist?
The description explicitly mentions that the input comes from fetch_ufixit_report, implying a prerequisite. However, it does not provide explicit when-not-to-use or alternatives, though the context is clear.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
- Behavior4/5
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
Discloses key constraint that front page cannot be unpublished and notes no content changes, but lacks detail on permissions or side effects; with no annotations, the description carries the burden.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Conciseness5/5Is the description appropriately sized, front-loaded, and free of redundancy?
Concise with clear structure: purpose sentence, parameter list, and an important note. No unnecessary text.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Completeness4/5Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Covers parameters and crucial constraints; output schema exists so return values are not needed. Lacks error handling info but generally sufficient.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Parameters4/5Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Despite 0% schema coverage, description explains each parameter clearly (e.g., allowed values for editing_roles), adding meaning beyond type/required info.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Purpose5/5Does the description clearly state what the tool does and how it differs from similar tools?
Clearly states it updates settings for an existing page without changing content, distinguishing it from tools like edit_page_content.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Usage Guidelines3/5Does the description explain when to use this tool, when not to, or what alternatives exist?
Implies when to use (for settings only) but lacks explicit guidance on alternatives or when not to use; the sibling list is large and no direct comparisons.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
- Behavior3/5
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
No annotations are provided, so the description must convey behavior. It mentions concurrent processing and dry-run validation, but does not disclose potential side effects, required permissions, or whether grades are immediately saved. The presence of an output schema partially compensates, but more behavioral context would be beneficial.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Conciseness5/5Is the description appropriately sized, front-loaded, and free of redundancy?
The description is concise: a one-line summary followed by a well-structured Args block. Every sentence serves a purpose, no redundant information, and it is front-loaded with the key purpose.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Completeness4/5Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given the tool's complexity (6 parameters, nested objects, concurrency), the description covers core aspects: purpose, parameters, and concurrency controls. It lacks details on error handling, permissions, and confirmation of submission success, but the output schema likely addresses return values. Overall, it is fairly complete.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Parameters5/5Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema description coverage is 0%, so the description's detailed Args section is essential. It explains each parameter's meaning, types, defaults, and the structure of the 'grades' object. This provides significant added value beyond the schema, making parameter semantics exceptionally clear.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Purpose5/5Does the description clearly state what the tool does and how it differs from similar tools?
The description explicitly states 'Grade multiple submissions efficiently with concurrent processing' and details both rubric-based and simple point-based grading. This clearly identifies the action (grade) and resource (multiple submissions), distinguishing it from single-grading tools like 'grade_with_rubric'.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Usage Guidelines4/5Does the description explain when to use this tool, when not to, or what alternatives exist?
The description implies batch usage through its name and mention of concurrent processing, but does not explicitly state when to use this tool versus alternatives like single-submission grading. It provides clear context for batch grading but lacks direct exclusions or alternative recommendations.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
- Behavior4/5
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
Annotations already indicate destructiveHint=true. The description adds useful nuance: the delete is permanent but Canvas may retain a recycle-bin copy depending on admin settings. This extra context helps the agent understand the actual behavior beyond the annotation.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Conciseness5/5Is the description appropriately sized, front-loaded, and free of redundancy?
The description is short, front-loaded with the core purpose, and efficiently provides the permanence note and parameter explanations. Every sentence earns its place without redundancy.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Completeness4/5Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
For a simple delete tool with only 2 parameters and an existing output schema, the description is largely complete. It covers the action, permanence, and both parameters. However, it could further clarify that this tool has no confirmation prompt (compared to its sibling) and mention any required permissions, keeping it from a perfect score.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Parameters5/5Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
The input schema has 0% description coverage—only titles exist. The description fills this gap with clear explanations: 'Course code or Canvas ID' for course_identifier, and 'Announcement ID to delete' for announcement_id. This adds essential meaning not present in the schema.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Purpose5/5Does the description clearly state what the tool does and how it differs from similar tools?
The description explicitly states 'Delete an announcement from a Canvas course,' which clearly identifies the verb and resource. It distinguishes from siblings like 'bulk_delete_announcements' and 'delete_announcement_with_confirmation' by implying a single, immediate delete without confirmation.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Usage Guidelines3/5Does the description explain when to use this tool, when not to, or what alternatives exist?
The description mentions the action is permanent and notes a recycle bin possibility, but it does not explicitly state when to use this tool versus alternatives like 'delete_announcement_with_confirmation' or 'bulk_delete_announcements'. The intent must be inferred from the name and context.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
- Behavior4/5
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
Beyond the destructiveHint annotation, the description adds detail about permanence and potential recycle-bin retention depending on admin settings. It also highlights the dry_run default (True) as a safety mechanism, which is valuable behavioral context not captured by annotations alone.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Conciseness5/5Is the description appropriately sized, front-loaded, and free of redundancy?
The description is concise, uses clear section headers, and front-loads the core purpose. Each sentence adds value, and the bulleted Args list improves readability without unnecessary verbosity.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Completeness4/5Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
For a criteria-based deletion tool with output schema and good annotations, the description covers key behaviors (permanence, dry run) and parameters. It could include potential error conditions or limit behavior, but is sufficiently complete for most use cases.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Parameters5/5Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
With 0% schema description coverage, the description fully compensates by explaining each parameter: course_identifier (course code or ID), criteria (dict with specific keys like title_contains, older_than, etc.), limit (safety max), and dry_run (preview mode). This adds crucial meaning beyond the raw schema types.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Purpose5/5Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the action ('Delete announcements matching specific criteria') and the resource ('announcements'). It distinguishes itself from siblings like 'delete_announcement' (single) and 'bulk_delete_announcements' (bulk without criteria) by specifying criteria-based deletion.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Usage Guidelines3/5Does the description explain when to use this tool, when not to, or what alternatives exist?
The description includes a note about permanence and recycling, and mentions the dry_run safety feature, but does not explicitly guide when to use this tool over similar siblings like 'delete_announcement' or 'bulk_delete_announcements'. Provides some context but lacks direct comparative guidance.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
- Behavior4/5
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
Annotations already indicate readOnlyHint=true, so the description's statement of 'Get a comprehensive overview' aligns. It adds value by specifying the included components (pages, modules, syllabus) but does not disclose additional behavioral traits like aggregation or performance.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Conciseness5/5Is the description appropriately sized, front-loaded, and free of redundancy?
The description is concise, front-loaded with the main purpose, and followed by a clear argument list. Every sentence serves a purpose with no wasted words.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Completeness4/5Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given the presence of an output schema, the description need not detail return values. It adequately explains the three components included. However, it could mention how to handle large courses or missing data, but overall it is complete for an overview tool.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Parameters5/5Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
With 0% schema description coverage, the description fully compensates by explaining each parameter: course_identifier as 'Course code or Canvas ID' and boolean flags with defaults and meanings. This adds critical meaning beyond the schema.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Purpose5/5Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states 'Get a comprehensive overview of course content including pages, modules, and syllabus.' It uses specific verbs and resources, and distinguishes itself from sibling tools like get_course_details or get_course_structure by offering a combined view.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Usage Guidelines3/5Does the description explain when to use this tool, when not to, or what alternatives exist?
The description implies that this tool is a one-stop-shop for an overview, but it does not explicitly state when to use it versus alternatives like list_pages or list_modules. No when-not or alternative guidance is provided.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
- Behavior4/5
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
Annotations already declare readOnlyHint=true, and the description adds behavioral details (optional search, sorting options) without contradiction, providing useful context beyond the annotation.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Conciseness5/5Is the description appropriately sized, front-loaded, and free of redundancy?
The description is succinct, uses a clear parameter list format, and front-loads the core purpose with no redundant information.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Completeness4/5Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given the tool's complexity (4 parameters, no schema descriptions, output schema exists), the description covers all necessary usage aspects but omits details like pagination or limits, which is acceptable for a list tool.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Parameters5/5Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
With 0% schema coverage, the description fully explains each parameter: course_identifier, search_term, sort (with allowed values), and order (asc/desc), adding critical meaning beyond the schema.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Purpose5/5Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the action ('list files') and the resource ('in a Canvas course'), distinguishing it from sibling tools like download_course_file or read_course_file.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Usage Guidelines3/5Does the description explain when to use this tool, when not to, or what alternatives exist?
The description implies usage for listing files with optional search but does not explicitly state when to use this tool versus alternatives such as download_course_file or read_course_file.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
- Behavior4/5
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
Annotations indicate readOnlyHint=true, and the description aligns by describing a download action. It goes beyond annotations by specifying the default behavior for save_directory (system temp dir) and the precondition that the directory must exist.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Conciseness5/5Is the description appropriately sized, front-loaded, and free of redundancy?
The description is concise, front-loaded, and uses a structured format (Args section). Every sentence earns its place with no redundancy.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Completeness4/5Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given the straightforward nature of the tool and presence of an output schema, the description is mostly complete. It could mention potential errors (e.g., file not found) but is adequate for an AI agent.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Parameters4/5Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Though schema description coverage is 0%, the description provides meaning for save_directory (default, existence requirement) and implicitly explains course_identifier and file_id. This adds value beyond the bare schema.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Purpose5/5Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the action ('download') and the resource ('file from a Canvas course to the local filesystem'). It is a specific verb+resource combination that distinguishes it from siblings like 'read_course_file' and 'upload_course_file'.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Usage Guidelines4/5Does the description explain when to use this tool, when not to, or what alternatives exist?
It advises using list_course_files or list_module_items to find file IDs, providing context for obtaining the required file_id parameter. However, it does not explicitly state when to use this tool versus alternatives like read_course_file.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
- Behavior4/5
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
Annotations already indicate readOnlyHint=true, and the description adds detail about using a specific Canvas endpoint and returning a ranked table with engagement scores. No contradictions. It could mention pagination or limits, but the provided info is sufficient.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Conciseness4/5Is the description appropriately sized, front-loaded, and free of redundancy?
The description is somewhat lengthy due to the Args block, but it is front-loaded with the purpose and use case. Every sentence adds value, though it could be slightly more compact.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Completeness4/5Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
With an output schema present, the description does not need to explain return values. It covers purpose, usage context, parameters, and the underlying endpoint. It is complete for practical use.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Parameters5/5Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema description coverage is 0%, but the description includes a detailed 'Args:' section explaining each parameter, including defaults and purpose. This fully compensates for the missing schema descriptions.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Purpose5/5Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the tool's purpose: 'Get per-student engagement analytics: page views, participations, and on-time/late/missing assignment counts.' It also mentions the engagement score and its use case, distinguishing it from sibling tools like get_assignment_analytics.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Usage Guidelines4/5Does the description explain when to use this tool, when not to, or what alternatives exist?
The description indicates the tool is 'useful for identifying disengaged students in participation/presentation-driven courses,' providing context for when to use it. However, it does not explicitly state when not to use or mention alternatives, though the sibling list implies differentiation.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
- Behavior4/5
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
With no annotations, the description covers key behaviors: uploads local file, returns file ID, and explains parameters. However, it lacks details on permissions, error handling, or size limits.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Conciseness3/5Is the description appropriately sized, front-loaded, and free of redundancy?
The description is informative but somewhat verbose, including an unformatted Args list. Could be more concise while retaining key information.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Completeness5/5Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
The description includes return value (file ID) for downstream usage, covers all parameters, and provides integration context, making it complete for an upload tool with an output schema.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Parameters5/5Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
All five parameters are explained with descriptions of purpose, defaults (folder_path, display_name, on_duplicate), and example values, compensating for the 0% schema coverage.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Purpose5/5Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the tool uploads a file to Canvas course storage and mentions integration with add_module_item and send_conversation, distinguishing its purpose.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Usage Guidelines4/5Does the description explain when to use this tool, when not to, or what alternatives exist?
Provides clear context for when to use (uploading files) and mentions integration with other tools, but does not explicitly state when not to use or compare with sibling tools like download_course_file.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
- Behavior4/5
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
Annotations already declare destructiveHint=true. The description adds context about permanence and the optional recycle-bin behavior, plus safety parameters. It does not contradict annotations.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Conciseness5/5Is the description appropriately sized, front-loaded, and free of redundancy?
The description is brief, front-loaded with the main action, and uses a clean Args structure. Every sentence adds value without redundancy.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Completeness4/5Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given the low complexity and presence of an output schema, the description covers the core behavior well. It could mention error handling or consequences of mismatch, but it is largely complete.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Parameters5/5Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
With 0% schema description coverage, the description fully compensates by explaining all four parameters, including the optional require_title_match and dry_run params. This adds significant value beyond the schema.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Purpose5/5Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states it deletes an announcement with safety checks. The name and sibling tools (delete_announcement, delete_announcements_by_criteria) differentiate it as a version with confirmation guards.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Usage Guidelines4/5Does the description explain when to use this tool, when not to, or what alternatives exist?
The description implies usage when safety checks are desired, and the presence of sibling tools suggests alternatives. However, it does not explicitly state when to prefer this tool over delete_announcement or bulk_delete_announcements.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
- Behavior4/5
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
Without annotations, the description carries full burden. It describes that fixes are applied for auto_fixable issues, and that dry_run controls preview vs. actual application. However, it lacks explicit mention of permanence of changes when dry_run=False or required permissions, though the dry_run safeguard mitigates risk.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Conciseness4/5Is the description appropriately sized, front-loaded, and free of redundancy?
Well-structured with sections; front-loaded with purpose. Each sentence adds value. Could be slightly more concise, but overall efficient and clear.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Completeness5/5Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given the tool's complexity (4 params, no annotations), the description covers purpose, usage, parameters, and behavioral aspects comprehensively. Output schema exists, so return value details are not required.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Parameters5/5Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema coverage is 0%, so description must explain parameters fully. It does so, detailing course_identifier, fix_types (with comma-separated examples), content_types, and dry_run with defaults. Provides meaningful context beyond schema titles and types.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Purpose5/5Does the description clearly state what the tool does and how it differs from similar tools?
Description clearly states the tool auto-fixes accessibility issues in Canvas course content. It specifies the verb 'auto-fix' and resource 'accessibility issues', and distinguishes itself from siblings like scan_course_content_accessibility and format_accessibility_summary by describing its action and prerequisite.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Usage Guidelines4/5Does the description explain when to use this tool, when not to, or what alternatives exist?
Explicitly instructs to run scan_course_content_accessibility first, and explains the dry_run parameter for safe preview. Provides clear context on when to use (after scanning) and how to apply changes, though it doesn't mention when not to use or alternatives.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
- Behavior4/5
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
Annotations already provide readOnlyHint: true. Description adds that results are organized by category with descriptions, adding useful behavioral context beyond the annotation.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Conciseness5/5Is the description appropriately sized, front-loaded, and free of redundancy?
Two concise sentences with no fluff. The purpose is upfront, followed by additional detail on organization and use case.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Completeness5/5Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given no parameters and presence of output schema, the description fully covers what the tool does, its link to execute_typescript, and the structure of results. No gaps identified.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Parameters4/5Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Input schema has zero parameters, so schema coverage is 100%. The description adds no parameter semantics because none exist; baseline for 0 parameters is 4.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Purpose5/5Does the description clearly state what the tool does and how it differs from similar tools?
Description clearly states the tool lists all available TypeScript modules for the code execution API, distinguishes it from other list tools (e.g., list_modules for Canvas modules), and specifies it is for discovering importable files for execute_typescript.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Usage Guidelines4/5Does the description explain when to use this tool, when not to, or what alternatives exist?
The description advises use to discover available bulk operations, implying usage before execute_typescript. While it does not explicitly mention when not to use or alternatives, the context is clear for this discovery tool.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
- Behavior4/5
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
Annotations already provide readOnlyHint=true. The description adds that include_criteria defaults to True, clarifying optional behavior. No contradictions. It doesn't discuss rate limits or auth, but given annotations cover safety profile, this is adequate.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Conciseness5/5Is the description appropriately sized, front-loaded, and free of redundancy?
The description is concise: one line purpose plus a short Args section. No superfluous words. Front-loaded with the main action.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Completeness5/5Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
For a simple read operation with output schema, the description completely covers the tool's purpose and parameters. It's sufficient for an agent to use correctly.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Parameters4/5Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
The description explains course_identifier as 'course code or Canvas ID' and include_criteria as including detailed criteria with default True. This adds meaning beyond the schema, which only has types. Schema coverage is 0% by text, so description compensates well.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Purpose5/5Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the verb 'list', the resource 'rubrics', and the scope 'in a specific course'. It also mentions optional detailed criteria, which helps differentiate from siblings like get_rubric.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Usage Guidelines4/5Does the description explain when to use this tool, when not to, or what alternatives exist?
It indicates when to use: to list rubrics in a course. Though it does not explicitly exclude alternatives, the context (siblings like get_rubric) implies when not to use it. A mention of 'use get_rubric for a specific rubric' would earn a 5.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
- Behavior5/5
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
Adds behavioral details beyond annotations: returns base64 content, clamps max_size_mb server-side, rejects files larger than effective limit to avoid memory issues. No contradiction with readOnlyHint annotation.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Conciseness4/5Is the description appropriately sized, front-loaded, and free of redundancy?
Well-structured: starts with purpose, contrasts with sibling, provides lookup hints, then details parameters. Efficient despite being somewhat lengthy; no unnecessary sentences.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Completeness4/5Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Covers usage context, alternative, behavioral constraints, and parameter details. Output schema exists so return value details are not needed. Minor gap: no mention of error cases, but acceptable for a read tool.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Parameters4/5Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Input schema has 0% description coverage, but description explains each parameter (course_identifier, file_id, max_size_mb) with default and clamping behavior, adding value over the schema's titles.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Purpose5/5Does the description clearly state what the tool does and how it differs from similar tools?
Description clearly states it reads a file and returns base64 content, and distinguishes itself from sibling download_course_file by noting it returns content directly rather than saving to filesystem.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Usage Guidelines4/5Does the description explain when to use this tool, when not to, or what alternatives exist?
Explicitly guides to use list_course_files or list_module_items to find file IDs, and explains when this tool is preferable (when MCP server runs on different machine). Does not explicitly state when not to use, but provides clear alternative.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
- Behavior5/5
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
The description adds critical behavioral context beyond the destructiveHint annotation, clarifying that the mutation is a partial removal rather than a full deletion. This prevents misunderstanding about the tool's effect.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Conciseness5/5Is the description appropriately sized, front-loaded, and free of redundancy?
The description is concise: a single-sentence purpose, a critical note, and a clear parameter list. Every sentence adds value with no redundancy.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Completeness4/5Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
The description covers the tool's core action, side effect, and parameter meanings. It omits prerequisites (e.g., permissions) and error handling, but given the existence of an output schema for return values and the tool's simplicity, it is largely sufficient.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Parameters4/5Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
With 0% schema description coverage, the description provides brief but sufficient parameter explanations ('Course code or Canvas ID', 'Module ID containing the item', 'Item ID to remove'). While not exhaustive, these add meaning beyond the raw schema.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Purpose5/5Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the tool removes an item from a module, with the important nuance that it only unlinks and does not delete the underlying content. This distinguishes it from sibling tools like delete_module or delete_page.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Usage Guidelines4/5Does the description explain when to use this tool, when not to, or what alternatives exist?
The description explicitly highlights that the tool only unlinks, not deletes content, guiding the agent on appropriate use. However, it does not directly name alternative tools for full deletion, relying on the agent to infer from context.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
- Behavior4/5
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
Annotations already set readOnlyHint=true, so the description does not need to restate safety. It adds value by clarifying that the call returns the entire structure and explaining parameters. No contradictions.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Conciseness5/5Is the description appropriately sized, front-loaded, and free of redundancy?
The description is a single, front-loaded sentence stating the main purpose, followed by a clean docstring-style parameter list. Every word earns its place with no redundancy.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Completeness5/5Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given the output schema exists (though not shown), the description does not need to detail return values. It provides the scope ('full module and item structure') and parameter semantics, making it complete for an agent to understand the tool's function.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Parameters5/5Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
With 0% schema description coverage, the description fully compensates by explaining that course_identifier can be a course code or Canvas ID, and include_unpublished controls whether unpublished items are included (default True). This adds essential meaning beyond the schema's type-only definitions.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Purpose5/5Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the tool retrieves the full module and item structure for a course in a single call. This verb+resource combination is specific and distinct from sibling tools like list_modules or list_module_items which require multiple calls.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Usage Guidelines4/5Does the description explain when to use this tool, when not to, or what alternatives exist?
The description implies efficient bulk retrieval with 'in a single call,' which helps the agent choose it over multiple calls. However, it does not explicitly state when not to use it or provide direct alternatives.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
- Behavior4/5
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
Annotations declare readOnlyHint=true, and the description consistently describes a read operation. It adds behavioral context by detailing parameter precedence and lookup methods, which goes beyond what annotations provide.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Conciseness5/5Is the description appropriately sized, front-loaded, and free of redundancy?
The description is extremely concise, using three sentences plus a labeled Args list. Every sentence adds value, with the main purpose front-loaded.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Completeness5/5Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given the presence of an output schema, the description needs only to explain input and behavior. It covers purpose, parameter selection logic, and parameter meanings completely.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Parameters5/5Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Despite 0% schema description coverage, the description fully explains each parameter's purpose and format: course_identifier as course code or ID, rubric_id for direct lookup, assignment_id for attached rubric. This adds significant value over the raw schema.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Purpose5/5Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the tool retrieves detailed rubric criteria, ratings, and points. It distinguishes itself from sibling tools like list_rubrics (listing) and create_rubric (creation) by focusing on retrieval with specific identifiers.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Usage Guidelines4/5Does the description explain when to use this tool, when not to, or what alternatives exist?
The description explains when to use rubric_id vs assignment_id, including precedence when both are provided. It does not explicitly mention when not to use the tool, but the guidance is sufficient for correct invocation.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
- Behavior4/5
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
Annotations indicate readOnlyHint=true, and description aligns with read-only search. Description adds details about output granularity via detail_level, providing extra behavioral context beyond annotations.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Conciseness5/5Is the description appropriately sized, front-loaded, and free of redundancy?
The description is concise, with a clear purpose statement and well-organized parameter list. Every sentence adds value without redundancy.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Completeness5/5Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given the tool's simplicity and presence of an output schema, the description sufficiently covers usage and parameters. No gaps are evident.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Parameters5/5Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Despite 0% schema description coverage, the description explains both parameters in detail, adding meaning about filtering and output format that the schema alone does not provide.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Purpose5/5Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the tool searches Canvas code API tools by keyword, using a specific verb and resource. It distinguishes itself from sibling tools which are specific operations.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Usage Guidelines4/5Does the description explain when to use this tool, when not to, or what alternatives exist?
The description provides guidance on parameters, including defaults and options. However, it does not explicitly state when to use this tool vs. alternatives, though its search nature makes it implicit.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
- Behavior5/5
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
With no annotations, the description fully discloses behavior: sandboxed Node.js environment, temp file deletion, optional network allowlist, timeout, memory, and CPU limits, and best-effort security. This is comprehensive.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Conciseness4/5Is the description appropriately sized, front-loaded, and free of redundancy?
The description is informative and well-structured, with bulleted args and highlighted important points. Some verbosity with uppercase IMPORTANT, but still efficient and easy to parse.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Completeness5/5Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given there is an output schema, the description does not need to explain return values. It covers both parameters, provides crucial context about token savings and security, and is complete for the tool's complexity.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Parameters5/5Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
The description adds meaning beyond the input schema: 'code' is explained as TypeScript code with import capability, and 'timeout' is described as max execution time in seconds with default. Schema coverage is 0%, so description fully compensates.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Purpose5/5Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states 'Execute TypeScript code in a Node.js environment with access to Canvas API,' specifying the verb, resource, and environment. It also mentions the ability to import from './canvas/*' modules, distinguishing it from sibling tools that are primarily Canvas CRUD operations.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Usage Guidelines4/5Does the description explain when to use this tool, when not to, or what alternatives exist?
The description provides explicit guidance on when to use (99.7% token savings for bulk operations) and important security considerations. While it doesn't explicitly state when not to use or compare with alternatives, the context is clear.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
GitHub Badge
Glama performs regular codebase and documentation scans to:
- Confirm that the MCP server is working as expected.
- Confirm that there are no obvious security issues.
- Evaluate tool definition quality.
Our badge communicates server capabilities, safety, and installation instructions.
Card Badge
Copy to your README.md:
Score Badge
Copy to your README.md:
Latest Blog Posts
MCP directory API
We provide all the information about MCP servers via our MCP API.
curl -X GET 'https://glama.ai/api/mcp/v1/servers/vishalsachdev/canvas-mcp'
If you have feedback or need assistance with the MCP directory API, please join our Discord server