Vocab Voyage
Server Details
20 MCP tools + 17 widgets for SAT/ISEE/SSAT/GRE/GMAT/LSAT prep. Flashcards, quizzes & games. Hosted.
- Status
- Healthy
- Last Tested
- Transport
- Streamable HTTP
- URL
- Repository
- jaideepdhanoa/vocab-voyage-mcp
- GitHub Stars
- 0
- Server Listing
- Vocab Voyage
Glama MCP Gateway
Connect through Glama MCP Gateway for full control over tool access and complete visibility into every call.
Full call logging
Every tool call is logged with complete inputs and outputs, so you can debug issues and audit what your agents are doing.
Tool access control
Enable or disable individual tools per connector, so you decide what your agents can and cannot do.
Managed credentials
Glama handles OAuth flows, token storage, and automatic rotation, so credentials never expire on your clients.
Usage analytics
See which tools your agents call, how often, and when, so you can understand usage patterns and catch anomalies.
Tool Definition Quality
Average 4.2/5 across 31 of 31 tools scored. Lowest: 2.4/5.
Every tool has a distinct purpose with clear 'use when' and 'do not use' instructions. Overlapping tools like get_definition, explain_word_in_context, and get_word_of_the_day are carefully disambiguated through descriptions and usage directives.
All tool names follow a consistent verb_noun pattern using snake_case (e.g., get_definition, generate_quiz, play_game, record_word_result). No mixing of conventions or ambiguous verbs.
31 tools cover a broad vocabulary learning platform including study, progress, actions, parent/tutor, and utility. While on the higher end, each tool earns its place; no obvious redundancy.
The tool surface covers core learning, progress tracking, mastery management, parent/tutor features, and support. Minor gaps exist (e.g., no search for words across courses) but most workflows are complete.
Available Tools
31 toolsaward_game_xpAInspect
Award score-based XP from a game/activity (separate from study-time XP). Cascades to the leaderboard via DB trigger. Requires sign-in.
| Name | Required | Description | Default |
|---|---|---|---|
| xp | Yes | XP to award (>= 0). | |
| reason | No | Optional human label for analytics. |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
Annotations indicate it is not read-only or idempotent, and the description adds behavioral context: 'Cascades to the leaderboard via DB trigger' and 'Requires sign-in.' This provides useful disclosure beyond annotations.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
Two concise, front-loaded sentences with no unnecessary words. Every sentence adds value.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
No output schema, and the description does not explain return values or error states. However, for a simple mutation with 2 parameters, it is reasonably complete, but could mention success/failure response.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema description coverage is 100%, with both parameters described. The tool description adds no additional meaning beyond what the schema already provides, meeting the baseline.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the action (award), resource (score-based XP from game/activity), and distinguishes from study-time XP, providing specificity and differentiation from potential sibling tools.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The only usage guidance is 'Requires sign-in.' No context on when to use this tool versus alternative tools (e.g., other XP-awarding tools or mutation tools) is provided.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
explain_word_in_contextARead-onlyIdempotentInspect
Explain what a word means inside a specific sentence — useful when a word has multiple meanings.
| Name | Required | Description | Default |
|---|---|---|---|
| word | Yes | ||
| sentence | Yes | The sentence containing the word |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
Annotations already indicate read-only and idempotent behavior; description adds that it explains in context but does not mention return format or edge cases (e.g., handling of missing words). No contradiction.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
Single sentence of 14 words, front-loaded with action, no redundancy. Efficient.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Simple tool with two params and no output schema; description covers core purpose and use case, but lacks mention of output format (e.g., text explanation). Adequate for complexity level.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema has 50% coverage (only 'sentence' described). Description adds no further details about 'word' parameter or format constraints; baseline is 3 due to moderate coverage.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
Description clearly states the verb 'explain' and resource 'word in context', distinguishing it from sibling tools like 'get_definition' which gives general definitions. The phrase 'useful when a word has multiple meanings' adds specificity.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
Provides some guidance ('useful when a word has multiple meanings') but does not explicitly exclude alternatives (e.g., use get_definition for general meaning) or state when not to use.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
file_support_ticketAInspect
File a real human-followup support ticket on behalf of the signed-in user. Use this when the user reports a billing problem, bug, account lockout, complaint about a tutor, or anything Sparkle/the agent cannot resolve from data. The ticket is emailed to the support team and a confirmation is sent to the user with a 1-business-day SLA. Categories: billing, bug, account, complaint, feedback, other. Requires sign-in.
| Name | Required | Description | Default |
|---|---|---|---|
| summary | Yes | One-line description of the issue (what the user needs). | |
| category | Yes | Issue category. Use 'billing' for refunds/charges, 'bug' for crashes/data loss, 'account' for lockouts/access, 'complaint' for tutor/quality issues, 'feedback' for feature requests. | |
| conversation_snippet | No | Optional: last few turns of the conversation for context. |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
The description discloses key behavioral traits beyond annotations: the ticket is emailed to support, a confirmation is sent to the user, and there is a 1-business-day SLA. It also notes the requirement for sign-in. These details are not captured in annotations and add important context for the agent.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is concise at four sentences, front-loaded with the primary purpose. Each sentence serves a distinct function: purpose, usage guidance, process details, and list of categories. No extraneous information.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
The description covers the tool's purpose, usage guidance, behavioral expectations, and parameter categories comprehensively. However, it does not mention what the tool returns (e.g., ticket ID or confirmation), and there is no output schema to compensate. For a tool that likely produces a result, this leaves a minor gap in understanding the full interaction.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
The input schema has 100% coverage with descriptions for all parameters. The tool description does not add significant new information about parameters beyond what the schema already provides (e.g., category enum descriptions are present in schema). Thus, the description adds minimal semantic value, earning a baseline score of 3.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the action ('File a real human-followup support ticket'), the resource (a support ticket), and the scope (on behalf of the signed-in user). It also lists specific use cases, making it distinct from sibling tools like get_flashcards or award_game_xp.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description explicitly says when to use this tool ('when the user reports a billing problem, bug, account lockout...') and implicitly when not to use it ('anything Sparkle/the agent cannot resolve from data'). This provides clear decision criteria for the agent.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
generate_quizARead-onlyInspect
Use this when the user wants to practice, be quizzed, or test their knowledge across multiple words at once. Generates a 1–10 question multiple-choice quiz for a test family (isee, ssat, sat, psat, gre, gmat, lsat, general). Renders the interactive Vocab Voyage quiz widget on supporting hosts; per-answer taps persist mastery for signed-in users. Do not use for definition lookups — call get_definition instead. Do not use for spaced-repetition flashcards — call get_flashcards instead.
| Name | Required | Description | Default |
|---|---|---|---|
| count | No | Number of questions (1–10), default 5 | |
| level | No | Optional difficulty hint | |
| test_family | Yes |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
The description states that per-answer taps persist mastery for signed-in users, implying a write operation that modifies state. However, the annotations declare readOnlyHint=true, contradicting the description. According to rules, a contradiction mandates a score of 1.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is three sentences with no redundancy. It front-loads the use case, then describes functionality and exclusions. Every sentence adds necessary information without waste.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
The description covers purpose, usage, and some behavior but leaves a critical gap: the contradiction with readOnlyHint is unresolved. It also does not detail how the quiz widget works or what happens on unsupported hosts. For a tool with 3 params and no output schema, this is adequate but incomplete.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema description coverage is 67% (count and level described, test_family not). The description adds context by listing test family options and the question count range (1–10), but does not fully compensate for the missing test_family description. Overall, it adds moderate value beyond the schema.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the tool generates a multiple-choice quiz for test families, specifies the question range, and distinguishes from siblings like get_definition and get_flashcards. The verb 'generate' and resource 'quiz' are explicit, making purpose unmistakable.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description explicitly tells when to use the tool ('when the user wants to practice, be quizzed, or test their knowledge across multiple words at once') and provides two clear exclusions: call get_definition for lookups and get_flashcards for spaced repetition. This fully guides selection among siblings.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
get_child_session_detailARead-onlyIdempotentInspect
Auth-only. Parent-only. Detailed breakdown for a single child's study/quiz session — accuracy, missed words, duration. Defaults to the most recent session for the parent's first linked child if no child_user_id / session_id is supplied. Ownership-gated: returns an error for unlinked children.
| Name | Required | Description | Default |
|---|---|---|---|
| session_id | No | Optional. Defaults to the child's most recent session. | |
| child_user_id | No | Optional. Defaults to first linked child. |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
Annotations already indicate readOnlyHint=true and destructiveHint=false. The description adds behavioral traits beyond annotations: 'Auth-only', 'Parent-only', 'Ownership-gated: returns an error for unlinked children'. This provides important behavioral context not captured by annotations.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is two sentences with no wasted words. It front-loads key constraints (Auth-only, Parent-only) and provides essential details in a compact, readable format.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
For a read-only tool with no output schema and 0 required parameters, the description covers purpose, default behavior, authentication, and ownership. While it doesn't list all return fields beyond accuracy, missed words, and duration, this is sufficient given the tool's simplicity and annotations.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema description coverage is 100%, so baseline is 3. The description adds value by specifying default behavior for both parameters: defaults to the child's most recent session if session_id is omitted, and defaults to the first linked child if child_user_id is omitted. This behavioral context enhances the schema.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the tool retrieves a detailed breakdown of a child's study/quiz session, listing specific fields (accuracy, missed words, duration). It distinguishes itself from siblings like get_session_detail by specifying parent-only and child scope.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description provides clear context: requires authentication and parent role, defaults to most recent session for first linked child if parameters omitted, and ownership-gated for unlinked children. It lacks explicit mention of when not to use this tool versus alternatives like get_session_trends, but the context is sufficient for an agent to infer.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
get_class_session_trendsARead-onlyIdempotentInspect
Auth-only. Tutor-only. Aggregate class-level trends across the tutor's classes (default 14 days, max 30). Pass class_id to scope to one class; omit it to get a worst-first rollup across up to 25 classes plus 1–3 struggling students.
| Name | Required | Description | Default |
|---|---|---|---|
| days | No | Window size in days (default 14, max 30). | |
| class_id | No | Optional class id to scope to one class. |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
Annotations already provide readOnlyHint and destructiveHint, but the description adds valuable behavioral context: auth/role requirements, default and maximum time window, and the rollup behavior with/without class_id. It does not contradict annotations. Slight deduction for not mentioning potential data limits or caching.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
Two sentences, front-loaded with auth requirements, then behavior. Every sentence adds value, no fluff. Highly efficient.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
No output schema exists, but the description provides a reasonable notion of what is returned ('aggregate trends', 'worst-first rollup', 'struggling students'). Could be more explicit about the exact fields or format, but it's sufficient for an AI agent to understand the tool's output.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema coverage is 100% with descriptions, but the description adds extra meaning beyond the schema: it explains the effect of omitting 'class_id' (rollup across 25 classes + struggling students) and reinforces defaults/max for 'days'.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the verb ('Aggregate class-level trends') and resource ('class session trends'), and distinguishes from siblings like 'get_session_trends' by specifying scope ('across the tutor's classes').
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
Explicitly states auth and role requirement ('Auth-only. Tutor-only.'), and provides clear guidance on when to use the 'class_id' parameter versus omitting it, including the behavior difference ('worst-first rollup across up to 25 classes plus 1–3 struggling students').
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
get_class_standingARead-onlyIdempotentInspect
Use this when a signed-in student asks how they're doing in their tutor class, who's ahead, who their rival is, or who they should challenge. Auth-only. Returns weekly XP rank inside the user's tutor class plus a winnable rival suggestion (similar weekly XP). NEVER name the class leader unless the user is rank #1 — the response uses '(top student)' as a deliberate placeholder. Renders the interactive class-standing widget on supporting hosts; falls back to markdown elsewhere. Anonymous callers receive a sign-in prompt.
| Name | Required | Description | Default |
|---|---|---|---|
No parameters | |||
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
The description reveals key behaviors: auth-only, privacy of class leader (placeholder 'top student' unless user is rank #1), and rendering differences based on host support. Annotations confirm read-only and idempotent nature; description adds context beyond annotations.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is somewhat verbose but front-loaded with purpose and usage. Each sentence contributes value, though a slight reduction could improve conciseness.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
With no parameters and no output schema, the description adequately covers what the tool returns (rank, rival) and key behaviors (auth, leader privacy, rendering fallback). Complete for a zero-parameter tool.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
No parameters exist, so baseline is 4. The description adds no parameter info, which is appropriate since there are none.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states that the tool returns weekly XP rank inside the user's tutor class and a winnable rival suggestion. It uses specific verbs and resource names, and distinguishes itself from siblings by focusing on class standing and rival challenges.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description explicitly lists when to use this tool (signed-in student asking about class standing, rivals) and notes auth requirement and anonymous fallback. It lacks explicit alternative guidance but is clear enough for the intended scenarios.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
get_course_word_listARead-onlyIdempotentInspect
Get a sample of vocabulary words from a specific Vocab Voyage course. Use list_courses to discover slugs.
| Name | Required | Description | Default |
|---|---|---|---|
| limit | No | 1–50, default 20 | |
| course_slug | Yes |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
Annotations already provide readOnlyHint, idempotentHint, and destructiveHint, so the description does not need to repeat those. It adds value by describing the result as a 'sample' (implying non-exhaustive) and by mentioning the slug discovery workflow. It does not disclose rate limits or pagination, but annotations cover safety.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is two sentences with no extraneous information. It is front-loaded with the main action and immediately provides a critical usage hint (list_courses). Every part serves a purpose.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given the tool's simplicity (2 parameters, no output schema, many sibling tools), the description adequately covers what the tool does and how to use it. It does not explain the return format or behavior for empty results, but for a sample retrieval tool this is acceptable.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema coverage is 50% (limit parameter described, course_slug not). The description adds meaning by stating the course is a 'Vocab Voyage course' and instructing to use 'list_courses' to get the slug, clarifying the source of course_slug. For limit, the schema already provides details (1–50, default 20).
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the tool gets 'a sample of vocabulary words from a specific Vocab Voyage course', using a specific verb and resource. It distinguishes from sibling tools like 'get_definition' and 'get_flashcards' by specifying it returns a sample from a course, and directs users to 'list_courses' to find slugs.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description indicates when to use the tool (getting vocabulary from a course) and suggests using 'list_courses' to discover slugs. However, it does not explicitly state when not to use it or provide comparisons to alternative siblings like 'get_word_of_the_day' or 'get_flashcards', which also return words.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
get_definitionARead-onlyIdempotentInspect
Use this when the user asks what a specific word means, requests its definition, part of speech, synonyms/antonyms, or an example sentence. Returns curated dictionary data from the Vocab Voyage corpus. Do not use for sentence-level meaning disambiguation (call explain_word_in_context) or for daily word prompts (call get_word_of_the_day).
| Name | Required | Description | Default |
|---|---|---|---|
| word | Yes | The word to define |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
Annotations already declare readOnlyHint, etc., so description only adds that data is from 'Vocab Voyage corpus', which is minor; no contradictions.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
Two clear, front-loaded sentences that each serve a purpose without waste.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given the tool's simplicity (one param, no enums, no output schema), the description fully covers the agent's needs.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema coverage is 100% for the single parameter; description adds no additional meaning beyond what the schema already provides.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description explicitly states the tool is used for word definitions, part of speech, synonyms/antonyms, and example sentences, clearly distinguishing it from sibling tools.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
Provides explicit when-to-use and when-not-to-use guidance, including alternatives explain_word_in_context and get_word_of_the_day.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
get_flashcardsARead-onlyInspect
Use this when the user asks for flashcards, wants to drill words individually, or wants a tap-to-flip review session. Returns 1–12 cards for a test family. Renders the interactive Vocab Voyage flashcards widget on supporting hosts; per-card 'I knew it / I didn't' buttons persist mastery for signed-in users. Do not use for multiple-choice testing (call generate_quiz) or for a single word lookup (call get_definition).
| Name | Required | Description | Default |
|---|---|---|---|
| count | No | Number of cards (1–12), default 5 | |
| test_family | No | isee, ssat, sat, psat, gre, gmat, lsat, general |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
Annotations already declare readOnlyHint=true and destructiveHint=false, so the safety profile is covered. The description adds useful context: interactive widget rendering, per-card mastery persistence for signed-in users. No contradiction with annotations.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
Three sentences, each purposeful. First sentence states purpose, second specifies range and behavior, third provides exclusion guidance. No wasted words.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given 2 parameters and no output schema, the description adequately explains the tool's function and return behavior (1–12 cards, interactive widget). Could mention that the output is a list of card objects, but the interactive widget implication covers it. Slight gap but still complete enough for an agent.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema coverage is 100%, so the schema itself documents both parameters (count, test_family) with descriptions. The description does not add additional semantic detail beyond what the schema provides, so baseline 3 is appropriate.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description uses specific verbs like 'returns' and 'renders' and clearly identifies the resource ('flashcards for a test family'). It distinguishes this tool from siblings: not for multiple-choice testing (use generate_quiz) nor single word lookup (use get_definition).
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
Explicitly states when to use: user asks for flashcards, wants to drill words individually, or wants a tap-to-flip review session. Also provides explicit exclusions with alternative tool names (generate_quiz, get_definition).
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
get_my_progressARead-onlyInspect
Use this when the signed-in user asks about their own streak, XP, words mastered, recent activity, or 'how am I doing'. Auth-only personal dashboard. Renders the interactive Vocab Voyage progress widget on supporting hosts; falls back to markdown elsewhere. Anonymous callers receive a sign-in prompt. Do not use for global stats or other users' progress.
| Name | Required | Description | Default |
|---|---|---|---|
No parameters | |||
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
Annotations provide readOnlyHint and non-destructive. Description adds that it is auth-only, renders an interactive widget on supporting hosts, falls back to markdown, and anonymous callers receive a sign-in prompt. No contradictions.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
Four concise sentences front-loaded with core purpose. Every sentence adds valuable context without redundancy.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Despite no output schema, the description fully explains the tool's behavior, including the rendered output (widget or markdown) and the data contained. Complete for a parameterless auth-bound tool.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
No parameters, baseline score is 4. No additional parameter information needed.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the tool retrieves the signed-in user's personal progress including streak, XP, words mastered, and recent activity. It distinguishes from siblings by explicitly excluding global stats and other users' progress.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description explicitly states when to use ('the signed-in user asks about their own progress') and when not to use (not for global stats or other users). It also mentions auth-only requirement and fallback behavior.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
get_pending_invitesARead-onlyIdempotentInspect
Use this when the signed-in user asks about pending parent invites, share codes, or whether their parent invite has been accepted yet. Returns each pending invite with hours_until_expiry. RULE: if any invite has hours_until_expiry < 24 (and not expired), proactively offer to resend it via the resend-parent-invite flow. If expired, offer to send a fresh invite. Requires sign-in.
| Name | Required | Description | Default |
|---|---|---|---|
No parameters | |||
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
Annotations already indicate readOnly and idempotent. Description adds valuable behavioral context: returns hours_until_expiry, proactive resend rule, and sign-in requirement.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
Concise, front-loaded with usage condition, and includes a clear rule. Every sentence adds value with no wasted words.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given zero parameters and no output schema, description covers purpose, usage, return info, and behavioral rules comprehensively.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
No parameters, so schema coverage is 100% trivially. Description adds value by mentioning output fields (hours_until_expiry), fulfilling baseline expectations.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
Clearly states it handles pending parent invites, share codes, and invite acceptance status. Distinguishes from sibling tools as no other tool mentions invites.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
Explicitly states when to use ('when the signed-in user asks about...'), includes a rule for proactive action based on expiry, and notes requirement for sign-in. Lacks explicit when-not or alternatives but sufficient.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
get_recent_mistakesARead-onlyIdempotentInspect
Use this when the signed-in user asks about words they've gotten wrong, missed words, words to review, or wants to revisit recent mistakes. Returns up to 25 words from the last N days (default 7) with miss-rate and last-seen timestamp, plus a link to the in-app Recent Mistakes page. SUMMARISE — never dump every row; tell the user the count, name 2–3 sample words, and recommend the page URL. Requires sign-in.
| Name | Required | Description | Default |
|---|---|---|---|
| days | No | Lookback window in days (1–90, default 7) | |
| limit | No | Max words to return (1–50, default 10) |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
Annotations already indicate read-only and idempotent. Description adds concrete behavioral details: returns up to 25 words, default 7 days, fields (miss-rate, last-seen, URL), and requires sign-in. No contradictions.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
Two sentences plus a succinct instruction. Front-loaded with use case, then returns, then summarization rule. Every sentence adds value.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Completely covers the tool's purpose, parameters, return fields, and usage guidelines. No gaps given the simple nature and existing schema/annotations.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema covers 100% of parameters with descriptions. Description adds default values for days (7) and hints at max limit (25, though schema allows up to 50—minor inconsistency, but still helpful). Also clarifies summarization behavior.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
Clear verb+resource: retrieves recent mistakes (words gotten wrong). Distinguishes from siblings like get_flashcards or get_word_of_the_day by specifying the use case exactly.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
Explicitly states when to use ('when the signed-in user asks about words they've gotten wrong...') and provides a summarization instruction. Does not explicitly mention when not to use, but the context is clear given sibling tools.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
get_recommended_next_actionARead-onlyInspect
One-line 'do this next' hint derived from the user's current lifecycle phase. Useful when the agent wants a quick recommendation without rendering a full guidance card.
| Name | Required | Description | Default |
|---|---|---|---|
| persona | No | Optional persona override. |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
Annotations already declare readOnlyHint=true and destructiveHint=false; description adds that it's a quick hint, which is consistent but does not introduce new behavioral details beyond what is in annotations.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
Single sentence that efficiently conveys purpose and usage, with no wasted words.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
For a simple tool with one optional parameter and no output schema, the description adequately covers purpose and usage context, though it could mention that no output schema means response may vary.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema coverage is 100%; the description does not add meaning beyond the schema's description of 'persona' as an optional override. Baseline score of 3 is appropriate.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
Description specifies verb 'get', resource 'recommended next action', and distinguishes from siblings by noting it's a quick hint without a full guidance card, unlike get_sparkle_guidance.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
States 'useful when the agent wants a quick recommendation without rendering a full guidance card', providing clear usage context and implying when not to use (when full guidance is needed).
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
get_session_detailARead-onlyIdempotentInspect
Use this when the signed-in user asks 'what did I miss in [that session]', 'which words tripped me up', or 'what was my accuracy on session X'. Pass a session_id (study_sessions.id or adaptive_sessions.id, usually obtained from get_recent_session_results / a picker chip). Returns title, accuracy %, wrong_words[] (max 10), and a per-card timeline (truncated to first 20 events). Cite at least one wrong word and the accuracy in your reply.
| Name | Required | Description | Default |
|---|---|---|---|
| session_id | Yes | study_sessions.id or adaptive_sessions.id (UUID) | |
| include_timeline | No | Include per-card timeline (default true). Truncated to 20 events. |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
Annotations already declare readOnlyHint=true, idempotentHint=true, and destructiveHint=false, so the description's behavioral detail is supplementary. It adds truncation limits (max 10 wrong words, first 20 timeline events) which is useful, but not necessary beyond annotations.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
Three concise sentences. First sentence gives usage context, second explains parameter and return value, third gives usage instruction. No wasted words; front-loaded with crucial information.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Despite lacking an output schema, the description fully explains return fields (title, accuracy percentage, wrong_words max 10, per-card timeline truncated to 20). Combined with input schema and annotations, the tool is completely specified for a read operation.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema coverage is 100% with descriptions for both parameters. The description adds context: session_id is obtained from get_recent_session_results or a picker chip, and include_timeline defaults to true with truncation. This adds meaning beyond the schema.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the tool returns session details (title, accuracy, wrong words, timeline) and provides example user queries. It distinguishes slightly by mentioning how to obtain the session_id from related tools, but does not explicitly differentiate from siblings.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
Explicitly specifies when to use this tool with concrete user utterances (e.g., 'what did I miss in that session'). Provides context for when to invoke, but does not mention when not to use or suggest alternatives.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
get_session_trendsARead-onlyIdempotentInspect
Auth-only. Personal study trends over a window (default 14 days, max 90): session count, total minutes, accuracy trend (up/down/flat), and top-missed words. Use after a user asks 'how am I trending / am I improving / which words keep tripping me up'.
| Name | Required | Description | Default |
|---|---|---|---|
| days | No | Window size in days (default 14, max 90). |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
Annotations already indicate read-only and idempotent. The description adds 'Auth-only', specifies the default window (14 days) and maximum (90), and lists the output components (session count, minutes, accuracy trend, top-missed words), providing transparency beyond annotations.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
Two sentences cover purpose, output details, usage trigger, and parameter constraints. No redundant or extraneous text; every sentence adds value.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
For a simple read-only tool with one optional parameter and no output schema, the description is complete. It explains what the tool returns, its default/max, and when to use it, meeting all needs for correct invocation.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema coverage is 100% and the schema description already includes 'Window size in days (default 14, max 90).' The tool description repeats this info but does not add new parameter semantics; however, it reinforces the parameter's role in context.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the tool provides personal study trends (session count, total minutes, accuracy trend, top-missed words) over a window. It also gives example user queries that trigger usage, distinguishing it from siblings like get_class_session_trends.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description explicitly says 'Use after a user asks...' which defines the context. It lacks explicit 'do not use' statements but the intention is clear and the sibling get_class_session_trends exists as an alternative for class-level trends.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
get_sparkle_guidanceARead-onlyInspect
Returns Vocab Voyage's lifecycle-aware guidance: the user's current phase (e.g. student.at_risk), a friendly greeting, 2–3 recommended tool calls, and an optional CTA. Renders the session-debrief widget on supporting hosts. Anonymous callers get visitor.* phase suggestions.
| Name | Required | Description | Default |
|---|---|---|---|
| persona | No | Optional persona override: student | parent | tutor | explorer. |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
Annotations declare readOnlyHint=true and no destructive behavior. The description adds context about lifecycle awareness, anonymous caller handling, and widget rendering, which aligns with read-only behavior. No contradictions; behavior is well explained.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
Three sentences, front-loaded with the core function, followed by supplementary details. No unnecessary words; each sentence adds value.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given no output schema, the description adequately explains the return content (phase, greeting, recommendations, CTA) and behavior for anonymous callers. Minor gap: structure of recommended tool calls not specified, but overall sufficient.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema covers the persona parameter with 100% coverage but lacks enum values. The description adds the valid values (student, parent, tutor, explorer) beyond what the schema provides, improving semantic clarity.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description explicitly states it returns lifecycle-aware guidance including the user's current phase, greeting, recommended tool calls, and CTA. It distinguishes itself from sibling tools like get_recommended_next_action by mentioning lifecycle awareness and widget rendering.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description implies usage for obtaining guidance data, but does not explicitly state when to use this tool over alternatives like get_recommended_next_action or other get_* tools. No when-not or exclusion criteria are provided.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
get_study_plan_recommendationARead-onlyInspect
Auth-only. Returns a personalized N-day study plan (default 7, range 3–7) chosen from one of four focus modes (weak-topic-drill / streak-recovery / new-words / review-mastery) based on the user's recent trends. Inline only the first 3 days; full plan persists when the user clicks the Vocab Voyage start link.
| Name | Required | Description | Default |
|---|---|---|---|
| horizon_days | No | Plan length in days (3–7, default 7). |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
Annotations already indicate readOnlyHint and non-destructive. Description adds behavioral details on focus modes, inline limitation to first 3 days, and persistence via a start link, enhancing transparency beyond annotations.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
Two concise sentences: first covers authentication and core function, second details behavioral specifics. No wasted words, well-structured.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given a single parameter and no output schema, the description covers focus modes, default/range, and inline behavior, sufficiently complete for the tool's complexity.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema covers the parameter with 100% description, but description adds contextual info on default (7) and range (3-7), adding value beyond the schema.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states it returns a personalized N-day study plan, specifies focus modes, and differs from siblings like 'get_recommended_next_action' or 'study_plan_preview' by detailing output and behavior.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
Indicates 'Auth-only' and describes when to use for personalized study plans based on recent trends, but does not explicitly exclude alternatives or state when not to use.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
get_word_of_the_dayARead-onlyIdempotentInspect
Use this when the user asks for today's word, a daily vocabulary nudge, or a single-word warmup. Returns today's deterministic Word of the Day (definition, part of speech, example, synonyms/antonyms), optionally scoped to a test family (isee, ssat, sat, psat, gre, gmat, lsat, general). Do not use for arbitrary lookups — call get_definition instead.
| Name | Required | Description | Default |
|---|---|---|---|
| test_family | No | Optional test family: isee, ssat, sat, psat, gre, gmat, lsat, general |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
Discloses return fields and deterministic nature. Annotations already indicate read-only and idempotent; description adds value by specifying the exact content of the response. No contradictions.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
Three sentences, front-loaded with purpose and usage, then return details. No unnecessary words or repetition.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
For a simple tool with one optional parameter and no output schema, the description fully covers what the tool returns and when to use it, making it complete for agent invocation.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema has 100% coverage for the single parameter. Description adds confirmation of the valid values and optionality but does not provide new information beyond the schema description.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the tool returns today's Word of the Day, specifying it includes definition, part of speech, example, synonyms/antonyms. It distinguishes from sibling get_definition by explicitly saying not to use for arbitrary lookups.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
Provides explicit when-to-use scenarios (user asks for today's word, daily nudge, warmup) and when-not-to (arbitrary lookups) with a direct alternative (get_definition). Also mentions optional scoping to test families.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
list_coursesARead-onlyIdempotentInspect
Lists all 13 Vocab Voyage courses with their slugs and descriptions.
| Name | Required | Description | Default |
|---|---|---|---|
No parameters | |||
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
Annotations already indicate readOnly, idempotent, and non-destructive behavior. The description adds the detail that exactly 13 courses are listed, which is a specific behavioral trait beyond annotations. It could also mention that the list is static, but the provided detail is sufficient.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is a single sentence that immediately states the action and scope. Every word is necessary, and it is front-loaded with the verb 'Lists'.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
For a simple listing tool with no parameters and annotations covering safety, the description adequately specifies the output (slugs and descriptions) and the fixed size (13 courses). No additional context is needed.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
There are no parameters, and schema coverage is 100%. The description does not need to add parameter information. According to guidelines, zero parameters yields a baseline of 4.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the tool lists all 13 Vocab Voyage courses and specifies the returned fields (slugs and descriptions). It effectively distinguishes from sibling tools like list_starter_prompts which list different resources.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description does not provide explicit guidance on when to use this tool versus alternatives, but the lack of parameters and simple listing function makes usage obvious. No exclusions or alternatives are mentioned, which is acceptable for a straightforward tool.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
list_starter_promptsARead-onlyIdempotentInspect
Lists Vocab Voyage's MCP starter prompts (also exposed via the standard MCP prompts/list endpoint). Useful for hosts that don't yet support prompts/list.
| Name | Required | Description | Default |
|---|---|---|---|
No parameters | |||
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
The description adds no behavioral traits beyond what the annotations already provide (readOnlyHint, idempotentHint, destructiveHint). It only mentions the alternative endpoint, which is useful context but not a behavioral disclosure. Annotations already declare the tool as safe and idempotent.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is a single sentence of 20 words, concise and front-loaded with the core purpose. No redundant information.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
For a simple list tool with no parameters, annotations, and no output schema, the description fully covers what the tool does and when to use it. It is complete within the context of the available structured fields.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
The tool has 0 parameters, and the schema coverage is 100% (empty schema). Per guidelines, baseline is 4. The description adds minimal extra meaning beyond the schema, but it is adequate for a parameterless tool.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the verb 'Lists' and the resource 'Vocab Voyage's MCP starter prompts', and distinguishes it by mentioning it is also exposed via the standard prompts/list endpoint. This is specific and unambiguous.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description explicitly says 'Useful for hosts that don't yet support prompts/list', providing clear guidance on when to use this tool. It does not mention when not to use it, but the context is sufficiently clear for selection.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
mark_word_difficultAIdempotentInspect
Manually mark a word as still-learning for the signed-in user (resets mastery toward learning band). Requires sign-in.
| Name | Required | Description | Default |
|---|---|---|---|
| word | No | ||
| card_id | No |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
Annotations already indicate readOnlyHint=false, idempotentHint=true, destructiveHint=false. The description adds value by specifying the auth requirement ('Requires sign-in') and the behavioral effect on mastery ('resets mastery toward learning band'), which goes beyond what annotations provide. No contradictions.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is two sentences, very concise, and front-loaded with the action. Every sentence provides essential information without waste.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
The description covers purpose and auth requirements but lacks details on output (no output schema), parameter usage, and whether the operation is reversible. For a tool that modifies mastery state, more context on consequences would be helpful.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema has two parameters (word, card_id) with no descriptions. Schema description coverage is 0%. The description does not explain what card_id represents or how the parameters relate. It only implies that 'word' is the word being marked. This leaves the agent unclear on how to use both parameters.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the action ('Manually mark a word as still-learning'), the resource ('for the signed-in user'), and the effect ('resets mastery toward learning band'). It distinguishes from sibling tools like 'mark_word_known' which implies the opposite action.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description implies manual override ('Manually mark') but does not explicitly state when to use this tool versus alternatives like 'record_word_result' or 'get_recommended_next_action'. No when-not or alternative guidance is provided.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
mark_word_knownBIdempotentInspect
Manually mark a word as mastered for the signed-in user (same as the flashcard 'I knew this' override). Requires sign-in.
| Name | Required | Description | Default |
|---|---|---|---|
| word | No | ||
| card_id | No |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
Annotations already declare idempotentHint=true and destructiveHint=false, so the description adds little beyond confirming mutation and sign-in. No contradictions.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
Single sentence with a concise note, front-loaded with the core purpose. No wasted words.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
With 2 undocumented parameters and no output schema, the description fails to explain parameter relationships, response behavior, or success/failure indicators, leaving the tool under-specified for correct invocation.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema description coverage is 0%, and the description does not explain the meaning or format of the 'word' and 'card_id' parameters, leaving the agent without necessary context.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the verb 'mark', the resource 'word as mastered', and the user context 'for the signed-in user'. It also distinguishes from sibling tools like 'mark_word_difficult' by specifying 'mastered'.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description provides context via the flashcard analogy and notes the sign-in requirement, but it does not explicitly state when not to use this tool or compare it with alternatives like 'mark_word_difficult'.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
nudge_childAInspect
Parent-only. Sends a 'check-in' push notification (and email fallback) to a linked child. Use when the parent says things like 'remind my kid to study', 'nudge my child', 'tell Sam to do their words today'. The server enforces a 24h cooldown per child — if rate-limited the response includes retry_after_hours. NEVER spoof a different parent — the calling user must already be linked to the child. Requires sign-in.
| Name | Required | Description | Default |
|---|---|---|---|
| reason | No | Optional short reason (≤200 chars), e.g. 'streak at risk' | |
| message | No | Optional personal message (≤280 chars) shown to the child | |
| child_user_id | Yes | user_id of the linked child to nudge |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
Describes server-enforced 24h cooldown with retry_after_hours in response, and email fallback. Annotations already declare non-read-only and non-destructive, so description adds value with these behavioral details.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
Four sentences, no fluff. Front-loaded with key info. Every sentence earns its place.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Covers prerequisites, behavior, cooldown, and fallback. No output schema, but response details (retry_after_hours) are mentioned. Missing explicit error handling but sufficient for a tool of this complexity.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema coverage is 100%. Description adds examples and character limits for parameters (e.g., 'Optional short reason (≤200 chars)') which go beyond the schema descriptions.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
Explicitly states 'Sends a check-in push notification (and email fallback) to a linked child' and gives specific example phrases that trigger the tool. Clearly distinguishes itself from sibling tools.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
Provides explicit when-to-use examples like 'remind my kid to study'. Also states constraints: parent-only, requires sign-in, never spoof parent, and mentions 24h cooldown. Does not explicitly mention alternative tools but given siblings are unrelated, this is sufficient.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
play_gameAInspect
Use this when the user wants to play a vocabulary game, asks for something fun, or wants to learn through play. Launches one of 11 mini-games inside the host chat. Renders the matching ui://vocab-voyage/game/{slug} widget on supporting hosts; falls back to a deep link elsewhere. Per-question answers persist via record_word_result; round completion fires record_session_complete + award_game_xp so MCP play counts toward streaks, XP, and mastery for signed-in users. Supported slugs: word_match, spelling_bee, speed_round, synonym_showdown, word_scramble, fill_in_blank, context_clues, word_guess, picture_match, crossword, word_search. Do not use for a serious test-prep quiz — call generate_quiz instead.
| Name | Required | Description | Default |
|---|---|---|---|
| slug | Yes | Game slug: word_match | spelling_bee | speed_round | synonym_showdown | word_scramble | fill_in_blank | context_clues | word_guess | picture_match | crossword | word_search | |
| count | No | Words in the round (4–12, default 8) | |
| test_family | No | Optional: isee, ssat, sat, psat, gre, gmat, lsat, general |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
Details side effects: renders UI widget, falls back to deep link, persists answers via record_word_result, fires record_session_complete and award_game_xp. This adds significant context beyond annotations (readOnlyHint: false) and does not contradict them.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
Well-organized: usage context, behavior, side effects, slug list, exclusion. Every sentence is informative and earns its place with no redundancy.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given complexity (3 params, no output schema), the description is complete: it explains what the tool does, how it works, its side effects, and differentiates from siblings. An agent can confidently select and invoke this tool.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema coverage is 100%, so baseline is 3. Description lists slugs again but adds little new meaning beyond schema descriptions. No extra details on count or test_family beyond what schema provides.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the tool is for playing vocabulary games or having fun learning, with a specific verb ('play') and resource ('mini-game'). It distinguishes from sibling generate_quiz by explicitly excluding serious test-prep quizzes.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
Provides explicit when-to-use criteria (wants to play, asks for fun, learn through play) and when-not-to-use (serious test-prep, directing to generate_quiz). Also explains that results persist for streaks and XP, guiding correct invocation context.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
record_session_completeAInspect
Record a completed study session: writes study_sessions, awards study-time XP (+1/min, capped 30/day), and updates the daily streak. Use after a play_game / quiz / flashcard session ends. Requires sign-in.
| Name | Required | Description | Default |
|---|---|---|---|
| deck_id | No | Optional deck UUID; omit for ad-hoc MCP sessions. | |
| total_count | No | ||
| session_type | No | e.g. mcp_word_match, mcp_flashcard, mcp_quiz. | |
| cards_studied | Yes | ||
| correct_count | No | ||
| session_title | No | ||
| time_spent_seconds | Yes |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
Beyond annotations (readOnlyHint=false, destructiveHint=false), description discloses specific behavioral details: writes to study_sessions, awards XP (+1/min, capped 30/day), updates streak, and requires authentication. No contradictions.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
Two concise sentences, front-loaded with purpose and key effects. Every sentence adds unique value without redundancy.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Covers main effects (DB writes, XP, streak), auth requirement. No output schema, but not needed. Could mention idempotency or error handling for completeness, but adequate for a mutation tool.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema description coverage is only 29% (2 of 7 parameters have descriptions). The tool description does not compensate by explaining individual parameters or their roles, leaving ambiguity for cards_studied, correct_count, etc. Only XP calculation detail is added.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the tool records a completed study session, writes to study_sessions, awards study-time XP with specific rate/cap, and updates daily streak. It distinguishes itself from siblings like award_game_xp (separate XP) and play_game (different action).
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
Explicitly states 'Use after a play_game / quiz / flashcard session ends,' providing clear when-to-use context. Also notes 'Requires sign-in.' Could improve by mentioning when not to use or alternatives like award_game_xp for game-specific XP, but still clear.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
record_word_resultAInspect
Persist a single word answer (correct/incorrect) to the user's mastery progress. Mirrors the web app's word-mastery scaling so MCP study counts toward leaderboards and streaks. Requires sign-in.
| Name | Required | Description | Default |
|---|---|---|---|
| word | No | The word answered (preferred for human input). | |
| card_id | No | Card UUID (preferred when known from a prior tool result). | |
| is_correct | Yes | ||
| question_type | No | e.g. multiple_choice, fill_in_blank, flashcard. | |
| quiz_attempt_id | No | Optional quiz_attempt UUID to record a per-question row. | |
| selected_answer | No | ||
| time_taken_seconds | No |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
Annotations indicate this is a non-read-only, non-destructive mutation. The description adds value by explaining that recording affects leaderboards and streaks, and requires authentication. This goes beyond the annotations' minimal info, though it doesn't detail idempotency or other side effects.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is two sentences: purpose first, then context. It is efficient with no filler words. However, it could be slightly more structured by including a brief note on return behavior, but overall it is appropriately sized.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
The tool has 7 parameters with only 1 required, no output schema, and moderate complexity. The description lacks information about return values, error conditions, or how missing optional parameters are handled. The note about leaderboards helps, but gaps remain in completeness.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema coverage is 57% (some params described). The description does not add any parameter-specific meaning or guidance beyond the schema. It does not compensate for the missing descriptions on is_correct, selected_answer, and time_taken_seconds, leaving the agent without clarity on those fields.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description uses 'Persist a single word answer (correct/incorrect)' which clearly states the verb and resource. It distinguishes from siblings like mark_word_known or record_session_complete by focusing on recording a single correctness result rather than marking difficulty or completing a session.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description states 'Requires sign-in' as a prerequisite but does not explicitly guide when to use this tool versus alternatives. The context of mirroring web app mastery scaling provides some implicit guidance, but no when-not or comparison to similar tools is given.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
resend_pending_inviteAInspect
Resend a pending parent invite by id. Use after get_pending_invites surfaces an invite expiring in <24h, or when the user explicitly asks to resend. Re-emails the existing invite_token; no new code is generated. 60s per-invite cooldown. Caller must own the invite. Requires sign-in.
| Name | Required | Description | Default |
|---|---|---|---|
| invite_id | Yes | The id field returned by get_pending_invites. |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
Adds behavioral details beyond annotations: 60s cooldown, re-emails existing token, caller must own invite. Annotations already indicate mutation, so credit for extra context.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
Multiple sentences each add value; no fluff. Front-loaded with core action. Could be slightly more structured but clear and efficient.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
For a simple 1-param tool with no output schema, covers purpose, usage, behavioral traits, and permissions. Lacks error handling details but adequate given simplicity.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema covers invite_id with adequate description. The tool description adds little beyond 'by id' and referencing get_pending_invites. Baseline 3 is appropriate.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
Description clearly states 'Resend a pending parent invite by id' and provides specific use cases when to invoke, distinguishing from siblings like get_pending_invites.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
Explicitly tells when to use (after expiring invite or user request), what it doesn't do (no new code), and includes cooldown and ownership prerequisites.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
set_personaARead-onlyIdempotentInspect
Bias subsequent Sparkle guidance toward a persona (student | parent | tutor | explorer). Session-scoped: the host should pass the chosen persona back to get_sparkle_guidance.
| Name | Required | Description | Default |
|---|---|---|---|
| persona | Yes | student | parent | tutor | explorer |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
Annotations already indicate read-only and idempotent behavior. The description adds valuable context: it is session-scoped and requires passing the persona to get_sparkle_guidance. No contradiction with annotations.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
Two sentences, front-loaded with purpose and constraints. Every sentence adds value with no repetition or fluff.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
For a simple tool with one parameter and no output schema, the description is complete: explains purpose, allowed values, scope, and integration hint. No gaps.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema coverage is 100% with a description listing allowed values. The tool description repeats these values but does not add new meaning, such as format or default. Baseline 3 is appropriate.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
Description clearly states the tool biases Sparkle guidance toward a persona and lists the allowed values (student, parent, tutor, explorer). While specific, it could be more explicit that this tool sets the persona for subsequent calls, not just 'bias'. Differentiates from siblings like get_sparkle_guidance which retrieves guidance.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
States the scope (session-scoped) and instructs the host to pass the persona back to get_sparkle_guidance. Provides useful context for correct usage, though does not explicitly contrast with alternative tools (which are mostly about getting data, not setting).
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
study_plan_previewARead-onlyIdempotentInspect
Use this when the user asks for a study plan, a multi-day prep schedule, or how to prepare for a test by date. Returns a 7-day plan (5 words/day) for a given test family. Renders the interactive Vocab Voyage study-plan widget on supporting hosts; tapping 'Start Day N' launches a flashcard session seeded with that day's words. Do not use for a single quiz session — call generate_quiz instead. Do not use for one-off lookups — call get_definition instead.
| Name | Required | Description | Default |
|---|---|---|---|
| target_date | No | Optional ISO date (YYYY-MM-DD) | |
| test_family | Yes |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
Annotations already mark the tool as readOnlyHint=true, idempotentHint=true, destructiveHint=false. The description adds context beyond annotations, such as the interactive widget behavior, flashcard session launch, and return value structure (7-day, 5 words/day). No contradictions.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is concise, front-loading the primary use case, then providing key details and exclusions. Every sentence adds value without redundancy.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given no output schema, the description adequately explains the return value (7-day plan, 5 words/day) and interactive behavior. It could benefit from more detail on the response format, but the core functionality is clear.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema coverage is 50% (target_date has description, test_family does not). The description mentions test family and date but adds no new details about parameter formats, constraints, or default behavior beyond what the schema provides for target_date. It does not compensate for the missing schema description of test_family.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the tool's purpose: generating a 7-day study plan for test prep. It uses specific verbs ('returns', 'renders') and distinguishes from siblings like generate_quiz and get_definition, making its scope explicit.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description explicitly specifies when to use (multi-day study plan requests) and when not to (single quiz or one-off lookups), with direct references to alternative tools (generate_quiz, get_definition).
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
update_adaptive_levelCInspect
Run the adaptive-mastery promotion logic for the signed-in user (delegates to the web app's update-adaptive-mastery function). Requires sign-in.
| Name | Required | Description | Default |
|---|---|---|---|
| course_id | No | ||
| total_count | Yes | ||
| correct_count | Yes | ||
| words_studied | Yes |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
Annotations indicate non-read-only and non-destructive, but the description adds little beyond the delegation to a web function and sign-in requirement. No mention of side effects or outcome.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is a single sentence, concise but missing important parameter and usage details. It front-loads the purpose but lacks completeness.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given 4 parameters, no output schema, and a mutation tool, the description is insufficient. It does not explain return values or how parameters affect the logic.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema coverage is 0% and the description does not explain any of the four parameters (words_studied, correct_count, total_count, course_id). The agent has no guidance on what values to provide.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the action (run adaptive-mastery promotion logic) and the target (signed-in user), distinguishing it from sibling tools that are read-only or record individual results. However, it could be more specific about the promotion logic's effect.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
No guidance is provided on when to use this tool versus alternatives like record_word_result or get_sparkle_guidance. The only prerequisite mentioned is sign-in.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
Claim this connector by publishing a /.well-known/glama.json file on your server's domain with the following structure:
{
"$schema": "https://glama.ai/mcp/schemas/connector.json",
"maintainers": [{ "email": "your-email@example.com" }]
}The email address must match the email associated with your Glama account. Once published, Glama will automatically detect and verify the file within a few minutes.
Control your server's listing on Glama, including description and metadata
Access analytics and receive server usage reports
Get monitoring and health status updates for your server
Feature your server to boost visibility and reach more users
For users:
Full audit trail – every tool call is logged with inputs and outputs for compliance and debugging
Granular tool control – enable or disable individual tools per connector to limit what your AI agents can do
Centralized credential management – store and rotate API keys and OAuth tokens in one place
Change alerts – get notified when a connector changes its schema, adds or removes tools, or updates tool definitions, so nothing breaks silently
For server owners:
Proven adoption – public usage metrics on your listing show real-world traction and build trust with prospective users
Tool-level analytics – see which tools are being used most, helping you prioritize development and documentation
Direct user feedback – users can report issues and suggest improvements through the listing, giving you a channel you would not have otherwise
The connector status is unhealthy when Glama is unable to successfully connect to the server. This can happen for several reasons:
The server is experiencing an outage
The URL of the server is wrong
Credentials required to access the server are missing or invalid
If you are the owner of this MCP connector and would like to make modifications to the listing, including providing test credentials for accessing the server, please contact support@glama.ai.
Discussions
No comments yet. Be the first to start the discussion!
Your Connectors
Sign in to create a connector for this server.