Skip to main content
Glama

Server Details

Generate, manage and explore your Switch AI image and video library, scoped to your account.

Status
Healthy
Last Tested
Transport
Streamable HTTP
URL

Glama MCP Gateway

Connect through Glama MCP Gateway for full control over tool access and complete visibility into every call.

MCP client
Glama
MCP server

Full call logging

Every tool call is logged with complete inputs and outputs, so you can debug issues and audit what your agents are doing.

Tool access control

Enable or disable individual tools per connector, so you decide what your agents can and cannot do.

Managed credentials

Glama handles OAuth flows, token storage, and automatic rotation, so credentials never expire on your clients.

Usage analytics

See which tools your agents call, how often, and when, so you can understand usage patterns and catch anomalies.

100% free. Your data is private.
Tool DescriptionsA

Average 4.4/5 across 34 of 34 tools scored. Lowest: 3.6/5.

Server CoherenceA
Disambiguation4/5

Most tools have distinct purposes, but some overlap exists, e.g., apply_iphone_realism and apply_ugc both create casual, amateurish looks. Descriptions are clear, so confusion is minimal.

Naming Consistency3/5

Naming is mixed: many tools use verb_noun (e.g., generate_image, list_my_videos) but there are 10+ apply_* tools that use a different pattern. While each subgroup is internally consistent, the overall mix reduces coherence.

Tool Count3/5

34 tools is high for a media generation server. While the breadth of capabilities justifies many tools, several apply_* tools are redundant (all return prompt stacks) and could be consolidated, making the count feel heavy.

Completeness4/5

Core workflows for generating, listing, showing, and checking status are covered. Missing delete or update for assets is a minor gap, but users can manage tasks via cancel. The surface adequately supports the domain.

Available Tools

34 tools
analyze_videoAnalyze VideoA
Read-onlyIdempotent
Inspect

Switch Vision — watch and understand a video (or image) like a human and answer a question about it: scenes, subjects, actions, on-screen text, pacing, mood and sentiment. Pass video_url (a public https video URL, including YouTube) OR one of your own Switch videos (a video/asset id from list_my_videos / list_my_assets / upload_media). Add an optional question to focus the analysis (e.g. "what is the tone and energy?", "list the cuts and what each shot shows"). Use this whenever the user gives you a reference video and wants its style, energy, structure or content understood — for example before making a new video that matches it.

ParametersJSON Schema
NameRequiredDescriptionDefault
questionNoOptional. What to find out about the video — tone, structure, on-screen text, sentiment, etc.
video_urlYesA public https video URL (YouTube ok), OR one of your own Switch video/asset ids.
Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations already declare readOnlyHint=true and idempotentHint=true. The description adds valuable context about what the tool analyzes, but could be clearer on whether images are fully supported (mentions 'image' but schema requires video_url). Nevertheless, no contradictions.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Two well-structured sentences, no redundancy. The first sentence packs the core purpose and capabilities; the second provides usage guidance. Every word earns its place.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness5/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given no output schema, the description sufficiently covers inputs, optional features, and usage context. It fits well among sibling tools as the analysis counterpart to generation tools.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters5/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema descriptions are clear (100% coverage), but the description elaborates further: explains video_url can be a public URL or Switch ID from specific tools, and gives example questions for the optional parameter. This adds significant meaning.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states 'watch and understand a video (or image) like a human and answer a question about it', listing specific aspects (scenes, subjects, actions, etc.). It distinguishes itself from sibling tools which are mostly apply-style tools, making its purpose unique.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines5/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

Explicitly says 'Use this whenever the user gives you a reference video and wants its style, energy, structure or content understood', and provides an example context ('before making a new video that matches it'). This guides when to use vs. the apply-style siblings.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

apply_cinematic_anamorphicApply Cinematic AnamorphicA
Read-onlyIdempotent
Inspect

ARRI Alexa anamorphic widescreen film look. Choose grade: warm golden, cool noir, or moody desaturated. Returns the styled prompt stack for your shot — pair it with generate_image.

ParametersJSON Schema
NameRequiredDescriptionDefault
styleYeswarm_golden = late-afternoon honey. cool_noir = neon-fill desaturated. moody_desaturated = soft window low-contrast.
subjectNoWhat you want to shoot. E.g. "a woman walking through a hotel lobby" or "morning coffee on the balcony".
Behavior3/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations already indicate readOnlyHint=true and idempotentHint=true, so description's mention of 'returns the styled prompt stack' confirms non-destructive behavior. However, description does not add significant behavioral context beyond annotations, so score is moderate.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Two sentences efficiently convey purpose, options, and usage. No redundancy or unnecessary information. Well-suited for quick agent comprehension.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given no output schema, description explains the return value (styled prompt stack) and how it integrates with generate_image. For a simple 2-param tool, this is sufficient to understand the workflow.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Input schema has 100% description coverage with clear enum values and subject description. Description's 'Choose grade' and listing of grades adds minimal extra meaning beyond schema. Baseline score due to high schema coverage.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

Description clearly states the tool applies an ARRI Alexa anamorphic widescreen film look with specific grades. It specifies the output (styled prompt stack) and usage context (pair with generate_image). This distinguishes it from sibling 'apply_*' tools like apply_movie_scene or apply_magic_hour_portrait.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines3/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

Description mentions pairing with generate_image, implying typical use case, but lacks explicit guidance on when to choose this tool over similar ones (e.g., when to use anamorphic vs other cinematic looks). No alternative or exclusion criteria provided.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

apply_graphic_editorial_portraitApply Graphic Editorial PortraitA
Read-onlyIdempotent
Inspect

Sharp graphic editorial portrait — premium fashion-magazine grade, hard graphic composition. Classic studio or golden-hour outdoor. Returns the styled prompt stack for your shot — pair it with generate_image.

ParametersJSON Schema
NameRequiredDescriptionDefault
styleYesclassic = Hasselblad H6D studio. golden_hour = Canon R5 outdoor.
subjectNoWhat you want to shoot. E.g. "a woman walking through a hotel lobby" or "morning coffee on the balcony".
Behavior3/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

The annotations already declare readOnlyHint and idempotentHint, so the description's note about returning a prompt stack is consistent but adds no new behavioral detail beyond what annotations provide. No contradictions.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is a single concise sentence that conveys the tool's purpose, output, and a sibling pairing hint. No wasted words.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

For a tool with 2 parameters and no output schema, the description adequately explains the tool's role and output. Given the sibling tools, it provides enough context for an agent to select and use it correctly.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

The input schema has 100% coverage with descriptions for both style (enum with studio/outdoor camera references) and subject (free text). The description adds minor context about the output but not about parameters. Baseline 3 is appropriate.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states it applies a 'sharp graphic editorial portrait' and specifies the two style options ('classic studio' or 'golden-hour outdoor'). It also explains the output is a prompt stack for use with generate_image, distinguishing it from sibling tools like apply_high_fashion_editorial or apply_magic_hour_portrait.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description explicitly says to pair it with generate_image, indicating the intended workflow. It implies when to use this tool (for graphic editorial portraits) but does not explicitly say when not to use it or suggest alternatives, though the sibling list provides context.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

apply_high_fashion_editorialApply High Fashion EditorialA
Read-onlyIdempotent
Inspect

High-fashion magazine cover/editorial energy. Choose a photographer mood: Mario Testino glossy, Steven Klein dark cinematic, Inez & Vinoodh hard-flash, Annie Leibovitz painterly, Tim Walker dreamlike, Peter Lindbergh black-and-white natural, or Cass Bird off-duty. Returns the styled prompt stack for your shot — pair it with generate_image.

ParametersJSON Schema
NameRequiredDescriptionDefault
styleYesPhotographer attribution drives the lighting + camera + grade stack.
subjectNoWhat you want to shoot. E.g. "a woman walking through a hotel lobby" or "morning coffee on the balcony".
Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations already declare readOnlyHint and idempotentHint. Description adds that it returns a prompt stack, not an actual image, clarifying its read-only nature. No contradictions.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Two efficient sentences, front-loaded with purpose, examples, and usage instruction. No wasted words.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

For a tool with 2 params and no output schema, description covers purpose, parameters, and integration with generate_image. Could mention optional subject or more on output format, but sufficient.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters4/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema coverage is 100% with descriptions. Tool description adds human-readable examples for style enum and clarifies subject as 'what you want to shoot', enhancing understanding.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

Description clearly states it applies high-fashion editorial energy with specific photographer moods. Distinguishes from sibling tools like apply_cinematic_anamorphic or apply_graphic_editorial_portrait by naming its unique stylistic options.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

Explicitly says 'pair it with generate_image', indicating it's a preparatory step. Implicitly tells when to use (for high-fashion) but doesn't explicitly exclude other cases. Naming photographer moods provides context.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

apply_iphone_realismApply Iphone RealismA
Read-onlyIdempotent
Inspect

Phone-shot amateur look — looks like a real person snapped it on their phone. Casual, candid, pore-level real, no professional gloss. Three flavors: digital phone, 35mm film point-and-shoot, or off-duty intimate. Returns the styled prompt stack for your shot — pair it with generate_image.

ParametersJSON Schema
NameRequiredDescriptionDefault
styleYesdigital_phone = Sony A7IV + 50mm f/1.4 GM phone-style realism. film_pointshoot = Contax T2 35mm Portra 400. off_duty_intimate = Cass Bird natural-window editorial.
subjectNoWhat you want to shoot. E.g. "a woman walking through a hotel lobby" or "morning coffee on the balcony".
Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations already indicate readOnlyHint and idempotentHint. The description adds that the tool returns a prompt stack and must be paired with generate_image, providing useful behavioral context beyond the annotations.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is only two sentences, front-loaded with the core purpose ('Phone-shot amateur look'), and contains no unnecessary words. Every sentence adds value.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

The description explains the tool's output (styled prompt stack), its usage with generate_image, and the three style options. It does not detail the prompt format, but given the simplicity and good schema/annotations, it is sufficiently complete.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema coverage is 100% and the schema already provides detailed descriptions for both parameters. The tool description adds little new parameter information beyond listing the styles, so it meets the baseline without significant enhancement.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the tool applies a 'phone-shot amateur look' and returns a styled prompt stack for use with generate_image. It distinguishes itself from sibling style tools by specifying 'casual, candid, pore-level real, no professional gloss' and listing three specific flavors.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description implies when to use this tool (when an amateur phone look is desired) and mentions pairing with generate_image. It does not explicitly state when not to use it or name alternatives, but the context is clear given the sibling tools.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

apply_magic_hour_portraitApply Magic Hour PortraitA
Read-onlyIdempotent
Inspect

Golden-hour rim-light editorial portrait. Choose camera: Canon R5 + 85mm f/1.2 or Hasselblad H6D + 80mm. Returns the styled prompt stack for your shot — pair it with generate_image.

ParametersJSON Schema
NameRequiredDescriptionDefault
styleYescanon_85mm = Canon R5 portrait standard. hasselblad_80mm = medium-format luxury.
subjectNoWhat you want to shoot. E.g. "a woman walking through a hotel lobby" or "morning coffee on the balcony".
Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations already indicate readOnly and idempotent; the description adds that it returns a 'styled prompt stack' for use with another tool, which gives behavioral context beyond the annotations.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Two concise sentences that front-load the style and purpose, with no wasted words.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given only two parameters and no output schema, the description adequately explains the return value and usage context, though it could mention the output format.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema coverage is 100%, so baseline is 3. The description repeats camera choices already in the schema's enum descriptions and does not add new meaning to the parameters.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states 'Golden-hour rim-light editorial portrait' and specifies camera choices, making the tool's purpose distinct from sibling tools like 'apply_cinematic_anamorphic'.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

It instructs to pair with 'generate_image' but does not explicitly state when to use this tool vs alternatives; however, the context of 'editorial portrait' provides implicit guidance.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

apply_movie_sceneApply Movie SceneA
Read-onlyIdempotent
Inspect

Put me in a movie — full cinematic film look matching specific film genres. Choose: neon-noir action thriller, 80s finance excess, comic-book superhero blockbuster, video-game key art, or generic action thriller. Returns the styled prompt stack for your shot — pair it with generate_image.

ParametersJSON Schema
NameRequiredDescriptionDefault
styleYesneon_noir_action = wet streets + neon + anamorphic. glamour_finance_excess = 1980s Wall Street mahogany / gold. superhero_blockbuster = comic-book key art. video_game_character = Unreal-Engine character render. generic_action_thriller = ARRI cinematic.
subjectNoWhat you want to shoot. E.g. "a woman walking through a hotel lobby" or "morning coffee on the balcony".
Behavior3/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations already declare readOnlyHint=true and idempotentHint=true, so the description's statement about returning a prompt stack is consistent but adds little beyond what annotations provide. No behavioral surprises or additional context like side effects or rate limits.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Two concise sentences that front-load the purpose and immediately list the options and output. Every word serves a purpose; no fluff.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the tool has only 2 simple parameters, no nested objects, and no output schema, the description adequately covers purpose and usage. It could briefly explain what a 'styled prompt stack' looks like, but it's sufficient for an agent to understand the tool's role in a pipeline.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema description coverage is 100%, and the input schema already provides detailed enum values with descriptions. The tool description repeats the enum list but adds no new semantic meaning beyond the schema. Baseline 3 is appropriate.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose4/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states it applies a cinematic film look to a scene, listing specific genres. It pairs with generate_image. However, it doesn't explicitly differentiate from sibling tools like apply_cinematic_anamorphic or apply_graphic_editorial_portrait, missing a chance to clarify its unique role.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines3/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description implies usage by saying 'pair it with generate_image', providing context on when to use it. However, it lacks explicit guidance on when not to use it or how it compares to alternatives among the many sibling apply tools.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

apply_productApply ProductA
Read-onlyIdempotent
Inspect

Product photography. Choose: clean studio hero shot, real-world lifestyle, extreme macro detail, or top-down flat lay. Returns the styled prompt stack for your shot — pair it with generate_image.

ParametersJSON Schema
NameRequiredDescriptionDefault
styleYesclean_studio = seamless backdrop hero. lifestyle = product in use. macro_detail = extreme close-up texture. flat_lay = top-down catalog.
subjectNoWhat you want to shoot. E.g. "a woman walking through a hotel lobby" or "morning coffee on the balcony".
Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Informs that tool returns a styled prompt stack and is read-only/idempotent, consistent with annotations. Provides step-by-step context that annotations alone do not capture.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Two efficient sentences with no filler. Front-loaded with purpose, then actionable output and next step. Every sentence adds value.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness5/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given 2 parameters, full schema coverage, and no output schema, the description explains the tool's role, options, and integration with generate_image. No gaps remain.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters4/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema covers both parameters with descriptions. Description reiterates the enum values and adds the output context (prompt stack), which enhances understanding beyond schema.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

Description clearly states the tool applies product photography styles, listing four distinct options. It distinguishes from sibling tools like apply_cinematic_anamorphic by focusing on product shots, not cinematic or portrait styles.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

Explicitly says to pair output with generate_image, indicating a two-step workflow. Does not mention when not to use or alternatives, but context is clear enough for proper invocation.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

apply_travelApply TravelA
Read-onlyIdempotent
Inspect

Luxury travel + hotel editorial. Real architecture is preserved exactly (no inventing buildings). Choose subject: hotel hero, rural property, scenic view, drone aerial, lifestyle moment, or interior. If you attach a reference image of a real property, the architecture lock kicks in automatically. Returns the styled prompt stack for your shot — pair it with generate_image.

ParametersJSON Schema
NameRequiredDescriptionDefault
styleYeshotel_hero = property is the star. rural_property = country estate. scenic_view = pure landscape. drone_aerial = top-down or 45° from above. lifestyle = model + destination. interior = inside the property.
subjectNoWhat you want to shoot. E.g. "a woman walking through a hotel lobby" or "morning coffee on the balcony".
Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations (readOnlyHint, idempotentHint) already indicate safe, deterministic behavior. The description adds valuable behavioral context: 'Real architecture is preserved exactly' and automatic architecture lock with reference images. No contradictions.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is concise—three sentences pack purpose, styles, behavioral rules, and output. Front-loaded with the core function. No superfluous words.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the tool's moderate complexity (2 params, no output schema), the description covers the return value ('styled prompt stack'), behavioral edge cases (architecture lock), and example subjects. Minor gap: no explicit mention of required vs optional parameters, but schema covers that.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema coverage is 100% with detailed enum descriptions. The description adds minor clarity (e.g., 'hotel_hero = property is the star') but does not significantly expand beyond schema. Baseline 3 is appropriate.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the tool applies a travel/hotel editorial style with specific subject options. It distinguishes from siblings by listing six concrete styles (hotel_hero, rural_property, etc.) and the architecture preservation rule.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description specifies when to use the tool ('luxury travel + hotel editorial') and provides subject examples. It also notes a reference image triggers architecture lock. It lacks explicit exclusion of other styles, but the sibling context makes alternatives obvious.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

apply_ugcApply UgcA
Read-onlyIdempotent
Inspect

User-generated content — looks like a real person captured it casually. Choose: phone shot, film point-and-shoot, mirror selfie, or car selfie. Returns the styled prompt stack for your shot — pair it with generate_image.

ParametersJSON Schema
NameRequiredDescriptionDefault
styleYesphone_shot = iPhone-style snap. film_pointshoot = Contax T2 grain. mirror_selfie = bathroom/bedroom mirror. car_selfie = inside-the-car phone.
subjectNoWhat you want to shoot. E.g. "a woman walking through a hotel lobby" or "morning coffee on the balcony".
Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

The description discloses that the tool returns a styled prompt stack, which is behavioral information beyond the readOnlyHint and idempotentHint annotations. It does not contradict annotations. The return format is not specified, but the description gives a good idea of the output.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is extremely concise: two sentences that front-load the purpose and outcome. Every word is informative, with no redundancy or wasted text.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the tool's simplicity (two parameters, one required) and the presence of annotations, the description adequately covers what the tool does and how to use it. It could mention the subject parameter's role explicitly, but the schema already does, and the description implies it by referring to 'your shot.'

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema coverage is 100%, so baseline is 3. The description reiterates the style choices but does not add new meaning beyond the schema's detailed enum descriptions. No additional parameter semantics are provided.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states it applies a user-generated content style and returns a styled prompt stack. It specifies the four sub-styles (phone shot, film point-and-shoot, mirror selfie, car selfie), effectively distinguishing it from other apply_* sibling tools by focusing on casual, amateur aesthetics.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description provides clear context for when to use this tool: for casual, user-generated content looks. It also advises pairing with generate_image. However, it does not explicitly state when not to use it or compare to other style tools, but the specificity of the styles makes the use case clear.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

apply_wellnessApply WellnessA
Read-onlyIdempotent
Inspect

Wellness / yoga / fitness / lifestyle campaign — warm amber tropical, tropical paradise cinematic, or high-key cyan beach. Returns the styled prompt stack for your shot — pair it with generate_image.

ParametersJSON Schema
NameRequiredDescriptionDefault
styleYeswarm_amber_tropical = warm honey grade with golden haze. hanalei_cinematic = soft golden mist + infinity pool reflection. high_key_cyan_beach = bright daylit cyan ocean.
subjectNoWhat you want to shoot. E.g. "a woman walking through a hotel lobby" or "morning coffee on the balcony".
Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations already indicate readOnlyHint=true and idempotentHint=true, so the tool is safe and deterministic. The description adds that it returns a 'styled prompt stack', which is the key behavioral outcome. No contradictions.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness4/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Single sentence with dash, front-loads purpose. Efficient but could benefit from clearer structure (e.g., listing the styles).

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness5/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given 2 parameters, no output schema, and annotations present, the description covers all necessary aspects: what the tool does, what it returns, and how to use it. No gaps.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema coverage is 100% with detailed enum descriptions. The description repeats the style names but doesn't add new parameter semantics. Baseline 3 is appropriate since the schema does the heavy lifting.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states it applies a wellness/yoga/fitness/lifestyle campaign style and returns a styled prompt stack. It names specific styles (warm amber tropical, tropical paradise cinematic, high-key cyan beach). This distinguishes it from sibling style tools like 'apply_cinematic_anamorphic' or 'apply_travel'.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description advises pairing with 'generate_image', providing clear usage context. While it doesn't explicitly exclude alternative tools, the 'wellness' theme and specific style names guide appropriate selection.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

cancel_my_taskCancel TaskA
Idempotent
Inspect

Stop one of your generation tasks by task id — works on queued AND running tasks. Already-saved images stay in your library; nothing is deleted or refunded. Returns how many images were saved out of how many you requested.

ParametersJSON Schema
NameRequiredDescriptionDefault
taskIdYesTask id from generate_image or list_my_tasks.
Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations indicate idempotentHint=true and readOnlyHint=false. The description adds key behavioral details: nothing is deleted or refunded, and it returns a count of saved/requested images. This goes beyond what annotations provide.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Three sentences, zero fluff. The main purpose is front-loaded, and each sentence adds new information (scope, side effects, return value).

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness5/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

For a single-parameter tool with no output schema and good annotations, the description fully covers purpose, scope, side effects, and return value. Nothing essential is missing.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema description coverage is 100%, so the baseline is 3. The description adds little beyond the schema for 'taskId', but does provide context that it works on tasks from specific generators.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description uses a specific verb ('Stop') and resource ('generation tasks'), clarifies scope ('queued AND running tasks'), and explicitly states what it does not affect ('Already-saved images stay'). This clearly distinguishes it from sibling tools like 'generate_image' or 'list_generations'.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description states when to use (to cancel queued or running tasks). It does not explicitly mention when not to use or name alternatives, but the context is clear enough for an agent to decide.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

check_balanceCheck BalanceA
Read-onlyIdempotent
Inspect

Check your daily Switch spending — what you have spent today, your daily limit, and what is remaining. Optionally pass an estimatedCost (USD) to also get whether you can afford it.

ParametersJSON Schema
NameRequiredDescriptionDefault
estimatedCostNoOptional dollar amount to test against your daily limit.
Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations confirm read-only and idempotent behavior. The description adds value by detailing what information is returned (spent, limit, remaining) and the optional cost check.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Two concise sentences cover the tool's core function and optional capability without waste.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness5/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

For a simple tool with one optional parameter and no output schema, the description fully covers what the agent needs to know.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters4/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema covers the single parameter fully. The description adds meaning by explaining the parameter's purpose (test against daily limit).

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the tool checks daily Switch spending, limit, and remaining, with an optional affordability test. It is distinct from sibling creative tools.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description explains the tool's function and the optional parameter, though it does not explicitly contrast with siblings. The context makes the usage clear.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

check_job_statusCheck Job StatusA
Read-onlyIdempotent
Inspect

Polling-friendly status check for one of your tasks. Returns a slim shape with status, progressPct, and eta so you can poll without refetching the full payload.

ParametersJSON Schema
NameRequiredDescriptionDefault
taskIdYesTask id to check.
Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations already declare read-only and idempotent. Description adds valuable context: 'polling-friendly' and 'slim shape', enhancing transparency without contradiction.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Single sentence, no fluff. Efficiently conveys purpose, return shape, and polling suitability.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness5/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

For a simple tool with annotations and full schema coverage, the description is complete. It explains the return value and use case without needing more detail.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema coverage is 100% with clear parameter description. Description adds no additional meaning beyond 'Task id to check.' Baseline score of 3 is appropriate.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

Clearly states it is a polling-friendly status check for a task, returning a slim shape with status, progressPct, and eta. Differentiates from siblings like get_video_status by emphasizing lightweight polling.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines3/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

Implies use for polling status but does not explicitly state when not to use or mention alternatives among the many sibling tools. No exclusion criteria given.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

explore_modelsExplore ModelsA
Read-onlyIdempotent
Inspect

Browse the image-generation models available to your Switch account. Returns model id, display name, brand, and credits-per-image so you can pick one before calling generate_image.

ParametersJSON Schema
NameRequiredDescriptionDefault

No parameters

Behavior3/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations already declare readOnlyHint and idempotentHint, so the safety profile is clear. Description adds account-specific scope and return fields but minimal extra behavioral context beyond annotations.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Two sentences, front-loaded with action verb, followed by key details. No wasted words.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness5/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

For a zero-parameter listing tool, the description fully covers purpose, output contents, and usage context. No additional details are needed.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters4/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

No parameters exist, schema coverage is 100%. The description doesn't need to add param info, and baseline for 0 params is 4.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

Clearly states it browses image-generation models for the account, lists returned fields (id, name, brand, credits-per-image), and specifies its role before calling generate_image. Distinguishes from siblings like generate_image and list_video_models.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

Explicitly advises using this before generate_image to pick a model. Although it lacks explicit when-not-to-use guidance, the context of a simple listing tool makes this adequate.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

generate_audioGenerate AudioAInspect

Generate spoken audio from text: narration, a voiceover, a read-aloud script, or a multi-voice dialogue. Pass text (up to 2048 chars) — the words to be spoken. To match a specific voice instantly, pass reference_audio_url (a short clip) or up to 3 reference_audio_urls and address them as @Audio1, @Audio2, @Audio3 in the text for dialogue. Alternatively pass image_url to voice a scene from a picture (cannot combine with reference audio). Optional voice (a saved voice id), speech_rate (-50..100), and pitch (-12..12). Returns a playable audio_url, duration_seconds, and generation_id (also saved to your library).

ParametersJSON Schema
NameRequiredDescriptionDefault
textYesThe words to speak / narrate / perform. Max 2048 chars. For dialogue, address voices as @Audio1, @Audio2, @Audio3.
pitchNoOptional. Pitch, -12 to 12. 0 is normal.
voiceNoOptional. A saved voice id to speak in. Omit for a natural default voice.
formatNoOptional output format. Default mp3.
image_urlNoOptional. Voice a scene from a picture. Cannot be combined with reference audio.
speech_rateNoOptional. Speaking speed, -50 (slower) to 100 (faster). 0 is normal.
reference_audio_urlNoOptional. A short clip URL to instantly match that voice.
reference_audio_urlsNoOptional. Up to 3 reference clip URLs for multi-voice dialogue.
Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations indicate readOnlyHint: false (write operation). Description adds behavioral details: max text length (2048 chars), return format (audio_url, duration_seconds, generation_id), and that generations are saved to library. No contradictions.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness4/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Description is moderately sized and well-structured with logical flow. Could be slightly more concise, but every sentence adds information. Front-loaded with main purpose.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness5/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given 8 parameters, no output schema, and no nested objects, the description covers all important aspects: generation process, parameter constraints, return values, and output behavior (saved to library). Comprehensive enough for an agent to use correctly.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters5/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema coverage is 100%, but description adds significant context: explains dialogue syntax using @Audio tags, clarifies image_url exclusion with reference audio, gives numeric ranges for speech_rate and pitch, and describes the purpose of format. Adds value beyond raw schema.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

Clear verb and resource: 'Generate spoken audio from text'. Lists specific use cases (narration, voiceover, read-aloud, multi-voice dialogue) and distinguishes from siblings which are mostly visual tools (generate_image, generate_video, etc.).

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

Provides explicit usage context: when to use reference_audio_url, image_url, and voice parameters. Implies restrictions (cannot combine image_url with reference audio). Could explicitly mention when not to use, but given sibling tools are mostly unrelated, it is clear enough.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

generate_imageGenerate ImageAInspect

Generate one or more Switch images. Auto-routes to the right model based on subject (Nano Banana 2 default, GPT Image 2 for swimwear/beach, Switch Model/Ultra/Pro for sexier content, Nano Banana Pro for typography-heavy). Counts <= 8 render inline in chat; counts > 8 queue to your Switch Studio with progress polling. All images persist to your Studio library and folder. Pass an optional style (e.g. "wellness/warm_amber_tropical", "high_fashion_editorial/testino_glossy", "movie_scene/neon_noir_action") to apply a curated photographic stack from the apply_* skill tools.

ParametersJSON Schema
NameRequiredDescriptionDefault
countNoHow many images to generate. Default 4. <= 8 returns inline, > 8 queues to Studio. Beta limit: max 50 per request — larger asks are capped at 50 and the response says so.
modelNoOptional explicit model. If omitted, auto-routed based on subject content (see tool description).
styleNoOptional curated style stack from the apply_* skill tools. Format "<skill>/<style_key>", e.g. "wellness/warm_amber_tropical" or "high_fashion_editorial/leibovitz_painterly".
subjectYesPlain-English description of what to generate. E.g. "a woman walking through a hotel lobby" or "morning coffee on the balcony, model wearing a robe".
folder_nameNoOptional Switch Studio folder name. Auto-created if missing. Defaults to the chat-derived title.
aspect_ratioNoImage aspect ratio. Default 9:16 (vertical, social-friendly).
reference_image_urlsNoOptional public image URLs to use as face/body/scene references.

Output Schema

ParametersJSON Schema
NameRequiredDescription
assetNo
imagesNo
_widgetNo
Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Description discloses key behaviors: auto-routing, persistence to library, count-based delivery, and style application. Annotations only provide readOnlyHint=false, so the description adds significant value beyond that.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness4/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is a single paragraph that front-loads the main purpose. While it is relatively long, every sentence serves a purpose. Some structuring (e.g., bullets) could improve readability but is not detrimental.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness5/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the tool's complexity (7 parameters, 1 required, output schema available), the description covers model selection, count behavior, style integration, folder management, aspect ratio, and reference images. It is comprehensive for an AI agent to select and invoke the tool.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters5/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema coverage is 100%, and the description adds further meaning: explains auto-routing for model, count thresholds, style format and source, and default aspect ratio. Each parameter's purpose is enriched beyond the schema.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description begins with 'Generate one or more Switch images,' clearly stating the action and resource. It further specifies auto-routing based on subject, providing a clear purpose that distinguishes it from sibling style tools (apply_*).

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

Provides usage context such as auto-routing, inline vs. queue behavior based on count, and style options from apply_* tools. Lacks explicit 'when not to use' but offers clear situational guidance.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

generate_videoGenerate VideoAInspect

Generate Switch video across the real provider lineup (Kling, Seedance, Switch Video/WAN 2.7, Switch Video Edit, Topaz upscale) and modes (text-to-video, image-to-video, frame-to-frame, motion, omni, reference-to-video, video-edit, upscale). ALWAYS call list_video_models first to pick the right model + mode and see its required inputs. Pass one shot, or shots:[...] for a storyboard (max 4 by default, hard max 10) where EACH shot is DIFFERENT — never repeat one prompt to get copies. Renders async (~30-90s); a background job delivers each clip to your library. Returns a task_id per shot — poll get_video_status or list_my_videos.

ParametersJSON Schema
NameRequiredDescriptionDefault
modeNoVideo mode. Must be supported by the chosen model (see list_video_models).
audioNoOmni / Seedance refs: generate audio. Omni is ON by default; set false for a silent clip. Other models ignore this. See list_video_models for which models generate audio and the max seconds with vs without audio.
modelNoModel id from list_video_models (e.g. kling-v3, seedance-2.0-t2v, wan-2.7-t2v, topaz). Or prefer option_id from list_video_models.
shotsNoA storyboard of 1-10 DISTINCT shots. Each item takes the same fields as a single shot (subject, model, mode, image_url, etc.).
subjectNoThe shot: subject + motion + scene (video needs motion language, e.g. "slow push-in").
durationNoClip length in seconds. Default 5. Seedance does 4-15s; Switch Video (WAN) does 5/10/15; Kling/Switch Video Edit cap at 10 — see each model's durations in list_video_models.
image_urlNoRequired for image-to-video / frame-to-frame / motion. Accepts EITHER a Switch asset id (from show_media / list_my_assets / upload_media) OR a public https url. An asset id is resolved server-side, so just pass the id you have — no need to fetch a url first.
option_idNoOptional catalog id from list_video_models (e.g. "kling-image"); use instead of model+mode.
video_urlNoRequired for video-edit and upscale (the source clip). Must be a publicly downloadable https URL.
resolutionNoOutput resolution. Defaults to 1080p where the model supports it. 720p is cheaper and faster. 480p is the cheapest, only on Seedance 2.0 Mini (budget tier). 4K is only on Kling v3 text/image and Kling Omni; Seedance text-to-video is 720p only. Each model lists its available resolutions in list_video_models.
aspect_ratioNoe.g. 9:16, 16:9, 1:1. Must be allowed for the model (see list_video_models).
end_image_urlNoEnd frame for frame-to-frame mode.
reference_audio_urlsNoSeedance reference/omni only: up to 3 reference audio files to drive synthesized audio. Requires at least one reference image or video.
reference_image_urlsNoReference images. Each entry accepts EITHER a Switch asset id (from show_media / list_my_assets / upload_media / get_my_active_references) OR a public https url — asset ids are resolved server-side. Seedance reference/omni accepts up to 9; Kling Omni up to 7. For Seedance, at least one image or video reference is required.
reference_video_urlsNoSeedance reference/omni only: up to 3 reference video clips for motion/style guidance. A Seedance video ref can satisfy the required visual anchor. NOTE: the AUDIO track of these clips is IGNORED — never extracted or preserved.
character_orientationNoMotion mode only: follow the character image (default) or the reference video.
Behavior5/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

The description adds substantial behavioral context beyond annotations: async rendering (~30-90s), background job delivery, return of task_id per shot, storyboard limits (max 4 default, hard max 10). No annotation contradiction exists as readOnlyHint is false, matching the mutation nature.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is a single, well-structured paragraph that front-loads the core purpose, then covers prerequisites, usage patterns, async behavior, and return format. Every sentence adds value without redundancy.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

For a complex tool with 16 parameters and async behavior, the description covers workflow, constraints, and return values (task_id per shot). It lacks explicit error handling or validation details, but the core usage is sufficiently explained.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema coverage is 100%, with each parameter having a rich description. The tool description adds minor usage context (e.g., 'shots' array, max shots, image_url asset id) but does not significantly enhance semantic understanding beyond what the schema already provides.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the tool generates Switch video across multiple providers and modes, and distinguishes itself from sibling tools like list_video_models and get_video_status. It specifies the verb 'generate' and the resource 'video', making its function unambiguous.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description explicitly instructs to 'ALWAYS call list_video_models first' and explains async nature with polling. It also advises not to repeat prompts. However, it could more directly contrast with tools like analyze_video or stitch_videos to clarify when not to use it.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

get_my_active_referencesGet Active ReferencesA
Read-onlyIdempotent
Inspect

Read the user's staged references in Switch Studio. Returns TWO groups: (1) the image-generation reference strip (typed face/body/outfit/scenery/product slots) under refs, and (2) the VIDEO-tab references the user staged in the Omni/Image video tabs (the @Image1/@Image2 strip) under videoReferences, with usable signed URLs. Call this before generate_image or generate_video whenever the user says "use my refs" or refers to images they staged in Studio (including "the images in my video tab"). To make a video from the video-tab refs, pass videoReferences.imageUrls into generate_video reference_image_urls (and videoUrls into reference_video_urls) in reference-to-video / omni mode. Refs marked alive:false are dead (stored file gone) and are already excluded from the usable url lists. NOTE: a photo the user just attached in THIS chat is in neither group — for that, call upload_media and use its returned url/asset id directly.

ParametersJSON Schema
NameRequiredDescriptionDefault

No parameters

Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations already declare readOnlyHint=true and idempotentHint=true, so the description doesn't need to reiterate safety. It adds value by explaining that refs marked alive:false are excluded and that URLs are signed and usable. This provides useful behavioral context beyond what annotations offer.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness4/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is longer than necessary but packs essential information without fluff. It front-loads the main purpose and then details groups and usage. Every sentence adds value, though it could be slightly more concise.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness5/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

With no output schema, the description fully describes the return structure (refs, videoReferences, alive:false, signed URLs) and explains how to use the output for generate_video. It covers purpose, usage, edge cases, and integration with other tools, making it complete for a zero-parameter tool.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters4/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

There are 0 parameters (schema coverage 100%), so the description has no parameter details to add. It compensates by explaining the return structure and how to use the output, which is valuable for a parameterless tool.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the tool reads the user's staged references in Switch Studio and returns two distinct groups: 'refs' (image-generation reference strip) and 'videoReferences' (VIDEO-tab references). This is specific and distinguishes it from sibling tools like upload_media or list_my_assets.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines5/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description explicitly tells when to use the tool: 'Call this before generate_image or generate_video whenever the user says "use my refs" or refers to images they staged in Studio.' It also notes what is not included (a photo attached in chat) and directs to upload_media for that, and gives instructions for using videoReferences with generate_video.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

get_video_statusGet Video StatusA
Read-onlyIdempotent
Inspect

Check the status of one of your video jobs by task_id (from generate_video) or job_id. Returns status, a viewable view_url when finished, or the error if it failed. Poll this every ~20s — do not loop rapidly.

ParametersJSON Schema
NameRequiredDescriptionDefault
job_idNoAlternatively, the job_id.
task_idNoTask id returned by generate_video.
Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations already indicate readOnlyHint=true and idempotentHint=true, so no contradiction. The description adds value by explaining return values (status, view_url, error) and polling behavior, which goes beyond the annotations.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Three concise sentences with front-loaded purpose. Every sentence adds value: purpose and parameters, returns, and polling guidance. No wasted words.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness5/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Despite no output schema, the description explains the return values (status, view_url, error) fully. It covers necessary context: identifiers, polling interval, and relationship to generate_video. Complete for a simple status-check tool.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema coverage is 100% with clear descriptions for both parameters. The description adds minimal extra context (e.g., that they are alternatives), but does not significantly augment what the schema already provides.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the tool checks the status of video jobs using task_id or job_id. It specifies the resource (video jobs) and the action (check status), and the mention of 'from generate_video' distinguishes it from general job status tools.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description provides explicit polling guidance ('Poll this every ~20s — do not loop rapidly') and context for when to use it (after generate_video). It does not explicitly state when not to use it or list alternatives among siblings.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

lip_sync_videoLip Sync VideoAInspect

Lip-sync audio onto one of your videos. RECOMMENDED: action="create" with engine="best" + video_url + sound_file (base64 data URI) — syncs the whole clip on the highest-quality engine, no face step needed. Kling flow (manual timing control): (1) action="identify-face" with video_url (MP4/MOV, 2-60s, <=100MB, 720p/1080p); (2) action="create" with session_id + face_id + audio + timing IN MILLISECONDS (sound_start_time, sound_end_time, sound_insert_time) + optional speech_volume/original_audio_volume (0-100); (3) action="status" with the task_id to poll — returns a branded SwitchApp view_url when done. Charges credits on create; failed jobs are refunded.

ParametersJSON Schema
NameRequiredDescriptionDefault
actionYesWhich step to run.
engineNocreate: "best" = highest-quality whole-clip sync (needs only video_url + sound_file). Default "kling" (timeline flow).
face_idNocreate: a face_id from identify-face (one face supported).
task_idNostatus: the task_id from create.
audio_idNocreate: alternative to sound_file — an existing audio id.
video_urlNoidentify-face: the source video (MP4/MOV, 2-60s, <=100MB, 720p/1080p). Use a SwitchApp/public URL.
session_idNocreate: from identify-face.
sound_fileNocreate: base64 data URI of the audio (e.g. data:audio/mpeg;base64,...).
speech_volumeNocreate: 0-100 (default 100).
sound_end_timeNocreate: audio end, in MILLISECONDS.
sound_start_timeNocreate: audio start, in MILLISECONDS.
sound_insert_timeNocreate: where in the video to place the audio, in MILLISECONDS.
original_audio_volumeNocreate: 0-100 (default 0).
Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations indicate readOnlyHint=false, but the description adds important behavioral details: charges credits on create, refunds failed jobs, requires specific video constraints (MP4/MOV, 2-60s, <=100MB, 720p/1080p). No contradiction with annotations.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness4/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is relatively long but well-structured with a highlighted recommendation and step-by-step instructions. Every sentence adds value; no redundancy. The front-loaded recommendation helps quickly identify the simplest path.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Despite 13 parameters and a multi-step process, the description covers all necessary flows (recommended and Kling), timing constraints, file requirements, and return behavior (view_url on status). No output schema exists, but the description mentions the return value adequately.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters4/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema coverage is 100%, so baseline is 3. The description adds value by explaining the recommended usage patterns, timing units (milliseconds), and relationship between parameters (e.g., sound_file vs audio_id, timing fields).

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the tool's purpose: lip-sync audio onto a video. It specifies verb 'lip-sync' and resource 'video', and differentiates from siblings like talking_avatar_video or generate_video.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines5/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

Explicitly recommends the best flow (action='create' with engine='best') and provides detailed step-by-step instructions for the Kling flow, including timing units and parameter dependencies.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

list_generationsList GenerationsA
Read-only
Inspect

List your recent and active generation tasks. Returns counts per status (pending / running / completed / failed) plus an array of your tasks with id, status, prompts, model, ref counts, scheduledAt, finishedAt.

ParametersJSON Schema
NameRequiredDescriptionDefault
limitNoDefault 10. Max 50.
statusNo"all" for everything, or array like ["pending","running"]. Default: active + recent.
Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations already declare readOnlyHint=true. The description adds context about the return structure (counts per status, array of tasks with specific fields) and implies read-only behavior. No contradictions.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness4/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is concise (3 sentences) and front-loads the core purpose. Every sentence adds value, though the parameter information is redundant with the schema.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the simplicity (2 params, no output schema), the description adequately covers the tool's behavior and return structure, including status counts and task fields. Minor gaps like pagination details are acceptable.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters2/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema coverage is 100% with detailed descriptions for limit and status. The description merely repeats the schema's defaults and allowed values without adding new semantic meaning or examples.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states it lists your recent and active generation tasks, specifying the returned data including counts per status and an array with key fields. This distinguishes it from sibling tools like list_my_assets or show_generation.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines3/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description provides default limits and status behavior but does not explicitly guide when to use this tool versus alternatives like show_generation or list_my_assets. No when-not-to-use guidance is given.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

list_my_assetsList AssetsA
Read-only
Inspect

Return asset METADATA only (id, truncated prompt, model, created date), newest first. This does NOT display images and must NOT be used to show pictures — if the user says "show me / display my last image(s)", call show_media instead (it renders them; pass count=N for several). Use list_my_assets only when you need ids/metadata for another tool (e.g. move_asset) or a plain text list.

ParametersJSON Schema
NameRequiredDescriptionDefault
countNoDefault 20. Max 50.
Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations already declare readOnlyHint=true. Description adds that it returns only metadata, not images, and orders by newest first. No contradictions, but could include more detail on pagination or error behavior.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Two sentences, no wasted words. Front-loaded with key information, efficient and clear.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness5/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Even without output schema, description explains the return structure (id, truncated prompt, model, created date) and ordering. For a single-parameter read-only tool, this is complete.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema coverage is 100% with clear description for count (Default 20. Max 50.). Description does not add extra meaning beyond schema, so baseline score of 3 is appropriate.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description explicitly states the tool returns asset metadata (id, truncated prompt, model, created date) newest first, and clearly distinguishes itself from show_media by stating it does not display images.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines5/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

Provides explicit when-to-use and when-not-to-use guidance: use for ids/metadata for other tools or plain text list; do not use for showing images, instead call show_media with count=N.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

list_my_foldersList FoldersA
Read-onlyIdempotent
Inspect

List the folders in your Switch library (id, name, parent). Use this to find an existing folder before move_asset or create_folder.

ParametersJSON Schema
NameRequiredDescriptionDefault

No parameters

Behavior3/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations already indicate readOnlyHint and idempotentHint. The description adds that it returns id, name, parent, but does not discuss pagination, ordering, or error behavior. Adequate but not extra.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Two concise sentences, front-loaded with action and result, no wasted words.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

For a simple list tool with no output schema, the description covers purpose, fields returned, and usage context. Could mention if folders are top-level only but generally complete.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters4/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

No parameters exist, and schema coverage is 100%. Baseline is 4; no parameter explanation needed.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states it lists folders, specifies returned fields (id, name, parent), and distinguishes from siblings like list_my_assets by targeting folders with a specific use case.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines5/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

Explicitly advises using this tool to find an existing folder before move_asset or create_folder, providing clear context for when to invoke it.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

list_my_videosList VideosA
Read-only
Inspect

List your recent Switch videos, newest first — id, status, prompt, model, and a viewable view_url for finished clips. Use this to check whether videos finished and to let the user choose which one they want.

ParametersJSON Schema
NameRequiredDescriptionDefault
countNoHow many to return. Default 10. Max 50.
statusNoOptional filter: submitted, processing, succeed, failed, or all.
Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations already indicate readOnlyHint=true. Description adds that list is 'recent' and includes viewable URLs for finished clips. No contradiction; useful behavioral details beyond annotations.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Two sentences, concise, front-loaded with purpose, no filler. Every sentence serves a purpose.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Covers purpose, returned fields, and use case. No output schema but the described fields suffice. Lacks mention of pagination, but max count 50 reduces need. Adequate for a simple list tool.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Both parameters (count, status) have complete descriptions in the input schema. Description does not add extra meaning beyond 'how many' and 'optional filter'. Schema coverage is 100%, so baseline 3 applies.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

Description clearly states verb 'list', resource 'your recent Switch videos', sorting 'newest first', and lists returned fields (id, status, etc.). Differentiates from siblings like list_generations by specifying video-specific content.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

Explicitly says to use for checking if videos finished and for user selection. Provides a clear use case. Lacks explicit exclusion of alternatives, but sufficient for a simple list tool.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

list_video_modelsList Video ModelsA
Read-onlyIdempotent
Inspect

List the video providers, models, and modes available to your Switch account, with each model's required inputs, allowed aspect ratios and durations, and a rough per-second cost. Call this before generate_video so you pick a real model + mode and supply the right inputs.

ParametersJSON Schema
NameRequiredDescriptionDefault

No parameters

Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations already declare readOnlyHint and idempotentHint as true, making the safe, non-destructive nature clear. The description enriches this by detailing the output content (providers, models, modes, inputs, aspect ratios, durations, cost), adding behavioral context beyond annotations.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Two sentences, no redundant words, front-loaded with actionable information. Every sentence serves a clear purpose: stating what the tool does and why it should be used.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness5/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the tool has no parameters and a read-only, idempotent nature, the description fully explains what it returns and its role in the workflow. No output schema is needed because the description lists the output categories.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters4/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

With zero parameters and 100% schema coverage, the description has no need to explain parameters. Baseline score of 4 is appropriate as it adds no parameter detail required.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description uses a specific verb ('List') and identifies the resource ('video providers, models, and modes') with detailed attributes (required inputs, aspect ratios, durations, cost). It clearly distinguishes from sibling tools like explore_models or generate_image by focusing on video-specific metadata for pre-generation selection.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines5/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

Explicitly advises 'Call this before generate_video so you pick a real model + mode and supply the right inputs,' providing both when-to-use and the tool's role in a workflow. This directly guides an AI agent away from incorrect tool selection.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

search_my_librarySearch LibraryA
Read-only
Inspect

Search your library by prompt substring (metadata only — id, prompt, date). Optional folderId scopes to one folder. Only your own assets are returned. This does NOT display images; to show/display results to the user, pass their ids to show_media.

ParametersJSON Schema
NameRequiredDescriptionDefault
limitNoDefault 20.
queryYes
folderIdNo
Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations already declare readOnlyHint=true, so the read-only behavior is covered. The description adds key behavioral details: only returns user's own assets, searches only metadata (not content), and does not display images. This adds value beyond annotations.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is concise with two well-structured sentences. No superfluous information; every sentence adds value.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

For a search tool with no output schema, the description is fairly complete. It explains what is searched, what is returned, and what is not. Could mention pagination or response format, but not necessary for basic use.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters4/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema description coverage is low (33%), but the description compensates by explaining that 'query' is a prompt substring, 'folderId' scopes to one folder, and 'limit' defaults to 20. This adds meaning beyond the schema's limited descriptions.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the tool searches the library by prompt substring (metadata only), which is a specific verb and resource. It distinguishes from siblings like show_media and list_my_assets by noting it does not display images and returns only metadata, not the assets themselves.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines5/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description explicitly says when to use (searching metadata), when not to use (not for displaying images), and directs to the sibling tool show_media for displaying. It also mentions optional scoping by folderId, providing clear context.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

show_generationShow GenerationA
Read-only
Inspect

Get the full detail of one of your generations by task id — prompts, model, ref counts, saved/failed counts, ETA hint, asset ids.

ParametersJSON Schema
NameRequiredDescriptionDefault
taskIdYesTask id from generate_image or list_generations.
Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations already declare readOnlyHint=true. Description adds details on what the response contains (prompts, model, ref counts, etc.), providing behavioral context beyond annotations.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Single sentence that is front-loaded with the main purpose and includes specific fields in a list, no redundant information.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Low complexity tool with one parameter and no output schema. Description adequately explains the return value fields, though a more structured list could be clearer. Sufficient for an AI agent.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters4/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema covers taskId with 100% coverage. Description adds valuable context that taskId comes from generate_image or list_generations, which aids correct parameter selection.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

Description uses specific verb 'Get' and resource 'full detail of one of your generations' and lists included fields (prompts, model, ref counts, etc.), clearly distinguishing from sibling tools like list_generations.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

States usage context: requires a taskId from generate_image or list_generations, but does not explicitly mention when not to use or alternatives. Clear enough for this simple tool.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

show_mediaShow MediaA
Read-onlyIdempotent
Inspect

Display the user's images inline — one or many. Users speak plainly and will NOT know asset ids; never ask for one, resolve it yourself. For "show me" or "show me my last image" call with NO arguments (shows the most recent image). For "show me my last 4 images / my last 10 pictures" pass count=N (returns a clean grid, up to 12). For a specific known image pass assetId. Renders a branded SwitchApp media card with a Download action per result; do not just print URLs. (Videos are not shown here — use list_my_videos and return the newest finished video's view_url, which plays.)

ParametersJSON Schema
NameRequiredDescriptionDefault
countNoOptional. How many of the most recent images to show as a grid (default 1, max 12). Use when the user says "my last N images/pictures".
assetIdNoOptional. A specific image id (from list_my_assets, search_my_library, or show_generation). Omit to show the most recent image(s).

Output Schema

ParametersJSON Schema
NameRequiredDescription
assetNo
imagesNo
_widgetNo
Behavior5/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations already mark readOnlyHint and idempotentHint as true. Description adds crucial behavioral context: it renders a branded SwitchApp media card with a Download action, not just URLs. Also states users will not know asset IDs, so the agent must resolve them.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is extremely concise, using five sentences to cover purpose, use cases, constraints, and sibling differentiation. No fluff, every sentence carries weight.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness5/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the tool has two optional parameters, full schema coverage, output schema, and clear annotations, the description completes the picture by adding usage patterns, behavioral traits, and exclusions. Nothing missing for an AI agent to use correctly.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters5/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema coverage is 100% with descriptions. The tool description enriches understanding: count default is 1, max 12; assetId is for specific known images; omitting shows most recent. This adds practical meaning beyond schema definitions.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

Clearly states the tool displays user images inline. Distinguishes from list_my_videos by noting that videos are not shown here. Provides specific verbs like 'Display', 'show', and differentiates between showing most recent, a grid, or a specific asset.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines5/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

Explicitly describes when to call with no arguments (most recent), count=N (grid up to 12), or assetId (specific image). Instructs not to ask users for asset IDs, resolving them automatically. Also tells agent to use list_my_videos for videos instead of this tool.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

stitch_videosStitch VideosAInspect

Stitch several of your Switch videos together into ONE video, played back-to-back in the order you give. Pass clip_asset_ids: an ORDERED list of your video ids (get them from list_my_videos) — the first id plays first. Optional orientation (landscape|portrait|square), fps, quality. Renders the combined video with ffmpeg and returns the finished, downloadable video url right away (also saved to list_my_videos). Use this whenever the user wants to combine, join, merge, or concatenate multiple clips into one.

ParametersJSON Schema
NameRequiredDescriptionDefault
fpsNoFrames per second. Default 30.
qualityNodraft, standard (default), or high.
orientationNolandscape (1920x1080, default), portrait (1080x1920), or square (1080x1080).
project_nameNoOptional name for the output video.
clip_asset_idsYesOrdered list of your video ids (from list_my_videos). At least 2. Output order = this order.
Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations only indicate non-read-only. The description adds context: the rendering process uses ffmpeg, returns a downloadable URL immediately, and saves the result to list_my_videos. No contradictions; behavioral traits are well disclosed beyond annotations.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is two sentences, front-loaded with the core purpose, and every word is essential. No redundancy or fluff.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

For a tool with 5 parameters and no output schema, the description covers input requirements (ordered list, source), optional parameters, output format (URL and saved to list), and even mentions the underlying technology (ffmpeg). It is sufficiently complete for an agent to invoke correctly.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters4/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema coverage is 100%, so baseline is 3. The description adds value by explaining the ordering of clip_asset_ids and where to obtain them, plus listing optional params with defaults. This extra context enhances understanding beyond the schema alone.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the tool stitches several Switch videos into one, with a specific verb ('stitch') and resource ('videos'). It distinguishes from sibling tools (e.g., apply_* effects, generate_video), as none of them perform video concatenation.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description explicitly says to use this tool when the user wants to combine, join, merge, or concatenate clips. It also advises getting clip_asset_ids from list_my_videos. However, it does not explicitly mention when not to use it or provide alternatives for other scenarios.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

talking_avatar_videoTalking Avatar VideoAInspect

Turn a face photo into a lip-synced talking-head video that speaks your text (or your audio). Provide image_url (a clear face photo) and either script (text to speak, max 2500 characters) or audio_url. Optional voice_id / language / voice_settings. Renders in ~1-5 minutes (single call, returns the finished branded video) and is saved to your library. Charged per video.

ParametersJSON Schema
NameRequiredDescriptionDefault
scriptNoText the avatar speaks. Max 2500 characters. Required unless audio_url is given.
languageNoOptional language code (default en).
voice_idNoOptional voice id (from clone_voice / your library).
audio_urlNoPre-recorded audio URL to lip-sync instead of generating speech from script.
image_urlYesA clear face photo (Switch/public URL). Required.
voice_settingsNoOptional: { stability, similarityBoost, style, useSpeakerBoost } 0-1.
Behavior5/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

The description goes well beyond the minimal annotation (readOnlyHint=false) by detailing rendering time (~1-5 minutes), synchronous completion ('single call, returns the finished branded video'), persistence ('saved to your library'), and cost ('Charged per video'). This provides rich behavioral context for the agent.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness4/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is concise (~70 words) and front-loaded with the core action. Each sentence adds distinct information (purpose, required inputs, optional inputs, behavior, cost). Could be slightly more structured (e.g., bullet points), but it remains clear and efficient.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the tool complexity (6 params, no output schema, nested objects), the description covers purpose, inputs, timing, storage, and cost. It lacks details on output format or error handling, but the mention of 'returns the finished branded video' provides some closure. The lack of differentiation from sibling 'lip_sync_video' is a minor gap.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters4/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Input schema coverage is 100%, so baseline is 3. The description adds value by clarifying that image_url should be a 'clear face photo', emphasizing the required-or-alternative relationship between script and audio_url, and noting the optional voice_settings. It does not repeat all schema details, but the added nuance justifies a 4.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose4/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states it creates a talking-head video from a face photo and text/audio. It specifies the key inputs (image_url, script/audio_url) and output (lip-synced video). However, it does not explicitly differentiate from the sibling tool 'lip_sync_video', which appears similar, so purpose clarity is strong but not perfect.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines3/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description implies when to use the tool (to create a talking-head video) but does not provide explicit guidance on when not to use it or alternatives. It mentions the two input modes (script or audio_url) but lacks context on scenarios where one might prefer the sibling 'lip_sync_video' or 'generate_video'.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

upload_mediaUpload MediaAInspect

Upload one image into your Switch library in a single call. Pass url (any public https) OR base64 + mime. Switch fetches/decodes it server-side, stores it, and returns a clean public URL plus the new asset id. This is THE way to use a photo the user attached in chat as a reference: pass the returned url directly into generate_image's reference_image_urls, OR into generate_video's image_url (image-to-video) or reference_image_urls (reference / omni video). The returned URL is provider-fetchable as-is — no presigned PUT, no curl, no confirm-upload step. Do NOT call get_my_active_references for a chat-attached photo; that strip only holds Studio-managed refs.

ParametersJSON Schema
NameRequiredDescriptionDefault
urlNoAny public https URL — Switch fetches it server-side.
mimeNoMIME type when sending base64. Default image/png.
base64NoBase64-encoded image bytes (use this when there is no public URL).
Behavior5/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations only show readOnlyHint=false and idempotentHint=false. Description adds that Switch fetches/decodes server-side, stores, returns public URL and asset id, and notes no presigned PUT or confirm-upload step needed.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness4/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Dense paragraph with every sentence adding value. Could be slightly restructured for clarity, but no wasted words given the complexity.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness5/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

No output schema, but description explicitly states return values (public URL and asset id). Provides usage context with generate_image/generate_video. Complete enough for agent to use tool correctly.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters5/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema coverage is 100%, but description adds value by explaining usage of url vs base64+mime, implying mutual exclusivity, and giving context like 'any public https' and 'when there is no public URL'.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

States verb (upload), resource (image into Switch library), and distinguishes from siblings like get_my_active_references. Also clarifies it's for chat-attached photos, not Studio-managed refs.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines5/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

Explicitly says when to use (for chat-attached photos) and when not to (do NOT call get_my_active_references). Provides clear alternative: pass returned URL into generate_image or generate_video.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

upload_reference_assetUpload ReferenceAInspect

Upload an image, video, or audio reference into Switch cloud and get a ready-to-use reference URL. Pass kind=image|video|audio. Returns reference_image_urls / reference_video_urls / reference_audio_urls for generate_image and generate_video. Image and video references are also added to your active Studio reference strip (the same one your desktop uses) unless activate=false. PREFERRED for real files: call with presign=true to get an upload_url, PUT the bytes straight to it (no base64 through the model), then call again with confirm_path to verify and add it — works for image, video, and audio. base64/url is only for tiny inline files.

ParametersJSON Schema
NameRequiredDescriptionDefault
urlNoPublic https URL to fetch server-side.
kindYesReference type to upload.
mimeNoMIME for base64. Images: jpg/png/webp/gif. Videos: mp4/mov. Audio: mp3/wav/m4a/aac.
base64NoBase64 bytes (optionally a data: URL). Best for small files; large video should use presign.
presignNoReturn an upload_url to PUT the file bytes directly to (no base64). Video always; image/audio when enabled.
activateNoImage/video: add to the active Studio reference strip. Default true. Audio never touches the strip.
filenameNoOptional source filename for extension/display.
frame_typeNoImage strip label: ref (default), face, body, clothes, scenery, product, typography.
confirm_pathNoThe storage_path from a presign call, after you PUT the file — verifies the object, records it, and adds it to your strip.
Behavior5/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Adds significant behavioral context beyond annotations: describes the two-step presign workflow, the effect on the active Studio reference strip (and activate=false to skip), and that audio never touches the strip. Annotations indicate readOnlyHint=false (mutation) and idempotentHint=false, which align with the description.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness4/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is well-structured and front-loaded with the main purpose. It's slightly verbose but each sentence adds value (e.g., presign flow, activate behavior). Could be tightened slightly, but overall efficient.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the tool's complexity (9 parameters, presign flow, no output schema), the description covers the key aspects: output URLs, strip addition, and the two-step upload process. Could mention error handling or size limits, but it's largely complete for a tool with good annotations.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters4/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema coverage is 100%, but the description adds meaning by explaining the presign/confirm_path workflow, the frame_type labeling, the activate behavior, and the size recommendation for base64 vs presign. This goes beyond the schema's basic parameter descriptions.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the tool's purpose: uploading image, video, or audio references into Switch cloud to obtain a ready-to-use URL. It explicitly links to downstream tools (generate_image, generate_video) and distinguishes from sibling tools like upload_media by specifying the reference context.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

Provides explicit guidance on preferred usage (presign for real files, base64/url for tiny inline) and explains the activate parameter difference between image/video and audio. However, it doesn't explicitly mention when to avoid this tool in favor of alternatives like upload_media.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

voiceManage VoicesAInspect

Manage custom voices for talking_avatar_video. action="clone" registers a voice from audio_sample_url (a 10-30 second clip) under voice_name (charged 2 credits, sample stored durably) and returns a voice_id; action="list" returns your saved voices; action="delete" removes one by voice_id. Use the returned voice_id as talking_avatar_video.voice_id.

ParametersJSON Schema
NameRequiredDescriptionDefault
actionYesWhich operation to run.
voice_idNodelete: the voice_id to remove.
voice_nameNoclone: a name for the voice (unique per account).
audio_sample_urlNoclone: a 10-30 second voice sample URL (reachable).
Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

The description discloses side effects of clone (charges 2 credits, stores sample durably, returns voice_id) and implies mutation (delete, clone). Annotations indicate readOnlyHint=false, which is consistent. The description adds value beyond annotations by detailing costs and durability.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is only two sentences, with the first sentence covering all actions and parameters concisely, and the second providing a critical usage hint. No wasted words; every sentence earns its place.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

The description covers all three actions with parameter requirements, side effects, and usage context. However, it lacks details on the return format for 'list' and 'delete' operations (e.g., array of voice objects). Given no output schema, slightly more detail would enhance completeness.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters4/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

With 100% schema description coverage, the schema already defines parameters. The description adds operational context (e.g., audio_sample_url must be reachable, 10-30 seconds; voice_name unique per account) and clarifies the relationship between action and required parameters, enhancing understanding beyond the schema.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly identifies the tool as managing custom voices for talking_avatar_video, listing three distinct actions (clone, list, delete) with specific operations and resources, differentiating it from sibling tools like talking_avatar_video.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description provides explicit instructions for each action, including constraints (e.g., audio sample length of 10-30 seconds, unique voice_name, cost of 2 credits) and how to use the returned voice_id. While it lacks explicit 'when not to use' guidance, the context is sufficient for selecting the correct action.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

Discussions

No comments yet. Be the first to start the discussion!

Try in Browser

Your Connectors

Sign in to create a connector for this server.

Resources