AetherWave Studio
Server Details
One tool surface for music, image, video, and audio generation across Suno, Grok Imagine, Seedance, Kling, Hailuo, Wan, VEO, Ideogram, and GPT Image 2. Generate, edit, upscale, reframe, and master through one credit pool. Connect in one click with OAuth, no API key required.
- Status
- Healthy
- Last Tested
- Transport
- Streamable HTTP
- URL
Glama MCP Gateway
Connect through Glama MCP Gateway for full control over tool access and complete visibility into every call.
Full call logging
Every tool call is logged with complete inputs and outputs, so you can debug issues and audit what your agents are doing.
Tool access control
Enable or disable individual tools per connector, so you decide what your agents can and cannot do.
Managed credentials
Glama handles OAuth flows, token storage, and automatic rotation, so credentials never expire on your clients.
Usage analytics
See which tools your agents call, how often, and when, so you can understand usage patterns and catch anomalies.
Tool Definition Quality
Average 4.6/5 across 16 of 16 tools scored.
Each tool targets a distinct media operation (image, video, audio, listing, mastering, etc.) with clear boundaries. Even similar tools like generate_image and edit_image are differentiated by their primary intent (creation vs. modification) and model selection guidance.
All tools follow the 'aetherwave_verb_noun' pattern consistently, using snake_case. Verbs and nouns are descriptive and predictable (e.g., generate_image, list_video_models, remove_background_video).
16 tools is slightly above the ideal range (3-15) but remains well-scoped for a multimedia generation platform covering image, video, audio, and user management. Each tool serves a distinct purpose, and no obvious bloat exists.
The tool surface covers core creation, editing, listing, and enhancement workflows for images, videos, and audio. Minor gaps exist (e.g., no delete tool, no get-single-creation tool), but the essential lifecycle is well-covered.
Available Tools
16 toolsaetherwave_balanceCheck credit balanceARead-onlyInspect
Returns the current AetherWave credit balance for the API key. Use this BEFORE a generation to confirm sufficient credits, especially for video which can cost 30-300+ credits depending on model/duration/resolution.
| Name | Required | Description | Default |
|---|---|---|---|
No parameters | |||
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
Annotations already declare readOnlyHint=true and openWorldHint=true, so the description is not required to cover safety. It adds value by specifying that the balance is tied to the API key and explaining the credit cost range for video, but does not elaborate on other behavioral aspects like rate limits or caching.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
Two sentences with zero wasted words. The first sentence states the core functionality, and the second provides actionable usage guidance. Information is front-loaded.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
For a simple read-only balance check with no parameters and no output schema, the description is fully complete. It covers what the tool returns, its purpose, and when to use it, integrating cost context for better decision-making.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
The tool has no parameters, and schema description coverage is 100%. Per scoring guidelines, baseline is 3. The description does not add parameter-specific information but provides context on when to use the tool.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
Clearly states 'Returns the current AetherWave credit balance for the API key.' This is a specific verb and resource, and it distinguishes from sibling generation tools by implying it is a prerequisite check.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
Explicitly advises to use this tool BEFORE a generation to confirm sufficient credits, especially for video generation which can vary in cost. This provides clear context and when-to-use guidance.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
aetherwave_edit_imageEdit image with AI (I2I)AInspect
Edits an existing image guided by a text prompt. Pass a public imageUrl plus a prompt describing the change ("add a moon to the sky", "swap the background for a neon city", "make it look like a comic panel"). Submits, polls, and returns the edited image URL(s). Default model is 'grok-imagine-i2i' (6 cr per call, returns 2 variations, ~30s, best cost-to-quality on standard edits). Other I2I-capable models: 'seedream-v4-edit', 'wan-2.5-spicy-i2i', 'flux-kontext-pro', 'qwen-image-edit', 'gpt-image-1.5-i2i' (slow, ~5min). Use list_image_models for full lineup. Note: source URLs with spaces or parentheses may fail upstream; prefer clean URLs.
Model selection guide for edits
Default: grok-imagine-i2i (6 cr per call, returns 2 variations = 3 cr/image effective, fast ~30s, strong general-purpose edit quality).
Pick a different model when:
Need a single deterministic output, or 4K resolution ->
seedream-v4-edit(7 cr per image, supports 1K/2K/4K, multi-image up to 6)Subtle edits / preserve composition / character consistency ->
flux-kontext-proorflux-kontext-maxNSFW edits ->
wan-2.5-spicy-i2iHighest quality, time is not a concern (~5 min OK) ->
gpt-image-1.5-i2iorgrok-imagine-quality-i2i(16 cr @ 1K, 22 cr @ 2K)Stylized / artistic transformation ->
midjourney-i2i
If the user simply says "edit this image" with no other signal, default to grok-imagine-i2i.
| Name | Required | Description | Default |
|---|---|---|---|
| model | No | Model ID. Defaults to 'grok-imagine-i2i' (3 cr/image effective, 2 outputs). Other options: 'seedream-v4-edit', 'wan-2.5-spicy-i2i', 'flux-kontext-pro', 'qwen-image-edit', 'gpt-image-1.5-i2i', 'grok-imagine-quality-i2i'. Use list_image_models for the full list. | |
| prompt | Yes | Text description of the edit (e.g. 'replace the sky with sunset clouds'). | |
| quality | No | Quality preset for models that support it (e.g. GPT Image 2). | |
| imageUrl | Yes | Public URL of the source image to edit. Must be a real, fetchable URL. | |
| maxImages | No | Number of variations to return for multi-output models. | |
| resolution | No | Output resolution. Tiered-pricing models accept '1K' / '2K'. | |
| aspectRatio | No | Output aspect ratio (e.g. '1:1', '16:9'). Defaults to the source ratio for most models. | |
| renderingSpeed | No | Rendering speed preset for models that support it. | |
| negative_prompt | No | What to avoid in the output (supported by some models). |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
Discloses submission and polling behavior, return format (URL(s)), cost (6 cr per call), timing (~30s default), and model-specific behaviors (e.g., slow ~5min for gpt-image-1.5-i2i). Warns about URL pitfalls. No annotation contradiction.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
Well-structured with front-loaded purpose and detailed guide; slightly verbose but every section adds essential context. Model selection table is excellent but could be trimmed.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
For a complex tool with 9 parameters and no output schema, description fully covers return format, polling, edge cases, and model-specific behaviors. No gaps.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema coverage is 100%, but description adds value by explaining default model behavior, prompt examples, and specific parameter contexts like resolution tiers and maxImages for multi-output models.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
Description uses specific verb 'edits' and resource 'existing image', with example prompts. Distinguishes from siblings like aetherwave_generate_image by focusing on image-to-image editing.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
Contains an explicit model selection guide covering various use cases (single output, NSFW, high quality), default recommendation, and when to use alternatives. Mentions sibling tool list_image_models for full lineup.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
aetherwave_generate_imageGenerate image (Grok Imagine, GPT Image 2, Seedream V4, Wan, Imagen 4, Nano Banana, Ideogram V3, Z-Image Turbo)AInspect
Generates one or more images from a text prompt (T2I) or a text prompt + reference image(s) (I2I). Submits the job, polls until terminal, and returns the final image URLs. Default model is 'grok-imagine-t2i' (fast, 6 images per generation, 5 credits). Use list_image_models to see the full lineup with pricing. For I2I, pass referenceImages as an array of public image URLs and pick a model with I2I support (e.g. 'grok-imagine-i2i', 'wan-2.5-spicy-i2i').
Model selection guide (when the user does not specify a model)
Default: grok-imagine-t2i (5 cr, 6 outputs per call, fast, general purpose).
Strong recommendation: when a single high-quality output is what's wanted (most agent / one-shot workflows), prefer gpt-image-2-t2i (9 cr @ 1K / higher @ 2K, single deterministic image, best general quality across realism, illustration, typography, and composition; supports up to 2K resolution and most aspect ratios including auto). This is the front-runner for serious creative output where you don't need to pick from 6 variations.
Pick a different model when the prompt has these signals:
"single best result" / "one image" / production / no time to pick from variations ->
gpt-image-2-t2i(9 cr, 1 output, top general quality)"photoreal" / "photo of" / "realistic" ->
gpt-image-2-t2i(9 cr, best general realism) orimagen-4(12 cr, very high quality) orz-image-turbo(3 cr, fastest)"highest quality" / "premium" / no budget ->
gpt-image-2-t2iat 2K, orgrok-imagine-quality-t2i(16 cr @ 1K, 22 cr @ 2K), orimagen-4-ultraText inside the image (signs, posters, typography) ->
ideogram-v3-t2i(best in class) orgpt-image-2-t2i(also strong)Artistic / painterly / stylized ->
midjourney-t2iAlbum art / cover art ->
gpt-image-2-t2ifor one strong image;grok-imagine-t2ifor 6 variations to choose from;seedream-v4-t2iif 4K wantedLogo or design with embedded text ->
ideogram-v3-t2iNSFW / adult / explicit ->
wan-2.5-spicy-t2i(auto-tags creation as 18+; routes to adult gallery)Cheapest possible / quick test ->
z-image-turbo(3 cr)Multiple variations to compare -> keep
grok-imagine-t2i(6 outputs default) or usenumImageson a multi-output model
For I2I (reference image provided): prefer the dedicated aetherwave_edit_image tool for "change something in this image" intent. Use aetherwave_generate_image with I2I models only when you specifically want style transfer (midjourney-i2i), premium quality (grok-imagine-quality-i2i), or adult content (wan-2.5-spicy-i2i).
Always pass an explicit aspectRatio (e.g. "1:1" for square album art, "16:9" for video thumbnails, "9:16" for shorts/reels). Some upstream providers reject submissions with no aspect ratio.
Ask the user only when:
The prompt contradicts itself (e.g., "highest quality but cheapest")
The user requested "the best model" with no context, surface 2-3 options with tradeoffs
A single generation would cost more than 20 credits and the user has not confirmed
| Name | Required | Description | Default |
|---|---|---|---|
| seed | No | Seed for deterministic generation (supported by some models). | |
| model | No | Model ID. Defaults to 'grok-imagine-t2i'. Use list_image_models for the full list. | |
| prompt | Yes | Text description of the image to generate. | |
| numImages | No | Number of images for models that support multiple outputs. | |
| resolution | No | Output resolution. Most models accept '1K' or '2K'; some accept '480p'/'720p'. | |
| aspectRatio | No | Aspect ratio (e.g. '1:1', '16:9', '9:16'). Pass this explicitly when possible; some upstream providers reject submissions without an aspect ratio. Default ratios vary by model. | |
| negative_prompt | No | What to avoid in the output (supported by some models). | |
| referenceImages | No | Array of public image URLs for image-to-image generation. Required when using an I2I model. A single URL string is also accepted (wrapped as a one-element array). |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
Covers process: submits job, polls until terminal, returns final image URLs. Discloses credit costs per model, default model behavior, and model-specific behaviors like multi-output vs single deterministic. Annotations (readOnlyHint=false, destructiveHint=false) are supplemented with full async workflow and mutation context.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
Description is long but well-structured with sections (Model selection guide, I2I guidance, when to ask user). Every sentence adds value, but some redundancy could be trimmed (e.g., repeating default model). Still, clarity justifies length.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given 8 parameters, no output schema, and complex model ecosystem, the description is very complete. It explains process, costs, model selection, parameters, and edge cases. Output is described as final image URLs. No gaps.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema covers 100% of parameters with basic descriptions. The tool description adds significant meaning: explains default model, how to use referenceImages (array or single URL), which models support I2I, negative_prompt, seed, etc. The model selection guide ties parameters to use cases.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states 'Generates one or more images from a text prompt (T2I) or a text prompt + reference image(s) (I2I).' It distinguishes from sibling aetherwave_edit_image for 'change something' intent and mentions list_image_models for model lineup. The verb+resource+scope is very specific.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
Explicitly guides when to use this tool vs alternatives: For I2I, prefer aetherwave_edit_image unless specific style transfer, premium, or adult content needed. Provides a comprehensive model selection guide with signals like 'high quality', 'photoreal', 'text inside image', etc. Also advises when to ask user for clarification.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
aetherwave_generate_musicGenerate music (Suno)AInspect
Generates AI music via Suno. Returns two tracks per submission. Default model is V5.5 (newest, best quality). For instrumental output set instrumental: true. Music gen typically takes 30-90s - this tool polls with up to a 6-minute budget. Note: the title param is advisory for instrumentals - Suno often writes its own title from the prompt content for instrumental generations. Transient GENERATE_AUDIO_FAILED errors are common; retry once before degrading the model version.
| Name | Required | Description | Default |
|---|---|---|---|
| model | No | Suno model version. Defaults to V5_5 (current best). | |
| title | No | Optional title for the generated tracks. | |
| lyrics | No | Custom lyrics. If omitted, Suno will generate lyrics from the prompt (unless instrumental=true). | |
| prompt | Yes | Style/mood/topic description. E.g. 'Lo-fi ambient track, rain sounds, warm pads' or 'High-energy synthwave with driving bass'. | |
| instrumental | No | If true, no vocals. Default false. |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
Adds significant behavioral context beyond annotations: returns two tracks, polling mechanism, common transient errors with retry advice, model version behavior. No contradiction with annotations.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
Dense single paragraph with all key information, though could benefit from bullet points for improved scannability. Still efficient and front-loaded.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Lacks return value details despite no output schema. Mentions 'returns two tracks' but doesn't specify format (URLs, IDs). Polling is described but not how results are delivered. Slight gap in completeness.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema coverage is 100%, but description adds crucial context: instrumental activation, title advisory for instrumentals, lyrics generation fallback, model default, and prompt examples. This meaningfully supplements the schema.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
Clearly states 'Generates AI music via Suno' with a specific verb and resource. Distinguished from sibling tools like aetherwave_generate_image and aetherwave_generate_video by focusing on music generation.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
Provides detailed guidance on when to use: default model, instrumental option, timing and polling budget, error handling. While no explicit alternatives are named, the context of music generation vs other tools is clear.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
aetherwave_generate_videoGenerate video (Grok Imagine, Wan 2.7, Hailuo 02, Seedance, Kling 2.6, VEO 3.1, Happy Horse)AInspect
Generates a short-form video from a text prompt (T2V) or a text prompt + starting image (I2V). Submits, polls, and returns the final video URL. Default model is 'grok-imagine-t2v' (fast, 4-6 cr/s, with built-in KIE -> fal.ai fallback). Use list_video_models for the full lineup with credit cost per second. I2V models (e.g. 'grok-imagine-i2v', 'seedance-pro-i2v') require a public imageUrl. Video generation can take 30s to several minutes; this tool polls with up to an 8-minute budget.
Model selection guide for videos (when the user does not specify a model)
Default: grok-imagine-t2v (4-6 cr/s, fast, has KIE -> fal.ai fallback for redundancy. Best general-purpose).
Pick a different model when the prompt has these signals:
"highest quality" / "premium" / broadcast / commercial ->
veo3.1-qualityorveo3-quality(Google's flagship, fixed 350-560 cr for 8s, 3-5 min)"fast premium" / quick high-quality ->
veo3-fastorveo3.1-fast(84 cr fixed for 8s)Cinematic camera moves / dolly / pan ->
seedance-pro-t2v(3-10 cr/s) orkling-3.0-pro-t2v(26 cr/s)Realistic human motion / faces ->
hailuo-2.3-pro-i2v(I2V, supply imageUrl)Talking head / lip sync ->
kling-avatar-pro(23 cr/s) orinfinitalk(5-17 cr/s)Anime / stylized / fantasy ->
wan-2.7-t2vNSFW / adult ->
wan-22-nsfw-i2v(I2V only; auto-tags adult)Animate this exact image -> any I2V variant (
grok-imagine-i2v,seedance-pro-i2v,hailuo-2.3-pro-i2v)First + last frame interpolation ->
seedance-pro-i2vwith bothimageUrl+endImageUrlCheapest test ->
hailuo-2.0-standard@ 512p (3 cr/s, ~18 cr for 6s) orgrok-imagine-t2v@ 480p (4 cr/s, ~24 cr for 6s)Clip 12-15s ->
grok-imagine-t2v(accepts up to 15s)True 4K ->
kling-3.0-4k-t2v(94 cr/s, expensive but native 4K)
Audio in generated video: grok-imagine-t2v, seedance-pro-t2v, and the VEO 3.x family include audio at base cost (no surcharge). Kling 2.6 and Kling 3.0 are the outliers — they price audio as a +50-100% surcharge (Kling 2.6 doubles the cost, Kling 3.0 Pro adds ~46%). Default to Grok / Seedance / VEO when sound matters and you don't want to think about audio pricing.
Cost framing: resolution and duration drive cost more than model choice. A 6-second 480p Grok generation costs ~24 cr; the same prompt at 1080p Seedance 2 is ~858 cr (35x more). Pick the lowest acceptable resolution + duration first.
For I2V models: imageUrl is required. For first+last-frame models, pass endImageUrl too.
Ask the user only when:
Single generation would cost more than 100 credits and they haven't confirmed
They asked for "the best" with no other signal; surface 2-3 options with cost ranges
| Name | Required | Description | Default |
|---|---|---|---|
| mode | No | Moderation mode for Grok Imagine. Defaults to 'normal'. | |
| model | No | Model ID. Defaults to 'grok-imagine-t2v'. Use list_video_models for the full list. | |
| prompt | Yes | Text description of the video scene. | |
| duration | No | Duration in seconds. Grok Imagine accepts 6-15; other models have their own ranges (see list_video_models). | |
| imageUrl | No | Public URL of starting image. Required for I2V models. | |
| resolution | No | Output resolution. Default depends on model. | |
| aspectRatio | No | Aspect ratio (e.g. '16:9', '9:16', '1:1'). | |
| endImageUrl | No | Public URL of ending image. Supported by some I2V models (first+last frame). |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
Beyond annotations (readOnlyHint=false, openWorldHint=true), the description details polling behavior with an 8-minute budget, default model fallback (KIE -> fal.ai), cost ranges per model, audio inclusion, and resolution-driven cost differences. No contradictions with annotations.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is long (multiple sections and bullet points) but well-structured and front-loaded with the core purpose. Each sentence serves a purpose given the tool's complexity. Could be slightly shortened, but clarity is not sacrificed.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
For a tool with 8 parameters, no output schema, and minimally informative annotations, the description is remarkably complete. It covers model selection, cost, duration, audio, resolution, I2V requirements, NSFW handling, and first+last frame interpolation. It leaves almost no ambiguity.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema description coverage is 100%, but the description adds substantial value: for 'model' it explains defaults and fallback; for 'duration' it specifies model-specific ranges; for 'resolution' it links to cost; for 'imageUrl' it clarifies I2V requirement; for 'mode' it mentions moderation. It also provides practical constraints (e.g., max 15s for Grok).
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the tool generates a short-form video from text or text+image, including multiple models. It distinguishes itself from sibling tools like list_video_models by specifying that this tool submits, polls, and returns the final video URL.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description provides an exhaustive model selection guide with explicit signals (e.g., highest quality, cinematic camera moves, cheapest test). It instructs when to ask the user (e.g., single generation >100 credits) and when not to (e.g., default to grok-imagine-t2v). It also covers cost framing and audio pricing.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
aetherwave_list_image_modelsList available image modelsARead-onlyInspect
Returns every image-generation model AetherWave supports, with its credit cost, default aspect ratio, supported inputs (T2I vs I2I), and any model-specific options. Call this before generate_image when you don't know the right model ID. The model key (e.g. 'grok-imagine-t2i') is what you pass as model to generate_image.
| Name | Required | Description | Default |
|---|---|---|---|
No parameters | |||
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
Annotations already indicate read-only and open-world; description adds beyond by listing returned fields (credit cost, aspect ratio, etc.) and noting the model key format.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
Two sentences, every word earns its place. Front-loaded with purpose and immediate usage context.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Despite no output schema, description lists all relevant fields and cross-references generate_image, fully equipping the agent.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
No parameters exist, so description need not explain them. Baseline 4 for zero-param tools.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the tool returns every image-generation model with details like credit cost, aspect ratio, supported inputs, and options. It is distinct from siblings like aetherwave_list_video_models.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
Explicitly advises calling this before generate_image when model ID is unknown, providing clear when-to-use and how-to-follow-up guidance.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
aetherwave_list_master_presetsList available audio mastering presetsARead-onlyInspect
Returns every AI mastering preset AetherWave supports, with target LUFS, tags, descriptions, and difficulty level. Call this before master_audio when you don't know which preset fits the track. 12 presets total covering streaming, hip hop, EDM, pop, rock, lo-fi, R&B, acoustic, cinematic, podcast, gentle, and loud-and-punchy mastering styles. Each preset has a target LUFS value (e.g. -14 for streaming, -9 for loud) so you can match the user's distribution target.
| Name | Required | Description | Default |
|---|---|---|---|
No parameters | |||
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
Annotations already declare readOnlyHint=true and openWorldHint=true, indicating no destructive side effects and no exhaustive list. The description adds value by specifying that all 12 presets are returned with their attributes (e.g., target LUFS for streaming is -14), which goes beyond the annotations.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is concise with three sentences: first sentence states purpose and output, second sentence provides usage guidance, third sentence gives concrete details. No wasted words, front-loaded with key information.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given 0 parameters, no output schema, and annotations providing safety context, the description is complete. It covers the tool's purpose, when to use it, and what to expect (12 presets with relevant metadata). No obvious gaps.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
There are 0 parameters, and the schema coverage is 100% (empty properties). The description implicitly covers the parameter semantics by indicating no input is needed. According to guidelines, baseline is 4 for 0 params, and the description is adequate.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states 'Returns every AI mastering preset AetherWave supports' with specific attributes (LUFS, tags, etc.). It distinguishes from the sibling tool 'aetherwave_master_audio' by explicitly recommending to call this first when uncertain about the preset.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description explicitly advises: 'Call this before master_audio when you don't know which preset fits the track.' This gives clear context for when to use the tool versus alternatives, and implies that if the preset is known, this tool may be skipped.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
aetherwave_list_my_creationsList my AetherWave gallery itemsARead-onlyInspect
Returns items from the authenticated user's gallery — images, videos, audio tracks they've generated on AetherWave. Useful for agent workflows like 'find my last 5 images and reframe them all to 9:16' or 'list my recent songs and master each one'. Supports pagination and type filtering. Each item includes id, type, prompt, model, contentUrl, thumbnailUrl, createdAt, isFavorite, visibility, rating, and type-specific fields (duration for audio/video, width/height for images).
| Name | Required | Description | Default |
|---|---|---|---|
| type | No | Filter to a single media type. Omit for all types. | |
| limit | No | Max items to return. Defaults to 100, max 500. | |
| offset | No | Pagination offset. Defaults to 0. | |
| favoritesOnly | No | If true, only return items marked as favorite. |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
Annotations already declare readOnlyHint=true and openWorldHint=true. The description adds valuable behavioral details: it lists the exact fields returned (id, type, prompt, model, etc.), mentions pagination and type filtering, and notes type-specific fields. This goes beyond the annotations to inform the agent about the output structure and capabilities.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is three sentences long, front-loaded with the core purpose, followed by usage examples, then a detailed list of returned fields and capabilities. Every sentence adds value without redundancy or filler.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
For a listing tool with 4 optional parameters and no output schema, the description adequately covers purpose, typical workflows, pagination, field details, and type-specific attributes. It provides enough context for an agent to understand what data to expect and how to use the parameters effectively.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema coverage is 100% (all 4 parameters described in schema). The description mentions pagination and type filtering, which aligns with the parameters, but does not add significant semantic depth beyond the schema. The use case examples imply parameter usage but are not explicit about parameter values. Baseline of 3 is appropriate given schema completeness.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description uses a specific verb ('Returns') and resource ('items from the authenticated user's gallery'), clearly distinguishing from sibling tools which are about generation, editing, and other operations. It lists concrete subtypes (images, videos, audio tracks), making the tool's scope unambiguous.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description provides explicit use cases like 'find my last 5 images and reframe them' and 'list my recent songs and master each one', which illustrate when to use this tool. It mentions support for pagination and type filtering. However, it does not explicitly state when not to use it or direct to alternatives, though the sibling tool names imply the differences.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
aetherwave_list_video_modelsList available video modelsARead-onlyInspect
Returns every video-generation model AetherWave supports (Grok Imagine, Wan 2.7, Hailuo 02, Seedance Pro/Lite, Kling 2.6 with audio, VEO 3.1, Happy Horse, etc.) with per-second credit cost, supported durations, resolutions, aspect ratios, and whether the model needs an input image (I2V). Call this before generate_video when you don't know the right model ID.
| Name | Required | Description | Default |
|---|---|---|---|
No parameters | |||
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
Annotations already declare readOnlyHint and openWorldHint. The description adds valuable behavioral context about the returned data (cost, durations, resolutions, I2V flag), enhancing transparency beyond annotations.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
Two sentences, front-loaded with crucial details, no filler. Every sentence earns its place.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
No output schema, but the description enumerates returned fields comprehensively. For a listing tool with no parameters, this covers the context adequately.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
No parameters exist, and schema coverage is 100%. The description adds no param info, which is appropriate. Baseline 4 for zero parameters.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states it returns every video-generation model with detailed info, naming specific models and attributes. It distinguishes from sibling tools like list_image_models by focusing on video models.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
Explicitly states 'Call this before generate_video when you don't know the right model ID,' providing clear context for use. Lacks explicit when-not-to-use, but the guidance is sufficient.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
aetherwave_master_audioMaster an audio track (AI mastering)AInspect
Submits an audio file for AI mastering and returns the mastered URL synchronously (route polls the Python service internally; expect 30s-5min). Useful as a final polish step after music generation. Cost: 20 credits per track. Producer, Mogul, and Ultimate plans get mastering free. Output is WAV (~50MB per 3-minute track, lossless for redistribution). Pick a preset to steer the mastering style; call aetherwave_list_master_presets for the full live list (12 presets including streaming, loud, gentle, hip_hop, edm, pop, rock, lofi, rnb, acoustic, cinematic, podcast). Each preset has a target LUFS value so you can match the distribution target.
| Name | Required | Description | Default |
|---|---|---|---|
| preset | Yes | Mastering preset name. Must be one of: 'streaming', 'loud', 'gentle', 'hip_hop', 'edm', 'pop', 'rock', 'lofi', 'rnb', 'acoustic', 'cinematic', 'podcast'. Call aetherwave_list_master_presets for full metadata (target LUFS, description, tags). | |
| audioUrl | Yes | Public URL to the source audio file (MP3 or WAV). | |
| trackTitle | No | Optional title for the mastered output (used in gallery row label). |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
Discloses synchronous polling (30s-5min), cost, free plans, output format (WAV, ~50MB), and preset steering. Adds far beyond annotations.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
Long but each sentence adds value. First sentence captures core purpose. Could be slightly tighter but not verbose.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Despite no output schema, covers timing, cost, plans, output details, and preset metadata. Highly complete for a complex tool.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema coverage is 100%. Description adds preset list reference, audioUrl format, and optional trackTitle purpose. Could include more on trackTitle usage.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
Clearly states it submits audio for AI mastering and returns URL. Distinguishes as final polish after music generation, and references sibling tool for preset list.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
Specifies when to use (final polish), cost, and plan details. Does not explicitly state when not to use but provides enough context for decision.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
aetherwave_reframe_imageReframe image to a new aspect ratio (Ideogram V3 Reframe)AInspect
Reframes an image to a new aspect ratio by intelligently outpainting the edges. Pass a public imageUrl and the target aspectRatio ('16:9', '9:16', '1:1', '4:3', '3:4', etc.). Three speed tiers: 'turbo' (5 cr, fast), 'balanced' (10 cr, default), 'quality' (14 cr, slowest, best edges). Returns the reframed image URL.
| Name | Required | Description | Default |
|---|---|---|---|
| speed | No | Rendering speed. 'turbo'=5cr, 'balanced'=10cr (default), 'quality'=14cr. | |
| imageUrl | Yes | Public URL of the source image. | |
| aspectRatio | Yes | Target aspect ratio (e.g. '16:9', '9:16', '1:1', '4:3', '3:4', '21:9'). |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
The description adds significant context beyond annotations: it explains the outpainting approach, speed tiers and their costs, default behavior, and return value. No contradiction with annotations (readOnlyHint=false, openWorldHint=true, etc.).
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
Two sentences, front-loaded with purpose, then details. No wasted words. Efficient and easy to parse.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
For a tool with no output schema, it describes the return value (reframed image URL). It covers all key aspects: inputs, behavior, speed options, and output. Complete for the task.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Despite 100% schema description coverage, the description adds substantial value: explains speed tiers with cost, provides examples of aspect ratios, and mentions default speed. It enriches parameter understanding beyond the schema.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states it reframes an image to a new aspect ratio via outpainting. The verb 'Reframes' and the resource 'image' are specific. It distinguishes from siblings like reframe_video, remove_background, and upscale_image.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description tells when to use (reframing an image to a new aspect ratio) and provides speed tiers with costs, aiding selection. It does not explicitly exclude alternatives or state when not to use, but sibling context helps.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
aetherwave_reframe_videoReframe video to a new aspect ratio (Luma Ray 2 Flash)AInspect
Reframes a video to a new aspect ratio by intelligently outpainting/cropping the edges. Pass a public videoUrl and target reframeAspectRatio. 17 credits per second. Optional reframePrompt lets you steer the new edge content (e.g. 'extend the sky with sunset clouds'). Returns the reframed video URL (R2-hosted).
| Name | Required | Description | Default |
|---|---|---|---|
| videoUrl | Yes | Public URL of the source video (MP4). | |
| reframePrompt | No | Optional prompt to steer the new edge content. | |
| reframeAspectRatio | Yes | Target aspect ratio. |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
Annotations already indicate non-destructive (destructiveHint=false) and read-write (readOnlyHint=false). The description adds behavioral context: intelligent outpainting/cropping, credit cost, optional prompt to steer content, and return format (R2-hosted URL). No contradictions with annotations.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is three sentences: main action, required params, optional prompt and cost, and return. It is front-loaded with the core purpose, and every sentence adds value without redundancy.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given the tool's complexity (3 parameters, no output schema), the description covers purpose, inputs, cost, and output. It does not specify processing duration or error handling, but it provides enough for an AI agent to correctly invoke the tool. The openWorldHint suggests possible additional effects not described, but overall completeness is high.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema coverage is 100% with descriptions for each parameter. The description adds a usage example for reframePrompt ('extend the sky with sunset clouds'), which enhances meaning. For videoUrl and reframeAspectRatio, it merely restates the schema, adding little beyond the baseline.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the verb 'Reframes' and the resource 'video', and specifies the method ('intelligently outpainting/cropping'). It highlights the required inputs (videoUrl, reframeAspectRatio) and distinguishes from sibling tools like aetherwave_reframe_image.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description gives clear context: how to use (pass public videoUrl and target aspect ratio), what to expect (outpainting/cropping), and cost information (17 credits per second). However, it does not explicitly exclude alternative tools or provide when-not scenarios, but the sibling names provide implicit differentiation.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
aetherwave_remove_backgroundRemove background from image (Recraft + fal.ai BiRefNet v2 fallback)AInspect
Strips the background from an image, returning a PNG with transparent alpha. Pass a public imageUrl. Useful for product shots, character cutouts, logo isolation, or compositing onto a new background. ~5 credits per image. Recraft is the primary provider; on outage the tool auto-falls back to fal.ai BiRefNet v2 so single-image calls never silently fail. Works best on photographic subjects (people, products, animals); transparent-PNG inputs have no foreground to segment.
| Name | Required | Description | Default |
|---|---|---|---|
| imageUrl | Yes | Public URL of the source image. |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
The description adds value beyond annotations by mentioning credit cost per image, auto-fallback on provider outage, and behavior with transparent inputs. Annotations indicate non-readOnly and non-idempotent, but description does not contradict them and adds useful context.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is concise with two sentences: first covers the core function, second adds key details (use cases, credits, fallback, limitations). No wasted words, front-loaded with essential information.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given the tool's simplicity (single required param, no output schema), the description is sufficiently complete. It covers purpose, usage, behavior, and limitations. Could potentially mention accepted image formats, but not a critical gap.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema coverage is 100% with only imageUrl, and the schema already describes it as a URI. The description adds that it must be a public URL, but adds minimal extra meaning beyond the schema. Baseline score of 3 is appropriate.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states it strips background from an image and returns a transparent PNG. It uses specific verbs and resources, and effectively distinguishes from siblings like aetherwave_remove_background_video and aetherwave_edit_image.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description provides context for when to use (product shots, character cutouts, etc.) and mentions the fallback behavior and best-use case (photographic subjects). It lacks explicit 'when-not-to-use' guidance but implies limitations with transparent-PNG inputs.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
aetherwave_remove_background_videoRemove background from videoAInspect
Strips the background from a video frame-by-frame using rembg (u2netp) on AetherWave's Python service. Pass a public videoUrl. Choose bgType: "transparent" for an alpha-channel WebM output (compositing) or bgType: "color" with a customColor hex for a solid replacement. 2 credits per second. Slowest tool in the surface (per-frame processing); a 6s clip takes ~4 min, a 30s clip ~15-20 min. Works best on subjects with clear edges (people, products). Returns the processed video URL (R2-hosted).
| Name | Required | Description | Default |
|---|---|---|---|
| bgType | No | 'transparent' = alpha WebM output (default). 'color' = solid replacement using customColor. | |
| videoUrl | Yes | Public URL of the source video (MP4). | |
| customColor | No | Hex color for solid background when bgType='color' (e.g. '#00ff00'). Default green. |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
Discloses key behavioral aspects: frame-by-frame processing, model used (rembg u2netp), output types (alpha WebM or solid color), credit cost, speed estimates, and output format (R2 URL). No annotation contradictions.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is compact, every sentence adds distinct information, and it is front-loaded with the core function. No unnecessary words.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given the tool's complexity and lack of output schema, the description sufficiently covers input, behavior, performance, and output format, making it complete for agent use.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema coverage is 100%, but the description adds value by explaining the behavior of bgType options (alpha-channel WebM, solid replacement) and providing an example hex for customColor, which aids understanding beyond parameter names.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the action ('Strips the background from a video frame-by-frame') and resource, differentiating from siblings like aetherwave_remove_background (likely for images) and other video tools.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
Provides clear context on performance ('Slowest tool', time estimates) and suitability ('Works best on subjects with clear edges'), helping decide when to use. However, it doesn't explicitly mention alternatives or when not to use it.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
aetherwave_upscale_imageUpscale image (Topaz)AInspect
Upscales a source image using Topaz's high-fidelity upscaler. Pass a public imageUrl and an upscaleFactor. Credit cost depends on the source resolution × factor; small images cost less than large ones at the same factor. Returns the upscaled image URL.
| Name | Required | Description | Default |
|---|---|---|---|
| imageUrl | Yes | Public URL of the source image. | |
| upscaleFactor | No | Upscale multiplier. Defaults to '2x'. '8x' is heavy; use only on small sources. |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
Adds value beyond annotations by detailing credit cost dependency and noting that 8x is heavy. No contradictions with annotations; readOnlyHint=false appropriately indicates mutation.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
Three concise sentences: purpose, parameters with cost hint, return value. Front-loaded with key information, no redundancy.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Complete for a simple upscale tool: covers input, cost behavior, and output. No output schema needed. Context signals confirm low complexity.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema covers both parameters with descriptions. Description adds important context like cost calculation and caution for 8x factor, enhancing meaning beyond schema.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states it upscales a source image using Topaz's high-fidelity upscaler, distinguishing it from sibling tools like aetherwave_upscale_video and other image operations.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
Provides explicit parameters (imageUrl, upscaleFactor) and cost guidance based on resolution and factor. Also advises against using 8x on large sources. Could be improved by stating when not to use the tool.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
aetherwave_upscale_videoUpscale video (Atlas Video Upscaler)AInspect
Upscales a source video to 1080p or 2K using Atlas. Pass a public videoUrl and the target resolution. Cost is per-second (7 cr/s @ 1080p, 9 cr/s @ 2K). Atlas-side limits: clips up to 53s at 1080p, 23s at 2K, source must be <=30fps. Returns the upscaled video URL (R2-hosted).
| Name | Required | Description | Default |
|---|---|---|---|
| videoUrl | Yes | Public URL of the source video (MP4). | |
| targetResolution | No | Target output resolution. Defaults to '1080p'. '2k' is more expensive and limited to ~23s clips. |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
The description fully discloses behavioral traits: it is a mutating operation (cost per second, limits), returns a URL, and has no idempotency guarantees. Annotations (readOnlyHint=false, destructiveHint=false) are consistent with the description, which adds crucial details like cost and platform limits that annotations do not cover.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is highly concise: two sentences plus a few additional facts, all essential. It front-loads the main purpose and then packs in constraints, costs, and return value without redundancy. Every sentence earns its place.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given the tool's complexity (2 parameters, no output schema), the description is complete: it covers input requirements, target resolution options with costs and limits, and the return format (R2-hosted URL). No gaps for an AI agent to interpret.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema coverage is 100%, but the description adds significant meaning beyond the schema: it explains that targetResolution defaults to '1080p', that '2k' is more expensive and limited to shorter clips, and that videoUrl must be a public MP4 URL. This helps the agent understand parameter choices and trade-offs.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the tool's purpose: to upscale a source video to 1080p or 2K using Atlas. It specifies the action (upscale), resource (video), and target resolutions, distinguishing it from sibling tools like aetherwave_upscale_image (image upscaling) and other video manipulation tools.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description provides clear usage context: when you need to upscale a video, with constraints on source FPS (≤30fps), clip length (53s at 1080p, 23s at 2K), and public URL requirement. It implicitly guides when not to use (e.g., if source exceeds limits). However, it does not explicitly compare to other upscaling methods within the same family, but the context is sufficient for an AI agent.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
Claim this connector by publishing a /.well-known/glama.json file on your server's domain with the following structure:
{
"$schema": "https://glama.ai/mcp/schemas/connector.json",
"maintainers": [{ "email": "your-email@example.com" }]
}The email address must match the email associated with your Glama account. Once published, Glama will automatically detect and verify the file within a few minutes.
Control your server's listing on Glama, including description and metadata
Access analytics and receive server usage reports
Get monitoring and health status updates for your server
Feature your server to boost visibility and reach more users
For users:
Full audit trail – every tool call is logged with inputs and outputs for compliance and debugging
Granular tool control – enable or disable individual tools per connector to limit what your AI agents can do
Centralized credential management – store and rotate API keys and OAuth tokens in one place
Change alerts – get notified when a connector changes its schema, adds or removes tools, or updates tool definitions, so nothing breaks silently
For server owners:
Proven adoption – public usage metrics on your listing show real-world traction and build trust with prospective users
Tool-level analytics – see which tools are being used most, helping you prioritize development and documentation
Direct user feedback – users can report issues and suggest improvements through the listing, giving you a channel you would not have otherwise
The connector status is unhealthy when Glama is unable to successfully connect to the server. This can happen for several reasons:
The server is experiencing an outage
The URL of the server is wrong
Credentials required to access the server are missing or invalid
If you are the owner of this MCP connector and would like to make modifications to the listing, including providing test credentials for accessing the server, please contact support@glama.ai.
Discussions
No comments yet. Be the first to start the discussion!