Schema | Aetherwave Studio

Aetherwave Studio

Official

by AetherWave-Studio

Overview Schema Related Servers Score Discussions

JavaScript

Remote

Server Configuration

Describes the environment variables required to run the server.

Name	Required	Description	Default
`AETHERWAVE_API_KEY`	Yes	Your API key. Get one at /profile -> Developer tab. Must start with aw_live_.
`AETHERWAVE_BASE_URL`	No	Override the API base URL (useful for staging or self-hosted).	https://aetherwavestudio.com

Capabilities

Features and capabilities supported by this server

Capability	Details
`tools`	{ "listChanged": true }

Tools

Functions exposed to the LLM to take actions

Name	Description
aetherwave_balanceA	Returns the current AetherWave credit balance for the API key. Use this BEFORE a generation to confirm sufficient credits, especially for video which can cost 30-300+ credits depending on model/duration/resolution.
aetherwave_list_image_modelsA	Returns every image-generation model AetherWave supports, with its credit cost, default aspect ratio, supported inputs (T2I vs I2I), and any model-specific options. Call this before generate_image when you don't know the right model ID. The model key (e.g. 'grok-imagine-t2i') is what you pass as `model` to generate_image.
aetherwave_list_master_presetsA	Returns every AI mastering preset AetherWave supports, with target LUFS, tags, descriptions, and difficulty level. Call this before master_audio when you don't know which preset fits the track. 12 presets total covering streaming, hip hop, EDM, pop, rock, lo-fi, R&B, acoustic, cinematic, podcast, gentle, and loud-and-punchy mastering styles. Each preset has a target LUFS value (e.g. -14 for streaming, -9 for loud) so you can match the user's distribution target.
aetherwave_list_video_modelsA	Returns every video-generation model AetherWave supports (Grok Imagine, Wan 2.7, Hailuo 02, Seedance Pro/Lite, Kling 2.6 with audio, VEO 3.1, Happy Horse, etc.) with per-second credit cost, supported durations, resolutions, aspect ratios, and whether the model needs an input image (I2V). Call this before generate_video when you don't know the right model ID.
aetherwave_generate_imageA	Generates one or more images from a text prompt (T2I) or a text prompt + reference image(s) (I2I). Submits the job, polls until terminal, and returns the final image URLs. Default model is 'grok-imagine-t2i' (fast, 6 images per generation, 5 credits). Use list_image_models to see the full lineup with pricing. For I2I, pass `referenceImages` as an array of public image URLs and pick a model with I2I support (e.g. 'grok-imagine-i2i', 'wan-2.5-spicy-i2i'). Model selection guide (when the user does not specify a model) Default: `grok-imagine-t2i` (5 cr, 6 outputs per call, fast, general purpose). Strong recommendation: when a single high-quality output is what's wanted (most agent / one-shot workflows), prefer `gpt-image-2-t2i` (9 cr @ 1K / higher @ 2K, single deterministic image, best general quality across realism, illustration, typography, and composition; supports up to 2K resolution and most aspect ratios including auto). This is the front-runner for serious creative output where you don't need to pick from 6 variations. Pick a different model when the prompt has these signals: "single best result" / "one image" / production / no time to pick from variations -> `gpt-image-2-t2i` (9 cr, 1 output, top general quality) "photoreal" / "photo of" / "realistic" -> `gpt-image-2-t2i` (9 cr, best general realism) or `imagen-4` (12 cr, very high quality) or `z-image-turbo` (3 cr, fastest) "highest quality" / "premium" / no budget -> `gpt-image-2-t2i` at 2K, or `grok-imagine-quality-t2i` (16 cr @ 1K, 22 cr @ 2K), or `imagen-4-ultra` Text inside the image (signs, posters, typography) -> `ideogram-v3-t2i` (best in class) or `gpt-image-2-t2i` (also strong) Artistic / painterly / stylized -> `midjourney-t2i` Album art / cover art -> `gpt-image-2-t2i` for one strong image; `grok-imagine-t2i` for 6 variations to choose from; `seedream-v4-t2i` if 4K wanted Logo or design with embedded text -> `ideogram-v3-t2i` NSFW / adult / explicit -> `wan-2.5-spicy-t2i` (auto-tags creation as 18+; routes to adult gallery) Cheapest possible / quick test -> `z-image-turbo` (3 cr) Multiple variations to compare -> keep `grok-imagine-t2i` (6 outputs default) or use `numImages` on a multi-output model For I2I (reference image provided): prefer the dedicated `aetherwave_edit_image` tool for "change something in this image" intent. Use `aetherwave_generate_image` with I2I models only when you specifically want style transfer (`midjourney-i2i`), premium quality (`grok-imagine-quality-i2i`), or adult content (`wan-2.5-spicy-i2i`). Always pass an explicit `aspectRatio` (e.g. "1:1" for square album art, "16:9" for video thumbnails, "9:16" for shorts/reels). Some upstream providers reject submissions with no aspect ratio. Ask the user only when: The prompt contradicts itself (e.g., "highest quality but cheapest") The user requested "the best model" with no context, surface 2-3 options with tradeoffs A single generation would cost more than 20 credits and the user has not confirmed
aetherwave_edit_imageA	Edits an existing image guided by a text prompt. Pass a public `imageUrl` plus a `prompt` describing the change ("add a moon to the sky", "swap the background for a neon city", "make it look like a comic panel"). Submits, polls, and returns the edited image URL(s). Default model is 'grok-imagine-i2i' (6 cr per call, returns 2 variations, ~30s, best cost-to-quality on standard edits). Other I2I-capable models: 'seedream-v4-edit', 'wan-2.5-spicy-i2i', 'flux-kontext-pro', 'qwen-image-edit', 'gpt-image-1.5-i2i' (slow, ~5min). Use list_image_models for full lineup. Note: source URLs with spaces or parentheses may fail upstream; prefer clean URLs. Model selection guide for edits Default: `grok-imagine-i2i` (6 cr per call, returns 2 variations = 3 cr/image effective, fast ~30s, strong general-purpose edit quality). Pick a different model when: Need a single deterministic output, or 4K resolution -> `seedream-v4-edit` (7 cr per image, supports 1K/2K/4K, multi-image up to 6) Subtle edits / preserve composition / character consistency -> `flux-kontext-pro` or `flux-kontext-max` NSFW edits -> `wan-2.5-spicy-i2i` Highest quality, time is not a concern (~5 min OK) -> `gpt-image-1.5-i2i` or `grok-imagine-quality-i2i` (16 cr @ 1K, 22 cr @ 2K) Stylized / artistic transformation -> `midjourney-i2i` If the user simply says "edit this image" with no other signal, default to `grok-imagine-i2i`.
aetherwave_upscale_imageA	Upscales a source image using Topaz's high-fidelity upscaler. Pass a public `imageUrl` and an `upscaleFactor`. Credit cost depends on the source resolution × factor; small images cost less than large ones at the same factor. Returns the upscaled image URL.
aetherwave_remove_backgroundA	Strips the background from an image, returning a PNG with transparent alpha. Pass a public `imageUrl`. Useful for product shots, character cutouts, logo isolation, or compositing onto a new background. ~5 credits per image. Recraft is the primary provider; on outage the tool auto-falls back to fal.ai BiRefNet v2 so single-image calls never silently fail. Works best on photographic subjects (people, products, animals); transparent-PNG inputs have no foreground to segment.
aetherwave_reframe_imageA	Reframes an image to a new aspect ratio by intelligently outpainting the edges. Pass a public `imageUrl` and the target `aspectRatio` ('16:9', '9:16', '1:1', '4:3', '3:4', etc.). Three speed tiers: 'turbo' (5 cr, fast), 'balanced' (10 cr, default), 'quality' (14 cr, slowest, best edges). Returns the reframed image URL.
aetherwave_upscale_videoA	Upscales a source video to 1080p or 2K using Atlas. Pass a public `videoUrl` and the target resolution. Cost is per-second (7 cr/s @ 1080p, 9 cr/s @ 2K). Atlas-side limits: clips up to 53s at 1080p, 23s at 2K, source must be <=30fps. Returns the upscaled video URL (R2-hosted).
aetherwave_remove_background_videoA	Strips the background from a video frame-by-frame using rembg (u2netp) on AetherWave's Python service. Pass a public `videoUrl`. Choose `bgType: "transparent"` for an alpha-channel WebM output (compositing) or `bgType: "color"` with a `customColor` hex for a solid replacement. 2 credits per second. Slowest tool in the surface (per-frame processing); a 6s clip takes ~4 min, a 30s clip ~15-20 min. Works best on subjects with clear edges (people, products). Returns the processed video URL (R2-hosted).
aetherwave_reframe_videoA	Reframes a video to a new aspect ratio by intelligently outpainting/cropping the edges. Pass a public `videoUrl` and target `reframeAspectRatio`. 17 credits per second. Optional `reframePrompt` lets you steer the new edge content (e.g. 'extend the sky with sunset clouds'). Returns the reframed video URL (R2-hosted).
aetherwave_generate_videoA	Generates a short-form video from a text prompt (T2V) or a text prompt + starting image (I2V). Submits, polls, and returns the final video URL. Default model is 'grok-imagine-t2v' (fast, 4-6 cr/s, with built-in KIE -> fal.ai fallback). Use list_video_models for the full lineup with credit cost per second. I2V models (e.g. 'grok-imagine-i2v', 'seedance-pro-i2v') require a public `imageUrl`. Video generation can take 30s to several minutes; this tool polls with up to an 8-minute budget. Model selection guide for videos (when the user does not specify a model) Default: `grok-imagine-t2v` (4-6 cr/s, fast, has KIE -> fal.ai fallback for redundancy. Best general-purpose). Pick a different model when the prompt has these signals: "highest quality" / "premium" / broadcast / commercial -> `veo3.1-quality` or `veo3-quality` (Google's flagship, fixed 350-560 cr for 8s, 3-5 min) "fast premium" / quick high-quality -> `veo3-fast` or `veo3.1-fast` (84 cr fixed for 8s) Cinematic camera moves / dolly / pan -> `seedance-pro-t2v` (3-10 cr/s) or `kling-3.0-pro-t2v` (26 cr/s) Realistic human motion / faces -> `hailuo-2.3-pro-i2v` (I2V, supply imageUrl) Talking head / lip sync -> `kling-avatar-pro` (23 cr/s) or `infinitalk` (5-17 cr/s) Anime / stylized / fantasy -> `wan-2.7-t2v` NSFW / adult -> `wan-22-nsfw-i2v` (I2V only; auto-tags adult) Animate this exact image -> any I2V variant (`grok-imagine-i2v`, `seedance-pro-i2v`, `hailuo-2.3-pro-i2v`) First + last frame interpolation -> `seedance-pro-i2v` with both `imageUrl` + `endImageUrl` Cheapest test -> `hailuo-2.0-standard` @ 512p (3 cr/s, ~18 cr for 6s) or `grok-imagine-t2v` @ 480p (4 cr/s, ~24 cr for 6s) Clip 12-15s -> `grok-imagine-t2v` (accepts up to 15s) True 4K -> `kling-3.0-4k-t2v` (94 cr/s, expensive but native 4K) Audio in generated video: `grok-imagine-t2v`, `seedance-pro-t2v`, and the VEO 3.x family include audio at base cost (no surcharge). Kling 2.6 and Kling 3.0 are the outliers — they price audio as a +50-100% surcharge (Kling 2.6 doubles the cost, Kling 3.0 Pro adds ~46%). Default to Grok / Seedance / VEO when sound matters and you don't want to think about audio pricing. Cost framing: resolution and duration drive cost more than model choice. A 6-second 480p Grok generation costs ~24 cr; the same prompt at 1080p Seedance 2 is ~858 cr (35x more). Pick the lowest acceptable resolution + duration first. For I2V models: `imageUrl` is required. For first+last-frame models, pass `endImageUrl` too. Ask the user only when: Single generation would cost more than 100 credits and they haven't confirmed They asked for "the best" with no other signal; surface 2-3 options with cost ranges
aetherwave_generate_musicA	Generates AI music via Suno. Returns two tracks per submission. Default model is V5.5 (newest, best quality). For instrumental output set `instrumental: true`. Music gen typically takes 30-90s - this tool polls with up to a 6-minute budget. Note: the `title` param is advisory for instrumentals - Suno often writes its own title from the prompt content for instrumental generations. Transient `GENERATE_AUDIO_FAILED` errors are common; retry once before degrading the model version.
aetherwave_master_audioA	Submits an audio file for AI mastering and returns the mastered URL synchronously (route polls the Python service internally; expect 30s-5min). Useful as a final polish step after music generation. Cost: 20 credits per track. Producer, Mogul, and Ultimate plans get mastering free. Output is WAV (~50MB per 3-minute track, lossless for redistribution). Pick a `preset` to steer the mastering style; call `aetherwave_list_master_presets` for the full live list (12 presets including streaming, loud, gentle, hip_hop, edm, pop, rock, lofi, rnb, acoustic, cinematic, podcast). Each preset has a target LUFS value so you can match the distribution target.
aetherwave_list_my_creationsA	Returns items from the authenticated user's gallery — images, videos, audio tracks they've generated on AetherWave. Useful for agent workflows like 'find my last 5 images and reframe them all to 9:16' or 'list my recent songs and master each one'. Supports pagination and type filtering. Each item includes id, type, prompt, model, contentUrl, thumbnailUrl, createdAt, isFavorite, visibility, rating, and type-specific fields (duration for audio/video, width/height for images).

Prompts

Interactive templates invoked by user choice

Name	Description
No prompts

Resources

Contextual data attached and managed by the client

Name	Description
No resources

Server Configuration
Capabilities
Tools
Prompts
Resources

Latest Blog Posts

Who's Calling? MCP Hosts Are an Identity Blind Spot (And the Spec Knows It)
By Om-Shree-0709 on July 25, 2026.
mcp
Agent Identity
OAuth 2.1
Your AI Chatbot Just Exposed Your CEO's Salary to an Intern
By Om-Shree-0709 on July 2, 2026.
Agent Identity
MCP Security
OAuth Delegation
Why MCP Servers Need Execution Sandboxing (And Why Your Current Stack Isn't Enough)
By Om-Shree-0709 on June 30, 2026.
Agentic Ai
Prompt Injection
WebAssembly

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/AetherWave-Studio/aetherwave-mcp'

If you have feedback or need assistance with the MCP directory API, please join our Discord server