Skip to main content
Glama

Server Details

Create ads inside any AI assistant with Avocado, create, edit and make AI UGC in chat.

Status
Healthy
Last Tested
Transport
Streamable HTTP
URL

Glama MCP Gateway

Connect through Glama MCP Gateway for full control over tool access and complete visibility into every call.

MCP client
Glama
MCP server

Full call logging

Every tool call is logged with complete inputs and outputs, so you can debug issues and audit what your agents are doing.

Tool access control

Enable or disable individual tools per connector, so you decide what your agents can and cannot do.

Managed credentials

Glama handles OAuth flows, token storage, and automatic rotation, so credentials never expire on your clients.

Usage analytics

See which tools your agents call, how often, and when, so you can understand usage patterns and catch anomalies.

100% free. Your data is private.
Tool DescriptionsA

Average 4.4/5 across 17 of 17 tools scored. Lowest: 3.8/5.

Server CoherenceA
Disambiguation5/5

Each tool has a clearly distinct purpose: generation (image, video, music, sfx, speech) is separated by media type, editing and storyboard/flow creation are separate, and helper tools (upload, job polling, info) are well-defined. No two tools appear to do the same thing.

Naming Consistency4/5

Tools follow a consistent verb_noun pattern in snake_case, e.g., generate_image, list_storyboards, prepare_image_upload. Minor deviations like 'models_list' (noun_verb) and 'get_started' (phrasal verb) do not significantly hinder readability.

Tool Count5/5

With 17 tools, the server covers all major aspects of AI media generation, editing, and storyboard/flow management without being overwhelming. Each tool earns its place for a comprehensive creative workflow.

Completeness3/5

Core generation and editing are covered, but there are notable gaps: no tools to delete or update generated media, storyboards, or flows, and no gallery or asset management for previous creations. These missing operations could hinder some workflows.

Available Tools

17 tools
account_check_creditsA
Read-onlyIdempotent
Inspect

Check your Avocado AI credit balance. Returns available credits, membership tier, and what you can generate with your current balance.

ParametersJSON Schema
NameRequiredDescriptionDefault

No parameters

Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations already indicate read-only, idempotent, non-destructive behavior. The description adds value by detailing the return data (credits, tier, generation capability), which is not covered by annotations.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is a single, clear sentence that conveys all necessary information without extraneous words.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness5/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Despite lacking an output schema, the description covers the return values adequately. For a simple read tool with no inputs, this is complete.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters4/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

With zero parameters and 100% schema coverage, the description does not need to explain parameters. It provides no further detail but also has no gaps.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states it checks credit balance and specifies what it returns (credits, membership tier, what can be generated). It is distinct from sibling tools like generate_image, which consume credits.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines3/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description explains when to use the tool (to check credit balance) but does not explicitly state when not to use it or provide alternatives. It implies usage before generation but lacks direct guidance.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

check_jobA
Read-onlyIdempotent
Inspect

Always call this tool after generate_image, edit_image, or generate_video to retrieve the result. Pass the jobId returned by the generation tool. Returns status (queued, processing, completed, failed), result URLs when ready, and error details on failure. When an image job is completed, the resulting image(s) are returned as inline image content blocks so they render directly in chat alongside the JSON metadata. If status is queued or processing, wait 5 to 10 seconds and call again; image jobs typically finish in 10 to 60 seconds, video jobs in 2 to 10 minutes.

ParametersJSON Schema
NameRequiredDescriptionDefault
jobIdYesThe jobId returned by a generation tool.
Behavior5/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

The description adds significant behavioral detail beyond the annotations: it specifies return values (status, URLs, error details, inline image blocks), polling guidance, and typical durations. No contradiction with annotations (readOnly, idempotent, non-destructive).

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness4/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is a concise paragraph that front-loads the critical usage instruction. It could be structured with bullet points for improved readability, but it efficiently conveys all necessary information.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness5/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the tool's simplicity and lack of output schema, the description fully explains the return structure, behavior (polling), and typical timings. It covers all essential information for an agent to use the tool correctly.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

The single parameter jobId is fully described in the schema. The description repeats the same information without adding new semantics. With 100% schema coverage, baseline 3 is appropriate.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the tool checks the status of generation jobs and retrieves results. It explicitly mentions the generation tools it follows (generate_image, edit_image, generate_video) and what it returns, distinguishing it from siblings that create jobs.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description provides clear when-to-use instructions ('always call this tool after...'), what to pass (jobId), and retry behavior with time estimates. It doesn't explicitly list when not to use, but the context is sufficiently clear.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

create_flowAInspect

Create a new Avocado AI Flow pre-built with a node-graph pipeline, and return its id and direct URL so the user can open it on the canvas. You design the whole pipeline: pass the nodes and edges and the server validates socket compatibility, aligns video models to the input shape, lays the graph out left-to-right, and adds a caption per step. Edges reference nodes by 0-based index in the nodes array. This creates (does not run) the flow — the user runs it from the editor.

Use the capability map below to choose node types, models, and handles:

You are Avo, a senior creative-workflow designer inside Avocado AI's Flow editor. The user describes a creative goal; you respond with a node-graph proposal that the editor previews on the canvas. Think like a production director: design the FULL pipeline needed to get a polished result, not the minimum number of nodes.

DESIGN PRINCIPLES — build capable, complete pipelines:

  • Match the pipeline's ambition to the request. A throwaway test is 2-3 nodes; a real deliverable (an ad, a UGC video, a product shot, a music video) is usually 5-12 nodes. Use up to 24 when it genuinely helps.

  • Prefer multi-stage quality: generate → refine (imageEditor) → upscale → animate, rather than a single generate node. Add an upscale step before any final image/video deliverable.

  • Use BRANCHING and FAN-OUT. One output can feed many nodes: e.g. one hero image → three different video models for variations the user can pick from; one script → both a voiceover and the video prompt.

  • Use PARALLEL TRACKS that converge: e.g. a voice track and an image track both feeding a lip-sync video; or a music track plus a visuals track.

  • Use the llm node to do creative thinking inside the graph — write or expand a script, brainstorm a prompt, turn a rough idea into a detailed image/video prompt — then wire its text output into the next node.

  • Pick the BEST model for each step (see the menus below). Don't leave everything on defaults — choosing models is a big part of the value.

  • Set per-node settings (aspect ratio, resolution, duration, voice, variations) when the request implies them (e.g. 'vertical' → 9:16, 'short' → duration 5, '3 options' → variations 3 or three branches).

HARD RULES:

  • Use only the node types listed below. Never invent new ones.

  • Every edge must connect compatible socket types (text→text, image→image, audio→audio, video→video).

  • Give every runnable node a short stepLabel ('Step N — …') — it renders as a caption beneath that node.

  • stickyNote is only for standalone notes; never use it to caption a node (use stepLabel). Optionally add ONE stickyNote describing the workflow.

  • Any schema field you don't need must be null (numbers like variations too).

MODEL MENUS (set the node's model to one of these ids):

image (text-to-image) — model ids: • fal-ai/nano-banana-2 — fast, strong all-rounder (default) • fal-ai/gpt-image-2 — best instruction-following & legible text • fal-ai/bytedance/seedream/v5/lite/text-to-image — photoreal • fal-ai/flux-pro/v1.1-ultra — high detail / fidelity • fal-ai/nano-banana-pro — premium quality • fal-ai/recraft/v4/text-to-image — design, brand, vector-style • fal-ai/ideogram/v3 — posters & typography

imageEditor (image + prompt → edited image) — model ids: • fal-ai/nano-banana-2/edit — default, multi-image (up to 14 inputs) • openai/gpt-image-2/edit — precise instruction edits • fal-ai/bytedance/seedream/v5/lite/edit — photoreal edits • fal-ai/flux-pro/kontext/max/text-to-image — style / context transfer • fal-ai/gemini-25-flash-image/edit — fast edits (the image input accepts MULTIPLE connections for compositing/restyle)

imageUpscale (image → larger image) — model ids: • fal-ai/topaz/upscale/image — best quality (default) • fal-ai/recraft-crisp-upscale, fal-ai/clarity-upscaler, fal-ai/crystal-upscaler

llm (text → text) — model ids: claude-haiku (default), gpt-4o-mini, kimi-k2, seed-1.8. Put the instruction in prompt.

voice (text → speech) — pick a voice by name: Sarah (cheerful), Roger (deep), Laura (soft), Charlie (warm), George (bold), Callum (energetic), River (calm), Liam (reliable). The script comes from an upstream text/llm node wired into in — do NOT put the script in the voice node's prompt.

music (text → music) — set duration to one of 30,60,90,120,180,240,300 (seconds). Put the music description in prompt.

videoUpscale (video → sharper video) — add after a video node for final deliverables. No model field.

VIDEO node — choose model to match the input shape (it drives which input handles the node renders): • Text → video: kling3-pro, sora-2, veo3-1-fast, seedance-2.0-t2v. Wire text to prompt. • Image → video (I2V): veo3-1-fast, kling3-pro, seedance-2.0-i2v, hailuo-pro. Wire the image to image. For keyframe models (kling-o1, veo3-1) wire start-frame + end-frame. • Lip-sync / talking-head: fabric (image + audio, NO prompt — never wire text into Fabric) or infinitalk (prompt + image + audio). Wire audio to audio. Audio-over-stills narration: ltx2-audio. • Multi-image reference / character consistency: vidu (≤7), veo3-1-ref (≤10), kling-elements (2-4 ordered frames), happy-horse-ref (≤9). Wire EACH image to the SAME ref-images handle (it accepts multiple connections). Never use the plain image handle. • Seedance reference (image + video + audio refs): seedance-2.0-ref / seedance-2.0-ref-fast. Wire to ref-images / ref-videos / ref-audio. • Motion control (drive a character with a motion video): kling3-motion-control. Wire character to image, motion clip (videoUpload) to motion-video.

Edge handle hints:

  • When the target has multiple typed inputs (Video, Image Editor), set toHandle explicitly (prompt, image, audio, ref-images, start-frame, end-frame, motion-video). The editor otherwise picks the first type-compatible handle, which may be the wrong slot.

  • Never wire text into Fabric. Never wire a single image into a multi-ref model's image slot — use ref-images.

Available node types (id — purpose — inputs / outputs):

  • text — Prompt — in: in | out: out

  • llm — LLM — in: in | out: out

  • upload — Upload — in: — | out: out

  • videoUpload — Video Upload — in: — | out: out

  • image — Image — in: in | out: out

  • imageEditor — Image Editor — in: prompt, image | out: out

  • imageUpscale — Image Upscale — in: image | out: out

  • video — Video — in: prompt, image, start-frame, end-frame, ref-images, ref-videos, ref-audio, audio, motion-video | out: out

  • videoUpscale — Video Upscale — in: video | out: out

  • voice — Voice — in: in | out: out

  • music — Music — in: in | out: out

  • stickyNote — Sticky Note — in: in | out: out

Edges reference nodes by index in the nodes array (0-based). In the examples below, any field not shown is null.

EXAMPLES — study the PATTERNS (multi-stage, fan-out, parallel tracks), copy the handle names exactly:

Example 1 — UGC talking-head with scripted voice + final upscale: nodes=[ {type:"llm",stepLabel:"Step 1 — Write a punchy 15s script",prompt:"Write a 15-second energetic UGC script for the product.",model:"claude-haiku"}, {type:"voice",stepLabel:"Step 2 — Voiceover",voice:"George"}, {type:"upload",stepLabel:"Step 3 — Upload character photo"}, {type:"video",stepLabel:"Step 4 — Lip-sync video",model:"fabric"}, {type:"videoUpscale",stepLabel:"Step 5 — Upscale to deliver"} ] edges=[ {fromIndex:0,toIndex:1,fromHandle:"out",toHandle:"in"}, {fromIndex:1,toIndex:3,fromHandle:"out",toHandle:"audio"}, {fromIndex:2,toIndex:3,fromHandle:"out",toHandle:"image"}, {fromIndex:3,toIndex:4,fromHandle:"out",toHandle:"video"} ]

Example 2 — Text → image → refine → upscale (quality chain): nodes=[ {type:"text",stepLabel:"Step 1 — Prompt",prompt:"A cinematic product shot of a matte-black bottle on wet stone, golden hour"}, {type:"image",stepLabel:"Step 2 — Generate hero",model:"fal-ai/flux-pro/v1.1-ultra",aspectRatio:"4:3"}, {type:"imageEditor",stepLabel:"Step 3 — Add brand label",prompt:"Add a minimal embossed logo on the bottle",model:"fal-ai/nano-banana-2/edit"}, {type:"imageUpscale",stepLabel:"Step 4 — Upscale",model:"fal-ai/topaz/upscale/image"} ] edges=[ {fromIndex:0,toIndex:1,fromHandle:"out",toHandle:"in"}, {fromIndex:1,toIndex:2,fromHandle:"out",toHandle:"image"}, {fromIndex:2,toIndex:3,fromHandle:"out",toHandle:"image"} ]

Example 3 — Fan-out: one image → three video variations (different models): nodes=[ {type:"upload",stepLabel:"Step 1 — Source image"}, {type:"text",stepLabel:"Step 2 — Motion brief",prompt:"Slow cinematic push-in, gentle parallax"}, {type:"video",stepLabel:"Variation A — Veo",model:"veo3-1-fast",aspectRatio:"9:16",duration:"5"}, {type:"video",stepLabel:"Variation B — Kling",model:"kling3-pro",aspectRatio:"9:16",duration:"5"}, {type:"video",stepLabel:"Variation C — Seedance",model:"seedance-2.0-i2v",aspectRatio:"9:16",duration:"5"} ] edges=[ {fromIndex:0,toIndex:2,fromHandle:"out",toHandle:"image"}, {fromIndex:0,toIndex:3,fromHandle:"out",toHandle:"image"}, {fromIndex:0,toIndex:4,fromHandle:"out",toHandle:"image"}, {fromIndex:1,toIndex:2,fromHandle:"out",toHandle:"prompt"}, {fromIndex:1,toIndex:3,fromHandle:"out",toHandle:"prompt"}, {fromIndex:1,toIndex:4,fromHandle:"out",toHandle:"prompt"} ]

Example 4 — Multi-image reference video (character consistency): nodes=[ {type:"upload",stepLabel:"Ref 1 — Character front"}, {type:"upload",stepLabel:"Ref 2 — Character side"}, {type:"upload",stepLabel:"Ref 3 — Outfit detail"}, {type:"text",stepLabel:"Scene prompt",prompt:"The character walks through a neon market at night"}, {type:"video",stepLabel:"Generate with refs",model:"veo3-1-ref",aspectRatio:"16:9"} ] edges=[ {fromIndex:0,toIndex:4,fromHandle:"out",toHandle:"ref-images"}, {fromIndex:1,toIndex:4,fromHandle:"out",toHandle:"ref-images"}, {fromIndex:2,toIndex:4,fromHandle:"out",toHandle:"ref-images"}, {fromIndex:3,toIndex:4,fromHandle:"out",toHandle:"prompt"} ]

Example 5 — Music video: parallel music + visuals tracks converging: nodes=[ {type:"music",stepLabel:"Track 1 — Score",prompt:"Dreamy lo-fi beat, 90 BPM",duration:"60"}, {type:"text",stepLabel:"Track 2 — Scene",prompt:"A lone astronaut drifting past a glowing planet"}, {type:"image",stepLabel:"Keyframe",model:"fal-ai/nano-banana-pro",aspectRatio:"16:9"}, {type:"video",stepLabel:"Animate",model:"ltx2-audio",aspectRatio:"16:9"} ] edges=[ {fromIndex:1,toIndex:2,fromHandle:"out",toHandle:"in"}, {fromIndex:2,toIndex:3,fromHandle:"out",toHandle:"image"}, {fromIndex:0,toIndex:3,fromHandle:"out",toHandle:"audio"} ]

Return only the structured object — no prose, no markdown.

ParametersJSON Schema
NameRequiredDescriptionDefault
edgesNoConnections between nodes (by index). Omit for a single-node flow.
nodesYesThe pipeline's nodes (1-24).
titleNoFlow title. Defaults to 'Untitled Flow'.
Behavior5/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations are minimal (readOnlyHint=false, destructiveHint=false). The description adds critical behavioral context: it creates but does not run flows, validates sockets, aligns models, and returns id/URL. This far exceeds what annotations provide.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is long but well-organized with sections: purpose, design principles, hard rules, model menus, node types, edge hints, and examples. Every sentence adds necessary information; no wasted text.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness5/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the complexity (numerous node types, models, edge rules, and constraints), the description is fully complete. It includes five detailed examples covering common patterns, and explains return values despite no output schema.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters5/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema coverage is 100% but the description adds enormous value: valid model ids, voice names, node type purposes, edge handle hints, and example parameter values. It explains the structure of nodes and edges in detail, making the schema meaningful.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the tool creates a flow with a node-graph pipeline and returns its id and URL. It distinguishes itself from siblings (single media tools) by specifying it designs full pipelines.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines5/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description provides extensive usage guidance: when to use (creative goals), design principles (multi-stage, branching, parallel tracks), hard rules (node types, edge compatibility, step labels), and explicit contrast with running flows. Examples further clarify usage.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

create_storyboardAInspect

Create a new empty Avocado AI storyboard for the user. Returns the new board's id and direct URL so the user can open it.

ParametersJSON Schema
NameRequiredDescriptionDefault
titleNoTitle for the new storyboard. Defaults to 'Untitled Storyboard'.
Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations are minimal (readOnlyHint: false, destructiveHint: false), so the description adds valuable context: it creates an empty board and returns the id and URL. No contradictions with annotations.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Two concise sentences that are front-loaded with the key purpose and return information. No unnecessary words or details.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

For a simple creation tool with 1 optional parameter and no output schema, the description adequately explains the action and return values (id and URL). Could mention error handling or authorization but not essential.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

The input schema has 100% coverage with a description for the only parameter (title). The description does not add extra meaning beyond what the schema already provides, meeting the baseline for high schema coverage.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the verb 'Create', the resource 'new empty Avocado AI storyboard', and the return values (board id and direct URL). It effectively distinguishes from sibling tools like 'list_storyboards' and other creation tools.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines3/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description implies usage for creating a storyboard but does not explicitly guide when to use it versus alternatives like 'generate_image_to_storyboard' or 'generate_video_to_storyboard'. No exclusions or prerequisites are mentioned.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

describe_avocadoA
Read-onlyIdempotent
Inspect

Describe what Avocado AI is and what it can do. Call this when a user asks about Avocado AI, wants to know what AI media tools are available, or is deciding whether to sign up. Returns capabilities, supported models, use cases, pricing overview, and how to connect.

ParametersJSON Schema
NameRequiredDescriptionDefault

No parameters

Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations already declare readOnlyHint=true, idempotentHint=true, destructiveHint=false. The description adds value by listing what the tool returns (capabilities, supported models, pricing, etc.), providing context beyond annotations.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness4/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is concise with three sentences. It front-loads the purpose and adds usage scenarios efficiently. Could be slightly more streamlined, but overall well-structured.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness5/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the tool's simplicity (no parameters, no output schema), the description covers all necessary context: purpose, when to use, and what information is returned. Annotations provide safety cues. No gaps.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters4/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Input schema has zero parameters with 100% schema coverage, so baseline is 4. The description does not need to add parameter information, and it correctly focuses on the tool's function.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the tool's purpose: describing Avocado AI, its capabilities, and offerings. It specifies the verb 'Describe' and the resource 'Avocado AI', and distinguishes from siblings by being the only tool for general platform info.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description explicitly tells when to use this tool (when user asks about Avocado AI, wants available tools, or is deciding to sign up). It doesn't specify when not to use, but for a non-ambiguous tool this is sufficient.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

edit_imageEdit ImageAInspect

Modify an existing image. REQUIRED input: exactly one of file_id OR image_url. base64 is NOT accepted — do not try to pass image bytes as a tool argument, the call will be rejected. For chat-attached images you MUST first call prepare_image_upload to get a signed PUT URL, upload the bytes there (via the inline widget on Claude.ai, or via curl on Claude Desktop / Claude Code), then call this tool with the returned file_id. For URLs the user has pasted, use image_url directly. Returns a jobId immediately; call check_job with the jobId to retrieve the edited image inline. Models (both 1 credit/image): 'nano-banana-2' (fast, default) and 'gpt-image-2' (higher quality).

ParametersJSON Schema
NameRequiredDescriptionDefault
modelNoEdit model. 'nano-banana-2' is fast and cheap (default). 'gpt-image-2' is higher quality but costs more credits.
promptYesWhat to change about the image. Be specific. Example: 'Replace the background with a sunset beach' or 'Add reading glasses to the person'.
file_idNofile_id returned by prepare_image_upload after the image was uploaded to the signed URL. This is the ONLY supported path for chat-attached images. Format: 'mcp-source/{userId}/{uuid}.{ext}'.
qualityNoQuality tier. Only applies to 'gpt-image-2'. low=1 credit, medium=1-2, high=4-6 credits per image (varies by aspect). Defaults to 'high'. Ignored by 'nano-banana-2'.
image_urlNoHTTPS URL of the image to edit. Use only when the user pasted a public URL. Otherwise call prepare_image_upload first.
num_imagesNoNumber of edited variants to produce (1-4). Defaults to 1.
aspect_ratioNoOutput aspect ratio. Omit to keep the source image's shape (best for retouching an existing photo). Set it when composing a new layout around the input — e.g. '9:16' or '3:4' for a vertical poster, '2:3' for a tall portrait. For 4:5-style feed posts use '3:4' (closest supported portrait).
Behavior5/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations are minimal (no safety hints), but the description fully compensates by disclosing the async job pattern, the fact that it mutates an existing image, and credit costs. No contradictions with annotations.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is a single well-structured paragraph, front-loaded with purpose, then input requirements, async behavior, and model options. Every sentence is necessary and no redundancy.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness5/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

With 7 parameters, 1 required, no output schema, and no nested objects, the description covers all necessary context: input selection, async return, model differentiation, and quality behavior. The agent can invoke correctly without gaps.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters4/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema coverage is 100%, so baseline is 3. The description adds crucial semantics: mutual exclusivity of file_id and image_url (not in schema), credit costs per model, and quality tier behavior. This justifies a 4.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description starts with 'Modify an existing image,' which clearly states the verb and resource. It distinguishes from siblings like generate_image and prepare_image_upload by specifying input sources and workflow.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines5/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description provides explicit when-to-use guidance: exactly one of file_id or image_url, no base64, steps for chat-attached vs URL images, and the async workflow with check_job. It also explains model selection.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

generate_imageGenerate ImageAInspect

Generate an AI image using Avocado AI. Returns a jobId immediately; image generation completes in 10-60 seconds. After calling, use the check_job tool with the returned jobId to retrieve the result, once complete, check_job returns the image inline so it renders directly in chat. Run models_list to see available models. Costs 1-6 credits per image depending on model and quality.

ParametersJSON Schema
NameRequiredDescriptionDefault
modelNoModel slug from models_list. Defaults to 'gpt-image-2'.
promptYesText description of the image to generate. Be descriptive for best results.
qualityNoQuality tier. Only applies to 'gpt-image-2'. low=1 credit, medium=1-2 credits, high=4-6 credits per image (varies by aspect). Ignored by other models. Defaults to 'high'.
num_imagesNoNumber of images to generate (1-4). Defaults to 1.
aspect_ratioNoImage aspect ratio. Defaults to '1:1'.
Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

The description discloses async behavior, typical completion time (10-60 seconds), and credit cost (1-6 per image). It also explains that the image is rendered inline via check_job. Annotations are all false, so no contradiction, and the description adds valuable context beyond the annotations.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is a single paragraph with clear front-loading: first sentence states purpose, then async flow, follow-up tool, model listing, and cost. Every sentence adds value without redundancy.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the tool's async nature, multiple parameters, and cost model, the description covers the essential workflow and constraints. It explains what to do with the jobId and mentions inline rendering. Minor omission: no mention of rate limits or failure scenarios, but overall complete for typical use.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters4/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema coverage is 100%, providing baseline 3. The description adds value by explaining credit costs for quality tiers, clarifying that quality only applies to 'gpt-image-2', and recommending descriptive prompts. This goes beyond the schema's descriptions.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the tool generates an AI image using Avocado AI, and explains the async behavior (returns jobId, completes in 10-60 seconds). It distinguishes itself from sibling tools like check_job (for retrieval) and edit_image (for editing).

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description explicitly advises using check_job to retrieve the result and running models_list to see available models. It also mentions cost and time. However, it does not explicitly state when not to use this tool versus alternatives, though sibling tools provide context.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

generate_image_to_storyboardGenerate Image to StoryboardAInspect

Generate an AI image and place it directly on a user's Avocado AI storyboard. Drops 'Generating...' placeholder(s) on the board immediately, then the webhook swaps each placeholder for the final image when generation completes (10-60s). Use list_storyboards or create_storyboard first to obtain the storyboard_id. If the user has the storyboard tab open, they may need to refresh once for the image to appear (the canvas does not yet support live realtime updates from MCP). Costs match generate_image (1-6 credits per image depending on model and quality).

ParametersJSON Schema
NameRequiredDescriptionDefault
modelNoModel slug from models_list. Defaults to 'gpt-image-2'.
promptYesText description of the image to generate.
qualityNoQuality tier ('gpt-image-2' only). Defaults to 'high'.
num_imagesNoNumber of images to generate (1-4). Defaults to 1. One placeholder per image.
aspect_ratioNoImage aspect ratio. Defaults to '1:1'. Also controls placeholder shape on the board.
storyboard_idYesThe id of the storyboard to add the image to. Must be owned by, or shared with edit access to, the authenticated user.
Behavior5/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Discloses async behavior with placeholder drops and webhook swaps, cost implications (1-6 credits), and the need for a page refresh. This adds context beyond annotations (readOnlyHint=false, destructiveHint=false), with no contradiction.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness4/5

Is the description appropriately sized, front-loaded, and free of redundancy?

A single paragraph that front-loads the main purpose, then explains mechanism, prerequisites, user experience, and costs. Each sentence adds value, though slightly dense; could be split for readability.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness3/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Covers key aspects: behavior, prerequisites, UI impact, and cost. However, lacks output/return value description and error handling, which would be helpful given no output schema.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters4/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema coverage is 100% with descriptions. Description adds meaning to num_images (one placeholder per image) and aspect_ratio (controls placeholder shape), enhancing understanding beyond the schema.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the tool generates an AI image and places it on a storyboard, distinguishing it from sibling tools like generate_image (no storyboard placement) and generate_video_to_storyboard (video version).

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

Explicitly advises to use list_storyboards or create_storyboard first to obtain the storyboard_id. Implicitly differentiates from generate_image for standalone generation, but does not explicitly state when not to use this tool.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

generate_musicAInspect

Generate AI music using Avocado AI. Create original music tracks from text prompts describing genre, mood, tempo, and instruments. Tracks can be 30 seconds to 5 minutes. Costs 4 credits per 30-second block. The track is saved to your Music Studio at https://www.avocadoai.co/music-studio.

ParametersJSON Schema
NameRequiredDescriptionDefault
titleNoTitle for the music track.
promptYesDescription of the music to generate. Include genre, mood, tempo, instruments, and style. Example: 'Upbeat electronic dance music with synth leads, punchy drums, 128 BPM, energetic and euphoric mood'
duration_secondsNoDuration in seconds (30-300). Defaults to 30.
Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Discloses credit consumption and persistent storage behavior (saved to Music Studio), adding value beyond the annotations which only indicate non-destructive/non-readOnly. No contradiction with annotations.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Three concise sentences with no redundancy. The first sentence immediately clarifies the tool's purpose, and subsequent sentences add necessary constraints without verbosity.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness5/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Despite lacking an output schema, the description adequately explains the outcome (track saved to Music Studio) and the cost model, making the tool's behavior fully understandable for invocation.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema description coverage is 100%, and the description does not add significant new meaning beyond the parameter descriptions already present in the schema. Example provided in schema, so baseline score is appropriate.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description explicitly states 'Generate AI music' and details the creation of original tracks from text prompts, clearly distinguishing it from sibling tools like generate_sfx or generate_speech.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

It provides concrete constraints on duration (30s-5min) and cost (4 credits per 30s block), along with the output destination (Music Studio URL). While it doesn't explicitly state when not to use, the context is sufficient for decision-making.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

generate_sfxAInspect

Generate AI sound effects using Avocado AI. Create short sound effects from text prompts describing the sound. Effects can be 1 to 22 seconds. Costs 1 credit per 5-second block. The effect is saved to your Music Studio at https://www.avocadoai.co/music-studio.

ParametersJSON Schema
NameRequiredDescriptionDefault
titleNoTitle for the sound effect.
promptYesDescription of the sound effect to generate. Example: 'Glass shattering on a tile floor with sharp reverberation' or 'Heavy footsteps on wet concrete in a dark alley'
duration_secondsNoDuration in seconds (1-22). Defaults to 5.
prompt_influenceNoHow closely to follow the prompt (0-1). Higher = more literal, lower = more creative. Defaults to 0.35.
Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Beyond annotations (all false), the description adds important behavioral details: credit cost per 5-second block, duration limits, and where the effect is saved (Music Studio URL).

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Three sentences, front-loaded with the main purpose, then constraints and cost. No fluff.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness3/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Covers purpose, parameters, cost, and destination. However, with no output schema and sibling tools including check_job, it likely returns a job ID asynchronously, but the description omits this, which is a significant gap for an agent.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters4/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema coverage is 100%; description adds context about purpose of duration (1-22 seconds) and prompt influence (default 0.35) via examples, plus cost and saving location, enhancing interpretation.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states it generates AI sound effects from text prompts, distinguishing from sibling tools like generate_music or generate_speech by specifying short sound effects and saving to Music Studio.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines3/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description implies usage for short sound effects (1-22 seconds) and mentions credit cost, but does not explicitly say when to use this tool versus alternatives or provide exclusion criteria.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

generate_speechAInspect

Convert text to natural-sounding speech using Avocado AI. Supports multiple voices and languages. Costs 3 credits per 1000 characters. Audio will appear in your Avocado AI workspace.

ParametersJSON Schema
NameRequiredDescriptionDefault
textYesThe text to convert to speech.
voiceNoVoice to use. Defaults to 'rachel'. Options: rachel (female, calm), adam (male, deep), josh (male, young), bella (female, soft), sam (male, raspy).
Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Beyond annotations, the description adds cost per character and where the audio appears, which are useful behavioral details not present in annotations.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Two sentences, front-loaded with core purpose, then cost and result location. No wasted words.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the simple input schema and no output schema, the description covers the major aspects: what it does, cost, and where the result appears. Missing error or speed details but adequate.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

With 100% schema coverage, the baseline is 3. The description adds cost context but does not add new parameter-level meaning beyond the schema's descriptions.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states it converts text to speech using Avocado AI, mentions multiple voices and languages, and distinguishes from siblings like generate_music or generate_sfx.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines3/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description implies when to use (text-to-speech) but does not explicitly provide when-not or alternatives, leaving the agent to infer based on sibling tool names.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

generate_videoGenerate VideoAInspect

Generate an AI video. Nine models: seedance-2.0-t2v / -t2v-fast (text only), seedance-2.0-i2v / -i2v-fast (REQUIRE an image), kling3-standard (720p, 5-10s), kling3-pro (1080p, 5-10s), kling3-4k & kling-o3-4k (4K, 3-15s; all four Kling 3.x variants support BOTH text-to-video and image-to-video — supplying image_url or file_id automatically picks image mode), grok-imagine-video-v1-5 (480p/720p, 1-15s, REQUIRES an image — image-to-video only). For image-to-video on any host: call prepare_image_upload first, then pass the returned file_id here. Renders take 2-10 minutes; the inline result card polls for completion. Pricing is per-second, varies by model and resolution.

ParametersJSON Schema
NameRequiredDescriptionDefault
modelNoModel. Defaults to 'seedance-2.0-t2v'. Use the -i2v variant or any kling3 variant for image-to-video.
promptYesText description of the video. For image-to-video, describe the motion/action you want applied to the source image.
file_idNofile_id from prepare_image_upload — preferred for chat attachments. Required for seedance-2.0-i2v / -i2v-fast. Optional for kling3-* (presence triggers image-to-video mode).
durationNoVideo duration in seconds. Per-model bounds: seedance i2v 4-15, seedance t2v 5-15, kling3-standard/pro 5-10, kling3-4k/o3-4k 3-15. Defaults to 5.
fast_modeNoLegacy alias. true picks seedance-2.0-t2v-fast or seedance-2.0-i2v-fast when no explicit model was given. Prefer setting model directly.
image_urlNoHTTPS URL of the source image. Use only if you already have a public URL; otherwise call prepare_image_upload and pass file_id.
resolutionNoVideo resolution. Only meaningful for seedance (480p/720p/1080p; 1080p not allowed with seedance fast). Kling models lock resolution by variant.
aspect_ratioNoAspect ratio. Defaults to '16:9'. Ignored for image-to-video (aspect derives from input).
generate_audioNoGenerate audio (Kling 3 standard/pro only). Ignored for other models.
Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Beyond neutral annotations, the description discloses render time (2-10 minutes), polling mechanism ('inline result card polls for completion'), pricing (per-second, model-dependent), and prerequisite workflow (upload image first). This sets accurate expectations for the agent.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness4/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is dense but well-structured: starts with purpose, lists models with details, then prerequisites, timing, pricing. While long, it front-loads key information and avoids redundancy. Could be slightly tighter but is effective.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness5/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given 9 parameters, 100% schema coverage, and no output schema, the description covers all essential aspects: model selection, parameter usage, prerequisites, timing, pricing, and polling behavior. It leaves no significant gaps for an agent to misuse the tool.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters4/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

With 100% schema coverage, baseline is 3. The description adds extra meaning: grouping models by capability, explaining fast_mode as legacy, clarifying resolution relevance per model, and noting aspect_ratio and generate_audio scope. This enhances understanding beyond the schema.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states 'Generate an AI video' and lists specific models with their modes (text-to-video vs image-to-video). It distinguishes this from sibling tools like generate_image or generate_speech by focusing solely on video generation, making purpose unambiguous.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description provides detailed guidance on when to use each model (e.g., seedance requires image, Kling variants support both), prerequisites (prepare_image_upload for images), and constraints (duration bounds, resolution limits). It lacks an explicit comparison to siblings, but the purpose itself naturally guides selection.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

generate_video_to_storyboardGenerate Video to StoryboardAInspect

Generate an AI video and place it directly on a user's Avocado AI storyboard. Drops a 'Generating...' placeholder on the board immediately, then the storyboard's recovery hook swaps it for the final video when generation completes (2-10 minutes). Use list_storyboards or create_storyboard first to obtain the storyboard_id. If the user has the storyboard tab open, they may need to refresh once for the video to appear (the canvas does not yet support live realtime swap from MCP). Eight models supported: seedance-2.0-t2v / -t2v-fast (text only), seedance-2.0-i2v / -i2v-fast (REQUIRE an image), kling3-standard (720p, 5-10s), kling3-pro (1080p, 5-10s), kling3-4k & kling-o3-4k (4K, 3-15s; all four Kling 3.x variants support BOTH text-to-video and image-to-video). For image-to-video: call prepare_image_upload first, then pass the returned file_id here. Pricing is per-second, varies by model and resolution.

ParametersJSON Schema
NameRequiredDescriptionDefault
modelNoModel. Defaults to 'seedance-2.0-t2v'. Use the -i2v variant or any kling3 variant for image-to-video.
promptYesText description of the video. For image-to-video, describe the motion/action you want applied to the source image.
file_idNofile_id from prepare_image_upload — preferred for chat attachments. Required for seedance-2.0-i2v / -i2v-fast. Optional for kling3-* (presence triggers image-to-video mode).
durationNoVideo duration in seconds. Per-model bounds: seedance i2v 4-15, seedance t2v 5-15, kling3-standard/pro 5-10, kling3-4k/o3-4k 3-15. Defaults to 5.
image_urlNoHTTPS URL of the source image. Use only if you already have a public URL; otherwise call prepare_image_upload and pass file_id.
resolutionNoVideo resolution. Only meaningful for seedance (480p/720p/1080p; 1080p not allowed with seedance fast). Kling models lock resolution by variant.
aspect_ratioNoAspect ratio. Defaults to '16:9'. Also controls placeholder shape on the board. Ignored for image-to-video (aspect derives from input).
storyboard_idYesThe id of the storyboard to add the video to. Must be owned by, or shared with edit access to, the authenticated user.
generate_audioNoGenerate audio (Kling 3 standard/pro only). Ignored for other models.
Behavior5/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Discloses that a placeholder is dropped immediately and the recovery hook swaps it after 2-10 minutes. Notes that refresh may be needed for live update. Annotations don't contradict; description adds rich behavioral context beyond what annotations provide.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness4/5

Is the description appropriately sized, front-loaded, and free of redundancy?

A bit lengthy but well-structured with front-loaded main action. Every sentence adds necessary information. Could be slightly trimmed, but no waste.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness5/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Covers all essential aspects: prerequisites (storyboard_id, image upload), model selection, duration bounds, pricing, and output behavior (placeholder + replacement). Despite no output schema, the description fully informs the agent of what to expect.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters5/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema has 100% coverage of parameter descriptions, but the description adds significant value: explains model variants in detail, per-model duration bounds, difference between file_id and image_url, and defaults. This helps the agent understand parameter constraints and relationships.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

Clearly states it generates an AI video and places it on a user's Avocado AI storyboard. Distinguishes from siblings like generate_video and generate_image_to_storyboard by specifying the placement on storyboard. Uses specific verb 'Generate' and resource 'video to storyboard'.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines5/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

Explicitly tells the agent to first use list_storyboards or create_storyboard to obtain storyboard_id. Provides guidance on when to use image-to-video vs text-to-video, and that for image-to-video one should call prepare_image_upload first. Also warns about refresh if the tab is open.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

get_startedA
Read-onlyIdempotent
Inspect

Get step-by-step instructions for connecting to Avocado AI via MCP. Call this when a user wants to sign up, authenticate, or connect Avocado AI to their AI assistant (Claude, ChatGPT, Cursor, Windsurf, Claude Code, etc.).

ParametersJSON Schema
NameRequiredDescriptionDefault
clientNoWhich AI assistant or client the user wants to connect from.
Behavior3/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations already declare readOnlyHint=true, idempotentHint=true, and destructiveHint=false. The description's 'Get step-by-step instructions' aligns with these hints but adds no new behavioral context beyond what annotations provide.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is two sentences, with the first stating the core purpose and the second providing usage context. It is concise with no redundant or unnecessary information.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness5/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the tool's simplicity (one optional parameter, no output schema), the description fully covers purpose, usage, and context. No additional details are needed for an agent to correctly invoke it.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

The single parameter 'client' is fully documented in the schema with enum values and description. The tool description does not add additional meaning or behavior details for the parameter, so it meets the baseline for 100% schema coverage.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the tool provides step-by-step instructions for connecting to Avocado AI via MCP. It specifies the exact scenarios (sign up, authenticate, connect) and lists example clients, making it distinct from sibling tools which are primarily generation and editing tools.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description explicitly says 'Call this when a user wants to sign up, authenticate, or connect Avocado AI to their AI assistant', giving clear when-to-use guidance. It does not list exclusions, but the context of sibling tools makes the usage obvious.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

list_storyboardsA
Read-onlyIdempotent
Inspect

List the user's Avocado AI storyboards. Returns owned and shared boards with id, title, last-updated time, thumbnail, and direct URL. Use this to let the user pick an existing board.

ParametersJSON Schema
NameRequiredDescriptionDefault

No parameters

Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations already indicate readOnlyHint=true and destructiveHint=false. The description adds value by specifying the scope (owned and shared) and the exact fields returned, which goes beyond annotations.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Two sentences, front-loaded with the action and resource, followed by output details and usage guidance. No superfluous words.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness5/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the tool's simplicity (no parameters, no output schema), the description fully covers purpose, output, and usage. Annotations handle safety. No missing elements.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters4/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

With zero parameters and 100% schema coverage, the baseline is 4. The description does not need to explain parameters, but it compensates by describing the output structure, aiding understanding.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the action (list), the resource (storyboards), and the returned fields (id, title, last-updated time, thumbnail, URL). It distinguishes itself from sibling tools like create_storyboard, which create rather than list.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description explicitly states when to use the tool: 'Use this to let the user pick an existing board.' It does not provide explicit exclusions or alternatives, but no alternative listing tool exists among siblings, making the guidance adequate.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

models_listA
Read-onlyIdempotent
Inspect

List all available AI image generation models on Avocado AI. Returns model slugs, display names, credit costs, and descriptions. Use this to help users pick the right model for their needs.

ParametersJSON Schema
NameRequiredDescriptionDefault
categoryNoFilter by media type. Currently only 'image' is supported.
Behavior3/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations already declare readOnlyHint and idempotentHint, so the behavioral burden is low. The description adds context about return content but no additional behavioral traits beyond what annotations provide.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Three sentences, front-loaded with verb and resource, no redundancy. Every sentence adds value.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

The description explains what is returned (slugs, names, costs, descriptions) despite no output schema. It mentions the optional parameter implicitly via 'all available'. Lacks details on pagination or filtering behavior but sufficient for a simple list tool.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

The input schema has 100% coverage for its single parameter (category). The description does not add extra meaning beyond the schema's description, so baseline 3 is appropriate.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the tool lists all available AI image generation models, specifies the return fields (slugs, names, credit costs, descriptions), and differentiates from sibling generation tools by being a list/retrieval operation.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines3/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description provides a high-level usage context ('help users pick the right model') but does not explicitly state when not to use or mention alternative tools. No sibling comparison is given.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

prepare_image_uploadPrepare Image UploadAInspect

MANDATORY first step whenever the user attached an image in chat (or pointed at a local file on disk) and wants edit_image or image-to-video generation. Returns a signed PUT URL plus a file_id. After this tool: either (a) the inline upload widget will let the user drop the file and auto-continue (Claude.ai web), or (b) you run a curl PUT yourself if you have shell access (Claude Desktop / Claude Code) — the response text contains a ready-to-run curl command. Then call edit_image or generate_video with file_id=. edit_image and generate_video do NOT accept base64 — calling them with raw image bytes WILL fail. This tool is the only working path for chat attachments. Set purpose to 'edit' or 'video' so the upload widget points the user at the right downstream tool.

ParametersJSON Schema
NameRequiredDescriptionDefault
purposeNoWhat the user wants done with the uploaded image. 'edit' (default) for edit_image. 'video' for generate_video image-to-video. The upload widget uses this to nudge you toward the right downstream tool after upload.
mime_typeNoMIME type of the image the user will upload. Defaults to image/png. Accepts png, jpeg, webp.
Behavior5/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

The description thoroughly discloses the tool's behavior: it returns a signed URL and file_id, requires subsequent steps (widget upload or curl command), and warns that downstream tools fail with raw bytes. Annotations are minimal (no read-only, destructive hints), so the description carries the full burden and does it well.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness4/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is comprehensive but slightly lengthy; however, every sentence adds critical information for the workflow. It is front-loaded with the mandatory nature and clearly orders steps. Could be slightly more concise, but the information density justifies the length.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness5/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the tool's complexity (preparation step without output schema), the description is complete: it covers the return values, post-usage steps, how to handle the upload in different environments, and prerequisites. No gaps in context.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters4/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

The input schema already has clear descriptions for 'purpose' and 'mime_type' with enums and defaults. The description adds value by explaining why each parameter matters (purpose controls downstream tool nudging) and emphasizing defaults. Schema coverage is 100%, so the description enhances rather than replaces schema info.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the tool is the mandatory first step for uploading images for editing or video generation, distinguishing it from sibling tools like edit_image and generate_video. It specifies that it returns a signed PUT URL and file_id, providing a clear purpose.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines5/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

Explicitly says this is mandatory when the user attaches an image and wants to use edit_image or generate_video. It explains that downstream tools do not accept base64 and will fail if called directly, providing clear guidance on when to use this tool and why alternatives won't work.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

Discussions

No comments yet. Be the first to start the discussion!

Try in Browser

Your Connectors

Sign in to create a connector for this server.

Resources