Avocado AI
Server Details
Create ads inside any AI assistant with Avocado, create, edit and make AI UGC in chat.
- Status
- Healthy
- Last Tested
- Transport
- Streamable HTTP
- URL
Glama MCP Gateway
Connect through Glama MCP Gateway for full control over tool access and complete visibility into every call.
Full call logging
Every tool call is logged with complete inputs and outputs, so you can debug issues and audit what your agents are doing.
Tool access control
Enable or disable individual tools per connector, so you decide what your agents can and cannot do.
Managed credentials
Glama handles OAuth flows, token storage, and automatic rotation, so credentials never expire on your clients.
Usage analytics
See which tools your agents call, how often, and when, so you can understand usage patterns and catch anomalies.
Tool Definition Quality
Average 4.3/5 across 16 of 16 tools scored.
Each tool has a clearly distinct purpose. While generate_image and generate_image_to_storyboard both create images, their descriptions clarify one places the result on a storyboard. Similarly for video variants. Other tools like check_job, models_list, and describe_avocado address unique needs without overlap.
All tool names use consistent snake_case and follow a verb_noun pattern (e.g., check_job, create_storyboard, generate_image). Compound names like generate_image_to_storyboard maintain the pattern. No mixing of conventions.
16 tools cover a broad range of media generation and storyboard management, which justifies the count. While slightly on the heavy side, each tool serves a specific function without redundancy.
Core generation workflows are well-covered, but storyboard management lacks update/delete tools, and there is no tool for deleting generated media or managing accounts beyond credit checks. These are minor gaps that can be worked around via the web interface.
Available Tools
16 toolsaccount_check_creditsARead-onlyIdempotentInspect
Check your Avocado AI credit balance. Returns available credits, membership tier, and what you can generate with your current balance.
| Name | Required | Description | Default |
|---|---|---|---|
No parameters | |||
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
Annotations already indicate read-only, idempotent, non-destructive behavior. The description adds value by detailing the return data (credits, tier, generation capability), which is not covered by annotations.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is a single, clear sentence that conveys all necessary information without extraneous words.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Despite lacking an output schema, the description covers the return values adequately. For a simple read tool with no inputs, this is complete.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
With zero parameters and 100% schema coverage, the description does not need to explain parameters. It provides no further detail but also has no gaps.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states it checks credit balance and specifies what it returns (credits, membership tier, what can be generated). It is distinct from sibling tools like generate_image, which consume credits.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description explains when to use the tool (to check credit balance) but does not explicitly state when not to use it or provide alternatives. It implies usage before generation but lacks direct guidance.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
check_jobARead-onlyIdempotentInspect
Always call this tool after generate_image, edit_image, or generate_video to retrieve the result. Pass the jobId returned by the generation tool. Returns status (queued, processing, completed, failed), result URLs when ready, and error details on failure. When an image job is completed, the resulting image(s) are returned as inline image content blocks so they render directly in chat alongside the JSON metadata. If status is queued or processing, wait 5 to 10 seconds and call again; image jobs typically finish in 10 to 60 seconds, video jobs in 2 to 10 minutes.
| Name | Required | Description | Default |
|---|---|---|---|
| jobId | Yes | The jobId returned by a generation tool. |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
The description adds significant behavioral detail beyond the annotations: it specifies return values (status, URLs, error details, inline image blocks), polling guidance, and typical durations. No contradiction with annotations (readOnly, idempotent, non-destructive).
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is a concise paragraph that front-loads the critical usage instruction. It could be structured with bullet points for improved readability, but it efficiently conveys all necessary information.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given the tool's simplicity and lack of output schema, the description fully explains the return structure, behavior (polling), and typical timings. It covers all essential information for an agent to use the tool correctly.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
The single parameter jobId is fully described in the schema. The description repeats the same information without adding new semantics. With 100% schema coverage, baseline 3 is appropriate.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the tool checks the status of generation jobs and retrieves results. It explicitly mentions the generation tools it follows (generate_image, edit_image, generate_video) and what it returns, distinguishing it from siblings that create jobs.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description provides clear when-to-use instructions ('always call this tool after...'), what to pass (jobId), and retry behavior with time estimates. It doesn't explicitly list when not to use, but the context is sufficiently clear.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
create_storyboardAInspect
Create a new empty Avocado AI storyboard for the user. Returns the new board's id and direct URL so the user can open it.
| Name | Required | Description | Default |
|---|---|---|---|
| title | No | Title for the new storyboard. Defaults to 'Untitled Storyboard'. |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
Annotations are minimal (readOnlyHint: false, destructiveHint: false), so the description adds valuable context: it creates an empty board and returns the id and URL. No contradictions with annotations.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
Two concise sentences that are front-loaded with the key purpose and return information. No unnecessary words or details.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
For a simple creation tool with 1 optional parameter and no output schema, the description adequately explains the action and return values (id and URL). Could mention error handling or authorization but not essential.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
The input schema has 100% coverage with a description for the only parameter (title). The description does not add extra meaning beyond what the schema already provides, meeting the baseline for high schema coverage.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the verb 'Create', the resource 'new empty Avocado AI storyboard', and the return values (board id and direct URL). It effectively distinguishes from sibling tools like 'list_storyboards' and other creation tools.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description implies usage for creating a storyboard but does not explicitly guide when to use it versus alternatives like 'generate_image_to_storyboard' or 'generate_video_to_storyboard'. No exclusions or prerequisites are mentioned.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
describe_avocadoARead-onlyIdempotentInspect
Describe what Avocado AI is and what it can do. Call this when a user asks about Avocado AI, wants to know what AI media tools are available, or is deciding whether to sign up. Returns capabilities, supported models, use cases, pricing overview, and how to connect.
| Name | Required | Description | Default |
|---|---|---|---|
No parameters | |||
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
Annotations already declare readOnlyHint=true, idempotentHint=true, destructiveHint=false. The description adds value by listing what the tool returns (capabilities, supported models, pricing, etc.), providing context beyond annotations.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is concise with three sentences. It front-loads the purpose and adds usage scenarios efficiently. Could be slightly more streamlined, but overall well-structured.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given the tool's simplicity (no parameters, no output schema), the description covers all necessary context: purpose, when to use, and what information is returned. Annotations provide safety cues. No gaps.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Input schema has zero parameters with 100% schema coverage, so baseline is 4. The description does not need to add parameter information, and it correctly focuses on the tool's function.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the tool's purpose: describing Avocado AI, its capabilities, and offerings. It specifies the verb 'Describe' and the resource 'Avocado AI', and distinguishes from siblings by being the only tool for general platform info.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description explicitly tells when to use this tool (when user asks about Avocado AI, wants available tools, or is deciding to sign up). It doesn't specify when not to use, but for a non-ambiguous tool this is sufficient.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
edit_imageEdit ImageAInspect
Modify an existing image. REQUIRED input: exactly one of file_id OR image_url. base64 is NOT accepted — do not try to pass image bytes as a tool argument, the call will be rejected. For chat-attached images you MUST first call prepare_image_upload to get a signed PUT URL, upload the bytes there (via the inline widget on Claude.ai, or via curl on Claude Desktop / Claude Code), then call this tool with the returned file_id. For URLs the user has pasted, use image_url directly. Returns a jobId immediately; call check_job with the jobId to retrieve the edited image inline. Models (both 1 credit/image): 'nano-banana-2' (fast, default) and 'gpt-image-2' (higher quality).
| Name | Required | Description | Default |
|---|---|---|---|
| model | No | Edit model. 'nano-banana-2' is fast and cheap (default). 'gpt-image-2' is higher quality but costs more credits. | |
| prompt | Yes | What to change about the image. Be specific. Example: 'Replace the background with a sunset beach' or 'Add reading glasses to the person'. | |
| file_id | No | file_id returned by prepare_image_upload after the image was uploaded to the signed URL. This is the ONLY supported path for chat-attached images. Format: 'mcp-source/{userId}/{uuid}.{ext}'. | |
| image_url | No | HTTPS URL of the image to edit. Use only when the user pasted a public URL. Otherwise call prepare_image_upload first. | |
| num_images | No | Number of edited variants to produce (1-4). Defaults to 1. |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
Annotations are minimal (no destructive/readOnly hints), but the description adds crucial behavioral details: returns a jobId immediately, requires polling check_job for result, and mentions model options and credit costs. It does not contradict annotations.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is a single paragraph but well-structured, covering input requirements, workflow, async behavior, and models. It is concise and front-loaded, though bullet points could improve readability.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given the tool has 5 parameters (1 required) and no output schema, the description covers all necessary aspects: input modes, workflow steps, async return, and model choices. It fits well among sibling tools (e.g., prepare_image_upload, generate_image, check_job) and provides complete guidance for an AI agent.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema coverage is 100% and each parameter is described, but the description adds significant value beyond the schema: it clarifies that exactly one of file_id or image_url is required (despite schema only requiring prompt), explains the file_id format, rejects base64, and specifies default model and credit usage.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states 'Modify an existing image', specifies the two input modes (file_id vs image_url), and explicitly warns against base64. It distinguishes from sibling tools like generate_image by focusing on editing an existing image rather than generation.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description provides explicit guidance on when to use file_id (chat-attached images via prepare_image_upload) vs image_url (user-pasted URLs). It also states the prerequisite workflow for chat-attached images and warns not to pass base64, which is a common mistake.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
generate_imageGenerate ImageAInspect
Generate an AI image using Avocado AI. Returns a jobId immediately; image generation completes in 10-60 seconds. After calling, use the check_job tool with the returned jobId to retrieve the result, once complete, check_job returns the image inline so it renders directly in chat. Run models_list to see available models. Costs 1-6 credits per image depending on model and quality.
| Name | Required | Description | Default |
|---|---|---|---|
| model | No | Model slug from models_list. Defaults to 'gpt-image-2'. | |
| prompt | Yes | Text description of the image to generate. Be descriptive for best results. | |
| quality | No | Quality tier. Only applies to 'gpt-image-2'. low=1 credit, medium=1-2 credits, high=4-6 credits per image (varies by aspect). Ignored by other models. Defaults to 'high'. | |
| num_images | No | Number of images to generate (1-4). Defaults to 1. | |
| aspect_ratio | No | Image aspect ratio. Defaults to '1:1'. |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
The description discloses async behavior, typical completion time (10-60 seconds), and credit cost (1-6 per image). It also explains that the image is rendered inline via check_job. Annotations are all false, so no contradiction, and the description adds valuable context beyond the annotations.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is a single paragraph with clear front-loading: first sentence states purpose, then async flow, follow-up tool, model listing, and cost. Every sentence adds value without redundancy.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given the tool's async nature, multiple parameters, and cost model, the description covers the essential workflow and constraints. It explains what to do with the jobId and mentions inline rendering. Minor omission: no mention of rate limits or failure scenarios, but overall complete for typical use.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema coverage is 100%, providing baseline 3. The description adds value by explaining credit costs for quality tiers, clarifying that quality only applies to 'gpt-image-2', and recommending descriptive prompts. This goes beyond the schema's descriptions.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the tool generates an AI image using Avocado AI, and explains the async behavior (returns jobId, completes in 10-60 seconds). It distinguishes itself from sibling tools like check_job (for retrieval) and edit_image (for editing).
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description explicitly advises using check_job to retrieve the result and running models_list to see available models. It also mentions cost and time. However, it does not explicitly state when not to use this tool versus alternatives, though sibling tools provide context.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
generate_image_to_storyboardGenerate Image to StoryboardAInspect
Generate an AI image and place it directly on a user's Avocado AI storyboard. Drops 'Generating...' placeholder(s) on the board immediately, then the webhook swaps each placeholder for the final image when generation completes (10-60s). Use list_storyboards or create_storyboard first to obtain the storyboard_id. If the user has the storyboard tab open, they may need to refresh once for the image to appear (the canvas does not yet support live realtime updates from MCP). Costs match generate_image (1-6 credits per image depending on model and quality).
| Name | Required | Description | Default |
|---|---|---|---|
| model | No | Model slug from models_list. Defaults to 'gpt-image-2'. | |
| prompt | Yes | Text description of the image to generate. | |
| quality | No | Quality tier ('gpt-image-2' only). Defaults to 'high'. | |
| num_images | No | Number of images to generate (1-4). Defaults to 1. One placeholder per image. | |
| aspect_ratio | No | Image aspect ratio. Defaults to '1:1'. Also controls placeholder shape on the board. | |
| storyboard_id | Yes | The id of the storyboard to add the image to. Must be owned by, or shared with edit access to, the authenticated user. |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
Discloses async behavior with placeholder drops and webhook swaps, cost implications (1-6 credits), and the need for a page refresh. This adds context beyond annotations (readOnlyHint=false, destructiveHint=false), with no contradiction.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
A single paragraph that front-loads the main purpose, then explains mechanism, prerequisites, user experience, and costs. Each sentence adds value, though slightly dense; could be split for readability.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Covers key aspects: behavior, prerequisites, UI impact, and cost. However, lacks output/return value description and error handling, which would be helpful given no output schema.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema coverage is 100% with descriptions. Description adds meaning to num_images (one placeholder per image) and aspect_ratio (controls placeholder shape), enhancing understanding beyond the schema.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the tool generates an AI image and places it on a storyboard, distinguishing it from sibling tools like generate_image (no storyboard placement) and generate_video_to_storyboard (video version).
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
Explicitly advises to use list_storyboards or create_storyboard first to obtain the storyboard_id. Implicitly differentiates from generate_image for standalone generation, but does not explicitly state when not to use this tool.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
generate_musicAInspect
Generate AI music using Avocado AI. Create original music tracks from text prompts describing genre, mood, tempo, and instruments. Tracks can be 30 seconds to 5 minutes. Costs 4 credits per 30-second block. The track is saved to your Music Studio at https://www.avocadoai.co/music-studio.
| Name | Required | Description | Default |
|---|---|---|---|
| title | No | Title for the music track. | |
| prompt | Yes | Description of the music to generate. Include genre, mood, tempo, instruments, and style. Example: 'Upbeat electronic dance music with synth leads, punchy drums, 128 BPM, energetic and euphoric mood' | |
| duration_seconds | No | Duration in seconds (30-300). Defaults to 30. |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
Discloses credit consumption and persistent storage behavior (saved to Music Studio), adding value beyond the annotations which only indicate non-destructive/non-readOnly. No contradiction with annotations.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
Three concise sentences with no redundancy. The first sentence immediately clarifies the tool's purpose, and subsequent sentences add necessary constraints without verbosity.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Despite lacking an output schema, the description adequately explains the outcome (track saved to Music Studio) and the cost model, making the tool's behavior fully understandable for invocation.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema description coverage is 100%, and the description does not add significant new meaning beyond the parameter descriptions already present in the schema. Example provided in schema, so baseline score is appropriate.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description explicitly states 'Generate AI music' and details the creation of original tracks from text prompts, clearly distinguishing it from sibling tools like generate_sfx or generate_speech.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
It provides concrete constraints on duration (30s-5min) and cost (4 credits per 30s block), along with the output destination (Music Studio URL). While it doesn't explicitly state when not to use, the context is sufficient for decision-making.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
generate_sfxAInspect
Generate AI sound effects using Avocado AI. Create short sound effects from text prompts describing the sound. Effects can be 1 to 22 seconds. Costs 1 credit per 5-second block. The effect is saved to your Music Studio at https://www.avocadoai.co/music-studio.
| Name | Required | Description | Default |
|---|---|---|---|
| title | No | Title for the sound effect. | |
| prompt | Yes | Description of the sound effect to generate. Example: 'Glass shattering on a tile floor with sharp reverberation' or 'Heavy footsteps on wet concrete in a dark alley' | |
| duration_seconds | No | Duration in seconds (1-22). Defaults to 5. | |
| prompt_influence | No | How closely to follow the prompt (0-1). Higher = more literal, lower = more creative. Defaults to 0.35. |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
Beyond annotations (all false), the description adds important behavioral details: credit cost per 5-second block, duration limits, and where the effect is saved (Music Studio URL).
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
Three sentences, front-loaded with the main purpose, then constraints and cost. No fluff.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Covers purpose, parameters, cost, and destination. However, with no output schema and sibling tools including check_job, it likely returns a job ID asynchronously, but the description omits this, which is a significant gap for an agent.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema coverage is 100%; description adds context about purpose of duration (1-22 seconds) and prompt influence (default 0.35) via examples, plus cost and saving location, enhancing interpretation.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states it generates AI sound effects from text prompts, distinguishing from sibling tools like generate_music or generate_speech by specifying short sound effects and saving to Music Studio.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description implies usage for short sound effects (1-22 seconds) and mentions credit cost, but does not explicitly say when to use this tool versus alternatives or provide exclusion criteria.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
generate_speechAInspect
Convert text to natural-sounding speech using Avocado AI. Supports multiple voices and languages. Costs 3 credits per 1000 characters. Audio will appear in your Avocado AI workspace.
| Name | Required | Description | Default |
|---|---|---|---|
| text | Yes | The text to convert to speech. | |
| voice | No | Voice to use. Defaults to 'rachel'. Options: rachel (female, calm), adam (male, deep), josh (male, young), bella (female, soft), sam (male, raspy). |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
Beyond annotations, the description adds cost per character and where the audio appears, which are useful behavioral details not present in annotations.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
Two sentences, front-loaded with core purpose, then cost and result location. No wasted words.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given the simple input schema and no output schema, the description covers the major aspects: what it does, cost, and where the result appears. Missing error or speed details but adequate.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
With 100% schema coverage, the baseline is 3. The description adds cost context but does not add new parameter-level meaning beyond the schema's descriptions.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states it converts text to speech using Avocado AI, mentions multiple voices and languages, and distinguishes from siblings like generate_music or generate_sfx.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description implies when to use (text-to-speech) but does not explicitly provide when-not or alternatives, leaving the agent to infer based on sibling tool names.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
generate_videoGenerate VideoAInspect
Generate an AI video. Eight models: seedance-2.0-t2v / -t2v-fast (text only), seedance-2.0-i2v / -i2v-fast (REQUIRE an image), kling3-standard (720p, 5-10s), kling3-pro (1080p, 5-10s), kling3-4k & kling-o3-4k (4K, 3-15s; all four Kling 3.x variants support BOTH text-to-video and image-to-video — supplying image_url or file_id automatically picks image mode). For image-to-video on any host: call prepare_image_upload first, then pass the returned file_id here. Renders take 2-10 minutes; the inline result card polls for completion. Pricing is per-second, varies by model and resolution.
| Name | Required | Description | Default |
|---|---|---|---|
| model | No | Model. Defaults to 'seedance-2.0-t2v'. Use the -i2v variant or any kling3 variant for image-to-video. | |
| prompt | Yes | Text description of the video. For image-to-video, describe the motion/action you want applied to the source image. | |
| file_id | No | file_id from prepare_image_upload — preferred for chat attachments. Required for seedance-2.0-i2v / -i2v-fast. Optional for kling3-* (presence triggers image-to-video mode). | |
| duration | No | Video duration in seconds. Per-model bounds: seedance i2v 4-15, seedance t2v 5-15, kling3-standard/pro 5-10, kling3-4k/o3-4k 3-15. Defaults to 5. | |
| fast_mode | No | Legacy alias. true picks seedance-2.0-t2v-fast or seedance-2.0-i2v-fast when no explicit model was given. Prefer setting model directly. | |
| image_url | No | HTTPS URL of the source image. Use only if you already have a public URL; otherwise call prepare_image_upload and pass file_id. | |
| resolution | No | Video resolution. Only meaningful for seedance (480p/720p/1080p; 1080p not allowed with seedance fast). Kling models lock resolution by variant. | |
| aspect_ratio | No | Aspect ratio. Defaults to '16:9'. Ignored for image-to-video (aspect derives from input). | |
| generate_audio | No | Generate audio (Kling 3 standard/pro only). Ignored for other models. |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
The description reveals significant behavioral traits: 'Renders take 2-10 minutes' and 'the inline result card polls for completion.' It also explains prerequisites for image-to-video. Annotations (readOnlyHint=false, etc.) are not contradicted, and the description adds value beyond them.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is a single dense paragraph that front-loads the main action. It covers all necessary points without fluff, though it could benefit from structured formatting (e.g., bullet points for models) for easier scanning. Still, it is appropriately concise for the complexity.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given 9 parameters (1 required), no output schema, the description covers model selection, parameter constraints, prerequisites (prepare_image_upload), timing, and pricing. It does not explain return values directly, but the polling mechanism implies the result. Adequate for a complex tool.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
With 100% schema coverage, baseline is 3, but the description adds substantial meaning: model parameter details (e.g., kling3 variants support both modes), file_id vs image_url usage, per-model duration bounds, and fast_mode legacy alias. This significantly enhances parameter understanding.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states 'Generate an AI video' and enumerates eight specific models with their capabilities (text-only vs image-to-video), distinguishing this tool from siblings like generate_image. The purpose is specific and unambiguous.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description explains when to use image-to-video (requiring an image) vs text-to-video, and instructs to call prepare_image_upload first for image-to-video. It also notes render times and pricing. However, it does not explicitly state when not to use this tool (e.g., for still images), but context is clear.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
generate_video_to_storyboardGenerate Video to StoryboardAInspect
Generate an AI video and place it directly on a user's Avocado AI storyboard. Drops a 'Generating...' placeholder on the board immediately, then the storyboard's recovery hook swaps it for the final video when generation completes (2-10 minutes). Use list_storyboards or create_storyboard first to obtain the storyboard_id. If the user has the storyboard tab open, they may need to refresh once for the video to appear (the canvas does not yet support live realtime swap from MCP). Eight models supported: seedance-2.0-t2v / -t2v-fast (text only), seedance-2.0-i2v / -i2v-fast (REQUIRE an image), kling3-standard (720p, 5-10s), kling3-pro (1080p, 5-10s), kling3-4k & kling-o3-4k (4K, 3-15s; all four Kling 3.x variants support BOTH text-to-video and image-to-video). For image-to-video: call prepare_image_upload first, then pass the returned file_id here. Pricing is per-second, varies by model and resolution.
| Name | Required | Description | Default |
|---|---|---|---|
| model | No | Model. Defaults to 'seedance-2.0-t2v'. Use the -i2v variant or any kling3 variant for image-to-video. | |
| prompt | Yes | Text description of the video. For image-to-video, describe the motion/action you want applied to the source image. | |
| file_id | No | file_id from prepare_image_upload — preferred for chat attachments. Required for seedance-2.0-i2v / -i2v-fast. Optional for kling3-* (presence triggers image-to-video mode). | |
| duration | No | Video duration in seconds. Per-model bounds: seedance i2v 4-15, seedance t2v 5-15, kling3-standard/pro 5-10, kling3-4k/o3-4k 3-15. Defaults to 5. | |
| image_url | No | HTTPS URL of the source image. Use only if you already have a public URL; otherwise call prepare_image_upload and pass file_id. | |
| resolution | No | Video resolution. Only meaningful for seedance (480p/720p/1080p; 1080p not allowed with seedance fast). Kling models lock resolution by variant. | |
| aspect_ratio | No | Aspect ratio. Defaults to '16:9'. Also controls placeholder shape on the board. Ignored for image-to-video (aspect derives from input). | |
| storyboard_id | Yes | The id of the storyboard to add the video to. Must be owned by, or shared with edit access to, the authenticated user. | |
| generate_audio | No | Generate audio (Kling 3 standard/pro only). Ignored for other models. |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
Discloses that a placeholder is dropped immediately and the recovery hook swaps it after 2-10 minutes. Notes that refresh may be needed for live update. Annotations don't contradict; description adds rich behavioral context beyond what annotations provide.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
A bit lengthy but well-structured with front-loaded main action. Every sentence adds necessary information. Could be slightly trimmed, but no waste.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Covers all essential aspects: prerequisites (storyboard_id, image upload), model selection, duration bounds, pricing, and output behavior (placeholder + replacement). Despite no output schema, the description fully informs the agent of what to expect.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema has 100% coverage of parameter descriptions, but the description adds significant value: explains model variants in detail, per-model duration bounds, difference between file_id and image_url, and defaults. This helps the agent understand parameter constraints and relationships.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
Clearly states it generates an AI video and places it on a user's Avocado AI storyboard. Distinguishes from siblings like generate_video and generate_image_to_storyboard by specifying the placement on storyboard. Uses specific verb 'Generate' and resource 'video to storyboard'.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
Explicitly tells the agent to first use list_storyboards or create_storyboard to obtain storyboard_id. Provides guidance on when to use image-to-video vs text-to-video, and that for image-to-video one should call prepare_image_upload first. Also warns about refresh if the tab is open.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
get_startedARead-onlyIdempotentInspect
Get step-by-step instructions for connecting to Avocado AI via MCP. Call this when a user wants to sign up, authenticate, or connect Avocado AI to their AI assistant (Claude, ChatGPT, Cursor, Windsurf, Claude Code, etc.).
| Name | Required | Description | Default |
|---|---|---|---|
| client | No | Which AI assistant or client the user wants to connect from. |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
Annotations already declare readOnlyHint=true, idempotentHint=true, and destructiveHint=false. The description's 'Get step-by-step instructions' aligns with these hints but adds no new behavioral context beyond what annotations provide.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is two sentences, with the first stating the core purpose and the second providing usage context. It is concise with no redundant or unnecessary information.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given the tool's simplicity (one optional parameter, no output schema), the description fully covers purpose, usage, and context. No additional details are needed for an agent to correctly invoke it.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
The single parameter 'client' is fully documented in the schema with enum values and description. The tool description does not add additional meaning or behavior details for the parameter, so it meets the baseline for 100% schema coverage.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the tool provides step-by-step instructions for connecting to Avocado AI via MCP. It specifies the exact scenarios (sign up, authenticate, connect) and lists example clients, making it distinct from sibling tools which are primarily generation and editing tools.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description explicitly says 'Call this when a user wants to sign up, authenticate, or connect Avocado AI to their AI assistant', giving clear when-to-use guidance. It does not list exclusions, but the context of sibling tools makes the usage obvious.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
list_storyboardsARead-onlyIdempotentInspect
List the user's Avocado AI storyboards. Returns owned and shared boards with id, title, last-updated time, thumbnail, and direct URL. Use this to let the user pick an existing board.
| Name | Required | Description | Default |
|---|---|---|---|
No parameters | |||
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
Annotations already indicate readOnlyHint=true and destructiveHint=false. The description adds value by specifying the scope (owned and shared) and the exact fields returned, which goes beyond annotations.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
Two sentences, front-loaded with the action and resource, followed by output details and usage guidance. No superfluous words.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given the tool's simplicity (no parameters, no output schema), the description fully covers purpose, output, and usage. Annotations handle safety. No missing elements.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
With zero parameters and 100% schema coverage, the baseline is 4. The description does not need to explain parameters, but it compensates by describing the output structure, aiding understanding.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the action (list), the resource (storyboards), and the returned fields (id, title, last-updated time, thumbnail, URL). It distinguishes itself from sibling tools like create_storyboard, which create rather than list.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description explicitly states when to use the tool: 'Use this to let the user pick an existing board.' It does not provide explicit exclusions or alternatives, but no alternative listing tool exists among siblings, making the guidance adequate.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
models_listARead-onlyIdempotentInspect
List all available AI image generation models on Avocado AI. Returns model slugs, display names, credit costs, and descriptions. Use this to help users pick the right model for their needs.
| Name | Required | Description | Default |
|---|---|---|---|
| category | No | Filter by media type. Currently only 'image' is supported. |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
Annotations already declare readOnlyHint and idempotentHint, so the behavioral burden is low. The description adds context about return content but no additional behavioral traits beyond what annotations provide.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
Three sentences, front-loaded with verb and resource, no redundancy. Every sentence adds value.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
The description explains what is returned (slugs, names, costs, descriptions) despite no output schema. It mentions the optional parameter implicitly via 'all available'. Lacks details on pagination or filtering behavior but sufficient for a simple list tool.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
The input schema has 100% coverage for its single parameter (category). The description does not add extra meaning beyond the schema's description, so baseline 3 is appropriate.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the tool lists all available AI image generation models, specifies the return fields (slugs, names, credit costs, descriptions), and differentiates from sibling generation tools by being a list/retrieval operation.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description provides a high-level usage context ('help users pick the right model') but does not explicitly state when not to use or mention alternative tools. No sibling comparison is given.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
prepare_image_uploadPrepare Image UploadAInspect
MANDATORY first step whenever the user attached an image in chat (or pointed at a local file on disk) and wants edit_image or image-to-video generation. Returns a signed PUT URL plus a file_id. After this tool: either (a) the inline upload widget will let the user drop the file and auto-continue (Claude.ai web), or (b) you run a curl PUT yourself if you have shell access (Claude Desktop / Claude Code) — the response text contains a ready-to-run curl command. Then call edit_image or generate_video with file_id=. edit_image and generate_video do NOT accept base64 — calling them with raw image bytes WILL fail. This tool is the only working path for chat attachments. Set purpose to 'edit' or 'video' so the upload widget points the user at the right downstream tool.
| Name | Required | Description | Default |
|---|---|---|---|
| purpose | No | What the user wants done with the uploaded image. 'edit' (default) for edit_image. 'video' for generate_video image-to-video. The upload widget uses this to nudge you toward the right downstream tool after upload. | |
| mime_type | No | MIME type of the image the user will upload. Defaults to image/png. Accepts png, jpeg, webp. |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
The description thoroughly discloses the tool's behavior: it returns a signed URL and file_id, requires subsequent steps (widget upload or curl command), and warns that downstream tools fail with raw bytes. Annotations are minimal (no read-only, destructive hints), so the description carries the full burden and does it well.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is comprehensive but slightly lengthy; however, every sentence adds critical information for the workflow. It is front-loaded with the mandatory nature and clearly orders steps. Could be slightly more concise, but the information density justifies the length.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given the tool's complexity (preparation step without output schema), the description is complete: it covers the return values, post-usage steps, how to handle the upload in different environments, and prerequisites. No gaps in context.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
The input schema already has clear descriptions for 'purpose' and 'mime_type' with enums and defaults. The description adds value by explaining why each parameter matters (purpose controls downstream tool nudging) and emphasizing defaults. Schema coverage is 100%, so the description enhances rather than replaces schema info.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the tool is the mandatory first step for uploading images for editing or video generation, distinguishing it from sibling tools like edit_image and generate_video. It specifies that it returns a signed PUT URL and file_id, providing a clear purpose.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
Explicitly says this is mandatory when the user attaches an image and wants to use edit_image or generate_video. It explains that downstream tools do not accept base64 and will fail if called directly, providing clear guidance on when to use this tool and why alternatives won't work.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
Claim this connector by publishing a /.well-known/glama.json file on your server's domain with the following structure:
{
"$schema": "https://glama.ai/mcp/schemas/connector.json",
"maintainers": [{ "email": "your-email@example.com" }]
}The email address must match the email associated with your Glama account. Once published, Glama will automatically detect and verify the file within a few minutes.
Control your server's listing on Glama, including description and metadata
Access analytics and receive server usage reports
Get monitoring and health status updates for your server
Feature your server to boost visibility and reach more users
For users:
Full audit trail – every tool call is logged with inputs and outputs for compliance and debugging
Granular tool control – enable or disable individual tools per connector to limit what your AI agents can do
Centralized credential management – store and rotate API keys and OAuth tokens in one place
Change alerts – get notified when a connector changes its schema, adds or removes tools, or updates tool definitions, so nothing breaks silently
For server owners:
Proven adoption – public usage metrics on your listing show real-world traction and build trust with prospective users
Tool-level analytics – see which tools are being used most, helping you prioritize development and documentation
Direct user feedback – users can report issues and suggest improvements through the listing, giving you a channel you would not have otherwise
The connector status is unhealthy when Glama is unable to successfully connect to the server. This can happen for several reasons:
The server is experiencing an outage
The URL of the server is wrong
Credentials required to access the server are missing or invalid
If you are the owner of this MCP connector and would like to make modifications to the listing, including providing test credentials for accessing the server, please contact support@glama.ai.
Discussions
No comments yet. Be the first to start the discussion!