WanMCP

wan_generate_video_from_image

Generate AI video from a reference image. Provide a prompt describing motion and content, choose model and duration, and optionally add audio.

Instructions

Generate AI video from a reference image using Wan image-to-video models.

This supports three models:
- wan2.6-i2v: Standard image-to-video generation
- wan2.6-r2v: Reference video-to-video with character/timbre extraction
- wan2.6-i2v-flash: Fast image-to-video generation

Returns:
    Task ID and generated video information including URLs and state.

Input Schema

TableJSON Schema

Name	Required	Description	Default
`size`	No	The size of the generated video (e.g., '1280x720').
`audio`	No	Whether the generated video should include audio. Default is true.
`model`	No	Model to use. Options: 'wan2.6-i2v' (standard image-to-video), 'wan2.6-r2v' (reference video-to-video), 'wan2.6-i2v-flash' (fast image-to-video). Default: 'wan2.6-i2v'.	wan2.6-i2v
`prompt`	Yes	Description of the video motion and content. Describe what should happen in the video.
`timeout`	No	Timeout in seconds. Default is 1800.
`duration`	No	Video duration in seconds. Options: 5, 10, or 15.
`audio_url`	No	URL of reference audio to use in the video.
`image_url`	Yes	URL of the reference image for video generation. The video will be generated based on this image.
`shot_type`	No	Shot type: 'single' for continuous shot, 'multi' for multi-cut editing.
`resolution`	No	Video resolution. Options: '480P', '720P' (default), '1080P'.	720P
`callback_url`	No	Webhook callback URL for asynchronous notifications.
`prompt_extend`	No	Enable LLM-based prompt rewriting. Default is true.
`negative_prompt`	No	Content to exclude from the video. Maximum 500 characters.
`reference_video_urls`	No	Comma-separated URLs of reference videos for character/timbre extraction. Used with wan2.6-r2v model.

Output Schema

TableJSON Schema

Name	Required	Description	Default
`result`	Yes

Tool Definition Quality

B3.1/5.0

Behavior2/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

No annotations provided, so description carries full burden. It mentions it generates video and returns task ID/info, but lacks disclosure of asynchronous behavior, timeouts (though timeout parameter exists), failure scenarios, or authentication needs. Minimal safety or side-effect info.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness4/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Description is 6 sentences, front-loaded with purpose. Model list is useful but duplicates schema parameter description somewhat. Could be more concise, but overall efficient and clear.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness2/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

For a complex async tool with 14 parameters, the description omits critical context: how to check task status (sibling wan_get_task not mentioned), typical usage flow, error handling, and prerequisites like image URL accessibility. Output schema exists but does not compensate for missing async workflow guidance.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema coverage is 100%, so schema already describes all 14 parameters. The description adds context about model choices and return values but does not significantly enhance understanding of parameter semantics beyond schema. Baseline 3 is appropriate.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the tool generates AI video from a reference image, using Wan models. The name itself distinguishes it from siblings like wan_generate_video (likely text-to-video). It lists model variants, providing specific verb-resource clarity.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines2/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

No explicit guidance on when to use this tool versus alternatives like wan_generate_video. No mention of prerequisites (e.g., image URL format) or when to choose each model variant. Implied by name but not elaborated.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

Install Server

Other Tools

Latest Blog Posts

Your AI Chatbot Just Exposed Your CEO's Salary to an Intern
By Om-Shree-0709 on July 2, 2026.
Agent Identity
MCP Security
OAuth Delegation
Why MCP Servers Need Execution Sandboxing (And Why Your Current Stack Isn't Enough)
By Om-Shree-0709 on June 30, 2026.
Agentic Ai
Prompt Injection
WebAssembly
Lightport: Open-Sourcing Glama's AI Gateway
By punkpeye on April 27, 2026.
OpenAI
open source

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/AceDataCloud/WanMCP'

If you have feedback or need assistance with the MCP directory API, please join our Discord server