Skip to main content
Glama

image-to-video

Convert static images into dynamic videos using Vidu API. Specify duration, resolution, and movement amplitude for customized video generation. Supports text prompts for enhanced creative control.

Instructions

Generate a video from an image using Vidu API

Input Schema

TableJSON Schema
NameRequiredDescriptionDefault
durationNoDuration of the output video in seconds (4 or 8)
image_urlYesURL of the image to convert to video
modelNoModel name for generationvidu2.0
movement_amplitudeNoMovement amplitude of objects in the frameauto
promptNoText prompt for video generation (max 1500 chars)
resolutionNoResolution of the output video720p
seedNoRandom seed for reproducibility

Implementation Reference

  • index.ts:81-284 (handler)
    The core handler function that performs image-to-video conversion using the Vidu API. It validates inputs based on model, starts the generation task, polls for completion (up to 5 min), and returns the video and cover URLs or errors.
      async ({ image_url, prompt, duration, model, resolution, movement_amplitude, seed, bgm, callback_url }) => {
        try {
          // Validate model-specific constraints
          let finalDuration = duration;
          let finalResolution = resolution;
          
          if (model === "viduq1") {
            // viduq1 only supports 5s duration and 1080p resolution
            finalDuration = 5;
            finalResolution = "1080p";
            if (duration && duration !== 5) {
              console.warn(`Model viduq1 only supports 5s duration. Using 5s instead of ${duration}s.`);
            }
            if (resolution && resolution !== "1080p") {
              console.warn(`Model viduq1 only supports 1080p resolution. Using 1080p instead of ${resolution}.`);
            }
          } else {
            // vidu1.5 and vidu2.0
            if (!duration || ![4, 8].includes(duration)) {
              finalDuration = 4; // Default to 4s
            } else {
              finalDuration = duration;
            }
            
            // Resolution constraints based on duration
            if (finalDuration === 4) {
              if (!resolution || !["360p", "720p", "1080p"].includes(resolution)) {
                finalResolution = "360p"; // Default for 4s
              } else {
                finalResolution = resolution;
              }
            } else if (finalDuration === 8) {
              finalResolution = "720p"; // Only option for 8s
              if (resolution && resolution !== "720p") {
                console.warn(`8s videos only support 720p resolution. Using 720p instead of ${resolution}.`);
              }
            }
          }
          
          // BGM validation
          const finalBgm = bgm === true && finalDuration === 4;
          if (bgm === true && finalDuration !== 4) {
            console.warn(`BGM is only supported for 4s videos. BGM will not be added for ${finalDuration}s video.`);
          }
          
          // Step 1: Start the generation task
          const startResponse = await fetch(`${VIDU_API_BASE_URL}/ent/v2/img2video`, {
            method: "POST",
            headers: {
              "Content-Type": "application/json",
              "Authorization": `Token ${VIDU_API_KEY}`
            },
            body: JSON.stringify({
              model,
              images: [image_url],
              prompt: prompt || "",
              duration: finalDuration,
              seed: seed !== undefined ? seed : Math.floor(Math.random() * 1000000),
              resolution: finalResolution,
              movement_amplitude,
              bgm: finalBgm,
              ...(callback_url && { callback_url })
            })
          });
    
          if (!startResponse.ok) {
            const errorData = await startResponse.text();
            return {
              isError: true,
              content: [
                {
                  type: "text",
                  text: `Error starting video generation: ${errorData}`
                }
              ]
            };
          }
    
          const startData = await startResponse.json() as StartResponse;
          const taskId = startData.task_id;
    
          // Step 2: Poll for completion
          let state = startData.state;
          let result: StatusResponse | null = null;
          
          // Add a message to indicate that we're processing
          let status = `Task created with ID: ${taskId}\nInitial state: ${state}\n`;
          status += "Waiting for processing to complete...\n";
    
          // Maximum wait time: 5 minutes
          const maxPolls = 60;
          let pollCount = 0;
          
          while (state !== "success" && state !== "failed" && pollCount < maxPolls) {
            // Wait for 5 seconds before polling again
            await new Promise(resolve => setTimeout(resolve, 5000));
            
            const statusResponse = await fetch(`${VIDU_API_BASE_URL}/ent/v2/tasks/${taskId}/creations`, {
              method: "GET",
              headers: {
                "Content-Type": "application/json",
                "Authorization": `Token ${VIDU_API_KEY}`
              }
            });
    
            if (!statusResponse.ok) {
              const errorData = await statusResponse.text();
              return {
                isError: true,
                content: [
                  {
                    type: "text",
                    text: `Error checking generation status: ${errorData}`
                  }
                ]
              };
            }
    
            const statusData = await statusResponse.json() as StatusResponse;
            state = statusData.state;
            pollCount++;
            
            status += `Current state: ${state}\n`;
            
            if (state === "success") {
              result = statusData;
              break;
            } else if (state === "failed") {
              return {
                isError: true,
                content: [
                  {
                    type: "text",
                    text: `Video generation failed: ${statusData.err_code || "Unknown error"}`
                  }
                ]
              };
            }
          }
    
          if (state !== "success") {
            return {
              isError: true,
              content: [
                {
                  type: "text",
                  text: `Timed out waiting for video generation to complete. Last state: ${state}`
                }
              ]
            };
          }
    
          // Format the successful result
          if (result && result.creations && result.creations.length > 0) {
            const videoUrl = result.creations[0].url;
            const coverUrl = result.creations[0].cover_url;
            const credits = result.credits;
            
            return {
              content: [
                {
                  type: "text",
                  text: `
    Video generation complete!
    
    Task ID: ${taskId}
    Status: ${state}
    Credits used: ${credits || 'N/A'}
    Video URL: ${videoUrl}
    Cover Image URL: ${coverUrl}
    
    Note: These URLs are valid for one hour.
    `
                }
              ]
            };
          } else {
            return {
              content: [
                {
                  type: "text",
                  text: `
    Video generation completed, but no download URLs were returned.
    
    Task ID: ${taskId}
    Status: ${state}
    `
                }
              ]
            };
          }
        } catch (error: any) {
          console.error("Error in image-to-video tool:", error);
          return {
            isError: true,
            content: [
              {
                type: "text",
                text: `An unexpected error occurred: ${error.message}`
              }
            ]
          };
        }
      }
  • Zod schema defining input parameters for the image-to-video tool, including image_url (required), prompt, duration, model (default vidu2.0), etc.
    {
      image_url: z.string().url().describe("URL of the image to convert to video"),
      prompt: z.string().max(1500).optional().describe("Text prompt for video generation (max 1500 chars)"),
      duration: z.number().int().optional().describe("Duration of the output video in seconds (model-specific)"),
      model: z.enum(["viduq1", "vidu1.5", "vidu2.0"]).default("vidu2.0").describe("Model name for generation"),
      resolution: z.enum(["360p", "720p", "1080p"]).optional().describe("Resolution of the output video (model/duration-specific)"),
      movement_amplitude: z.enum(["auto", "small", "medium", "large"]).default("auto").describe("Movement amplitude of objects in the frame"),
      seed: z.number().int().optional().describe("Random seed for reproducibility"),
      bgm: z.boolean().optional().describe("Add background music (4s videos only)"),
      callback_url: z.string().url().optional().describe("Callback URL for async notifications")
    },
  • index.ts:67-285 (registration)
    MCP server tool registration call for 'image-to-video', specifying name, description, input schema, and handler function.
    server.tool(
      "image-to-video",
      "Generate a video from an image using Vidu API",
      {
        image_url: z.string().url().describe("URL of the image to convert to video"),
        prompt: z.string().max(1500).optional().describe("Text prompt for video generation (max 1500 chars)"),
        duration: z.number().int().optional().describe("Duration of the output video in seconds (model-specific)"),
        model: z.enum(["viduq1", "vidu1.5", "vidu2.0"]).default("vidu2.0").describe("Model name for generation"),
        resolution: z.enum(["360p", "720p", "1080p"]).optional().describe("Resolution of the output video (model/duration-specific)"),
        movement_amplitude: z.enum(["auto", "small", "medium", "large"]).default("auto").describe("Movement amplitude of objects in the frame"),
        seed: z.number().int().optional().describe("Random seed for reproducibility"),
        bgm: z.boolean().optional().describe("Add background music (4s videos only)"),
        callback_url: z.string().url().optional().describe("Callback URL for async notifications")
      },
      async ({ image_url, prompt, duration, model, resolution, movement_amplitude, seed, bgm, callback_url }) => {
        try {
          // Validate model-specific constraints
          let finalDuration = duration;
          let finalResolution = resolution;
          
          if (model === "viduq1") {
            // viduq1 only supports 5s duration and 1080p resolution
            finalDuration = 5;
            finalResolution = "1080p";
            if (duration && duration !== 5) {
              console.warn(`Model viduq1 only supports 5s duration. Using 5s instead of ${duration}s.`);
            }
            if (resolution && resolution !== "1080p") {
              console.warn(`Model viduq1 only supports 1080p resolution. Using 1080p instead of ${resolution}.`);
            }
          } else {
            // vidu1.5 and vidu2.0
            if (!duration || ![4, 8].includes(duration)) {
              finalDuration = 4; // Default to 4s
            } else {
              finalDuration = duration;
            }
            
            // Resolution constraints based on duration
            if (finalDuration === 4) {
              if (!resolution || !["360p", "720p", "1080p"].includes(resolution)) {
                finalResolution = "360p"; // Default for 4s
              } else {
                finalResolution = resolution;
              }
            } else if (finalDuration === 8) {
              finalResolution = "720p"; // Only option for 8s
              if (resolution && resolution !== "720p") {
                console.warn(`8s videos only support 720p resolution. Using 720p instead of ${resolution}.`);
              }
            }
          }
          
          // BGM validation
          const finalBgm = bgm === true && finalDuration === 4;
          if (bgm === true && finalDuration !== 4) {
            console.warn(`BGM is only supported for 4s videos. BGM will not be added for ${finalDuration}s video.`);
          }
          
          // Step 1: Start the generation task
          const startResponse = await fetch(`${VIDU_API_BASE_URL}/ent/v2/img2video`, {
            method: "POST",
            headers: {
              "Content-Type": "application/json",
              "Authorization": `Token ${VIDU_API_KEY}`
            },
            body: JSON.stringify({
              model,
              images: [image_url],
              prompt: prompt || "",
              duration: finalDuration,
              seed: seed !== undefined ? seed : Math.floor(Math.random() * 1000000),
              resolution: finalResolution,
              movement_amplitude,
              bgm: finalBgm,
              ...(callback_url && { callback_url })
            })
          });
    
          if (!startResponse.ok) {
            const errorData = await startResponse.text();
            return {
              isError: true,
              content: [
                {
                  type: "text",
                  text: `Error starting video generation: ${errorData}`
                }
              ]
            };
          }
    
          const startData = await startResponse.json() as StartResponse;
          const taskId = startData.task_id;
    
          // Step 2: Poll for completion
          let state = startData.state;
          let result: StatusResponse | null = null;
          
          // Add a message to indicate that we're processing
          let status = `Task created with ID: ${taskId}\nInitial state: ${state}\n`;
          status += "Waiting for processing to complete...\n";
    
          // Maximum wait time: 5 minutes
          const maxPolls = 60;
          let pollCount = 0;
          
          while (state !== "success" && state !== "failed" && pollCount < maxPolls) {
            // Wait for 5 seconds before polling again
            await new Promise(resolve => setTimeout(resolve, 5000));
            
            const statusResponse = await fetch(`${VIDU_API_BASE_URL}/ent/v2/tasks/${taskId}/creations`, {
              method: "GET",
              headers: {
                "Content-Type": "application/json",
                "Authorization": `Token ${VIDU_API_KEY}`
              }
            });
    
            if (!statusResponse.ok) {
              const errorData = await statusResponse.text();
              return {
                isError: true,
                content: [
                  {
                    type: "text",
                    text: `Error checking generation status: ${errorData}`
                  }
                ]
              };
            }
    
            const statusData = await statusResponse.json() as StatusResponse;
            state = statusData.state;
            pollCount++;
            
            status += `Current state: ${state}\n`;
            
            if (state === "success") {
              result = statusData;
              break;
            } else if (state === "failed") {
              return {
                isError: true,
                content: [
                  {
                    type: "text",
                    text: `Video generation failed: ${statusData.err_code || "Unknown error"}`
                  }
                ]
              };
            }
          }
    
          if (state !== "success") {
            return {
              isError: true,
              content: [
                {
                  type: "text",
                  text: `Timed out waiting for video generation to complete. Last state: ${state}`
                }
              ]
            };
          }
    
          // Format the successful result
          if (result && result.creations && result.creations.length > 0) {
            const videoUrl = result.creations[0].url;
            const coverUrl = result.creations[0].cover_url;
            const credits = result.credits;
            
            return {
              content: [
                {
                  type: "text",
                  text: `
    Video generation complete!
    
    Task ID: ${taskId}
    Status: ${state}
    Credits used: ${credits || 'N/A'}
    Video URL: ${videoUrl}
    Cover Image URL: ${coverUrl}
    
    Note: These URLs are valid for one hour.
    `
                }
              ]
            };
          } else {
            return {
              content: [
                {
                  type: "text",
                  text: `
    Video generation completed, but no download URLs were returned.
    
    Task ID: ${taskId}
    Status: ${state}
    `
                }
              ]
            };
          }
        } catch (error: any) {
          console.error("Error in image-to-video tool:", error);
          return {
            isError: true,
            content: [
              {
                type: "text",
                text: `An unexpected error occurred: ${error.message}`
              }
            ]
          };
        }
      }
    );
  • TypeScript interfaces defining API response shapes used within the image-to-video handler (e.g., StartResponse, StatusResponse).
    interface StartResponse {
      task_id: string;
      state: string;
      model: string;
      images: string[];
      prompt: string;
      duration: number;
      seed: number;
      resolution: string;
      bgm: boolean;
      movement_amplitude: string;
      created_at: string;
    }
    
    interface CreationItem {
      id: string;
      url: string;
      cover_url: string;
    }
    
    interface StatusResponse {
      state: string;
      err_code?: string;
      credits?: number;
      creations?: CreationItem[];
    }
    
    interface UploadResponse {
      id: string;
      put_url: string;
      expires_at: string;
    }
    
    interface FinishResponse {
      uri: string;
    }
  • Companion tool to check status of tasks created by image-to-video, referencing its task_id.
    server.tool(
      "check-generation-status",
      "Check the status of a video generation task",
      {
        task_id: z.string().describe("Task ID returned by the image-to-video tool")
      },
      async ({ task_id }) => {
        try {
          const statusResponse = await fetch(`${VIDU_API_BASE_URL}/ent/v2/tasks/${task_id}/creations`, {
            method: "GET",
            headers: {
              "Content-Type": "application/json",
              "Authorization": `Token ${VIDU_API_KEY}`
            }
          });
    
          if (!statusResponse.ok) {
            const errorData = await statusResponse.text();
            return {
              isError: true,
              content: [
                {
                  type: "text",
                  text: `Error checking generation status: ${errorData}`
                }
              ]
            };
          }
    
          const statusData = await statusResponse.json() as StatusResponse;
          
          if (statusData.state === "success") {
            if (statusData.creations && statusData.creations.length > 0) {
              const videoUrl = statusData.creations[0].url;
              const coverUrl = statusData.creations[0].cover_url;
              const credits = statusData.credits;
              
              return {
                content: [
                  {
                    type: "text",
                    text: `
    Generation task complete!
    
    Task ID: ${task_id}
    Status: ${statusData.state}
    Credits used: ${credits || 'N/A'}
    Video URL: ${videoUrl}
    Cover Image URL: ${coverUrl}
    
    Note: These URLs are valid for one hour.
    `
                  }
                ]
              };
            } else {
              return {
                content: [
                  {
                    type: "text",
                    text: `
    Generation task complete but no download URLs available.
    
    Task ID: ${task_id}
    Status: ${statusData.state}
    `
                  }
                ]
              };
            }
          } else if (statusData.state === "failed") {
            return {
              isError: true,
              content: [
                {
                  type: "text",
                  text: `Generation task failed with error code: ${statusData.err_code || "Unknown error"}`
                }
              ]
            };
          } else {
            return {
              content: [
                {
                  type: "text",
                  text: `
    Generation task is still in progress.
    
    Task ID: ${task_id}
    Current Status: ${statusData.state}
    
    You can check again later using the same task ID.
    `
                }
              ]
            };
          }
        } catch (error: any) {
          console.error("Error in check-generation-status tool:", error);
          return {
            isError: true,
            content: [
              {
                type: "text",
                text: `An unexpected error occurred: ${error.message}`
              }
            ]
          };
        }
      }
    );
Behavior2/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

No annotations are provided, so the description carries the full burden of behavioral disclosure. It states the tool generates a video but lacks details on execution time, rate limits, authentication needs, output format (e.g., video file type), error handling, or whether it's a synchronous/asynchronous operation. For a complex 7-parameter tool with no annotations, this is a significant gap in transparency.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is a single, efficient sentence that directly states the tool's purpose without redundancy. It's front-loaded with the core action and resource, and every word earns its place by specifying the API used. No unnecessary details or fluff are included.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness2/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the tool's complexity (7 parameters, video generation task) and lack of annotations and output schema, the description is incomplete. It doesn't cover behavioral aspects like performance, output details, or error handling, which are critical for an AI agent to use this tool effectively. The description alone is insufficient for a tool of this nature.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema description coverage is 100%, so the schema fully documents all 7 parameters with descriptions, defaults, and constraints. The description adds no parameter-specific information beyond what's in the schema, such as explaining interactions between parameters (e.g., how 'prompt' influences generation). Baseline 3 is appropriate when the schema does the heavy lifting.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose4/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the action ('Generate a video') and resource ('from an image'), specifying it uses the Vidu API. It distinguishes from sibling tools like 'check-generation-status' and 'upload-image' by focusing on video generation rather than status checking or image uploading. However, it doesn't explicitly differentiate from potential non-sibling alternatives beyond mentioning the API.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines2/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description provides no guidance on when to use this tool versus alternatives. It doesn't mention prerequisites (e.g., needing an uploaded image first), when not to use it, or how it relates to sibling tools like 'check-generation-status' for monitoring generation progress. Usage is implied only by the tool name and description.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

Install Server

Other Tools

Related Tools

Latest Blog Posts

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/el-el-san/vidu-mcp-server'

If you have feedback or need assistance with the MCP directory API, please join our Discord server