Skip to main content
Glama

Create Seedance 2.0 video generation task

seedance_create_task

Submit a video generation task to Seedance 2.0 using text prompts, reference images, videos, or audio. Returns a task ID for asynchronous status checking.

Instructions

Submit a Seedance 2.0 video generation task to the Volcengine ARK API and return the task_id immediately. Does NOT wait for the video to render - poll seedance_check_task afterwards. Reference media URLs must be publicly reachable.

Input Schema

TableJSON Schema
NameRequiredDescriptionDefault
promptYesNatural-language description of the desired video. Reference images / videos / audios with [Image1], [Video1], [Audio1] in 1-based order if you provided any.
modelNoSeedance 2.0 model id. doubao-seedance-2-0-260128 is the standard, highest-quality model. doubao-seedance-2-0-fast-260128 trades quality for latency.doubao-seedance-2-0-260128
durationNoVideo length in seconds. Must be an integer in [4, 15].
ratioNoAspect ratio. Use 9:16 for vertical short-video, 16:9 for landscape. 'adaptive' lets the model pick the best fit when reference media is provided.16:9
resolutionNoOutput resolution. 720p is recommended; 480p is faster/cheaper.720p
generate_audioNoWhether to generate synchronized audio (dialogue, SFX, music). Set false for silent video.
watermarkNoWhether to add the platform watermark. Some accounts cannot disable this.
web_searchNoEnable prompt enhancement via web search. Text-only input is required when this is true.
return_last_frameNoReturn the last frame as an image URL alongside the video URL. Useful for chaining segments.
image_urlsNoUp to 9 reference images. Each item is { url, role? }. role defaults to 'reference_image'. Use 'first_frame' (and optionally 'last_frame') for image-to-video animation.
video_urlsNoUp to 3 reference videos. Each item is { url, role? }. role defaults to 'reference_video'.
audio_urlsNoUp to 3 reference audios. Each item is { url, role? }. role defaults to 'reference_audio'. Audio MUST be paired with at least one image or video reference.

Implementation Reference

  • The actual handler function for the seedance_create_task tool. It validates input via validateCreateTaskInput, calls createSeedanceTask (from seedance.ts), and returns the task_id along with structured metadata.
    async (input: CreateTaskInput): Promise<CallToolResult> => {
      const validationError = validateCreateTaskInput(input);
      if (validationError) {
        return errorResult(validationError);
      }
      try {
        const result = await createSeedanceTask(input);
        const human = [
          `Task submitted. task_id: ${result.id}`,
          "",
          "Next step: wait 30-90 seconds, then call seedance_check_task with this task_id.",
          "A 15-second job on the standard model usually needs 2-5 minutes total.",
        ].join("\n");
        return {
          content: [{ type: "text", text: human }],
          structuredContent: {
            task_id: result.id,
            model: input.model,
            duration: input.duration,
            ratio: input.ratio,
            resolution: input.resolution,
            raw: result.raw,
          },
        };
      } catch (err) {
        return errorResult(formatApiError(err));
      }
    },
  • The core API function that builds the request body and POSTs to the ARK API to create a video generation task. Called by the handler.
    export async function createSeedanceTask(
      input: CreateTaskInput,
    ): Promise<CreateTaskResponse> {
      const apiKey = getApiKey();
      const baseUrl = getBaseUrl();
      const body = buildCreateTaskBody(input);
    
      const res = await fetch(`${baseUrl}/contents/generations/tasks`, {
        method: "POST",
        headers: {
          "Content-Type": "application/json",
          Authorization: `Bearer ${apiKey}`,
        },
        body: JSON.stringify(body),
      });
    
      const parsed = await parseJsonSafe(res);
    
      if (!res.ok) {
        throw new SeedanceApiError(
          extractApiErrorMessage(parsed, res.status),
          res.status,
          parsed,
        );
      }
    
      if (!parsed || typeof parsed !== "object") {
        throw new SeedanceApiError(
          "ARK API returned an unexpected non-JSON response",
          res.status,
          parsed,
        );
      }
    
      const obj = parsed as Record<string, unknown>;
      const id = obj.id;
      if (typeof id !== "string" || id.length === 0) {
        throw new SeedanceApiError(
          "ARK API response did not include a task id",
          res.status,
          parsed,
        );
      }
    
      return { id, raw: obj };
    }
  • Helper that transforms the validated CreateTaskInput into the API request body format (model, content items, ratio, resolution, etc.).
    export function buildCreateTaskBody(input: CreateTaskInput): CreateTaskRequestBody {
      return {
        model: input.model,
        content: buildContent(input),
        ratio: input.ratio,
        resolution: input.resolution,
        duration: input.duration,
        generate_audio: input.generate_audio,
        watermark: input.watermark,
        tools: input.web_search ? [{ type: "web_search" }] : undefined,
        return_last_frame: input.return_last_frame,
      };
    }
  • Zod-based input schema for seedance_create_task defining all parameters: prompt, model, duration, ratio, resolution, generate_audio, watermark, web_search, return_last_frame, image_urls, video_urls, audio_urls.
    export const createTaskInputShape = {
      prompt: z
        .string()
        .min(1, "prompt cannot be empty")
        .describe(
          "Natural-language description of the desired video. Reference images / videos / audios with [Image1], [Video1], [Audio1] in 1-based order if you provided any.",
        ),
      model: z
        .enum(SEEDANCE_MODELS)
        .default("doubao-seedance-2-0-260128")
        .describe(
          "Seedance 2.0 model id. doubao-seedance-2-0-260128 is the standard, highest-quality model. doubao-seedance-2-0-fast-260128 trades quality for latency.",
        ),
      duration: z
        .number()
        .int()
        .min(4)
        .max(15)
        .default(5)
        .describe("Video length in seconds. Must be an integer in [4, 15]."),
      ratio: z
        .enum(RATIOS)
        .default("16:9")
        .describe(
          "Aspect ratio. Use 9:16 for vertical short-video, 16:9 for landscape. 'adaptive' lets the model pick the best fit when reference media is provided.",
        ),
      resolution: z
        .enum(RESOLUTIONS)
        .default("720p")
        .describe("Output resolution. 720p is recommended; 480p is faster/cheaper."),
      generate_audio: z
        .boolean()
        .default(true)
        .describe(
          "Whether to generate synchronized audio (dialogue, SFX, music). Set false for silent video.",
        ),
      watermark: z
        .boolean()
        .default(true)
        .describe(
          "Whether to add the platform watermark. Some accounts cannot disable this.",
        ),
      web_search: z
        .boolean()
        .default(false)
        .describe(
          "Enable prompt enhancement via web search. Text-only input is required when this is true.",
        ),
      return_last_frame: z
        .boolean()
        .default(false)
        .describe(
          "Return the last frame as an image URL alongside the video URL. Useful for chaining segments.",
        ),
      image_urls: z
        .array(imageItemSchema)
        .max(9, "image_urls accepts at most 9 items")
        .optional()
        .describe(
          "Up to 9 reference images. Each item is { url, role? }. role defaults to 'reference_image'. Use 'first_frame' (and optionally 'last_frame') for image-to-video animation.",
        ),
      video_urls: z
        .array(videoItemSchema)
        .max(3, "video_urls accepts at most 3 items")
        .optional()
        .describe(
          "Up to 3 reference videos. Each item is { url, role? }. role defaults to 'reference_video'.",
        ),
      audio_urls: z
        .array(audioItemSchema)
        .max(3, "audio_urls accepts at most 3 items")
        .optional()
        .describe(
          "Up to 3 reference audios. Each item is { url, role? }. role defaults to 'reference_audio'. Audio MUST be paired with at least one image or video reference.",
        ),
    };
  • src/server.ts:49-90 (registration)
    Registration of seedance_create_task tool on the McpServer via server.registerTool(...), including its title, description, inputSchema, and annotations.
    server.registerTool(
      "seedance_create_task",
      {
        title: "Create Seedance 2.0 video generation task",
        description:
          "Submit a Seedance 2.0 video generation task to the Volcengine ARK API and return the task_id immediately. Does NOT wait for the video to render - poll seedance_check_task afterwards. Reference media URLs must be publicly reachable.",
        inputSchema: createTaskInputShape,
        annotations: {
          readOnlyHint: false,
          idempotentHint: false,
          openWorldHint: true,
        },
      },
      async (input: CreateTaskInput): Promise<CallToolResult> => {
        const validationError = validateCreateTaskInput(input);
        if (validationError) {
          return errorResult(validationError);
        }
        try {
          const result = await createSeedanceTask(input);
          const human = [
            `Task submitted. task_id: ${result.id}`,
            "",
            "Next step: wait 30-90 seconds, then call seedance_check_task with this task_id.",
            "A 15-second job on the standard model usually needs 2-5 minutes total.",
          ].join("\n");
          return {
            content: [{ type: "text", text: human }],
            structuredContent: {
              task_id: result.id,
              model: input.model,
              duration: input.duration,
              ratio: input.ratio,
              resolution: input.resolution,
              raw: result.raw,
            },
          };
        } catch (err) {
          return errorResult(formatApiError(err));
        }
      },
    );
Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations indicate readOnlyHint=false, and the description confirms a mutation (creates a task). It adds context beyond annotations: the async nature ('Does NOT wait for the video to render') and the requirement for public URLs. However, it does not mention rate limits, error handling, or response structure.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Three sentences containing no fluff: first states core function, second explains async behavior with action, third adds constraint. Front-loaded with the most important information.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

For a tool with 12 parameters and no output schema, the description covers the essential workflow and a key constraint (public URLs). It lacks details on error cases or the exact format of the returned task_id, but the async instruction is clear. Overall adequate but could be slightly more detailed.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters4/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema coverage is 100%, so baseline is 3. The description adds value beyond schema by emphasizing the async behavior and public URL requirement, which are not evident from individual parameter descriptions. It also clarifies the workflow (submit then poll).

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the verb 'Submit' and the resource 'Seedance 2.0 video generation task'. It specifies that it returns a task_id immediately and does not wait for rendering, distinguishing it from sibling tools like seedance_check_task that are used for polling.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines5/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description explicitly instructs to poll seedance_check_task afterwards and notes that reference media URLs must be publicly reachable. It provides clear guidance on when to use this tool (to initiate a task) and what to do next (poll for results).

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

Install Server

Other Tools

Latest Blog Posts

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/leonaiuv/seedance-2-mcp'

If you have feedback or need assistance with the MCP directory API, please join our Discord server