MCP Kling

generate_video

Create AI-generated videos from text descriptions using Kling AI models. Specify prompts, aspect ratios, durations, and camera movements to produce custom video content.

Instructions

Generate a video from text prompt using Kling AI

Input Schema

TableJSON Schema

Name	Required	Description
`prompt`	Yes	Text prompt describing the video to generate (max 2500 characters)
`negative_prompt`	No	Text describing what to avoid in the video (optional, max 2500 characters)
`model_name`	No	Model version to use (default: kling-v2-master)
`aspect_ratio`	No	Video aspect ratio (default: 16:9)
`duration`	No	Video duration in seconds (default: 5)
`mode`	No	Video generation mode (default: standard)
`cfg_scale`	No	Creative freedom scale 0-1 (0=more creative, 1=more adherent to prompt, default: 0.5)
`camera_control`	No	Camera movement settings for V2 models

Implementation Reference

src/index.ts:480-502 (handler)

MCP CallToolRequest handler case for 'generate_video': constructs VideoGenerationRequest from tool arguments and delegates to klingClient.generateVideo()

case 'generate_video': {
  const videoRequest: VideoGenerationRequest = {
    prompt: args.prompt as string,
    negative_prompt: args.negative_prompt as string | undefined,
    model_name: (args.model_name as 'kling-v1' | 'kling-v1.5' | 'kling-v1.6' | 'kling-v2-master' | undefined) || 'kling-v2-master',
    aspect_ratio: (args.aspect_ratio as '16:9' | '9:16' | '1:1') || '16:9',
    duration: (args.duration as '5' | '10') || '5',
    mode: (args.mode as 'standard' | 'professional') || 'standard',
    cfg_scale: (args.cfg_scale as number) ?? 0.5,
    camera_control: args.camera_control as any,
  };

  const result = await klingClient.generateVideo(videoRequest);
  
  return {
    content: [
      {
        type: 'text',
        text: `Video generation started successfully!\nTask ID: ${result.task_id}\n\nUse the check_video_status tool with this task ID to check the progress.`,
      },
    ],
  };
}

src/kling-client.ts:171-202 (handler)

Core KlingClient.generateVideo method: processes optional image URLs, builds request body with defaults, POSTs to Kling AI text2video endpoint, returns task_id

async generateVideo(request: VideoGenerationRequest): Promise<{ task_id: string }> {
  const path = '/v1/videos/text2video';
  
  // Process any image URLs
  const ref_image_url = await this.processImageUrl(request.ref_image_url);
  
  const body: any = {
    prompt: request.prompt,
    negative_prompt: request.negative_prompt || '',
    cfg_scale: request.cfg_scale || 0.8,
    aspect_ratio: request.aspect_ratio || '16:9',
    duration: request.duration || '5',
    model_name: request.model_name || 'kling-v2-master', // V2-master is default
    ...(request.image_url && { image_url: request.image_url }),
    ...(request.image_tail_url && { image_tail_url: request.image_tail_url }),
    ...(ref_image_url && { ref_image_url }),
    ...(request.ref_image_weight && { ref_image_weight: request.ref_image_weight }),
    ...(request.camera_control && { camera_control: request.camera_control }),
    ...(request.callback_url && { callback_url: request.callback_url }),
    ...(request.external_task_id && { external_task_id: request.external_task_id }),
  };

  try {
    const response = await this.axiosInstance.post(path, body);
    return response.data.data;
  } catch (error) {
    if (axios.isAxiosError(error)) {
      throw new Error(`Kling API error: ${error.response?.data?.message || error.message}`);
    }
    throw error;
  }
}

src/index.ts:66-162 (schema)

Tool schema definition in TOOLS array: inputSchema for generate_video with properties, enums, descriptions matching VideoGenerationRequest

{
  name: 'generate_video',
  description: 'Generate a video from text prompt using Kling AI',
  inputSchema: {
    type: 'object',
    properties: {
      prompt: {
        type: 'string',
        description: 'Text prompt describing the video to generate (max 2500 characters)',
      },
      negative_prompt: {
        type: 'string',
        description: 'Text describing what to avoid in the video (optional, max 2500 characters)',
      },
      model_name: {
        type: 'string',
        enum: ['kling-v1', 'kling-v1.5', 'kling-v1.6', 'kling-v2-master'],
        description: 'Model version to use (default: kling-v2-master)',
      },
      aspect_ratio: {
        type: 'string',
        enum: ['16:9', '9:16', '1:1'],
        description: 'Video aspect ratio (default: 16:9)',
      },
      duration: {
        type: 'string',
        enum: ['5', '10'],
        description: 'Video duration in seconds (default: 5)',
      },
      mode: {
        type: 'string',
        enum: ['standard', 'professional'],
        description: 'Video generation mode (default: standard)',
      },
      cfg_scale: {
        type: 'number',
        description: 'Creative freedom scale 0-1 (0=more creative, 1=more adherent to prompt, default: 0.5)',
        minimum: 0,
        maximum: 1,
      },
      camera_control: {
        type: 'object',
        description: 'Camera movement settings for V2 models',
        properties: {
          type: {
            type: 'string',
            enum: ['simple', 'down_back', 'forward_up', 'right_turn_forward', 'left_turn_forward'],
            description: 'Camera movement type',
          },
          config: {
            type: 'object',
            description: 'Camera movement configuration (only for "simple" type)',
            properties: {
              horizontal: {
                type: 'number',
                description: 'Horizontal movement [-10, 10]',
                minimum: -10,
                maximum: 10,
              },
              vertical: {
                type: 'number',
                description: 'Vertical movement [-10, 10]',
                minimum: -10,
                maximum: 10,
              },
              pan: {
                type: 'number',
                description: 'Pan rotation [-10, 10]',
                minimum: -10,
                maximum: 10,
              },
              tilt: {
                type: 'number',
                description: 'Tilt rotation [-10, 10]',
                minimum: -10,
                maximum: 10,
              },
              roll: {
                type: 'number',
                description: 'Roll rotation [-10, 10]',
                minimum: -10,
                maximum: 10,
              },
              zoom: {
                type: 'number',
                description: 'Zoom [-10, 10]',
                minimum: -10,
                maximum: 10,
              },
            },
          },
        },
      },
    },
    required: ['prompt'],
  },
},

src/index.ts:467-469 (registration)
Registration of all tools list handler, exposing the generate_video tool via ListToolsRequestSchema
```
server.setRequestHandler(ListToolsRequestSchema, async () => ({
  tools: TOOLS,
}));
```

src/kling-client.ts:23-38 (schema)

TypeScript interface VideoGenerationRequest defining typed inputs used by both tool handler and API client method.

export interface VideoGenerationRequest {
  prompt: string;
  negative_prompt?: string;
  model_name?: 'kling-v1' | 'kling-v1.5' | 'kling-v1.6' | 'kling-v2-master';
  aspect_ratio?: '16:9' | '9:16' | '1:1';
  duration?: '5' | '10';
  mode?: 'standard' | 'professional';
  cfg_scale?: number;
  image_url?: string;
  image_tail_url?: string;
  ref_image_url?: string;
  ref_image_weight?: number;
  camera_control?: CameraControl;
  callback_url?: string;
  external_task_id?: string;
}

Tool Definition Quality

C2.9/5.0

Behavior2/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

With no annotations provided, the description carries the full burden of behavioral disclosure. While 'Generate a video' implies a creation/mutation operation, the description doesn't address critical behavioral aspects: whether this is an async operation (likely given sibling 'check_video_status'), what permissions or authentication might be required, rate limits, cost implications, or what format/quality the output video will have. This is inadequate for a complex generative tool.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is a single, efficient sentence that states the core purpose without unnecessary elaboration. It's appropriately sized and front-loaded with the essential information, making it easy for an agent to quickly understand what the tool does.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness2/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

For a complex video generation tool with 8 parameters (including nested objects), no annotations, and no output schema, the description is insufficient. It doesn't address the asynchronous nature suggested by sibling tools, doesn't explain what the tool returns (video file? URL? task ID?), and provides no context about the Kling AI service's capabilities or limitations. The description should do more to compensate for the lack of structured metadata.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

The schema description coverage is 100%, so all parameters are documented in the schema itself. The description adds no additional parameter semantics beyond what's already in the schema descriptions. This meets the baseline expectation when schema coverage is complete, but doesn't provide extra value like explaining parameter interactions or practical usage patterns.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose4/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the tool's purpose: 'Generate a video from text prompt using Kling AI' - a specific verb ('Generate') with resource ('video') and technology context ('Kling AI'). However, it doesn't distinguish this from sibling tools like 'generate_image_to_video' or 'extend_video', which would require explicit differentiation for a perfect score.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines2/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description provides no guidance on when to use this tool versus alternatives like 'generate_image_to_video', 'extend_video', or 'apply_video_effect'. There's no mention of prerequisites, appropriate contexts, or limitations that would help an agent choose between these video-related tools.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

Install Server

Other Tools

Latest Blog Posts

Lightport: Open-Sourcing Glama's AI Gateway
By punkpeye on April 27, 2026.
open source
OpenAI
Tool Definition Quality Score (TDQS)
By punkpeye on April 3, 2026.
mcp
The Hackers Who Tracked My Sleep Cycle
By punkpeye on March 26, 2026.
security

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/199-mcp/mcp-kling'

If you have feedback or need assistance with the MCP directory API, please join our Discord server