Skip to main content
Glama

mcp_gemini_generate_videos

Generate videos using the Google Veo model by providing a text prompt and optional image. Returns the file path for the created video, customizable by aspect ratio, duration, and output directory.

Instructions

Google Veo 모델을 사용하여 비디오를 생성합니다. 생성된 비디오 파일 경로를 반환하며, 이 경로는 반드시 사용자에게 알려주어야 합니다.

Input Schema

TableJSON Schema
NameRequiredDescriptionDefault
aspectRatioNo비디오의 가로세로 비율16:9
durationSecondsNo비디오 길이(초)
fileNameNo저장할 비디오 파일 이름 (확장자 제외)veo-1755209535150
imageNo비디오의 첫 프레임으로 사용할 이미지 (선택 사항)
modelNo사용할 모델 ID (예: veo-2.0-generate-001)veo-2.0-generate-001
numberOfVideosNo생성할 비디오 수 (1-2)
personGenerationNo사람 생성 허용 설정dont_allow
promptYes비디오 생성을 위한 텍스트 프롬프트
saveDirNo비디오를 저장할 디렉토리./temp

Implementation Reference

  • MCP tool handler for mcp_gemini_generate_videos: invokes geminiService.generateVideos with args and formats the response as ToolResponse, handling success and error cases.
    async handler(args: any): Promise<ToolResponse> {
      try {
        const result = await geminiService.generateVideos(args);
        return {
          content: [{
            type: 'text',
            text: `비디오가 성공적으로 생성되었습니다. 생성된 비디오 파일: ${JSON.stringify(result.videos)}\n총 ${result.count}개의 비디오가 생성되었습니다.`
          }]
        };
      } catch (error) {
        return {
          content: [{
            type: 'text',
            text: `Gemini 비디오 생성 오류: ${error instanceof Error ? error.message : String(error)}`
          }]
        };
      }
    }
  • Input schema definition for the mcp_gemini_generate_videos tool, defining parameters like model, prompt, image, numberOfVideos, aspectRatio, etc.
    inputSchema: {
      type: 'object',
      properties: {
        model: {
          type: 'string',
          description: '사용할 모델 ID (예: veo-2.0-generate-001)',
          default: 'veo-2.0-generate-001',
        },
        prompt: {
          type: 'string',
          description: '비디오 생성을 위한 텍스트 프롬프트',
        },
        image: {
          type: 'object',
          description: '비디오의 첫 프레임으로 사용할 이미지 (선택 사항)',
          properties: {
            imageBytes: {
              type: 'string',
              description: 'Base64로 인코딩된 이미지 데이터',
            },
            mimeType: {
              type: 'string',
              description: '이미지 MIME 타입 (예: image/png)',
            },
          },
        },
        numberOfVideos: {
          type: 'number',
          description: '생성할 비디오 수 (1-2)',
          default: 1,
          minimum: 1,
          maximum: 2,
        },
        aspectRatio: {
          type: 'string',
          description: '비디오의 가로세로 비율',
          default: '16:9',
          enum: ['16:9', '9:16'],
        },
        personGeneration: {
          type: 'string',
          description: '사람 생성 허용 설정',
          default: 'dont_allow',
          enum: ['dont_allow', 'allow_adult'],
        },
        durationSeconds: {
          type: 'number',
          description: '비디오 길이(초)',
          default: 5,
          minimum: 5,
          maximum: 8,
        },
        saveDir: {
          type: 'string',
          description: '비디오를 저장할 디렉토리',
          default: './temp',
        },
        fileName: {
          type: 'string',
          description: '저장할 비디오 파일 이름 (확장자 제외)',
          default: `veo-${Date.now()}`,
        },
      },
      required: ['prompt']
    },
  • Registration of mcp_gemini_generate_videos in the main tools export array used by the MCP server for listTools and callTool handlers.
    {
      name: 'mcp_gemini_generate_videos',
      description: 'Google Veo 모델을 사용하여 비디오를 생성합니다. 생성된 비디오 파일 경로를 반환하며, 이 경로는 반드시 사용자에게 알려주어야 합니다.',
      inputSchema: {
        type: 'object',
        properties: {
          model: {
            type: 'string',
            description: '사용할 모델 ID (예: veo-2.0-generate-001)',
            default: 'veo-2.0-generate-001',
          },
          prompt: {
            type: 'string',
            description: '비디오 생성을 위한 텍스트 프롬프트',
          },
          image: {
            type: 'object',
            description: '비디오의 첫 프레임으로 사용할 이미지 (선택 사항)',
            properties: {
              imageBytes: {
                type: 'string',
                description: 'Base64로 인코딩된 이미지 데이터',
              },
              mimeType: {
                type: 'string',
                description: '이미지 MIME 타입 (예: image/png)',
              },
            },
          },
          numberOfVideos: {
            type: 'number',
            description: '생성할 비디오 수 (1-2)',
            default: 1,
            minimum: 1,
            maximum: 2,
          },
          aspectRatio: {
            type: 'string',
            description: '비디오의 가로세로 비율',
            default: '16:9',
            enum: ['16:9', '9:16'],
          },
          personGeneration: {
            type: 'string',
            description: '사람 생성 허용 설정',
            default: 'dont_allow',
            enum: ['dont_allow', 'allow_adult'],
          },
          durationSeconds: {
            type: 'number',
            description: '비디오 길이(초)',
            default: 5,
            minimum: 5,
            maximum: 8,
          },
          saveDir: {
            type: 'string',
            description: '비디오를 저장할 디렉토리',
            default: './temp',
          },
          fileName: {
            type: 'string',
            description: '저장할 비디오 파일 이름 (확장자 제외)',
            default: `veo-${Date.now()}`,
          },
        },
        required: ['prompt']
      },
      async handler(args: any): Promise<ToolResponse> {
        try {
          const result = await geminiService.generateVideos(args);
          return {
            content: [{
              type: 'text',
              text: `비디오가 성공적으로 생성되었습니다. 생성된 비디오 파일: ${JSON.stringify(result.videos)}\n총 ${result.count}개의 비디오가 생성되었습니다.`
            }]
          };
        } catch (error) {
          return {
            content: [{
              type: 'text',
              text: `Gemini 비디오 생성 오류: ${error instanceof Error ? error.message : String(error)}`
            }]
          };
        }
      }
    },
  • Core helper function implementing video generation: sends POST to Veo API, polls asynchronous operation status every 10s, downloads MP4 videos using URI with API key, saves to local temp directory, returns file paths.
    async generateVideos({
      model,
      prompt,
      image = null,
      numberOfVideos = 1,
      aspectRatio = '16:9',
      personGeneration = 'dont_allow',
      durationSeconds = 5,
      saveDir = './temp',
      fileName = `veo-${Date.now()}`,
    }: {
      model: string;
      prompt: string;
      image?: { imageBytes: string; mimeType: string } | null;
      numberOfVideos?: number;
      aspectRatio?: string;
      personGeneration?: string;
      durationSeconds?: number;
      saveDir?: string;
      fileName?: string;
    }) {
      try {
        const config = this.getRequestConfig();
        const url = `${this.baseUrl}/models/${model}:generateVideos`;
    
        const requestData: any = {
          prompt: {
            text: prompt,
          },
          config: {
            aspectRatio,
            numberOfVideos,
            durationSeconds,
            personGeneration,
          }
        };
    
        // 이미지가 제공된 경우 추가
        if (image) {
          requestData.image = image;
        }
    
        // 비디오 생성 요청 시작
        const response = await axios.post(url, requestData, config);
        
        // 작업 ID 가져오기
        const operationName = response.data.name;
        
        if (!operationName) {
          throw new Error('비디오 생성 작업을 시작할 수 없습니다.');
        }
    
        // 비동기 작업 상태 확인 및 완료 대기
        const operationUrl = `${this.baseUrl}/${operationName}`;
        let operation: { done: boolean; response: any } = { done: false, response: null };
        
        while (!operation.done) {
          // 10초 대기
          await new Promise(resolve => setTimeout(resolve, 10000));
          
          // 작업 상태 확인
          const statusResponse = await axios.get(operationUrl, config);
          operation = statusResponse.data;
        }
    
        // 비디오 다운로드 및 저장
        const fs = await import('fs');
        const path = await import('path');
        
        // 저장 디렉토리가 없으면 생성
        if (!fs.existsSync(saveDir)) {
          fs.mkdirSync(saveDir, { recursive: true });
        }
    
        const savedFiles = [];
        const generatedVideos = operation.response?.generatedVideos || [];
    
        for (let i = 0; i < generatedVideos.length; i++) {
          const videoUri = generatedVideos[i]?.video?.uri;
          
          if (videoUri) {
            // API 키 추가
            const downloadUrl = `${videoUri}&key=${this.apiKey}`;
            
            // 비디오 다운로드
            const videoResponse = await axios.get(downloadUrl, { responseType: 'arraybuffer' });
            const filePath = path.join(saveDir, `${fileName}-${i + 1}.mp4`);
            
            fs.writeFileSync(filePath, Buffer.from(videoResponse.data));
            savedFiles.push(filePath);
          }
        }
    
        return {
          model: model,
          prompt: prompt,
          videos: savedFiles,
          count: savedFiles.length,
        };
      } catch (error) {
        throw this.formatError(error);
      }
    }
  • src/index.ts:49-49 (registration)
    Tool capability declaration in MCP server initialization (set to false, indicating disabled but registered).
    mcp_gemini_generate_videos: false,
Behavior2/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

With no annotations provided, the description carries full burden for behavioral disclosure. It mentions that the tool returns a generated video file path that must be communicated to the user, which is useful. However, it lacks critical behavioral details: whether this is a read-only or write operation, potential rate limits, authentication requirements, file size implications, or what happens if generation fails. For a complex video generation tool with 9 parameters, this is insufficient.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness4/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is appropriately concise with two sentences that each serve a purpose: stating the core function and specifying the return value requirement. It's front-loaded with the main purpose. However, the second sentence about file paths could be more integrated with the first for better flow.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness2/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

For a complex video generation tool with 9 parameters, no annotations, and no output schema, the description is incomplete. It doesn't explain what the tool returns beyond file paths, doesn't mention error conditions, doesn't provide context about the Google Veo model's capabilities/limitations, and doesn't guide usage relative to similar tools. The agent would need to infer too much from the schema alone.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

The description adds no parameter-specific information beyond what's already in the schema (which has 100% coverage). It doesn't explain relationships between parameters (e.g., how 'image' interacts with 'prompt'), provide examples, or clarify edge cases. With complete schema documentation, the baseline is 3, but the description doesn't enhance parameter understanding.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose4/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the tool's purpose: 'Google Veo 모델을 사용하여 비디오를 생성합니다' (uses Google Veo model to generate videos). It specifies the verb (generate) and resource (videos) with the model context. However, it doesn't explicitly differentiate from sibling tools like mcp_gemini_create_image or mcp_imagen_generate, which are also media generation tools.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines2/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description provides no guidance on when to use this tool versus alternatives. It doesn't mention when video generation is appropriate compared to image generation tools (like mcp_gemini_generate_image) or other video-related tools. The only usage hint is about returning the file path, which is operational rather than contextual.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

Install Server

Other Tools

Related Tools

Latest Blog Posts

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/bigdata-coss/agent_mcp'

If you have feedback or need assistance with the MCP directory API, please join our Discord server