generate_video
Generate a video from a text description using AI. Supports async processing with optional image conditioning for first/last frames or style references.
Instructions
Generate a video from a text prompt using an OpenRouter video-generation model (default: google/veo-3.1). Submits an async job, polls until completion or max_wait_ms, then downloads the result. Optionally conditioned on first/last-frame images or reference images. Large outputs are auto-saved when save_path is provided and path-sandboxed.
Input Schema
| Name | Required | Description | Default |
|---|---|---|---|
| prompt | Yes | Text description of the desired video. | |
| model | No | Override the video model ID. | |
| resolution | No | 480p / 720p / 1080p / 1K / 2K / 4K (model-dependent). | |
| aspect_ratio | No | 16:9 / 9:16 / 1:1 / 4:3 / 3:4 / 21:9 / 9:21 (model-dependent). | |
| duration | No | Duration in seconds (model-dependent). | |
| seed | No | Deterministic seed when supported. | |
| first_frame_image | No | Optional image (path, URL, or data URL) used as the first frame for image-to-video. | |
| last_frame_image | No | Optional image used as the last frame for frame transitions. | |
| reference_images | No | Optional style/content reference images. | |
| provider | No | Provider-specific passthrough options keyed by provider slug. | |
| save_path | No | Where to save the video. Routed through the OPENROUTER_OUTPUT_DIR sandbox; extension auto-corrected. | |
| max_wait_ms | No | Total time to wait for the async job before returning a resumable handle (default 600000 ms). | |
| poll_interval_ms | No | Polling cadence (default 15000 ms). |
Implementation Reference
- Main handler function for the generate_video tool. Validates input, submits a video generation job via OpenRouter API, polls until completion or timeout, then downloads and returns/saves the result.
export async function handleGenerateVideo( request: { params: { arguments: GenerateVideoToolRequest } }, apiClient: OpenRouterAPIClient, progress?: ProgressHook, ) { const args = request.params.arguments ?? ({} as GenerateVideoToolRequest); if (!args.prompt || !args.prompt.trim()) { return toolError(ErrorCode.INVALID_INPUT, 'prompt is required.'); } // Fail-fast on unsafe save_path BEFORE spending credits on the job. let safeSavePath: string | null = null; if (args.save_path) { try { safeSavePath = await resolveSafeOutputPath(args.save_path); } catch (err) { if (err instanceof UnsafeOutputPathError) return toolErrorFrom(ErrorCode.UNSAFE_PATH, err); return toolErrorFrom(ErrorCode.INTERNAL, err); } } const model = args.model || process.env.OPENROUTER_DEFAULT_VIDEO_GEN_MODEL || FALLBACK_MODEL; const body = buildRequestBody(args, model); try { await attachFrameImages(args, body); } catch (err) { // Sandbox violation → UNSAFE_PATH; all other decode failures stay // as UNSUPPORTED_FORMAT (couldn't read, invalid data URL, etc.). if (err instanceof UnsafeOutputPathError) { return toolErrorFrom(ErrorCode.UNSAFE_PATH, err, 'Reference/frame image'); } return toolErrorFrom(ErrorCode.UNSUPPORTED_FORMAT, err, 'Reference/frame image'); } let envelope: VideoJobEnvelope; try { logger.info('generate_video.submit', { model, keys: Object.keys(body) }); envelope = await apiClient.submitVideoJob(body); } catch (err) { return classifyUpstreamError(err, 'generate_video.submit'); } const pollIntervalMs = Math.max( MIN_POLL_INTERVAL_MS, args.poll_interval_ms ?? getDefaultPollInterval(), ); const maxWaitMs = Math.max(100, args.max_wait_ms ?? getDefaultMaxWait()); const deadlineAt = Date.now() + maxWaitMs; const outcome = await pollUntilTerminal(apiClient, envelope, { pollIntervalMs, deadlineAt, onProgress: progress, }); if (outcome.kind === 'failed') { return toolError(ErrorCode.JOB_FAILED, extractJobError(outcome.status), { video_id: outcome.status.id, }); } if (outcome.kind === 'timeout') { return { content: [ { type: 'text' as const, text: `Video still generating after ${maxWaitMs}ms. Use get_video_status with video_id=${envelope.id} to resume.`, }, ], isError: false as const, _meta: { code: ErrorCode.JOB_STILL_RUNNING, video_id: envelope.id, polling_url: envelope.polling_url ?? `https://openrouter.ai/api/v1/videos/${envelope.id}`, last_status: outcome.last?.status, }, }; } try { const { content, _meta } = await finalizeCompletedJob( apiClient, outcome.status, safeSavePath, ); return { content, _meta }; } catch (err) { if (err instanceof UnsafeOutputPathError) { return toolErrorFrom(ErrorCode.UNSAFE_PATH, err); } return toolErrorFrom(ErrorCode.UPSTREAM_HTTP, err, 'Download'); } } - Input/request type definition for the generate_video tool, accepting prompt, model, resolution, aspect_ratio, duration, seed, first/last frame images, reference images, provider options, save_path, and polling parameters.
export interface GenerateVideoToolRequest { prompt: string; model?: string; resolution?: string; aspect_ratio?: string; duration?: number; seed?: number; first_frame_image?: string; last_frame_image?: string; reference_images?: string[]; provider?: Record<string, unknown>; save_path?: string; max_wait_ms?: number; poll_interval_ms?: number; } - src/tool-handlers.ts:371-438 (registration)Tool registration in ListToolsRequestSchema: defines the 'generate_video' tool name, description, annotations, and inputSchema with all parameters.
{ name: 'generate_video', description: 'Generate a video from a text prompt using an OpenRouter video-generation model (default: google/veo-3.1). ' + 'Submits an async job, polls until completion or max_wait_ms, then downloads the result. ' + 'Optionally conditioned on first/last-frame images or reference images. ' + 'Large outputs are auto-saved when save_path is provided and path-sandboxed.', annotations: { readOnlyHint: false, destructiveHint: false, idempotentHint: false, }, inputSchema: { type: 'object', properties: { prompt: { type: 'string', description: 'Text description of the desired video.' }, model: { type: 'string', description: 'Override the video model ID.' }, resolution: { type: 'string', description: '480p / 720p / 1080p / 1K / 2K / 4K (model-dependent).', }, aspect_ratio: { type: 'string', description: '16:9 / 9:16 / 1:1 / 4:3 / 3:4 / 21:9 / 9:21 (model-dependent).', }, duration: { type: 'number', minimum: 1, description: 'Duration in seconds (model-dependent).', }, seed: { type: 'number', description: 'Deterministic seed when supported.' }, first_frame_image: { type: 'string', description: 'Optional image (path, URL, or data URL) used as the first frame for image-to-video.', }, last_frame_image: { type: 'string', description: 'Optional image used as the last frame for frame transitions.', }, reference_images: { type: 'array', items: { type: 'string' }, description: 'Optional style/content reference images.', }, provider: { type: 'object', description: 'Provider-specific passthrough options keyed by provider slug.', }, save_path: { type: 'string', description: 'Where to save the video. Routed through the OPENROUTER_OUTPUT_DIR sandbox; extension auto-corrected.', }, max_wait_ms: { type: 'number', minimum: 10000, description: 'Total time to wait for the async job before returning a resumable handle (default 600000 ms).', }, poll_interval_ms: { type: 'number', minimum: 2000, description: 'Polling cadence (default 15000 ms).', }, }, required: ['prompt'], }, - src/tool-handlers.ts:521-525 (registration)Tool dispatch in CallToolRequestSchema: routes 'generate_video' requests to handleGenerateVideo with wrapped arguments and the API client.
case 'generate_video': return handleGenerateVideo( wrapToolArgs(args as GenerateVideoToolRequest | undefined), this.apiClient, ); - Helper function to prepare image inputs (file path, HTTP URL, or data URL) for frame_images and reference_images fields.
async function prepareImageInput( source: string, ): Promise<{ data: string; mime: string } | null> { if (!source) return null; if (source.startsWith('data:')) { const match = source.match(/^data:([^;,]+)(?:;[^,]*)*;base64,(.+)$/); if (!match) throw new Error(`Invalid image data URL: ${source.slice(0, 40)}…`); return { mime: match[1]!, data: match[2]! }; } if (source.startsWith('http://') || source.startsWith('https://')) { const { fetchHttpResource } = await import('./fetch-utils.js'); const { buffer, contentType } = await fetchHttpResource(source, { timeoutMs: 30_000, maxBytes: 25 * 1024 * 1024, maxRedirects: 8, }); const mime = (contentType?.split(';')[0]?.trim() || 'image/jpeg').toLowerCase(); return { mime, data: buffer.toString('base64') }; } // Local file: sandbox via path-safety's resolveSafeInputPath so // generate_video's first_frame_image / last_frame_image / // reference_images fields enforce the same OPENROUTER_INPUT_DIR // / OPENROUTER_OUTPUT_DIR / cwd scope that generate_image's // input_images already uses. Callers can still bypass with // OPENROUTER_ALLOW_UNSAFE_PATHS=1 for legacy scripts. const abs = await resolveSafeInputPath(source); const buf = await fs.readFile(abs); const ext = extname(abs).toLowerCase(); const mime = ext === '.png' ? 'image/png' : ext === '.webp' ? 'image/webp' : ext === '.gif' ? 'image/gif' : 'image/jpeg'; return { mime, data: buf.toString('base64') }; }