# π οΈ Veo 3.1 MCP - Complete Tools Reference
## Overview
The Veo 3.1 MCP server provides **6 tools** for AI video generation. This document describes each tool in detail with examples and best practices.
---
## Tool 1: `upload_image`
### Purpose
Upload an image to Google Files API and get a reusable `fileUri`. Use this for:
- Reference images (style guidance)
- First/last frames (interpolation)
- Frequently reused images (48h validity)
### Why This Tool?
**Token Efficiency:** Instead of passing huge base64 blobs, upload once and reuse the short fileUri.
### Input Schema
```typescript
{
source: 'url' | 'file_path', // Required
url?: string, // If source='url'
filePath?: string, // If source='file_path'
displayName?: string // Optional friendly name
}
```
### Output
```json
{
"success": true,
"file": {
"uri": "files/abc123",
"name": "files/abc123",
"displayName": "My Reference Image",
"mimeType": "image/jpeg",
"sizeBytes": "524288",
"expirationTime": "2025-11-24T12:00:00Z"
}
}
```
### Examples
**Upload local file:**
```json
{
"source": "file_path",
"filePath": "C:\\Users\\woute\\Pictures\\style-ref.jpg",
"displayName": "Corporate Style Reference"
}
```
**Upload from URL:**
```json
{
"source": "url",
"url": "https://example.com/brand-guide.png"
}
```
### Best Practices
β
**Pre-upload** frequently used references
β
**Reuse fileUris** for 48 hours
β
**Name descriptively** for tracking
β οΈ **Files expire** after 48h - re-upload if needed
---
## Tool 2: `start_video_generation`
### Purpose
Start a Veo 3.1 video generation job. Returns immediately with an operation ID that you poll with `get_video_job`.
### Input Schema (Complete)
```typescript
{
// Required
prompt: string, // Video description
// Model & Quality
model?: 'veo-3.1-generate-001' | // Quality ($0.20/sec)
'veo-3.1-fast-generate-001', // Speed ($0.10/sec) [default]
// Format
durationSeconds?: 4 | 6 | 8, // Default: 8
aspectRatio?: '16:9' | '9:16', // Default: 16:9
resolution?: '720p' | '1080p', // Default: 1080p
// Control
seed?: number, // Reproducibility
sampleCount?: 1 | 2 | 3 | 4, // Videos per request (default: 1)
generateAudio?: boolean, // Include audio (default: false)
// Visual Guidance (token-efficient!)
referenceImages?: ReferenceImage[], // 0-3 images
firstFrame?: ReferenceImage, // Must have both or neither
lastFrame?: ReferenceImage, // Must have both or neither
// Advanced
negativePrompt?: string, // Things to avoid
resizeMode?: 'pad' | 'crop' // Ref image fitting
}
```
### Output
```json
{
"success": true,
"operationName": "operations/abc123xyz",
"done": false,
"message": "Video generation started. Use get_video_job to poll.",
"estimatedTime": "30-120 seconds depending on complexity"
}
```
### Examples
#### Simple Text-to-Video
```json
{
"prompt": "A serene Japanese garden at sunrise, gentle wind moving bamboo leaves, cinematic camera pan"
}
```
Uses defaults: 8s, 16:9, 1080p, fast model, no audio
#### High-Quality with Audio
```json
{
"prompt": "Epic space battle with explosions and lasers",
"model": "veo-3.1-generate-001",
"durationSeconds": 8,
"resolution": "1080p",
"generateAudio": true
}
```
Cost: $3.20
#### With Reference Images (Token-Efficient!)
```json
{
"prompt": "A modern tech product showcase, rotating view",
"referenceImages": [
{
"source": "url",
"url": "https://cdn.example.com/product-style.jpg"
}
],
"durationSeconds": 6,
"resolution": "1080p"
}
```
Server auto-uploads URL to Files API, uses short fileUri internally.
#### With Pre-Uploaded References (Most Efficient!)
```json
{
"prompt": "Corporate brand video with our signature style",
"referenceImages": [
{
"source": "file_uri",
"fileUri": "files/brand123" // From upload_image
}
]
}
```
Zero upload overhead - fileUri used directly!
#### First/Last Frame Interpolation
```json
{
"prompt": "Smooth cinematic transition with camera zoom",
"firstFrame": {
"source": "file_path",
"filePath": "C:\\frames\\opening.jpg"
},
"lastFrame": {
"source": "file_path",
"filePath": "C:\\frames\\closing.jpg"
},
"durationSeconds": 8
}
```
Veo creates a coherent video interpolating between frames!
#### Multiple Variations with Seeds
```json
{
"prompt": "Abstract art video with flowing colors",
"seed": 42,
"sampleCount": 4
}
```
Generates 4 deterministic variations. Cost: 4Γ the base cost.
### Validation Rules
β
**Duration:** Must be exactly 4, 6, or 8
β
**References:** Max 3 images
β
**Samples:** Max 4 per request
β
**First/Last:** Both present or both absent
β οΈ **9:16 + refs:** May not be supported (use 16:9)
---
## Tool 3: `get_video_job`
### Purpose
Poll a video generation operation until complete, then retrieve video URLs.
### Input
```json
{
"operationName": "operations/abc123xyz"
}
```
### Output (In Progress)
```json
{
"done": false,
"status": "RUNNING",
"operationName": "operations/abc123xyz"
}
```
### Output (Complete)
```json
{
"done": true,
"status": "SUCCEEDED",
"operationName": "operations/abc123xyz",
"videos": [
{
"videoUri": "https://generativelanguage.googleapis.com/v1beta/files/video123:download?alt=media",
"mimeType": "video/mp4",
"durationSeconds": 8,
"resolution": "1080p"
}
]
}
```
### Polling Strategy
```
1. Start generation
2. Wait 15-30 seconds
3. Poll with get_video_job
4. If done=false, wait 15-30s more
5. Repeat until done=true
6. Download video from videoUri
```
**Don't poll too frequently** (< 10s) - wastes resources.
### Typical Flow
```
Start: operations/abc123
Poll @15s: {done: false, status: "RUNNING"}
Poll @45s: {done: false, status: "RUNNING"}
Poll @90s: {done: true, videos: [...]}
```
---
## Tool 4: `extend_video`
### Purpose
Extend a Veo-generated video by additional seconds.
### Input
```json
{
"videoFileUri": "files/video_abc123",
"additionalSeconds": 7,
"prompt": "The camera continues to pan left revealing more of the landscape",
"model": "veo-3.1-fast-generate-001",
"seed": 42
}
```
### Output
```json
{
"success": true,
"operationName": "operations/extend_xyz",
"done": false,
"message": "Video extension started. Use get_video_job to poll."
}
```
### Important Notes
β οΈ **Only Veo-generated videos** can be extended
β οΈ **Cannot extend arbitrary MP4s** you didn't create with Veo
β οΈ **Use videoFileUri** from previous Veo generation result
### Example Flow
```
1. Generate 8s video β get files/video1
2. extend_video {videoFileUri: "files/video1", additionalSeconds: 7}
3. Result: 15s video total
4. Can extend again if needed
```
---
## Tool 5: `start_batch_video_generation`
### Purpose
Generate multiple videos with controlled concurrency to respect rate limits.
### Input
```json
{
"jobs": [
{
"key": "scene1_take1",
"request": {
"prompt": "Opening scene of a tech startup office",
"seed": 1
}
},
{
"key": "scene1_take2",
"request": {
"prompt": "Opening scene of a tech startup office",
"seed": 2
}
},
{
"key": "scene2",
"request": {
"prompt": "Product demo with smooth camera movement",
"durationSeconds": 6
}
}
],
"concurrency": 3
}
```
### Output
```json
{
"success": true,
"batchId": "batch_1732298400000",
"totalJobs": 3,
"operations": [
{
"key": "scene1_take1",
"operationName": "operations/op1"
},
{
"key": "scene1_take2",
"operationName": "operations/op2"
},
{
"key": "scene2",
"operationName": "operations/op3"
}
],
"message": "Use get_video_job for each operationName"
}
```
### Concurrency Recommendations
| Concurrency | Use Case |
|-------------|----------|
| 1-2 | Conservative, low rate-limit risk |
| 3 | **Recommended** for most batches |
| 4-5 | Aggressive (monitor for 429 errors) |
| 6+ | β οΈ Will likely hit rate limits |
### Example: Generate 10 Video Variations
```json
{
"jobs": [
{"key": "v1", "request": {"prompt": "Product demo", "seed": 1}},
{"key": "v2", "request": {"prompt": "Product demo", "seed": 2}},
{"key": "v3", "request": {"prompt": "Product demo", "seed": 3}},
// ... up to v10
],
"concurrency": 3
}
```
**Time:** ~4-6 minutes total (vs 10-20 minutes sequential)
---
## Tool 6: `estimate_veo_cost`
### Purpose
Calculate cost in USD before generating, to plan budgets.
### Input
```json
{
"model": "veo-3.1-fast-generate-001",
"durationSeconds": 8,
"sampleCount": 1,
"generateAudio": false
}
```
### Output
```json
{
"estimatedCostUsd": 0.80,
"unitPricePerSec": 0.10,
"secondsBilled": 8,
"breakdown": "veo-3.1-fast-generate-001 (video only): $0.10/sec Γ 8s Γ 1 sample(s) = $0.80"
}
```
### Pricing Table
| Model | Video Only | Video + Audio |
|-------|------------|---------------|
| veo-3.1-generate-001 | $0.20/sec | $0.40/sec |
| veo-3.1-fast-generate-001 | $0.10/sec | $0.15/sec |
### Example Calculations
**Single 8s video (fast, no audio):**
```
$0.10/sec Γ 8s Γ 1 = $0.80
```
**Batch of 10 (fast, no audio):**
```
$0.10/sec Γ 8s Γ 10 = $8.00
```
**Quality with audio:**
```
$0.40/sec Γ 8s Γ 1 = $3.20
```
### Use Before Batches
```
estimate_veo_cost {
"model": "veo-3.1-fast-generate-001",
"durationSeconds": 8,
"sampleCount": 100
}
Returns: $80.00 estimate
β Decide if budget allows
β Adjust sampleCount or duration
```
---
## π― Tool Combination Patterns
### Pattern 1: Simple Generation
```
1. start_video_generation {prompt: "..."}
2. get_video_job (poll until done)
3. Download video
```
### Pattern 2: Style-Guided Generation
```
1. upload_image {source: "file_path", filePath: "style.jpg"}
β Returns files/style123
2. start_video_generation {
prompt: "...",
referenceImages: [{source: "file_uri", fileUri: "files/style123"}]
}
3. get_video_job (poll)
4. Reuse files/style123 for more videos (next 48h)
```
### Pattern 3: Frame Interpolation Workflow
```
1. upload_image {filePath: "first.jpg"} β files/first
2. upload_image {filePath: "last.jpg"} β files/last
3. start_video_generation {
prompt: "Smooth cinematic transition",
firstFrame: {source: "file_uri", fileUri: "files/first"},
lastFrame: {source: "file_uri", fileUri: "files/last"}
}
4. get_video_job (poll)
```
### Pattern 4: Video Series Creation
```
1. estimate_veo_cost {model: "fast", duration: 8, sampleCount: 5}
β Check budget
2. upload_image {filePath: "brand-style.jpg"} β files/brand
3. start_batch_video_generation {
jobs: [
{key: "intro", request: {prompt: "Intro scene..."}},
{key: "main", request: {prompt: "Main content..."}},
{key: "outro", request: {prompt: "Outro..."}}
],
concurrency: 3
}
4. Poll each operation
5. Stitch videos together
```
### Pattern 5: Iterative Refinement
```
1. Generate test at 720p + fast:
start_video_generation {
prompt: "...",
resolution: "720p",
model: "veo-3.1-fast-generate-001"
}
2. Review result
3. Generate final at 1080p + quality:
start_video_generation {
prompt: "...", // Same or refined
resolution: "1080p",
model: "veo-3.1-generate-001",
seed: 42 // Use seed from test if you liked it
}
```
---
## π¨ Reference Image Modes
### Single Reference (Style Transfer)
```json
{
"prompt": "Product rotating on a pedestal",
"referenceImages": [{
"source": "url",
"url": "https://example.com/product-style.jpg"
}]
}
```
### Multiple References (Style Combination)
```json
{
"prompt": "Futuristic car commercial",
"referenceImages": [
{"source": "file_uri", "fileUri": "files/lighting-ref"},
{"source": "file_uri", "fileUri": "files/color-ref"},
{"source": "file_uri", "fileUri": "files/composition-ref"}
]
}
```
Veo blends all 3 styles!
---
## βοΈ Parameter Deep-Dive
### `durationSeconds`
| Value | Use Case | Cost Multiplier |
|-------|----------|-----------------|
| 4 | Quick clips, loops | 0.5Γ |
| 6 | Standard clips | 0.75Γ |
| 8 | Full scenes | 1.0Γ |
### `resolution`
| Value | Dimensions | Use Case |
|-------|------------|----------|
| 720p | 1280Γ720 | Testing, social media |
| 1080p | 1920Γ1080 | Final delivery, HD quality |
### `model`
| Model | Speed | Quality | Cost/sec (no audio) |
|-------|-------|---------|---------------------|
| veo-3.1-fast-generate-001 | Fast | Good | $0.10 |
| veo-3.1-generate-001 | Slower | Best | $0.20 |
### `generateAudio`
- `false` (default): Video only
- `true`: Synchronized audio/SFX/music
**Cost Impact:** Doubles the per-second cost
### `seed`
- **Omit**: Different result each time
- **Set (e.g., 42)**: Reproducible output
- **Use Case:** Generate variations with different seeds
### `sampleCount`
Generate multiple videos in one request:
```json
{
"prompt": "Abstract art video",
"sampleCount": 4,
"seed": 100
}
```
Returns 4 different videos (seed increments automatically).
**Cost:** 4Γ the single video cost
---
## π¬ Advanced Workflows
### Workflow 1: Product Video Campaign
```
1. Upload product images:
- upload_image β files/product_front
- upload_image β files/product_side
- upload_image β files/brand_style
2. Generate videos:
- Scene 1: {firstFrame: product_front, lastFrame: product_side}
- Scene 2: {prompt: "...", referenceImages: [brand_style]}
- Scene 3: {extend previous video}
3. Download all and edit together
```
### Workflow 2: A/B Testing
```
1. estimate_veo_cost for 10 variations
2. start_batch_video_generation {
jobs: Array(10).fill().map((_, i) => ({
key: `variant_${i}`,
request: {
prompt: "Product demo video",
seed: i,
resolution: "720p" // Test quality first
}
})),
concurrency: 3
}
3. Poll all operations
4. Pick best variation
5. Regenerate winner at 1080p with quality model
```
### Workflow 3: Long-Form Video
```
1. Generate segment 1 (8s) β files/seg1
2. Extend seg1 by 7s β files/seg2 (15s total)
3. Extend seg2 by 7s β files/seg3 (22s total)
4. Continue as needed
```
---
## π‘ Token Efficiency in Practice
### Scenario: Generate 20 Videos with Same Style
**β Naive (inline base64):**
```
Each call includes 500KB base64 = ~50,000 tokens
20 calls Γ 50,000 = 1,000,000 tokens!
```
**β
Token-Efficient (this MCP):**
```
Upload once: files/style123 (one-time upload)
20 calls with fileUri = 20 Γ 20 tokens = 400 tokens
```
**Savings: 99.96%** π
---
## π Security & Best Practices
### API Key Security
β
Store in `.env` (gitignored)
β
Never commit to git
β
Regenerate if exposed
β Never hardcode
β Never expose to clients
### Cost Management
β
Estimate before large batches
β
Start with 720p for testing
β
Use fast model for iterations
β
Monitor Cloud Console billing
### Rate Limit Management
β
Use batch tool (concurrency: 3)
β
Add delays in custom scripts
β
Monitor 429 errors
β Don't burst > 50 req/min
### Quality Optimization
β
Detailed prompts work better
β
Reference images improve consistency
β
Seeds enable reproducibility
β
Test at 720p, finalize at 1080p
---
## π Comparison: All 6 Tools
| Tool | Purpose | Async? | Token-Efficient? | Cost |
|------|---------|--------|------------------|------|
| upload_image | Pre-upload refs | No | β
Enables efficiency | Free |
| start_video_generation | Generate video | Yes | β
Auto-uploads refs | $0.40-$3.20 |
| get_video_job | Check status | No | N/A | Free |
| extend_video | Add seconds | Yes | β
Uses fileUri | $0.40-$3.20 |
| start_batch_video_generation | Bulk generate | Yes | β
Shared cache | MultipleΓ |
| estimate_veo_cost | Cost preview | No | N/A | Free |
---
## π Summary
You have **6 powerful tools** for Veo 3.1 video generation:
1. **Upload** images once, reuse 48h (token-efficient!)
2. **Start** video generation (async, returns immediately)
3. **Poll** for completion (get video URLs)
4. **Extend** videos seamlessly
5. **Batch** generate with concurrency control
6. **Estimate** costs before generating
**All designed for maximum token efficiency and production reliability!** π
---
**See also:**
- [README.md](README.md) - Full guide
- [QUICK-REFERENCE.md](QUICK-REFERENCE.md) - Cheat sheet
- [IMPLEMENTATION-SUMMARY.md](IMPLEMENTATION-SUMMARY.md) - Technical details