de en es ja ko ru zh

Genkit MCP

Official

by firebase

Overview Schema Related Servers Score Discussions

TypeScript

Hybrid

genkit
py
samples
media-models-demo

README.md•6.93 KiB

# Media Generation Models Demo This sample demonstrates all media generation capabilities in the Google GenAI plugin: | Model | Output | Pattern | Latency | |-------|--------|---------|---------| | **TTS** | Audio (WAV) | Standard | ~1-5 seconds | | **Gemini Image** | Image (PNG) | Standard | ~5-15 seconds | | **Lyria** | Audio (WAV) | Standard | ~5-30 seconds | | **Veo** | Video (MP4) | Background | 30s - 5 minutes | ## Quick Start ```bash # Run with simulated models (no API key needed) ./run.sh # Open DevUI at http://localhost:4000 ``` ## Testing with Real Models ### Google AI Models (TTS, Image, Veo) ```bash # Set your Google AI API key export GEMINI_API_KEY=your_api_key_here # Run the demo ./run.sh ``` Get an API key from [Google AI Studio](https://aistudio.google.com/apikey). ### Vertex AI Models (Lyria) Lyria is only available through Vertex AI: ```bash # Authenticate with Google Cloud gcloud auth application-default login # Set your project export GOOGLE_CLOUD_PROJECT=your_project_id # Run the demo ./run.sh ``` ## Available Flows ### 1. TTS Speech Generator (`tts_speech_generator`) Converts text to natural-sounding speech. ```python # In DevUI or code: result = await tts_speech_generator_flow( text="Hello! Welcome to Genkit.", voice="Kore" # Options: Zephyr, Puck, Charon, Kore, Fenrir, Leda, etc. ) ``` **Voices Available:** - Zephyr, Puck, Charon, Kore, Fenrir, Leda - Orus, Aoede, Callirrhoe, Autonoe, Enceladus, Iapetus ### 2. Gemini Image Generator (`gemini_image_generator`) Generates images from text descriptions. ```python result = await gemini_image_generator_flow( prompt="A serene Japanese garden with cherry blossoms", aspect_ratio="16:9" # Options: 1:1, 16:9, 9:16, 4:3, 3:4 ) ``` ### 3. Lyria Audio Generator (`lyria_audio_generator`) Generates music and audio from text descriptions. Requires Vertex AI. ```python result = await lyria_audio_generator_flow( prompt="A peaceful piano melody with gentle rain", negative_prompt="loud, aggressive" ) ``` ### 4. Veo Video Generator (`veo_video_generator`) Generates videos using the background model pattern (long-running operation). ```python result = await veo_video_generator_flow( prompt="A cat playing piano in a jazz club", aspect_ratio="16:9", duration_seconds=5 ) ``` **How Veo Works:** ``` ┌─────────────────────────────────────────────────────────────┐ │ Veo Background Model Flow │ ├─────────────────────────────────────────────────────────────┤ │ │ │ 1. start() 2. poll every 3-5s 3. complete │ │ ┌────────┐ ┌────────────────┐ ┌────────────┐ │ │ │ Prompt │───────►│ Operation │─────►│ Video URL │ │ │ └────────┘ │ (job ID) │ └────────────┘ │ │ │ 10%...50%... │ │ │ └────────────────┘ │ │ │ │ Total time: 30 seconds to 5 minutes │ └─────────────────────────────────────────────────────────────┘ ``` ### 5. Media Models Overview (`media_models_overview`) Returns information about all available models and their status. ```python result = await media_models_overview_flow() # Returns which models are available based on environment ``` ## Testing Checklist ### Without API Keys (Simulated) - [ ] Run `./run.sh` without any API keys set - [ ] Open DevUI at http://localhost:4000 - [ ] Test `tts_speech_generator` - should return fake audio in ~1s - [ ] Test `gemini_image_generator` - should return fake image in ~2s - [ ] Test `lyria_audio_generator` - should return fake audio in ~3s - [ ] Test `veo_video_generator` - should show progress updates over ~10s - [ ] Test `media_models_overview` - should show all models as simulated ### With GEMINI_API_KEY - [ ] Set `GEMINI_API_KEY` environment variable - [ ] Run `./run.sh` - [ ] Test `tts_speech_generator` with different voices - [ ] Test `gemini_image_generator` with different prompts - [ ] Test `veo_video_generator` (may take 30s-5min) - [ ] Verify returned URLs/data are real media content ### With GOOGLE_CLOUD_PROJECT (Vertex AI) - [ ] Authenticate with `gcloud auth application-default login` - [ ] Set `GOOGLE_CLOUD_PROJECT` environment variable - [ ] Run `./run.sh` - [ ] Test `lyria_audio_generator` - should generate real audio ## Model Configuration ### TTS Config ```python config = { 'speech_config': { 'voice_config': { 'prebuilt_voice_config': {'voice_name': 'Kore'} }, # Multi-speaker (optional): 'multi_speaker_voice_config': { 'speaker_voice_configs': [ {'speaker': 'Alice', 'voice_config': {...}}, {'speaker': 'Bob', 'voice_config': {...}}, ] } } } ``` ### Image Config ```python config = { 'image_config': { 'aspect_ratio': '16:9', # or 1:1, 9:16, 4:3, 3:4, etc. 'image_size': '2K', # 1K, 2K, or 4K } } ``` ### Veo Config ```python config = { 'aspect_ratio': '16:9', # or 9:16 'duration_seconds': 5, # 5-8 seconds 'enhance_prompt': True, # AI-enhanced prompts 'negative_prompt': 'blurry', # What to avoid 'person_generation': 'allow', # or 'block' } ``` ### Lyria Config ```python config = { 'negative_prompt': 'noise, distortion', 'seed': 12345, # For reproducibility 'sample_count': 1, # Number of samples 'location': 'global', # Required for Lyria } ``` ## Troubleshooting ### "Model not found" error Ensure the API key is set correctly: ```bash echo $GEMINI_API_KEY # Should show your key ``` ### Lyria not working Lyria requires Vertex AI. Ensure: 1. `GOOGLE_CLOUD_PROJECT` is set 2. You've run `gcloud auth application-default login` 3. The project has Vertex AI API enabled ### Veo timeout Video generation can take up to 5 minutes. The demo has a 5-minute timeout. For longer videos or complex prompts, consider increasing the timeout. ## See Also - [Veo Documentation](https://ai.google.dev/gemini-api/docs/video) - [TTS Documentation](https://ai.google.dev/gemini-api/docs/speech) - [Gemini Image](https://ai.google.dev/gemini-api/docs/image-generation) - [Lyria (Vertex AI)](https://cloud.google.com/vertex-ai/docs/generative-ai/audio)

Loading blob content...

Latest Blog Posts

Redis vs ioredis vs valkey-glide
By punkpeye on January 26, 2026.
benchmark
Redis
valkey
Quickstart: Publish an MCP Server to the MCP Registry
By punkpeye on January 24, 2026.
mcp
official reference mirror
Official MCP Registry Server.json Requirements
By punkpeye on January 24, 2026.
mcp
official reference mirror

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/firebase/genkit'

If you have feedback or need assistance with the MCP directory API, please join our Discord server

README.md•6.93 KiB