gemini-3.1-flash-lite-preview vs gemini-2.0-flash-lite-preview-02-05
Pricing, Performance & Features Comparison
Gemini 2.0 Flash-Lite is Google's most cost-efficient Gemini 2.0 model, optimized for cost efficiency and low latency. It is a multimodal model supporting inputs including text, code, images, audio, and video, with text-only output. It features a maximum input capacity of 1,048,576 tokens and is cost-optimized for large-scale text output use cases.
Input$0.25
Output$1.5
Latency (p50)2.2s
Output Limit8K
Function Calling
JSON Mode
InputText, Image, Audio, Video
OutputText
in$0.25out$1.5cache$0.025–
Latency (24h)
Success Rate (24h)
Gemini 2.0 Flash-Lite delivers better quality than Gemini 1.5 Flash at the same speed and cost. It has a 1M token context window and multimodal input.
Input$0.075
Output$0.3
Latency (p50)–
Output Limit8K
Function Calling
-
JSON Mode
InputText, Image
OutputText
google-vertexCheapest
in$0.075out$0.3––
google-ai-studioCheapest
in$0.075out$0.3––