gemini-3.1-flash-lite-preview vs gpt-5.4-2026-03-05

Pricing, Performance & Features Comparison

gemini-3.1-flash-lite-preview

Authorgoogle

Context Length1M

Reasoning

Providers1

ReleasedFeb 2025

Knowledge CutoffJun 2024

License–

Gemini 2.0 Flash-Lite is Google's most cost-efficient Gemini 2.0 model, optimized for cost efficiency and low latency. It is a multimodal model supporting inputs including text, code, images, audio, and video, with text-only output. It features a maximum input capacity of 1,048,576 tokens and is cost-optimized for large-scale text output use cases.

Input$0.25

Output$1.5

Latency (p50)1.8s

Output Limit8K

Function Calling

JSON Mode

InputText, Image, Audio, Video

OutputText

google-vertex

in$0.25out$1.5cache$0.025–

Latency (24h)

Success Rate (24h)

gpt-5.4-2026-03-05

Authoropenai

Context Length1M

Reasoning

Providers1

ReleasedMar 2026

Knowledge CutoffAug 2025

License–

GPT-5.4 is OpenAI's most capable and efficient frontier model for professional work, combining advanced reasoning, coding, and native computer-use capabilities into a single model. It supports up to 1 million tokens of context, enabling agents to plan, execute, and verify tasks across long horizons. It is also the most factual model OpenAI has released, with individual claims 33% less likely to be false compared to GPT-5.2.

Input$2.5

Output$15

Latency (p50)1.7s

Output Limit128K

Function Calling

JSON Mode

InputText, Image

OutputText, Image

openai

in$2.5out$15cache$0.25–