gemini-3.1-flash-lite-preview vs o3-mini-2025-01-31

Pricing, Performance & Features Comparison

gemini-3.1-flash-lite-preview

Authorgoogle

Context Length1M

Reasoning

Providers1

ReleasedFeb 2025

Knowledge CutoffJun 2024

License-

Gemini 2.0 Flash-Lite is Google's most cost-efficient Gemini 2.0 model, optimized for cost efficiency and low latency. It is a multimodal model supporting inputs including text, code, images, audio, and video, with text-only output. It features a maximum input capacity of 1,048,576 tokens and is cost-optimized for large-scale text output use cases.

Input$0.25

Output$1.5

Latency (p50)1.2s

Output Limit8K

Function Calling

JSON Mode

InputText, Image, Audio, Video

OutputText

google-vertex

in$0.25out$1.5cache$0.025-

Latency (24h)

Success Rate (24h)

o3-mini-2025-01-31

Authoropenai

Context Length200K

Reasoning

Providers1

ReleasedJan 2025

Knowledge Cutoff-

License-

o3-mini is OpenAI's most recent small reasoning model, providing high intelligence at the same cost and latency targets of o1-mini. o3-mini also supports key developer features, like Structured Outputs, function calling, Batch API, and more. Like other models in the o-series, it is designed to excel at science, math, and coding tasks.

Input$1

Output$4.4

Latency (p50)2.1s

Output Limit100K

Function Calling

JSON Mode

InputText

OutputText

openai

in$1out$4.4cache$0.55-