gemini-2.5-flash vs kimi-latest-128k

Pricing, Performance & Features Comparison

gemini-2.5-flash

Authorgoogle

Context Length1M

Reasoning

Providers1

ReleasedJun 2025

Knowledge CutoffJan 2025

License–

Gemini 2.5 Flash is our best model in terms of price and performance, and offers well-rounded capabilities.

Input$0.15

Output$0.6

Latency (p50)1.8s

Output Limit66K

Function Calling

JSON Mode

Input–

Output–

google-vertex

in$0.15out$0.6––

Latency (24h)

Success Rate (24h)

kimi-latest-128k

Authormoonshot

Context Length128K

Reasoning

Providers1

ReleasedJul 2025

Knowledge Cutoff–

License–

Kimi-latest-128k refers to the Kimi K2 model, a state-of-the-art Mixture-of-Experts (MoE) language model with 32 billion activated and 1 trillion total parameters. It features a 128K context length and is meticulously optimized for agentic capabilities, specifically designed for tool use, reasoning, and autonomous problem-solving.

Input$2

Output$5

Latency (p50)2.5s

Output Limit128K

Function Calling

JSON Mode

InputText, Image, Audio, Video

OutputText, Audio

moonshot

in$2out$5cache$0.15–