Skip to main content
Glama

gemini-3-flash vs glm-4.7-flash

Pricing, Performance & Features Comparison

Price unit:
Authorgoogle
Context Length1M
Reasoning
Providers1
ReleasedDec 2025
Knowledge CutoffJan 2025
License-

Gemini 3 Flash combines Gemini 3 Pro's reasoning capabilities with the Flash line's levels on latency, efficiency, and cost. It not only enables everyday tasks with improved reasoning, but is designed to tackle the most complex agentic workflows.

Input$0.5
Output$3
Latency (p50)4.5s
Output Limit66K
Function Calling
JSON Mode
-
Input-
Output-
in$0.5out$3--
Latency (24h)
Success Rate (24h)
Authorzai
Context Length200K
Reasoning
-
Providers1
ReleasedJan 2026
Knowledge Cutoff-
License-

GLM-4.7-Flash is a 30B Mixture-of-Experts (MoE) reasoning model with approximately 3.6B active parameters, designed for local deployment with best-in-class performance for coding, agentic workflows, and chat. It supports a 200K context window and achieves open-source state-of-the-art scores on benchmarks like SWE-bench Verified and τ²-Bench, excelling particularly in frontend and backend development capabilities.

Input$0.07
Output$0.4
Latency (p50)8.2s
Output Limit131K
Function Calling
JSON Mode
-
InputText
OutputText
in$0.07out$0.4-write$0.01
Latency (24h)
Success Rate (24h)