Skip to main content
Glama

gpt-oss-120b vs qwen-flash-2025-07-28

Pricing, Performance & Features Comparison

Price unit:
Authoropenai
Context Length131K
Reasoning
-
Providers0
ReleasedAug 2025
Knowledge Cutoff-
License-

OpenAI most powerful open weight model, which fits into a single H100 GPU.

Input-
Output-
Latency (p50)-
Output Limit131K
Function Calling
JSON Mode
-
InputText
OutputText
-
Authoralibaba
Context Length1M
Reasoning
-
Providers1
ReleasedJul 2025
Knowledge Cutoff-
License-

Qwen-Flash is the fastest and most cost-effective model in the Qwen series and is suitable for simple jobs.

Input$0.25
Output$2
Latency (p50)2.1s
Output Limit33K
Function Calling
-
JSON Mode
-
Input-
Output-
in$0.25out$2--
Latency (24h)
Success Rate (24h)