gpt-oss-120b vs qwen-flash-2025-07-28

Pricing, Performance & Features Comparison

Authoropenai

Context Length131K

Reasoning

Providers0

ReleasedAug 2025

Knowledge Cutoff–

License–

OpenAI most powerful open weight model, which fits into a single H100 GPU.

Input–

Output–

Latency (p50)–

Output Limit131K

Function Calling

JSON Mode

InputText

OutputText

–

Authoralibaba

Context Length1M

Reasoning

Providers1

ReleasedJul 2025

Knowledge Cutoff–

License–

Qwen-Flash is the fastest and most cost-effective model in the Qwen series and is suitable for simple jobs.

Input$0.25

Output$2

Latency (p50)2.8s

Output Limit33K

Function Calling

JSON Mode

Input–

Output–

in$0.25out$2––