gpt-oss-120b vs qwen-flash-2025-07-28

Pricing, Performance & Features Comparison

Authoropenai

Context Length131K

Reasoning

Providers0

ReleasedAug 2025

Knowledge Cutoff-

License-

OpenAI most powerful open weight model, which fits into a single H100 GPU.

Input-

Output-

Latency (p50)-

Output Limit131K

Function Calling

JSON Mode

InputText

OutputText

Authoralibaba

Context Length1M

Reasoning

Providers1

ReleasedJul 2025

Knowledge Cutoff-

License-

Qwen-Flash is the fastest and most cost-effective model in the Qwen series and is suitable for simple jobs.

Input$0.25

Output$2

Latency (p50)2s

Output Limit33K

Function Calling

JSON Mode

Input-

Output-

in$0.25out$2--