deepseek-v4-flash vs glm-5.1

Pricing, Performance & Features Comparison

deepseek-v4-flash

Authordeepseek

Context Length1M

Reasoning

Providers1

ReleasedApr 2026

Knowledge Cutoff-

LicenseMIT License

Mixture-of-Experts model with 284B total parameters and 13B activated per token. Features hybrid attention architecture for efficient 1M context processing.

Input$0.14

Output$0.28

Latency (p50)3.3s

Output Limit384K

Function Calling

JSON Mode

InputText

OutputText

deepseek

in$0.14out$0.28cache$0.028write$0.14

Latency (24h)

Success Rate (24h)

glm-5.1

Authorzai

Context Length200K

Reasoning

Providers1

ReleasedApr 2026

Knowledge Cutoff-

LicenseMIT License

Post-training upgrade to GLM-5. Mixture-of-Experts model with 744B total parameters and 40B activated per token. Trained on Huawei Ascend 910B chips with enhanced RL for agentic capabilities.

Input$1.4

Output$4.4

Latency (p50)6.7s

Output Limit131K

Function Calling

JSON Mode

InputText

OutputText

zai

in$1.4out$4.4cache$0.26write$1.4