Skip to main content
Glama

deepseek-v4-flash vs kimi-k2.6

Pricing, Performance & Features Comparison

Authordeepseek
Context Length1M
Reasoning
Providers1
ReleasedApr 2026
Knowledge Cutoff-
LicenseMIT License

Mixture-of-Experts model with 284B total parameters and 13B activated per token. Features hybrid attention architecture for efficient 1M context processing.

Input$0.14
Output$0.28
Latency (p50)3.3s
Output Limit384K
Function Calling
JSON Mode
-
InputText
OutputText
in$0.14out$0.28cache$0.028write$0.14
Latency (24h)
Success Rate (24h)
Authormoonshot
Context Length262K
Reasoning
Providers1
ReleasedApr 2026
Knowledge CutoffApr 2025
LicenseMIT License

Mixture-of-Experts model with 1T total parameters and 32B activated per token. Features MLA attention, MoonViT vision encoder, and agent swarm orchestration.

Input$0.95
Output$4
Latency (p50)6s
Output Limit66K
Function Calling
JSON Mode
InputText, Image, Video
OutputText
in$0.95out$4cache$0.16write$0.95
Latency (24h)
Success Rate (24h)