llama-3.3-70b-instruct vs qwq-32b-preview

Pricing, Performance & Features Comparison

llama-3.3-70b-instruct

Authormeta

Context Length128K

Reasoning

Providers1

ReleasedDec 2024

Knowledge CutoffDec 2023

License-

Llama 3.3 is a text-only 70B instruction-tuned model that provides enhanced performance relative to Llama 3.1 70B–and to Llama 3.2 90B when used for text-only applications. Moreover, for some applications, Llama 3.3 70B approaches the performance of Llama 3.1 405B.

Input$0.45

Output$0.45

Latency (p50)-

Output Limit4K

Function Calling

JSON Mode

InputText

OutputText

avian

in$0.45out$0.45--

qwq-32b-preview

Authoralibaba

Context Length33K

Reasoning

Providers1

ReleasedNov 2024

Knowledge CutoffOct 2023

LicenseApache License 2.0

QwQ-32B-Preview is an experimental research model focusing on AI reasoning, with strong capabilities in math and coding. It features 32.5 billion parameters and a 32,768-token context window, leveraging transformer architecture with RoPE and advanced attention mechanisms. Despite its strengths, it has certain language mixing and reasoning limitations that remain areas of active research.

Input$0.17

Output$0.7

Latency (p50)9.5s

Output Limit512

Function Calling

JSON Mode

InputText, Image

OutputText

deepinfra

in$0.17out$0.7--

llama-3.3-70b-instruct vs qwq-32b-preview

Latency (24h)

Success Rate (24h)