Meta launches Llama 3.3 70B, achieving a level of intelligence previously reserved for Llama 3.1 405B and leapfrogging the November release of GPT-4o
We have completed our first round of independent evals on Llama 3.3 70B and are seeing a jump in Artificial Analysis Quality Index from 68 to 74, now matching Llama 3.1 405B’s score.
Congratulations to @AIatMeta on an excellent update!
Details:
➤ Biggest increases in MATH-500 (64% to 76%), GPQA Diamond (43% to 49%) and HumanEval (80% to 85%)
➤ Smaller increase in MMLU (84% to 86%)
➤ Llama 3.3 70B now leads Llama 3.1 405B in Math-500, and scores nearly equal to 405B in MMLU, GPQA Diamond and HumanEval
➤ With no change to model size, we anticipate that most providers serving Llama 3.1 70B APIs will imminently launch Llama 3.3 70B endpoints at equivalent price and speed to the 3.1 70B endpoints
See below for a full breakdown.
Dec 6, 2024 · 8:25 PM UTC
12
51
382

