Nvidia has broken through prior barriers with their B200 GPUs
We have conducted independent benchmarking and are seeing >1,000 output tokens/s on Llama 4 Maverick, >10X the speed of some other providers.
This represents the fastest Maverick endpoint that we have benchmarked yet.
Exciting times ahead for developers when B200-based APIs are publicly available.
May 23, 2025 · 12:04 AM UTC
10
29
354

