Whisper no longer wears the open weights transcription accuracy crown with new entrants achieving better Artificial Analysis Word Error Rate scores Once considered the default choice for open weights transcription, OpenAI’s Whisper has now been surpassed by newer open weights models on the Artificial Analysis Word Error Rate (AA-WER) benchmark measuring transcription accuracy. AA-WER comprises three challenging datasets aligned with real-world use cases: AMI-SDM (multi-speaker meetings), Earnings-22 (earnings calls), and VoxPopuli (parliamentary proceedings). Top open weights performers: @NVIDIA’s Canary Qwen 2.5B and Parakeet TDT 0.6B V2, followed by @Mistral’s Voxtral Small and Mini, and @IBM Granite Speech 3.3 8B. Open weights Speech to Text models offer deployment flexibility, cost benefits, the potential for customization/fine-tuning, and enable use-cases such as privacy-sensitive workloads that need to run locally.

Nov 5, 2025 · 3:58 PM UTC

26
51
400