Voxtral comprehensively outperforms Whisper large-v3, the current leading open-source Speech Transcription model, in speech transcription. It beats GPT-4o mini Transcribe and Gemini 2.5 Flash across all tasks, and achieves state-of-the-art results on English short-form and Mozilla Common Voice, surpassing ElevenLabs Scribe and demonstrating its strong multilingual capabilities.
Jul 15, 2025 · 2:35 PM UTC
5
34
334

