Voxtral comprehensively outperforms Whisper large-v3, the current leading open-source Speech Transcription model, in speech transcription. It beats GPT-4o mini Transcribe and Gemini 2.5 Flash across all tasks, and achieves state-of-the-art results on English short-form and Mozilla Common Voice, surpassing ElevenLabs Scribe and demonstrating its strong multilingual capabilities.

Jul 15, 2025 · 2:35 PM UTC

5
34
334