/plushcap/analysis/deepgram/whisper-v3-results

Whisper-v3 Hallucinations on Real World Data

What's this blog post about?

Whisper-v3, the latest version of OpenAI's automatic speech recognition (ASR) model, has been found to hallucinate more frequently than its predecessor, Whisper-v2, when tested on real-world data. The median Word Error Rate (WER) for Whisper-v3 is 53.4, while Whisper-v2 only has a median WER of 12.7. Users have reported hallucinations in languages like Japanese and Korean as well. The author of this text tested the model on various audio files and found that it performs well with edge cases but struggles with real-world data, leading to high error rates.

Company
Deepgram

Date published
Nov. 14, 2023

Author(s)
Jose Nicholas Francisco

Word count
1762

Language
English

Hacker News points
None found.