How Well Does AI Transcribe Song Lyrics?
Recent advancements in deep learning technology have significantly improved AI's ability to recognize speech. While human transcription remains the gold standard, Automatic Speech Recognition (ASR) models are now able to transcribe lyrics in songs with a surprising level of accuracy. In tests conducted on 15 songs from three different genres, ASR achieved word error rates (WERs) ranging from 0.473 to 0.878. Female voices were generally recognized better than male ones across all genres. The most accurately transcribed song was "Hotline Bling" by Drake, with a WER of 0.473, while the least accurate transcription was for Michael Jackson's "Thriller," with a WER of 0.878. Overall, ASR models were able to transcribe about 20-30% of lyrics in songs accurately.
Company
AssemblyAI
Date published
Sept. 21, 2021
Author(s)
Yujian Tang
Word count
1984
Language
English
Hacker News points
None found.