/plushcap/analysis/assemblyai/how-well-does-ai-transcribe-song-lyrics

How Well Does AI Transcribe Song Lyrics?

What's this blog post about?

Recent advancements in deep learning technology have significantly improved AI's ability to recognize speech. While human transcription remains the gold standard, Automatic Speech Recognition (ASR) models are now able to transcribe lyrics in songs with a surprising level of accuracy. In tests conducted on 15 songs from three different genres, ASR achieved word error rates (WERs) ranging from 0.473 to 0.878. Female voices were generally recognized better than male ones across all genres. The most accurately transcribed song was "Hotline Bling" by Drake, with a WER of 0.473, while the least accurate transcription was for Michael Jackson's "Thriller," with a WER of 0.878. Overall, ASR models were able to transcribe about 20-30% of lyrics in songs accurately.

Company
AssemblyAI

Date published
Sept. 21, 2021

Author(s)
Yujian Tang

Word count
1984

Language
English

Hacker News points
None found.


By Matt Makai. 2021-2024.