/plushcap/analysis/assemblyai/getting-started-with-espnet

Getting Started with ESPnet

What's this blog post about?

We have successfully transcribed audio files into text using ESPnet's pretrained models for Automatic Speech Recognition (ASR). The process involved converting audio files to .wav format, if not already in that format, and then running them through the speech2text object. Preprocessing was also performed on the resulting transcriptions by removing punctuation and converting all text to uppercase using a helper function named "text_normalizer". The final transcriptions were compared with their corresponding true transcriptions for accuracy.

Company
AssemblyAI

Date published
June 6, 2022

Author(s)
Ryan O'Connor

Word count
1714

Language
English

Hacker News points
None found.