Improved Accuracy on AssemblyAI’s Real Time Speech-to-Text API

Company

AssemblyAI

Date Published

Oct. 6, 2021

Author

Yujian Tang

Word count

504

Language

English

Hacker News points

None

URL

www.assemblyai.com/blog/improved-accuracy-on-assemblyais-real-time-speech-to-text-api

Summary

AssemblyAI has updated its real-time Speech Recognition system, improving accuracy while maintaining the same model. The upgrades include improved training methods and vocabulary tokenization. The new system uses intermediate CTC loss and bidirectional loss for AED models, which increases the quality of lower level text representations and introduces a better understanding of language to the model. Additionally, the updated model learns the start of words instead of using a "blank" or "separator" token, making it easier to predict sentence structure.