Company
Date Published
Author
Yujian Tang
Word count
504
Language
English
Hacker News points
None

Summary

AssemblyAI has updated its real-time Speech Recognition system, improving accuracy while maintaining the same model. The upgrades include improved training methods and vocabulary tokenization. The new system uses intermediate CTC loss and bidirectional loss for AED models, which increases the quality of lower level text representations and introduces a better understanding of language to the model. Additionally, the updated model learns the start of words instead of using a "blank" or "separator" token, making it easier to predict sentence structure.