Improved Accuracy on AssemblyAI’s Real Time Speech-to-Text API
AssemblyAI has updated its real-time Speech Recognition system, improving accuracy while maintaining the same model. The upgrades include improved training methods and vocabulary tokenization. The new system uses intermediate CTC loss and bidirectional loss for AED models, which increases the quality of lower level text representations and introduces a better understanding of language to the model. Additionally, the updated model learns the start of words instead of using a "blank" or "separator" token, making it easier to predict sentence structure.
Company
AssemblyAI
Date published
Oct. 6, 2021
Author(s)
Yujian Tang
Word count
504
Hacker News points
None found.
Language
English