/plushcap/analysis/assemblyai/improved-accuracy-on-assemblyais-real-time-speech-to-text-api

Improved Accuracy on AssemblyAI’s Real Time Speech-to-Text API

What's this blog post about?

AssemblyAI has updated its real-time Speech Recognition system, improving accuracy while maintaining the same model. The upgrades include improved training methods and vocabulary tokenization. The new system uses intermediate CTC loss and bidirectional loss for AED models, which increases the quality of lower level text representations and introduces a better understanding of language to the model. Additionally, the updated model learns the start of words instead of using a "blank" or "separator" token, making it easier to predict sentence structure.

Company
AssemblyAI

Date published
Oct. 6, 2021

Author(s)
Yujian Tang

Word count
504

Language
English

Hacker News points
None found.