/plushcap/analysis/assemblyai/introducing-our-new-punctuation-restoration-and-truecasing-models

Introducing Our New Punctuation Restoration and Truecasing Models

What's this blog post about?

New models for Punctuation Restoration and Truecasing have been introduced, outperforming previous production models on various data and metrics. The new models show significant improvements in handling casing for challenging linguistic types such as mixed-case words (+39% F1 score), acronyms (+20% F1 score), and capital-case (+11% F1 score). Overall, there is a 17% relative improvement on average across test datasets for predicting upper-case letter classification. Punctuation accuracy improves by 11% (F1 score). The new models are already in production, with API users automatically benefiting from the upgrades.

Company
AssemblyAI

Date published
Nov. 8, 2023

Author(s)
Marco Ramponi

Word count
1759

Language
English

Hacker News points
None found.