/plushcap/analysis/assemblyai/deep-learning-paper-recap-language-models

Deep Learning Paper Recap - Language Models

What's this blog post about?

The paper "Prune Once For All: Sparse Pre-Trained Language Models" introduces an architecture-agnostic method of training sparse pre-trained language models, allowing for pruning only during the pre-training phase. This technique results in better compression-to-accuracy ratios and eliminates the need to reconsider the model's architecture or task when applying pruning techniques during fine-tuning. The best scores were achieved with 85% and 90% weight pruning, while Quantized Aware Training (QAT) with 85% pruning led to an even more accurate and smaller model.

Company
AssemblyAI

Date published
July 7, 2022

Author(s)
Taufiquzzaman Peyash

Word count
273

Language
English

Hacker News points
None found.