Deep Learning Paper Recap - Diffusion and Transformer Models
This week's Deep Learning Paper Reviews discuss two research papers. The first paper applies continuous diffusion models to controllable natural language generation (NLG), improving text generation tasks through innovative "rounding" and "embedding" steps. Results show outperformance of existing methods such as PPLM and FUDGE, but a major bottleneck is the slow decoding speed. The second paper proposes representation pooling to sparsify transformer architectures, achieving sublinear time and memory complexity. An analysis shows a 1.8x speedup during training and 4.5x speedup during inference for long document summarization tasks, but might not be as useful for short input sequences.
Company
AssemblyAI
Date published
Aug. 24, 2022
Author(s)
Dillon Pulliam, Sergio Ramirez Martin
Word count
373
Hacker News points
None found.
Language
English