FlexGen: High-throughput generative inference of large language models with a single GPU
What's this blog post about?
Company
Together AI
Date published
March 13, 2023
Author(s)
Ying Sheng, Lianmin Zheng, Binhang Yuan, Zhuohan Li, Max Ryabinin, Daniel Y. Fu, Zhiqiang Xie, Beidi Chen, Clark Barrett, Joseph E. Gonzalez, Percy Liang, Christopher RĂ©, Ion Stoica, Ce Zhang
Word count
317
Hacker News points
None found.
Language
English