Content Deep Dive
FlexGen: High-throughput generative inference of large language models with a single GPU
Company
Together AI
Date Published
March 13, 2023
Author
Ying Sheng, Lianmin Zheng, Binhang Yuan, Zhuohan Li, Max Ryabinin, Daniel Y. Fu, Zhiqiang Xie, Beidi Chen, Clark Barrett, Joseph E. Gonzalez, Percy Liang, Christopher RĂ©, Ion Stoica, Ce Zhang
Word count
317
Language
English
Hacker News points
None
URL
www.together.ai/blog/flexgen-high-throughput-generative-inference-of-large-language-models-with-a-single-gpu
Summary
No summary generated yet.