Medusa: Simple framework for accelerating LLM generation with multiple decoding heads
What's this blog post about?
Company
Together AI
Date published
Sept. 11, 2023
Author(s)
Tianle Cai*, Yuhong Li*, Zhengyang Geng, Hongwu Peng, Tri Dao (* Equal contribution)
Word count
2817
Language
English
Hacker News points
None found.