Content Deep Dive
SpecExec: Massively Parallel Speculative Decoding for Interactive LLM Inference on Consumer Devices
Company
Together AI
Date Published
June 18, 2024
Author
Ruslan Svirschevski, Avner May, Zhuoming Chen, Beidi Chen, Zhihao Jia, Max Ryabinin
Word count
1308
Language
English
Hacker News points
None
URL
www.together.ai/blog/specexec
Summary
No summary generated yet.