/plushcap/analysis/together-ai/together-ai-specexec

SpecExec: Massively Parallel Speculative Decoding for Interactive LLM Inference on Consumer Devices

What's this blog post about?

Company
Together AI

Date published
June 18, 2024

Author(s)
Ruslan Svirschevski, Avner May, Zhuoming Chen, Beidi Chen, Zhihao Jia, Max Ryabinin

Word count
1308

Hacker News points
None found.

Language
English


By Matt Makai. 2021-2024.