/plushcap/analysis/together-ai/together-ai-flashattention-3

FlashAttention-3: Fast and Accurate Attention with Asynchrony and Low-precision

What's this blog post about?

Company
Together AI

Date published
July 11, 2024

Author(s)
Jay Shah (Colfax Research), Ganesh Bikshandi (Colfax Research), Ying Zhang (Meta), Vijay Thakkar (NVIDIA), Pradeep Ramani (NVIDIA), Tri Dao (Princeton University, Together AI)

Word count
1753

Hacker News points
287

Language
English


By Matt Makai. 2021-2024.