/plushcap/analysis/together-ai/together-ai-flashattention-3

FlashAttention-3: Fast and Accurate Attention with Asynchrony and Low-precision

What's this blog post about?

Company
Together AI

Date published
July 11, 2024

Author(s)
Jay Shah (Colfax Research), Ganesh Bikshandi (Colfax Research), Ying Zhang (Meta), Vijay Thakkar (NVIDIA), Pradeep Ramani (NVIDIA), Tri Dao (Princeton University, Together AI)

Word count
1753

Language
English

Hacker News points
287


By Matt Makai. 2021-2024.