Mamba-3B-SlimPJ: State-space models rivaling the best Transformer architecture
What's this blog post about?
Company
Together AI
Date published
Dec. 12, 2023
Author(s)
Tri Dao, Albert Gu
Word count
550
Hacker News points
None found.
Language
English