Introducing Jamba: AI21's Groundbreaking SSM-Transformer Model
AI21 Labs introduces Jamba, a groundbreaking language model that combines Mamba Structured State Space (SSM) technology with elements of the traditional Transformer architecture. This hybrid approach addresses the limitations of pure SSM models and delivers remarkable gains in throughput and efficiency. Jamba offers a 256K context window and outperforms or matches other state-of-the-art models in its size class on various benchmarks. The release of Jamba with open weights invites further discoveries and optimizations, opening new possibilities for language model development.
Company
AI21 Labs
Date published
March 28, 2024
Author(s)
-
Word count
3385
Hacker News points
8
Language
English