Long Context, But Actually
AI21 Labs Co-CEO Yoav Shoham discusses how they built Jamba-Instruct, a foundation model with a context window of 256K tokens, to close the gap between claimed and effective context window length. The model is designed to efficiently serve long context workflows and offers a longer context than most competing models. Key questions addressed include whether having a long context window means the model does something useful with it, if long context models can be served with acceptable latency and unit economics, and if long context matters as much in RAGish days.
Company
AI21 Labs
Date published
June 26, 2024
Author(s)
-
Word count
3912
Language
English
Hacker News points
None found.