Company
Date Published
April 19, 2024
Author
Stephen Oladele
Word count
2947
Language
English
Hacker News points
None

Summary

Meta has released Llama 3 pre-trained and instruction-fine-tuned language models with 8 billion (8B) and 70 billion (70B) parameters, setting a new state-of-the-art for models of their sizes that are open-source and accessible. The model architecture focuses on capabilities tuned to specific instructions, demonstrating Meta's commitment to making helpful and safe AI systems. Llama 3 has been trained on over 15 trillion tokens, with the 8B model achieving scores of 66.6 on MMLU and 45.9 on AGIEval, while the 70B model outperforming other SoTA models on various benchmarks. The model is now available across various platforms, including cloud providers and hosting platforms, making it accessible to researchers, developers, and businesses. Meta's ambition to push the boundaries of LLM capabilities further includes larger models, multimodality, multilingualism, and longer context windows, with a focus on instruction-following and real-world impact.