Together AI has partnered with Meta to release Llama 3.1 models for inference and fine-tuning, offering accelerated performance at full accuracy. The largest openly available foundation model, Llama 3.1 405B, rivals the best closed source models in AI and is now available on the Together Inference Platform. This platform delivers horizontal scalability with industry-leading performance, empowering developers to build Generative AI applications at production scale. With innovations like FlashAttention-3 kernels and custom-built speculators, the Together Inference Engine enables unmatched performance, accuracy, and cost-efficiency for Llama 3.1 models. Over 100,000 developers and companies are already building and running their Generative AI applications on the Together Platform, while new serverless endpoints and dedicated instances are available for deployment. The launch of Llama 3.1 models also includes significant advancements like expanded context length to 128K, support across 8 languages, and new security and safety tools for responsible development.