Announcing Ray 2.4.0: Infrastructure for LLM training, tuning, inference, and serving

Company

Anyscale

Date Published

April 27, 2023

Author

Richard Liaw, Jules S. Damji, Jiajun Yao

Word count

1692

Language

English

Hacker News points

None

URL

www.anyscale.com/blog/announcing-ray-2-4-0-infrastructure-for-llm-training-tuning-inference-and

Summary

The Ray 2.4 release features exciting improvements across the Ray ecosystem, including enhancements to Ray data for ease of use, stability, and observability, improved Serve observability, introduction of RLlib's module for custom reinforcement learning, improved Ray scalability for large clusters, new examples for Generative AI workloads such as Stable Diffusion and LLMs like GPT-J, and the introduction of a new LightningTrainer to scale PyTorch Lightning on Ray. The release aims to make Ray a pivotal compute substrate for generative AI workloads and address challenges associated with open-source generative AI infrastructure. With this update, users can now use Ray with Stable Diffusion and LLMs like GPT-J, fine-tune these models using DeepSpeed and Hugging Face, and build an open-source search engine with Ray and LangChain. The release also introduces a new RLModule abstraction in RLlib to define custom reinforcement learning models, improved Serve observability, and support for larger scale workloads up to 2000 nodes.