Company
Date Published
Author
Frank Sifei Luan, UC Berkeley
Word count
1257
Language
English
Hacker News points
36

Summary

We have achieved a new world record on the CloudSort benchmark using Ray, a general-purpose distributed execution framework, by developing Exoshuffle, a simple and flexible architecture for building distributed shuffle that achieves high performance. This record is 33% more cost-efficient than the previous one, set in 2016, and demonstrates Ray's ability to simplify high-performance distributed programming and achieve scalability and performance required for running challenging distributed data processing jobs. The success of Exoshuffle-CloudSort can be attributed to innovations in both open-source software and cloud infrastructure, including disaggregated storage that allows scaling compute and storage independently and elastically, and Ray's distributed futures abstraction that enables building complex distributed applications with fine-grained control. This achievement opens up the possibility for running a rich set of data processing workloads on Ray, and we are working to bring Exoshuffle capabilities into Ray Datasets to support use cases such as shuffling data loading for ML training.