/plushcap/analysis/anyscale/-online-resource-allocation-with-ray-at-ant-group

Online Resource Allocation with Ray at Ant Group

What's this blog post about?

Ant Group has implemented a flexible, high-performance, stable, and scalable online resource allocation system based on Ray to support the largest online shopping event in the world, Double 11. The system's deployment scale reached more than 6000 CPU cores and is currently used for various application scenarios including marketing and order allocation. The online resource allocation solution involves a complex engineering implementation relying on offline and real-time data, with the algorithm's implementation relying on both real-time and iterative calculations. Ray provides a simple and easy-to-use API, supports convenient resource scheduling, and has second level fault-tolerant recovery ability, ensuring the availability of the service. The online resource allocation solution based on Ray has been running stably in Ant Group, successfully supporting large-scale activities such as Double 11 and Double 12.

Company
Anyscale

Date published
March 30, 2021

Author(s)
Xingyu Lu, Yang Liu, Tengwei Cai, Fengbin Fang

Word count
1838

Hacker News points
1

Language
English


By Matt Makai. 2021-2024.