Streaming distributed execution across CPUs and GPUs

Company

Anyscale

Date Published

May 11, 2023

Author

Eric Liang, Stephanie Wang, Cheng Su

Word count

2067

Language

English

Hacker News points

None

URL

www.anyscale.com/blog/streaming-distributed-execution-across-cpus-and-gpus

Summary

Ray Data provides streaming execution for large-scale batch inference workloads, offering improved performance on heterogeneous clusters with both CPU and GPU devices. This allows for pipelined execution across an entire cluster, avoiding unnecessary overheads associated with bulk synchronous parallel frameworks. By leveraging end-to-end pipelining, Ray Data can handle demanding use cases such as video decoding, annotation, and classification, while also providing optimizations like memory stability, data locality, and fault tolerance to ensure seamless execution. The streaming backend is fully backwards compatible with the existing API, allowing users to transform datasets lazily with map operations and support shuffle operations, caching / materialization in memory, and more. Early users are taking advantage of Ray Data streaming to create efficient large-scale inference pipelines over unstructured data, including video and audio data.