The third generation of machine learning (ML) architectures has made significant progress in improving performance by exploiting the full bandwidth of distributed memory. Distributed libraries such as Ray Datasets and Ray Train enable improved performance by allowing for greater programmability, reducing operational and development overheads, and achieving better performance by passing data in-memory and with pipelining. A key capability of these architectures is their composability, which allows developers to compose existing distributed systems to solve complex problems. The example code snippet demonstrates how to create a Ray Dataset Pipeline with a distributed Ray Train Job, which can be used to express the ML ingest and training pipeline with just a few lines of Python code. This approach achieves lower operational and development overheads, better performance, and greater programmability compared to second-generation architectures.