The latest release of Ray 2.5 introduces several key features and enhancements across the Ray ecosystem, including support for training large language models (LLMs) with Ray Train, serving LLMs with Ray Serve, and a multi-GPU learner stack in RLlib for cost-efficient and scalable reinforcement learning agent training. The release also addresses bottlenecks in RLlib agent training by introducing a new multi-node, multi-GPU training stack that reduces costs by 1.7x. Additionally, Ray Serve now provides streaming responses to HTTP input requests, enhancing user experience, and supports multiplexing among replicas of dissimilar-shaped model architectures for efficient deployment of multiple models. The release also improves the usability of Ray Data for batch inference, with features such as a strict mode that requires schemas for all datasets and standalone Python objects are no longer supported. Overall, the Ray 2.5 release aims to improve ease of use, performance, and stability across the Ray ecosystem.