Company
Date Published
Author
Sven Mika
Word count
1475
Language
English
Hacker News points
None

Summary

RLlib, an open-source reinforcement learning library, has introduced two new features: support for attention networks as custom models and a "trajectory view API". The trajectory view API enables complex policy models by providing a way to efficiently collect and retrieve samples. This allows for faster training of RL algorithms with models such as recurrent neural networks (RNNs) and attention nets, which require access to previous observations or actions. The new feature solves two major problems: making complex model support possible and allowing for faster sample collection and retrieval. It provides a dictionary-based API that maps keys to "view requirement" objects, defining how RLlib should handle the input data for each view. This enables models to specify their requirements, such as viewing previous observations or actions, reducing memory complexity and improving performance. The new feature is particularly useful for environments like "stateless" CartPole, which requires stacking past observations and actions to solve.