Attention Nets and More with RLlib's Trajectory View API

Company

Anyscale

Date Published

April 21, 2021

Author

Sven Mika

Word count

1475

Language

English

Hacker News points

None

URL

www.anyscale.com/blog/attention-nets-and-more-with-rllibs-trajectory-view-api

Summary

RLlib, an open-source reinforcement learning library, has introduced two new features: support for attention networks as custom models and a "trajectory view API". The trajectory view API enables complex policy models by providing a way to efficiently collect and retrieve samples. This allows for faster training of RL algorithms with models such as recurrent neural networks (RNNs) and attention nets, which require access to previous observations or actions. The new feature solves two major problems: making complex model support possible and allowing for faster sample collection and retrieval. It provides a dictionary-based API that maps keys to "view requirement" objects, defining how RLlib should handle the input data for each view. This enables models to specify their requirements, such as viewing previous observations or actions, reducing memory complexity and improving performance. The new feature is particularly useful for environments like "stateless" CartPole, which requires stacking past observations and actions to solve.