Reinforcement learning with Deep Q Networks

Company

Anyscale

Date Published

March 1, 2022

Author

Misha Laskin

Word count

1189

Language

English

Hacker News points

None

URL

www.anyscale.com/blog/reinforcement-learning-with-deep-q-networks

Summary

This series on reinforcement learning explores the concept of Q functions and their application in Q learning algorithms. The goal of RL algorithms is to learn a policy that achieves maximum expected returns in its environment. A Q function predicts how much return an agent expects to get if it takes a specific action, and the agent's goal is to achieve this value. The Bellman error is used as a loss function for RL, which can be computed using a neural network. The Q learning algorithm involves training an agent to minimize the Bellman error by sampling transitions from a replay buffer and choosing actions based on epsilon greedy strategy. This simple algorithm has been used in breakthroughs like Deep Q Networks and is a foundation for other algorithms in the field of Deep RL.