Repository logo
 

The wisdom of the crowd: reliable deep reinforcement learning through ensembles of Q-functions

Date

2018

Authors

Elliott, Daniel L., author
Anderson, Charles W., advisor
Draper, Bruce, committee member
Kirby, Michael, committee member
Chong, Edwin, committee member

Journal Title

Journal ISSN

Volume Title

Abstract

Reinforcement learning agents learn by exploring the environment and then exploiting what they have learned. This frees the human trainers from having to know the preferred action or intrinsic value of each encountered state. The cost of this freedom is reinforcement learning can feel too slow and unstable during learning: exhibiting performance like that of a randomly initialized Q-function just a few parameter updates after solving the task. We explore the possibility that ensemble methods can remedy these shortcomings and do so by investigating a novel technique which harnesses the wisdom of the crowds by bagging Q-function approximator estimates. Our results show that this proposed approach improves all tasks and reinforcement learning approaches attempted. We are able to demonstrate that this is a direct result of the increased stability of the action portion of the state-action-value function used by Q-learning to select actions and by policy gradient methods to train the policy. Recently developed methods attempt to solve these RL challenges at the cost of increasing the number of interactions with the environment by several orders of magnitude. On the other hand, the proposed approach has little downside for inclusion: it addresses RL challenges while reducing the number interactions with the environment.

Description

Rights Access

Subject

machine learning
Q-learning
ensemble
reinforcement learning
neural networks

Citation

Associated Publications