The wisdom of the crowd: reliable deep reinforcement learning through ensembles of Q-functions

Elliott, Daniel L., author; Anderson, Charles W., advisor; Draper, Bruce, committee member; Kirby, Michael, committee member; Chong, Edwin, committee member

The wisdom of the crowd: reliable deep reinforcement learning through ensembles of Q-functions

Files

Elliott_colostate_0053A_15075.pdf (29.19 MB)

Date

2018

Authors

Elliott, Daniel L., author

Anderson, Charles W., advisor

Draper, Bruce, committee member

Kirby, Michael, committee member

Chong, Edwin, committee member

Abstract

Reinforcement learning agents learn by exploring the environment and then exploiting what they have learned. This frees the human trainers from having to know the preferred action or intrinsic value of each encountered state. The cost of this freedom is reinforcement learning can feel too slow and unstable during learning: exhibiting performance like that of a randomly initialized Q-function just a few parameter updates after solving the task. We explore the possibility that ensemble methods can remedy these shortcomings and do so by investigating a novel technique which harnesses the wisdom of the crowds by bagging Q-function approximator estimates. Our results show that this proposed approach improves all tasks and reinforcement learning approaches attempted. We are able to demonstrate that this is a direct result of the increased stability of the action portion of the state-action-value function used by Q-learning to select actions and by policy gradient methods to train the policy. Recently developed methods attempt to solve these RL challenges at the cost of increasing the number of interactions with the environment by several orders of magnitude. On the other hand, the proposed approach has little downside for inclusion: it addresses RL challenges while reducing the number interactions with the environment.

Subject

machine learning

Q-learning

ensemble

reinforcement learning

neural networks

URI

https://hdl.handle.net/10217/191477

Collections

2000-2019
Theses and Dissertations

Full item page

The wisdom of the crowd: reliable deep reinforcement learning through ensembles of Q-functions

Files

Date

Authors

Journal Title

Journal ISSN

Volume Title

Abstract

Description

Rights Access

Subject

Citation

URI

Associated Publications

Collections