Ensemble and Auxiliary Tasks for Data-Efficient Deep Reinforcement   Learning

Muhammad Rizki Maulana; Wee Sun Lee

arXiv:2107.01904·cs.LG·July 7, 2021

Ensemble and Auxiliary Tasks for Data-Efficient Deep Reinforcement Learning

Muhammad Rizki Maulana, Wee Sun Lee

PDF

1 Repo

TL;DR

This paper investigates how ensemble methods and auxiliary tasks interact to improve data efficiency in deep reinforcement learning, specifically within deep Q-learning applied to ATARI games, supported by theoretical analysis.

Contribution

It provides a comprehensive analysis of combining ensemble techniques with auxiliary tasks in deep RL, including a refined bias-variance-covariance decomposition to understand their effects.

Findings

01

Ensemble and auxiliary tasks improve data efficiency in deep RL.

02

The combined approach outperforms individual methods in ATARI games.

03

Theoretical analysis clarifies how these methods influence bias and variance.

Abstract

Ensemble and auxiliary tasks are both well known to improve the performance of machine learning models when data is limited. However, the interaction between these two methods is not well studied, particularly in the context of deep reinforcement learning. In this paper, we study the effects of ensemble and auxiliary tasks when combined with the deep Q-learning algorithm. We perform a case study on ATARI games under limited data constraint. Moreover, we derive a refined bias-variance-covariance decomposition to analyze the different ways of learning ensembles and using auxiliary tasks, and use the analysis to help provide some understanding of the case study. Our code is open source and available at https://github.com/NUS-LID/RENAULT.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

NUS-LID/RENAULT
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

MethodsQ-Learning