Preventing Value Function Collapse in Ensemble {Q}-Learning by   Maximizing Representation Diversity

Hassam Ullah Sheikh; Ladislau B\"ol\"oni

arXiv:2006.13823·cs.LG·January 24, 2022

Preventing Value Function Collapse in Ensemble {Q}-Learning by Maximizing Representation Diversity

Hassam Ullah Sheikh, Ladislau B\"ol\"oni

PDF

Open Access

TL;DR

This paper introduces a regularization method to enhance diversity among ensemble Q-learning models, effectively reducing overestimation bias and improving performance over existing ensemble techniques.

Contribution

It proposes a novel regularization technique inspired by economics and consensus optimization to prevent ensemble collapse in Q-learning.

Findings

01

Regularization significantly improves ensemble diversity.

02

Enhanced ensemble methods outperform Maxmin and Ensemble Q-learning.

03

The approach reduces overestimation bias effectively.

Abstract

The classic DQN algorithm is limited by the overestimation bias of the learned Q-function. Subsequent algorithms have proposed techniques to reduce this problem, without fully eliminating it. Recently, the Maxmin and Ensemble Q-learning algorithms have used different estimates provided by the ensembles of learners to reduce the overestimation bias. Unfortunately, these learners can converge to the same point in the parametric or representation space, falling back to the classic single neural network DQN. In this paper, we describe a regularization technique to maximize ensemble diversity in these algorithms. We propose and compare five regularization functions inspired from economics theory and consensus optimization. We show that the regularized approach significantly outperforms the Maxmin and Ensemble Q-learning algorithms as well as non-ensemble baselines.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsNeural Networks and Applications · Machine Learning and ELM · Reinforcement Learning in Robotics

MethodsConvolution · Dense Connections · Deep Q-Network · Q-Learning