Ensemble Elastic DQN: A novel multi-step ensemble approach to address overestimation in deep value-based reinforcement learning

Adrian Ly; Richard Dazeley; Peter Vamplew; Francisco Cruz; Sunil Aryal

arXiv:2506.05716·cs.LG·June 9, 2025

Ensemble Elastic DQN: A novel multi-step ensemble approach to address overestimation in deep value-based reinforcement learning

Adrian Ly, Richard Dazeley, Peter Vamplew, Francisco Cruz, Sunil Aryal

PDF

Open Access

TL;DR

This paper introduces Ensemble Elastic Step DQN (EEDQN), a new reinforcement learning algorithm combining ensemble and multi-step techniques to reduce overestimation bias, improve stability, and enhance sample efficiency in deep Q-learning.

Contribution

EEDQN unifies ensemble and elastic step updates, demonstrating improved stability and performance over standard and existing ensemble DQN variants on MinAtar benchmarks.

Findings

01

EEDQN outperforms baseline DQN methods across MinAtar environments.

02

EEDQN matches or exceeds state-of-the-art ensemble DQNs in final returns.

03

Systematic combination of ensemble and multi-step methods yields significant gains.

Abstract

While many algorithmic extensions to Deep Q-Networks (DQN) have been proposed, there remains limited understanding of how different improvements interact. In particular, multi-step and ensemble style extensions have shown promise in reducing overestimation bias, thereby improving sample efficiency and algorithmic stability. In this paper, we introduce a novel algorithm called Ensemble Elastic Step DQN (EEDQN), which unifies ensembles with elastic step updates to stabilise algorithmic performance. EEDQN is designed to address two major challenges in deep reinforcement learning: overestimation bias and sample efficiency. We evaluated EEDQN against standard and ensemble DQN variants across the MinAtar benchmark, a set of environments that emphasise behavioral learning while reducing representational complexity. Our results show that EEDQN achieves consistently robust performance across all…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsReinforcement Learning in Robotics · Software-Defined Networks and 5G · Amyotrophic Lateral Sclerosis Research