Ensemble Elastic DQN: A novel multi-step ensemble approach to address overestimation in deep value-based reinforcement learning
Adrian Ly, Richard Dazeley, Peter Vamplew, Francisco Cruz, Sunil Aryal

TL;DR
This paper introduces Ensemble Elastic Step DQN (EEDQN), a new reinforcement learning algorithm combining ensemble and multi-step techniques to reduce overestimation bias, improve stability, and enhance sample efficiency in deep Q-learning.
Contribution
EEDQN unifies ensemble and elastic step updates, demonstrating improved stability and performance over standard and existing ensemble DQN variants on MinAtar benchmarks.
Findings
EEDQN outperforms baseline DQN methods across MinAtar environments.
EEDQN matches or exceeds state-of-the-art ensemble DQNs in final returns.
Systematic combination of ensemble and multi-step methods yields significant gains.
Abstract
While many algorithmic extensions to Deep Q-Networks (DQN) have been proposed, there remains limited understanding of how different improvements interact. In particular, multi-step and ensemble style extensions have shown promise in reducing overestimation bias, thereby improving sample efficiency and algorithmic stability. In this paper, we introduce a novel algorithm called Ensemble Elastic Step DQN (EEDQN), which unifies ensembles with elastic step updates to stabilise algorithmic performance. EEDQN is designed to address two major challenges in deep reinforcement learning: overestimation bias and sample efficiency. We evaluated EEDQN against standard and ensemble DQN variants across the MinAtar benchmark, a set of environments that emphasise behavioral learning while reducing representational complexity. Our results show that EEDQN achieves consistently robust performance across all…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsReinforcement Learning in Robotics · Software-Defined Networks and 5G · Amyotrophic Lateral Sclerosis Research
