Modified Double DQN: addressing stability

Shervin Halat; Mohammad Mehdi Ebadzadeh; Kiana Amani

arXiv:2108.04115·cs.AI·October 30, 2024·1 cites

Modified Double DQN: addressing stability

Shervin Halat, Mohammad Mehdi Ebadzadeh, Kiana Amani

PDF

Open Access

TL;DR

This paper proposes three modifications to the Double DQN algorithm to improve its stability and reduce overestimation, supported by empirical and theoretical evaluations.

Contribution

The paper introduces three novel modifications to DDQN that enhance stability and maintain or improve performance over the original algorithm.

Findings

01

Modified algorithms show improved stability over DDQN.

02

None of the modifications underperform in overestimation correction.

03

Empirical and theoretical results validate the effectiveness of the proposed modifications.

Abstract

Inspired by Double Q-learning algorithm, the Double-DQN (DDQN) algorithm was originally proposed in order to address the overestimation issue in the original DQN algorithm. The DDQN has successfully shown both theoretically and empirically the importance of decoupling in terms of action evaluation and selection in computation of target values; although, all the benefits were acquired with only a simple adaption to DQN algorithm, minimal possible change as it was mentioned by the authors. Nevertheless, there seems a roll-back in the proposed algorithm of DDQN since the parameters of policy network are emerged again in the target value function which were initially withdrawn by DQN with the hope of tackling the serious issue of moving targets and the instability caused by it (i.e., by moving targets) in the process of learning. Therefore, in this paper three modifications to the DDQN…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsNetwork Security and Intrusion Detection

MethodsExperience Replay · Q-Learning · Dense Connections · Double Q-learning · Double DQN · Convolution · Deep Q-Network