Finite-Time Analysis of Simultaneous Double Q-learning

Hyunjun Na; Donghwan Lee

arXiv:2406.09946·cs.LG·January 13, 2026

Finite-Time Analysis of Simultaneous Double Q-learning

Hyunjun Na, Donghwan Lee

PDF

Open Access

TL;DR

This paper introduces simultaneous double Q-learning (SDQ), a variant that simplifies the algorithm and provides finite-time convergence analysis, showing faster convergence and bias mitigation compared to traditional double Q-learning.

Contribution

The paper proposes SDQ, a new double Q-learning variant that eliminates random selection, enabling finite-time analysis and improved convergence speed.

Findings

01

SDQ converges faster than double Q-learning.

02

SDQ effectively reduces overestimation bias.

03

Finite-time error bounds are derived for SDQ.

Abstract

$Q$ -learning is one of the most fundamental reinforcement learning (RL) algorithms. Despite its widespread success in various applications, it is prone to overestimation bias in the $Q$ -learning update. To address this issue, double $Q$ -learning employs two independent $Q$ -estimators which are randomly selected and updated during the learning process. This paper proposes a modified double $Q$ -learning, called simultaneous double $Q$ -learning (SDQ), with its finite-time analysis. SDQ eliminates the need for random selection between the two $Q$ -estimators, and this modification allows us to analyze double $Q$ -learning through the lens of a novel switching system framework facilitating efficient finite-time analysis. Empirical studies demonstrate that SDQ converges faster than double $Q$ -learning while retaining the ability to mitigate the maximization bias. Finally, we derive a…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsImage Processing Techniques and Applications · Machine Learning and ELM · Optical Systems and Laser Technology