ADDQ: Adaptive Distributional Double Q-Learning

Leif D\"oring; Benedikt Wille; Maximilian Birr; Mihail B\^irsan; Martin Slowik

arXiv:2506.19478·cs.LG·June 26, 2025

ADDQ: Adaptive Distributional Double Q-Learning

Leif D\"oring, Benedikt Wille, Maximilian Birr, Mihail B\^irsan, Martin Slowik

PDF

1 Repo 1 Video

TL;DR

This paper introduces ADDQ, a simple and adaptable method built on distributional RL to reduce overestimation bias in Q-value estimation, improving convergence across various environments.

Contribution

It presents a novel, easy-to-implement framework for locally adaptive overestimation control in distributional RL algorithms, supported by theoretical analysis and empirical results.

Findings

01

Improved convergence in tabular, Atari, and MuJoCo environments.

02

Effective reduction of overestimation bias.

03

Easy integration with existing distributional RL algorithms.

Abstract

Bias problems in the estimation of $Q$ -values are a well-known obstacle that slows down convergence of $Q$ -learning and actor-critic methods. One of the reasons of the success of modern RL algorithms is partially a direct or indirect overestimation reduction mechanism. We propose an easy to implement method built on top of distributional reinforcement learning (DRL) algorithms to deal with the overestimation in a locally adaptive way. Our framework is simple to implement, existing distributional algorithms can be improved with a few lines of code. We provide theoretical evidence and use double $Q$ -learning to show how to include locally adaptive overestimation control in existing algorithms. Experiments are provided for tabular, Atari, and MuJoCo environments.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

bommehd/addq
pytorchOfficial

Videos

ADDQ: Adaptive distributional double Q-learning· slideslive