$\beta$-DQN: Improving Deep Q-Learning By Evolving the Behavior

Hongming Zhang; Fengshuo Bai; Chenjun Xiao; Chao Gao; Bo Xu; Martin M\"uller

arXiv:2501.00913·cs.LG·October 29, 2025

$\beta$-DQN: Improving Deep Q-Learning By Evolving the Behavior

Hongming Zhang, Fengshuo Bai, Chenjun Xiao, Chao Gao, Bo Xu, Martin M\"uller

PDF

Open Access

TL;DR

The paper introduces $eta$-DQN, a simple and efficient exploration method for deep Q-learning that uses a behavior function to generate diverse policies, improving exploration without significant computational cost.

Contribution

It proposes $eta$-DQN, a novel exploration approach that combines a behavior function with an adaptive policy selection mechanism, enhancing exploration in deep reinforcement learning.

Findings

01

$eta$-DQN outperforms baseline methods on various tasks.

02

The method is easy to implement with minimal overhead.

03

It effectively balances exploration and bias correction.

Abstract

While many sophisticated exploration methods have been proposed, their lack of generality and high computational cost often lead researchers to favor simpler methods like $ϵ$ -greedy. Motivated by this, we introduce $β$ -DQN, a simple and efficient exploration method that augments the standard DQN with a behavior function $β$ . This function estimates the probability that each action has been taken at each state. By leveraging $β$ , we generate a population of diverse policies that balance exploration between state-action coverage and overestimation bias correction. An adaptive meta-controller is designed to select an effective policy for each episode, enabling flexible and explainable exploration. $β$ -DQN is straightforward to implement and adds minimal computational overhead to the standard DQN. Experiments on both simple and challenging exploration domains show…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsNeural Networks and Applications · Machine Learning and ELM · Face and Expression Recognition

MethodsDense Connections · Q-Learning · Convolution · Deep Q-Network