M$^2$DQN: A Robust Method for Accelerating Deep Q-learning Network

Zhe Zhang; Yukun Zou; Junjie Lai; Qing Xu

arXiv:2209.07809·cs.LG·September 19, 2022

M$^2$DQN: A Robust Method for Accelerating Deep Q-learning Network

Zhe Zhang, Yukun Zou, Junjie Lai, Qing Xu

PDF

Open Access 1 Repo

TL;DR

M$^2$DQN introduces a Max-Mean loss framework that enhances data efficiency and accelerates learning in Deep Q-Networks by minimizing the maximum TD-error across multiple experience batches, improving speed and performance.

Contribution

The paper proposes a novel Max-Mean loss framework for DQN that improves data efficiency and learning speed by focusing on the maximum TD-error among multiple experience batches.

Findings

01

Significant improvement in learning speed and performance in gym games.

02

Effective integration with existing DQN techniques like Double DQN.

03

Enhanced data efficiency in reinforcement learning applications.

Abstract

Deep Q-learning Network (DQN) is a successful way which combines reinforcement learning with deep neural networks and leads to a widespread application of reinforcement learning. One challenging problem when applying DQN or other reinforcement learning algorithms to real world problem is data collection. Therefore, how to improve data efficiency is one of the most important problems in the research of reinforcement learning. In this paper, we propose a framework which uses the Max-Mean loss in Deep Q-Network (M $^{2}$ DQN). Instead of sampling one batch of experiences in the training step, we sample several batches from the experience replay and update the parameters such that the maximum TD-error of these batches is minimized. The proposed method can be combined with most of existing techniques of DQN algorithm by replacing the loss function. We verify the effectiveness of this framework…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

myyura/minimax-dqn
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsSports Analytics and Performance · Reinforcement Learning in Robotics

MethodsSPEED: Separable Pyramidal Pooling EncodEr-Decoder for Real-Time Monocular Depth Estimation on Low-Resource Settings · Dense Connections · Double Q-learning · Double DQN · Q-Learning · Convolution · Experience Replay · Deep Q-Network