Accelerated Target Updates for Q-learning

Bowen Weng; Huaqing Xiong; Wei Zhang

arXiv:1905.02841·cs.LG·May 14, 2019

Accelerated Target Updates for Q-learning

Bowen Weng, Huaqing Xiong, Wei Zhang

PDF

Open Access

TL;DR

This paper introduces an accelerated Q-learning method that uses historical iterates inspired by momentum techniques, leading to faster convergence in various reinforcement learning tasks.

Contribution

It proposes a novel accelerated target update scheme for Q-learning based on momentum-inspired ideas, with proven convergence conditions.

Findings

01

Accelerated algorithms outperform vanilla Q-learning in convergence speed.

02

Validated on multiple RL benchmarks including Atari and control problems.

03

Shows improved performance in standard RL environments.

Abstract

This paper studies accelerations in Q-learning algorithms. We propose an accelerated target update scheme by incorporating the historical iterates of Q functions. The idea is conceptually inspired by the momentum-based accelerated methods in the optimization theory. Conditions under which the proposed accelerated algorithms converge are established. The algorithms are validated using commonly adopted testing problems in reinforcement learning, including the FrozenLake grid world game, two discrete-time LQR problems from the Deepmind Control Suite, and the Atari 2600 games. Simulation results show that the proposed accelerated algorithms can improve the convergence performance compared with the vanilla Q-learning algorithm.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMachine Learning and ELM · Domain Adaptation and Few-Shot Learning · Advanced Bandit Algorithms Research

MethodsQ-Learning