Periodic Q-Learning

Donghwan Lee; Niao He

arXiv:2002.09795·cs.LG·February 25, 2020·1 cites

Periodic Q-Learning

Donghwan Lee, Niao He

PDF

Open Access

TL;DR

This paper analyzes the periodic Q-learning algorithm, providing a theoretical understanding and demonstrating its improved sample complexity over standard Q-learning in reinforcement learning.

Contribution

It offers the first finite-time analysis of periodic Q-learning, explaining its effectiveness and better sample efficiency in solving Markov decision processes.

Findings

01

Periodic Q-learning has a simpler finite-time analysis.

02

It achieves better sample complexity for epsilon-optimal policies.

03

Provides theoretical justification for target network techniques.

Abstract

The use of target networks is a common practice in deep reinforcement learning for stabilizing the training; however, theoretical understanding of this technique is still limited. In this paper, we study the so-called periodic Q-learning algorithm (PQ-learning for short), which resembles the technique used in deep Q-learning for solving infinite-horizon discounted Markov decision processes (DMDP) in the tabular setting. PQ-learning maintains two separate Q-value estimates - the online estimate and target estimate. The online estimate follows the standard Q-learning update, while the target estimate is updated periodically. In contrast to the standard Q-learning, PQ-learning enjoys a simple finite time analysis and achieves better sample complexity for finding an epsilon-optimal policy. Our result provides a preliminary justification of the effectiveness of utilizing target estimates or…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsReinforcement Learning in Robotics · Adversarial Robustness in Machine Learning · Machine Learning and Algorithms

MethodsQ-Learning