Periodic Regularized Q-Learning
Hyukjun Yang, Han-Dong Lim, Donghwan Lee

TL;DR
This paper introduces Periodic Regularized Q-Learning (PRQ), a new RL algorithm with finite-time convergence guarantees under linear function approximation, achieved through a novel regularization of the projection operator.
Contribution
It proposes a regularized projected value iteration method and extends it to a stochastic setting, ensuring stable convergence in RL with function approximation.
Findings
PRQ converges in finite time under linear function approximation.
Regularization of the projection operator makes the projected value iteration a contraction.
Theoretical analysis confirms the stability and convergence of PRQ.
Abstract
In reinforcement learning (RL), Q-learning is a fundamental algorithm whose convergence is guaranteed in the tabular setting. However, this convergence guarantee does not hold under linear function approximation. To overcome this limitation, a significant line of research has introduced regularization techniques to ensure stable convergence under function approximation. In this work, we propose a new algorithm, periodic regularized Q-learning (PRQ). We first introduce regularization at the level of the projection operator and explicitly construct a regularized projected value iteration (RP-VI), subsequently extending it to a sample-based RL algorithm. By appropriately regularizing the projection operator, the resulting projected value iteration becomes a contraction. By extending this regularized projection into the stochastic setting, we establish the PRQ algorithm and provide a…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsReinforcement Learning in Robotics · Stochastic Gradient Optimization Techniques · Adaptive Dynamic Programming Control
