Accelerating Primal-dual Methods for Regularized Markov Decision Processes
Haoya Li, Hsiang-fu Yu, Lexing Ying, and Inderjit Dhillon

TL;DR
This paper introduces a quadratically convexified primal-dual formulation for entropy regularized Markov decision processes, achieving faster convergence with a new interpolating metric and demonstrating improved performance through numerical experiments.
Contribution
It presents a novel convexified primal-dual formulation and an accelerated convergence method for entropy regularized MDPs, with theoretical guarantees and empirical validation.
Findings
Global convergence guarantee for the new formulation
Exponential convergence rate achieved
Significant acceleration demonstrated in numerical results
Abstract
Entropy regularized Markov decision processes have been widely used in reinforcement learning. This paper is concerned with the primal-dual formulation of the entropy regularized problems. Standard first-order methods suffer from slow convergence due to the lack of strict convexity and concavity. To address this issue, we first introduce a new quadratically convexified primal-dual formulation. The natural gradient ascent descent of the new formulation enjoys global convergence guarantee and exponential convergence rate. We also propose a new interpolating metric that further accelerates the convergence significantly. Numerical results are provided to demonstrate the performance of the proposed methods under multiple settings.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdaptive Dynamic Programming Control · Optimization and Variational Analysis · Reinforcement Learning in Robotics
