Provably Efficient Convergence of Primal-Dual Actor-Critic with Nonlinear Function Approximation
Jing Dong, Li Shen, Yinggan Xu, Baoxiang Wang

TL;DR
This paper proves the first efficient convergence guarantee for a primal-dual actor-critic algorithm with nonlinear function approximation in RL, demonstrating its effectiveness through theoretical analysis and empirical validation.
Contribution
It introduces a novel convergence analysis for primal-dual actor-critic with nonlinear approximation, applicable under broad RL scenarios including multi-agent settings.
Findings
Convergence rate of 0(rac{\
Empirical results on OpenAI Gym tasks support theoretical claims.
Applicable to various RL settings, including multi-agent RL.
Abstract
We study the convergence of the actor-critic algorithm with nonlinear function approximation under a nonconvex-nonconcave primal-dual formulation. Stochastic gradient descent ascent is applied with an adaptive proximal term for robust learning rates. We show the first efficient convergence result with primal-dual actor-critic with a convergence rate of under Markovian sampling, where is the element-wise maximum of the gradient, is the number of iterations, and is the dimension of the gradient. Our result is presented with only the Polyak-\L{}ojasiewicz condition for the dual variables, which is easy to verify and applicable to a wide range of reinforcement learning (RL) scenarios. The algorithm and analysis are general enough to be applied to other RL settings, like multi-agent RL. Empirical results on…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsReinforcement Learning in Robotics · Adaptive Dynamic Programming Control · Mathematical Biology Tumor Growth
