Provably Efficient Convergence of Primal-Dual Actor-Critic with   Nonlinear Function Approximation

Jing Dong; Li Shen; Yinggan Xu; Baoxiang Wang

arXiv:2202.13863·cs.LG·March 1, 2022·1 cites

Provably Efficient Convergence of Primal-Dual Actor-Critic with Nonlinear Function Approximation

Jing Dong, Li Shen, Yinggan Xu, Baoxiang Wang

PDF

Open Access

TL;DR

This paper proves the first efficient convergence guarantee for a primal-dual actor-critic algorithm with nonlinear function approximation in RL, demonstrating its effectiveness through theoretical analysis and empirical validation.

Contribution

It introduces a novel convergence analysis for primal-dual actor-critic with nonlinear approximation, applicable under broad RL scenarios including multi-agent settings.

Findings

01

Convergence rate of 0(rac{\

02

Empirical results on OpenAI Gym tasks support theoretical claims.

03

Applicable to various RL settings, including multi-agent RL.

Abstract

We study the convergence of the actor-critic algorithm with nonlinear function approximation under a nonconvex-nonconcave primal-dual formulation. Stochastic gradient descent ascent is applied with an adaptive proximal term for robust learning rates. We show the first efficient convergence result with primal-dual actor-critic with a convergence rate of $O (\frac{l n ( N d G ^{2} )}{N})$ under Markovian sampling, where $G$ is the element-wise maximum of the gradient, $N$ is the number of iterations, and $d$ is the dimension of the gradient. Our result is presented with only the Polyak-\L{}ojasiewicz condition for the dual variables, which is easy to verify and applicable to a wide range of reinforcement learning (RL) scenarios. The algorithm and analysis are general enough to be applied to other RL settings, like multi-agent RL. Empirical results on…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsReinforcement Learning in Robotics · Adaptive Dynamic Programming Control · Mathematical Biology Tumor Growth