Finite-Time Analysis of Entropy-Regularized Neural Natural Actor-Critic   Algorithm

Semih Cayci; Niao He; R. Srikant

arXiv:2206.00833·cs.LG·June 3, 2022·1 cites

Finite-Time Analysis of Entropy-Regularized Neural Natural Actor-Critic Algorithm

Semih Cayci, Niao He, R. Srikant

PDF

Open Access

TL;DR

This paper provides a finite-time theoretical analysis of neural natural actor-critic algorithms, highlighting the roles of entropy regularization, averaging, and neural network approximation in ensuring stability, exploration, and optimality in large state space MDPs.

Contribution

It offers the first finite-time analysis of neural NAC, revealing how regularization and optimization techniques contribute to sample efficiency and policy optimality.

Findings

01

Entropy regularization and averaging improve stability and exploration.

02

Regularization bounds the sample complexity and network width.

03

Uniform approximation power of neural networks is crucial for global optimality.

Abstract

Natural actor-critic (NAC) and its variants, equipped with the representation power of neural networks, have demonstrated impressive empirical success in solving Markov decision problems with large state spaces. In this paper, we present a finite-time analysis of NAC with neural network approximation, and identify the roles of neural networks, regularization and optimization techniques (e.g., gradient clipping and averaging) to achieve provably good performance in terms of sample complexity, iteration complexity and overparametrization bounds for the actor and the critic. In particular, we prove that (i) entropy regularization and averaging ensure stability by providing sufficient exploration to avoid near-deterministic and strictly suboptimal policies and (ii) regularization leads to sharp sample complexity and network width bounds in the regularized MDPs, yielding a favorable…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdversarial Robustness in Machine Learning · Advanced Neural Network Applications · Reinforcement Learning in Robotics

MethodsEntropy Regularization · Gradient Clipping