Understanding the Effects of Second-Order Approximations in Natural   Policy Gradient Reinforcement Learning

Brennan Gebotys; Alexander Wong; David A. Clausi

arXiv:2201.09104·cs.LG·October 12, 2022

Understanding the Effects of Second-Order Approximations in Natural Policy Gradient Reinforcement Learning

Brennan Gebotys, Alexander Wong, David A. Clausi

PDF

Open Access 1 Repo

TL;DR

This paper systematically compares five second-order approximations in natural policy gradient reinforcement learning, revealing that better approximations and hyperparameter tuning significantly enhance performance and sample efficiency.

Contribution

It provides a comprehensive analysis of second-order approximations in natural policy gradient methods, highlighting their impact on performance, stability, and efficiency.

Findings

01

Improved second-order approximations lead to better performance.

02

Proper hyperparameter tuning can increase sample efficiency by up to 181%.

03

Using the natural gradient to optimize the critic network enhances results.

Abstract

Natural policy gradient methods are popular reinforcement learning methods that improve the stability of policy gradient methods by utilizing second-order approximations to precondition the gradient with the inverse of the Fisher-information matrix. However, to the best of the authors' knowledge, there has not been a study that has investigated the effects of different second-order approximations in a comprehensive and systematic manner. To address this, five different second-order approximations were studied and compared across multiple key metrics including performance, stability, sample efficiency, and computation time. Furthermore, hyperparameters which aren't typically acknowledged in the literature are studied including the effect of different batch sizes and optimizing the critic network with the natural gradient. Experimental results show that on average, improved second-order…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

gebob19/natural-policy-gradient-reinforcement-learning
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMachine Learning and ELM · Fuel Cells and Related Materials · Advancements in Semiconductor Devices and Circuit Design