Understanding the Effects of Second-Order Approximations in Natural Policy Gradient Reinforcement Learning
Brennan Gebotys, Alexander Wong, David A. Clausi

TL;DR
This paper systematically compares five second-order approximations in natural policy gradient reinforcement learning, revealing that better approximations and hyperparameter tuning significantly enhance performance and sample efficiency.
Contribution
It provides a comprehensive analysis of second-order approximations in natural policy gradient methods, highlighting their impact on performance, stability, and efficiency.
Findings
Improved second-order approximations lead to better performance.
Proper hyperparameter tuning can increase sample efficiency by up to 181%.
Using the natural gradient to optimize the critic network enhances results.
Abstract
Natural policy gradient methods are popular reinforcement learning methods that improve the stability of policy gradient methods by utilizing second-order approximations to precondition the gradient with the inverse of the Fisher-information matrix. However, to the best of the authors' knowledge, there has not been a study that has investigated the effects of different second-order approximations in a comprehensive and systematic manner. To address this, five different second-order approximations were studied and compared across multiple key metrics including performance, stability, sample efficiency, and computation time. Furthermore, hyperparameters which aren't typically acknowledged in the literature are studied including the effect of different batch sizes and optimizing the critic network with the natural gradient. Experimental results show that on average, improved second-order…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMachine Learning and ELM · Fuel Cells and Related Materials · Advancements in Semiconductor Devices and Circuit Design
