Quasi-Newton Iteration in Deterministic Policy Gradient

Arash Bahari Kordabad; Hossein Nejatbakhsh Esfahani; Wenqi Cai,; Sebastien Gros

arXiv:2203.13854·cs.LG·March 29, 2022

Quasi-Newton Iteration in Deterministic Policy Gradient

Arash Bahari Kordabad, Hossein Nejatbakhsh Esfahani, Wenqi Cai,, Sebastien Gros

PDF

Open Access

TL;DR

This paper introduces a model-free Hessian approximation for deterministic policy gradients in reinforcement learning, enabling superlinear convergence and unifying natural policy gradient as a special case.

Contribution

It proposes a novel quasi-Newton method for deterministic policy optimization that converges faster and generalizes the natural policy gradient approach.

Findings

01

Hessian approximation converges to the true Hessian at the optimal policy.

02

The method achieves superlinear convergence with rich policy parametrization.

03

Comparison shows improved convergence over natural policy gradient in nonlinear cases.

Abstract

This paper presents a model-free approximation for the Hessian of the performance of deterministic policies to use in the context of Reinforcement Learning based on Quasi-Newton steps in the policy parameters. We show that the approximate Hessian converges to the exact Hessian at the optimal policy, and allows for a superlinear convergence in the learning, provided that the policy parametrization is rich. The natural policy gradient method can be interpreted as a particular case of the proposed method. We analytically verify the formulation in a simple linear case and compare the convergence of the proposed method with the natural policy gradient in a nonlinear example.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsReinforcement Learning in Robotics · Model Reduction and Neural Networks · Machine Learning and ELM