A homotopic approach to policy gradients for linear quadratic regulators   with nonlinear controls

Craig Xu Chen; Andrea Agazzi

arXiv:2112.07612·math.OC·December 15, 2021·CDC

A homotopic approach to policy gradients for linear quadratic regulators with nonlinear controls

Craig Xu Chen, Andrea Agazzi

PDF

Open Access

TL;DR

This paper introduces a homotopic policy gradient method that gradually increases the discount factor, enabling convergence to the global optimum in nonlinear policy spaces for the LQR problem.

Contribution

It proposes a novel homotopic approach to policy gradients that overcomes local minima issues in nonlinear policy classes for LQR.

Findings

01

Homotopic policy gradient converges to the global optimum for nonlinear policies.

02

Counterexample shows linear policy extension can lead to local minima.

03

Method applies to a large class of Lipschitz, nonlinear policies.

Abstract

We study the convergence of deterministic policy gradient algorithms in continuous state and action space for the prototypical Linear Quadratic Regulator (LQR) problem when the search space is not limited to the family of linear policies. We first provide a counterexample showing that extending the policy class to piecewise linear functions results in local minima of the policy gradient algorithm. To solve this problem, we develop a new approach that involves sequentially increasing a discount factor between iterations of the original policy gradient algorithm. We finally prove that this homotopic variant of policy gradient methods converges to the global optimum of the undiscounted Linear Quadratic Regulator problem for a large class of Lipschitz, non-linear policies.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMathematical Biology Tumor Growth