Imitation Learning of Stabilizing Policies for Nonlinear Systems

Sebastian East

arXiv:2109.10854·math.OC·September 23, 2021

Imitation Learning of Stabilizing Policies for Nonlinear Systems

Sebastian East

PDF

Open Access 1 Repo

TL;DR

This paper extends stabilizing imitation learning methods from linear to polynomial systems using sum of squares techniques, proposing algorithms and demonstrating their effectiveness through experiments.

Contribution

It introduces a novel extension of stabilizing imitation learning to polynomial systems via sum of squares methods, with practical algorithms and numerical validation.

Findings

01

Algorithms effectively stabilize polynomial systems

02

Sum of squares techniques enable extension from linear to nonlinear systems

03

Numerical experiments demonstrate practical performance

Abstract

There has been a recent interest in imitation learning methods that are guaranteed to produce a stabilizing control law with respect to a known system. Work in this area has generally considered linear systems and controllers, for which stabilizing imitation learning takes the form of a biconvex optimization problem. In this paper it is demonstrated that the same methods developed for linear systems and controllers can be readily extended to polynomial systems and controllers using sum of squares techniques. A projected gradient descent algorithm and an alternating direction method of multipliers algorithm are proposed as heuristics for solving the stabilizing imitation learning problem, and their performance is illustrated through numerical experiments.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

sebastian-east/sos-imitation-learning
jaxOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsZebrafish Biomedical Research Applications · Iterative Learning Control Systems · Adaptive Dynamic Programming Control