Learning Over Contracting and Lipschitz Closed-Loops for Partially-Observed Nonlinear Systems (Extended Version)
Nicholas H. Barbara, Ruigang Wang, Ian R. Manchester

TL;DR
This paper introduces a novel policy parameterization for nonlinear, partially-observed systems that guarantees stability and robustness, enabling safe and effective learning-based control without extra constraints.
Contribution
It develops a Youla-REN parameterization that inherently satisfies stability and Lipschitz robustness, advancing control methods for nonlinear, partially-observed systems.
Findings
Performs comparably to existing methods in simulations
Ensures stability and robustness without additional constraints
Shows improved robustness to adversarial disturbances
Abstract
This paper presents a policy parameterization for learning-based control on nonlinear, partially-observed dynamical systems. The parameterization is based on a nonlinear version of the Youla parameterization and the recently proposed Recurrent Equilibrium Network (REN) class of models. We prove that the resulting Youla-REN parameterization automatically satisfies stability (contraction) and user-tunable robustness (Lipschitz) conditions on the closed-loop system. This means it can be used for safe learning-based control with no additional constraints or projections required to enforce stability or robustness. We test the new policy class in simulation on two reinforcement learning tasks: 1) magnetic suspension, and 2) inverting a rotary-arm pendulum. We find that the Youla-REN performs similarly to existing learning-based and optimal control methods while also ensuring stability and…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsReinforcement Learning in Robotics · Fuel Cells and Related Materials · Adaptive Dynamic Programming Control
MethodsTest
