Stochastic Reinforcement Learning with Stability Guarantees for Control of Unknown Nonlinear Systems
Thanin Quartz, Ruikun Zhou, Hans De Sterck, Jun Liu

TL;DR
This paper introduces a reinforcement learning algorithm that stabilizes unknown nonlinear systems by learning local linear dynamics and integrating this into neural policies, outperforming existing methods and providing theoretical stability guarantees.
Contribution
The paper presents a novel RL algorithm that guarantees stability for nonlinear systems by learning local linear models and incorporating them into control policies, with proven convergence and stability.
Findings
Outperforms SAC and PPO in stabilizing high-dimensional systems
Provides theoretical analysis of convergence and stability guarantees
Verifies asymptotic stability of learned control policies
Abstract
Designing a stabilizing controller for nonlinear systems is a challenging task, especially for high-dimensional problems with unknown dynamics. Traditional reinforcement learning algorithms applied to stabilization tasks tend to drive the system close to the equilibrium point. However, these approaches often fall short of achieving true stabilization and result in persistent oscillations around the equilibrium point. In this work, we propose a reinforcement learning algorithm that stabilizes the system by learning a local linear representation ofthe dynamics. The main component of the algorithm is integrating the learned gain matrix directly into the neural policy. We demonstrate the effectiveness of our algorithm on several challenging high-dimensional dynamical systems. In these simulations, our algorithm outperforms popular reinforcement learning algorithms, such as soft actor-critic…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdaptive Dynamic Programming Control
