Certifying Stability of Reinforcement Learning Policies using Generalized Lyapunov Functions
Kehan Long, Jorge Cort\'es, Nikolay Atanasov

TL;DR
This paper introduces a method to certify the stability of reinforcement learning policies by learning generalized Lyapunov functions, relaxing classical conditions and enabling stability guarantees for complex systems.
Contribution
It proposes a novel approach to learn generalized Lyapunov functions from RL value functions, extending stability certification to nonlinear systems with learned policies.
Findings
Successfully certifies stability of RL policies on benchmark tasks
Extends Lyapunov methods with neural residuals for broader applicability
Achieves larger regions of attraction compared to classical Lyapunov approaches
Abstract
Establishing stability certificates for closed-loop systems under reinforcement learning (RL) policies is essential to move beyond empirical performance and offer guarantees of system behavior. Classical Lyapunov methods require a strict stepwise decrease in the Lyapunov function but such certificates are difficult to construct for learned policies. The RL value function is a natural candidate but it is not well understood how it can be adapted for this purpose. To gain intuition, we first study the linear quadratic regulator (LQR) problem and make two key observations. First, a Lyapunov function can be obtained from the value function of an LQR policy by augmenting it with a residual term related to the system dynamics and stage cost. Second, the classical Lyapunov decrease requirement can be relaxed to a generalized Lyapunov condition requiring only decrease on average over multiple…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
Taxonomy
TopicsAdaptive Dynamic Programming Control · Reinforcement Learning in Robotics · Model Reduction and Neural Networks
