Reinforcement Learning for Control with Probabilistic Stability Guarantee: A Finite-Sample Approach
Minghao Han, Lixian Zhang, Chenliang Liu, Zhipeng Zhou, Jun Wang, Wei Pan

TL;DR
This paper introduces a finite-sample reinforcement learning method with probabilistic stability guarantees for control systems, combining Lyapunov-based theory with a new RL algorithm that ensures stability in a model-free setting.
Contribution
It develops a probabilistic stability theorem using finite data and proposes L-REINFORCE, an RL algorithm tailored for stabilizing control, bridging RL and control theory.
Findings
L-REINFORCE outperforms baseline in stability on Cartpole
Probabilistic stability increases with data size
Finite-sample stability guarantees are established
Abstract
This paper presents a novel approach to reinforcement learning (RL) for control systems that provides probabilistic stability guarantees using finite data. Leveraging Lyapunov's method, we propose a probabilistic stability theorem that ensures mean square stability using only a finite number of sampled trajectories. The probability of stability increases with the number and length of trajectories, converging to certainty as data size grows. Additionally, we derive a policy gradient theorem for stabilizing policy learning and develop an RL algorithm, L-REINFORCE, that extends the classical REINFORCE algorithm to stabilization problems. The effectiveness of L-REINFORCE is demonstrated through simulations on a Cartpole task, where it outperforms the baseline in ensuring stability. This work bridges a critical gap between RL and control theory, enabling stability analysis and controller…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdaptive Dynamic Programming Control · Reinforcement Learning in Robotics · Stability and Control of Uncertain Systems
