Reinforcement Learning for Control with Probabilistic Stability Guarantee: A Finite-Sample Approach

Minghao Han; Lixian Zhang; Chenliang Liu; Zhipeng Zhou; Jun Wang; Wei Pan

arXiv:2603.00043·cs.LG·March 3, 2026

Reinforcement Learning for Control with Probabilistic Stability Guarantee: A Finite-Sample Approach

Minghao Han, Lixian Zhang, Chenliang Liu, Zhipeng Zhou, Jun Wang, Wei Pan

PDF

Open Access

TL;DR

This paper introduces a finite-sample reinforcement learning method with probabilistic stability guarantees for control systems, combining Lyapunov-based theory with a new RL algorithm that ensures stability in a model-free setting.

Contribution

It develops a probabilistic stability theorem using finite data and proposes L-REINFORCE, an RL algorithm tailored for stabilizing control, bridging RL and control theory.

Findings

01

L-REINFORCE outperforms baseline in stability on Cartpole

02

Probabilistic stability increases with data size

03

Finite-sample stability guarantees are established

Abstract

This paper presents a novel approach to reinforcement learning (RL) for control systems that provides probabilistic stability guarantees using finite data. Leveraging Lyapunov's method, we propose a probabilistic stability theorem that ensures mean square stability using only a finite number of sampled trajectories. The probability of stability increases with the number and length of trajectories, converging to certainty as data size grows. Additionally, we derive a policy gradient theorem for stabilizing policy learning and develop an RL algorithm, L-REINFORCE, that extends the classical REINFORCE algorithm to stabilization problems. The effectiveness of L-REINFORCE is demonstrated through simulations on a Cartpole task, where it outperforms the baseline in ensuring stability. This work bridges a critical gap between RL and control theory, enabling stability analysis and controller…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdaptive Dynamic Programming Control · Reinforcement Learning in Robotics · Stability and Control of Uncertain Systems