Lyapunov-based uncertainty-aware safe reinforcement learning
Ashkan B. Jeddi, Nariman L. Dehghani, Abdollah Shafieezadeh

TL;DR
This paper introduces a Lyapunov-based, uncertainty-aware safe reinforcement learning framework that enhances safety and performance in complex, partially observable environments by integrating trajectory constraints, uncertainty quantification, and memory mechanisms.
Contribution
It proposes a novel Lyapunov-based model with uncertainty quantification and Transformer memory to improve safety and optimality in safe RL, especially in out-of-distribution scenarios.
Findings
Significant safety improvements in grid-world navigation tasks.
Enhanced performance in both fully and partially observable environments.
Effective identification of risk-averse actions through uncertainty estimation.
Abstract
Reinforcement learning (RL) has shown a promising performance in learning optimal policies for a variety of sequential decision-making tasks. However, in many real-world RL problems, besides optimizing the main objectives, the agent is expected to satisfy a certain level of safety (e.g., avoiding collisions in autonomous driving). While RL problems are commonly formalized as Markov decision processes (MDPs), safety constraints are incorporated via constrained Markov decision processes (CMDPs). Although recent advances in safe RL have enabled learning safe policies in CMDPs, these safety requirements should be satisfied during both training and in the deployment process. Furthermore, it is shown that in memory-based and partially observable environments, these methods fail to maintain safety over unseen out-of-distribution observations. To address these limitations, we propose a…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsReinforcement Learning in Robotics · Autonomous Vehicle Technology and Safety · Software Reliability and Analysis Research
