Safety Guarantees in Zero-Shot Reinforcement Learning for Cascade Dynamical Systems
Shima Rabiei, Sandipan Mishra, Santiago Paternain

TL;DR
This paper introduces a method for zero-shot safety guarantees in cascade dynamical systems by training on reduced models and analyzing safety probability bounds in full systems.
Contribution
It proposes a novel approach combining reduced-order model training with theoretical safety probability bounds for cascade systems.
Findings
Safety guarantees are preserved when the low-level controller effectively tracks inner states.
Theoretical bounds relate safety probability to tracking quality and bandwidth.
Validated on a quadrotor navigation task with demonstrated safety preservation.
Abstract
This paper considers the problem of zero-shot safety guarantees for cascade dynamical systems. These are systems where a subset of the states (the inner states) affects the dynamics of the remaining states (the outer states) but not vice-versa. We define safety as remaining on a set deemed safe for all times with high probability. We propose to train a safe RL policy on a reduced-order model, which ignores the dynamics of the inner states, but it treats it as an action that influences the outer state. Thus, reducing the complexity of the training. When deployed in the full system the trained policy is combined with a low-level controller whose task is to track the reference provided by the RL policy. Our main theoretical contribution is a bound on the safe probability in the full-order system. In particular, we establish the interplay between the probability of remaining safe after the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
