Runtime-Safety-Guided Policy Repair
Weichao Zhou, Ruihan Gao, BaekGyu Kim, Eunsuk Kang, Wenchao Li

TL;DR
This paper introduces a method to repair learned control policies at runtime to minimize safety-related control switching, using trajectory optimization to ensure safety without significantly deviating from the original policy.
Contribution
It proposes a novel approach for policy repair that reduces control switching by integrating safety constraints into trajectory optimization, effective even with unknown system models.
Findings
Reduces control switching in safety-critical control systems.
Effective even with approximate or unknown system models.
Improves safety assurances while maintaining high performance.
Abstract
We study the problem of policy repair for learning-based control policies in safety-critical settings. We consider an architecture where a high-performance learning-based control policy (e.g. one trained as a neural network) is paired with a model-based safety controller. The safety controller is endowed with the abilities to predict whether the trained policy will lead the system to an unsafe state, and take over control when necessary. While this architecture can provide added safety assurances, intermittent and frequent switching between the trained policy and the safety controller can result in undesirable behaviors and reduced performance. We propose to reduce or even eliminate control switching by `repairing' the trained policy based on runtime data produced by the safety controller in a way that deviates minimally from the original policy. The key idea behind our approach is the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdversarial Robustness in Machine Learning · Reinforcement Learning in Robotics · Fault Detection and Control Systems
MethodsRepair
