Back to Base: Towards Hands-Off Learning via Safe Resets with Reach-Avoid Safety Filters
Azra Begzadi\'c, Nikhil Uday Shinde, Sander Tonkens, Dylan Hirsch, Kaleb Ugalde, Michael C. Yip, Jorge Cort\'es, Sylvia Herbert

TL;DR
This paper presents a novel safety filter based on reach-avoid value functions that enables robots to autonomously reset to safe states, improving safety and efficiency in reinforcement learning training.
Contribution
It introduces a reach-avoid based safety filter that minimally intervenes in control to ensure safety and facilitate hands-off training for real-world robots.
Findings
Successfully applied to a cartpole swing-up task
Ensures safety while maintaining control performance
Enables autonomous resets without human intervention
Abstract
Designing controllers that accomplish tasks while guaranteeing safety constraints remains a significant challenge. We often want an agent to perform well in a nominal task, such as environment exploration, while ensuring it can avoid unsafe states and return to a desired target by a specific time. In particular we are motivated by the setting of safe, efficient, hands-off training for reinforcement learning in the real world. By enabling a robot to safely and autonomously reset to a desired region (e.g., charging stations) without human intervention, we can enhance efficiency and facilitate training. Safety filters, such as those based on control barrier functions, decouple safety from nominal control objectives and rigorously guarantee safety. Despite their success, constructing these functions for general nonlinear systems with control constraints and system uncertainties remains an…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMachine Learning and Algorithms · Network Packet Processing and Optimization · Software Testing and Debugging Techniques
