Back to Base: Towards Hands-Off Learning via Safe Resets with Reach-Avoid Safety Filters

Azra Begzadi\'c; Nikhil Uday Shinde; Sander Tonkens; Dylan Hirsch; Kaleb Ugalde; Michael C. Yip; Jorge Cort\'es; Sylvia Herbert

arXiv:2501.02620·eess.SY·June 4, 2025

Back to Base: Towards Hands-Off Learning via Safe Resets with Reach-Avoid Safety Filters

Azra Begzadi\'c, Nikhil Uday Shinde, Sander Tonkens, Dylan Hirsch, Kaleb Ugalde, Michael C. Yip, Jorge Cort\'es, Sylvia Herbert

PDF

Open Access

TL;DR

This paper presents a novel safety filter based on reach-avoid value functions that enables robots to autonomously reset to safe states, improving safety and efficiency in reinforcement learning training.

Contribution

It introduces a reach-avoid based safety filter that minimally intervenes in control to ensure safety and facilitate hands-off training for real-world robots.

Findings

01

Successfully applied to a cartpole swing-up task

02

Ensures safety while maintaining control performance

03

Enables autonomous resets without human intervention

Abstract

Designing controllers that accomplish tasks while guaranteeing safety constraints remains a significant challenge. We often want an agent to perform well in a nominal task, such as environment exploration, while ensuring it can avoid unsafe states and return to a desired target by a specific time. In particular we are motivated by the setting of safe, efficient, hands-off training for reinforcement learning in the real world. By enabling a robot to safely and autonomously reset to a desired region (e.g., charging stations) without human intervention, we can enhance efficiency and facilitate training. Safety filters, such as those based on control barrier functions, decouple safety from nominal control objectives and rigorously guarantee safety. Despite their success, constructing these functions for general nonlinear systems with control constraints and system uncertainties remains an…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMachine Learning and Algorithms · Network Packet Processing and Optimization · Software Testing and Debugging Techniques