TAIL-Safe: Task-Agnostic Safety Monitoring for Imitation Learning Policies
Riad Ahmed, Momotaz Begum

TL;DR
TAIL-Safe introduces a safety monitoring approach for imitation learning policies, enabling them to identify safe execution regions and recover from unsafe actions, thus improving robustness in manipulation tasks.
Contribution
It proposes a Lipschitz-continuous Q-value function and a recovery mechanism to ensure safe policy execution and success in out-of-distribution scenarios.
Findings
Flow-matching policies succeed when guided by TAIL-Safe.
TAIL-Safe effectively identifies safe state-action regions.
Recovery mechanism maintains task success under perturbations.
Abstract
Recent imitation learning (IL) algorithms such as flow-matching and diffusion policies demonstrate remarkable performance in learning complex manipulation tasks. However, these policies often fail even when operating within their training distribution due to extreme sensitivity to initial conditions and irreducible approximation errors that lead to compounding drift. This makes it unsafe to deploy IL policies in the field where out-of-distribution scenarios are prevalent. A prerequisite for safe deployment is enabling the policy to determine whether it can execute a task the way it was learned from demonstrations. This paper presents TAIL-Safe, a principled approach to identify, for a trained IL policy, a safe set from where the policy empirically succeeds in completing the learned task. We propose a Lipschitz-continuous Q-value function that maps state-action pairs to a long-term…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
