Failure-Aware RL: Reliable Offline-to-Online Reinforcement Learning with Self-Recovery for Real-World Manipulation

Huanyu Li; Kun Lei; Sheng Zang; Kaizhe Hu; Yongyuan Liang; Bo An; Xiaoli Li; Huazhe Xu

arXiv:2601.07821·cs.RO·January 13, 2026

Failure-Aware RL: Reliable Offline-to-Online Reinforcement Learning with Self-Recovery for Real-World Manipulation

Huanyu Li, Kun Lei, Sheng Zang, Kaizhe Hu, Yongyuan Liang, Bo An, Xiaoli Li, Huazhe Xu

PDF

Open Access

TL;DR

This paper introduces FARL, a novel reinforcement learning paradigm that reduces failures and enhances performance in real-world robotic tasks by integrating safety critics and recovery policies, validated through extensive experiments.

Contribution

The paper proposes FARL, a new offline-to-online RL framework with a safety critic and recovery policy, and introduces FailureBench, a benchmark for failure scenarios in robotics.

Findings

01

FARL reduces IR Failures by 73.1% in real-world tasks.

02

FARL improves online RL performance by 11.3% on average.

03

Extensive experiments validate FARL's effectiveness in safety and generalization.

Abstract

Post-training algorithms based on deep reinforcement learning can push the limits of robotic models for specific objectives, such as generalizability, accuracy, and robustness. However, Intervention-requiring Failures (IR Failures) (e.g., a robot spilling water or breaking fragile glass) during real-world exploration happen inevitably, hindering the practical deployment of such a paradigm. To tackle this, we introduce Failure-Aware Offline-to-Online Reinforcement Learning (FARL), a new paradigm minimizing failures during real-world reinforcement learning. We create FailureBench, a benchmark that incorporates common failure scenarios requiring human intervention, and propose an algorithm that integrates a world-model-based safety critic and a recovery policy trained offline to prevent failures during online exploration. Extensive simulation and real-world experiments demonstrate the…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsReinforcement Learning in Robotics · Robot Manipulation and Learning · Adversarial Robustness in Machine Learning