RACER: Rich Language-Guided Failure Recovery Policies for Imitation Learning
Yinpei Dai, Jayjun Lee, Nima Fazeli, Joyce Chai

TL;DR
RACER introduces a scalable framework combining rich language annotations and failure recovery data to improve robotic visuomotor policies, enabling better error correction and task success in diverse environments.
Contribution
The paper presents RACER, a novel supervisor-actor framework that integrates a vision-language model with failure recovery data to enhance robot control and robustness.
Findings
RACER outperforms state-of-the-art methods on RLbench tasks.
RACER demonstrates superior zero-shot generalization to unseen tasks.
RACER achieves effective real-world robot manipulation performance.
Abstract
Developing robust and correctable visuomotor policies for robotic manipulation is challenging due to the lack of self-recovery mechanisms from failures and the limitations of simple language instructions in guiding robot actions. To address these issues, we propose a scalable data generation pipeline that automatically augments expert demonstrations with failure recovery trajectories and fine-grained language annotations for training. We then introduce Rich languAge-guided failure reCovERy (RACER), a supervisor-actor framework, which combines failure recovery data with rich language descriptions to enhance robot control. RACER features a vision-language model (VLM) that acts as an online supervisor, providing detailed language guidance for error correction and task execution, and a language-conditioned visuomotor policy as an actor to predict the next actions. Our experimental results…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMultimodal Machine Learning Applications · Anomaly Detection Techniques and Applications · COVID-19 diagnosis using AI
MethodsAttention Is All You Need · Linear Layer · Position-Wise Feed-Forward Layer · Label Smoothing · Byte Pair Encoding · Absolute Position Encodings · Softmax · Layer Normalization · Dropout · Dense Connections
