RACER: Rich Language-Guided Failure Recovery Policies for Imitation   Learning

Yinpei Dai; Jayjun Lee; Nima Fazeli; Joyce Chai

arXiv:2409.14674·cs.RO·September 24, 2024

RACER: Rich Language-Guided Failure Recovery Policies for Imitation Learning

Yinpei Dai, Jayjun Lee, Nima Fazeli, Joyce Chai

PDF

Open Access

TL;DR

RACER introduces a scalable framework combining rich language annotations and failure recovery data to improve robotic visuomotor policies, enabling better error correction and task success in diverse environments.

Contribution

The paper presents RACER, a novel supervisor-actor framework that integrates a vision-language model with failure recovery data to enhance robot control and robustness.

Findings

01

RACER outperforms state-of-the-art methods on RLbench tasks.

02

RACER demonstrates superior zero-shot generalization to unseen tasks.

03

RACER achieves effective real-world robot manipulation performance.

Abstract

Developing robust and correctable visuomotor policies for robotic manipulation is challenging due to the lack of self-recovery mechanisms from failures and the limitations of simple language instructions in guiding robot actions. To address these issues, we propose a scalable data generation pipeline that automatically augments expert demonstrations with failure recovery trajectories and fine-grained language annotations for training. We then introduce Rich languAge-guided failure reCovERy (RACER), a supervisor-actor framework, which combines failure recovery data with rich language descriptions to enhance robot control. RACER features a vision-language model (VLM) that acts as an online supervisor, providing detailed language guidance for error correction and task execution, and a language-conditioned visuomotor policy as an actor to predict the next actions. Our experimental results…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMultimodal Machine Learning Applications · Anomaly Detection Techniques and Applications · COVID-19 diagnosis using AI

MethodsAttention Is All You Need · Linear Layer · Position-Wise Feed-Forward Layer · Label Smoothing · Byte Pair Encoding · Absolute Position Encodings · Softmax · Layer Normalization · Dropout · Dense Connections