SPRINT: Scalable Policy Pre-Training via Language Instruction Relabeling
Jesse Zhang, Karl Pertsch, Jiahui Zhang, Joseph J. Lim

TL;DR
SPRINT is a scalable offline policy pre-training method that leverages large language models and offline reinforcement learning to automatically generate diverse skills, significantly speeding up downstream task learning in robotics.
Contribution
It introduces instruction relabeling and cross-trajectory skill chaining to reduce human annotation effort and enhance skill diversity in robot pre-training.
Findings
Faster learning of new tasks with SPRINT in simulation and real robots.
Reduces human effort in pre-training data annotation.
Pre-training with SPRINT improves downstream task performance.
Abstract
Pre-training robot policies with a rich set of skills can substantially accelerate the learning of downstream tasks. Prior works have defined pre-training tasks via natural language instructions, but doing so requires tedious human annotation of hundreds of thousands of instructions. Thus, we propose SPRINT, a scalable offline policy pre-training approach which substantially reduces the human effort needed for pre-training a diverse set of skills. Our method uses two core ideas to automatically expand a base set of pre-training tasks: instruction relabeling via large language models and cross-trajectory skill chaining through offline reinforcement learning. As a result, SPRINT pre-training equips robots with a much richer repertoire of skills. Experimental results in a household simulator and on a real robot kitchen manipulation task show that SPRINT leads to substantially faster…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Multimodal Machine Learning Applications · Natural Language Processing Techniques
MethodsBalanced Selection
