Inverse Dynamics Pretraining Learns Good Representations for Multitask Imitation
David Brandfonbrener, Ofir Nachum, Joan Bruna

TL;DR
This paper explores how inverse dynamics pretraining can learn effective representations for multitask imitation learning, demonstrating empirical benefits in simulated visuomotor tasks and providing a new theoretical analysis.
Contribution
It introduces inverse dynamics modeling as a pretraining method for multitask imitation learning and offers a novel theoretical explanation for its empirical success.
Findings
Inverse dynamics pretraining improves transfer to new tasks.
Empirical validation on simulated visuomotor problems shows advantages.
Provides a new theoretical analysis of inverse dynamics benefits.
Abstract
In recent years, domains such as natural language processing and image recognition have popularized the paradigm of using large datasets to pretrain representations that can be effectively transferred to downstream tasks. In this work we evaluate how such a paradigm should be done in imitation learning, where both pretraining and finetuning data are trajectories collected by experts interacting with an unknown environment. Namely, we consider a setting where the pretraining corpus consists of multitask demonstrations and the task for each demonstration is set by an unobserved latent context variable. The goal is to use the pretraining corpus to learn a low dimensional representation of the high dimensional (e.g., visual) observation space which can be transferred to a novel context for finetuning on a limited dataset of demonstrations. Among a variety of possible pretraining objectives,…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
Taxonomy
TopicsRobot Manipulation and Learning · Human Pose and Action Recognition · Multimodal Machine Learning Applications
