Multi-Environment Pretraining Enables Transfer to Action Limited Datasets
David Venuto, Sherry Yang, Pieter Abbeel, Doina Precup, Igor Mordatch,, Ofir Nachum

TL;DR
This paper introduces ALPT, a pretraining method that uses inverse dynamics modeling to generate action labels in unannotated datasets, enabling effective transfer learning in environments with limited action data.
Contribution
The paper presents a novel pretraining approach combining multi-environment data and inverse dynamics modeling to improve reinforcement learning in action-sparse settings.
Findings
Significant performance improvements in game environments with minimal annotated data.
Effective action label generation even across environments with no shared actions.
Enhanced generalization capabilities demonstrated on benchmark tasks.
Abstract
Using massive datasets to train large-scale models has emerged as a dominant approach for broad generalization in natural language and vision applications. In reinforcement learning, however, a key challenge is that available data of sequential decision making is often not annotated with actions - for example, videos of game-play are much more available than sequences of frames paired with their logged game controls. We propose to circumvent this challenge by combining large but sparsely-annotated datasets from a \emph{target} environment of interest with fully-annotated datasets from various other \emph{source} environments. Our method, Action Limited PreTraining (ALPT), leverages the generalization capabilities of inverse dynamics modelling (IDM) to label missing action data in the target environment. We show that utilizing even one additional environment dataset of labelled data…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsReinforcement Learning in Robotics · Multimodal Machine Learning Applications · Model Reduction and Neural Networks
