PolyTask: Learning Unified Policies through Behavior Distillation
Siddhant Haldar, Lerrel Pinto

TL;DR
PolyTask introduces a unified policy learning framework for embodied tasks that combines few demonstrations and behavior distillation, enabling efficient multi-task and lifelong learning in simulation and real robots.
Contribution
It proposes a novel 'learn then distill' approach with behavior distillation, improving multi-task and lifelong learning in embodied environments.
Findings
PolyTask outperforms prior methods in simulated environments.
PolyTask achieves significant improvements in lifelong learning.
The method is effective in real-robot experiments.
Abstract
Unified models capable of solving a wide variety of tasks have gained traction in vision and NLP due to their ability to share regularities and structures across tasks, which improves individual task performance and reduces computational footprint. However, the impact of such models remains limited in embodied learning problems, which present unique challenges due to interactivity, sample inefficiency, and sequential task presentation. In this work, we present PolyTask, a novel method for learning a single unified model that can solve various embodied tasks through a 'learn then distill' mechanism. In the 'learn' step, PolyTask leverages a few demonstrations for each task to train task-specific policies. Then, in the 'distill' step, task-specific policies are distilled into a single policy using a new distillation method called Behavior Distillation. Given a unified policy, individual…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMultimodal Machine Learning Applications · Domain Adaptation and Few-Shot Learning · Advanced Neural Network Applications
