Refactoring Policy for Compositional Generalizability using Self-Supervised Object Proposals
Tongzhou Mu, Jiayuan Gu, Zhiwei Jia, Hao Tang, Hao Su

TL;DR
This paper introduces a two-stage framework that refactors a high-reward teacher policy into a generalizable student policy using self-supervised object proposals, enhancing compositional generalizability in complex tasks.
Contribution
It presents a novel object-centric GNN-based policy trained via self-supervised learning to improve compositional generalizability in reinforcement learning tasks.
Findings
Superior performance on four challenging tasks
Effective use of self-supervised object proposals
Enhanced generalization compared to baselines
Abstract
We study how to learn a policy with compositional generalizability. We propose a two-stage framework, which refactorizes a high-reward teacher policy into a generalizable student policy with strong inductive bias. Particularly, we implement an object-centric GNN-based student policy, whose input objects are learned from images through self-supervised learning. Empirically, we evaluate our approach on four difficult tasks that require compositional generalizability, and achieve superior performance compared to baselines.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
Taxonomy
TopicsAI-based Problem Solving and Planning · Reinforcement Learning in Robotics · Robot Manipulation and Learning
