Reinforcement Learning for Sparse-Reward Object-Interaction Tasks in a First-person Simulated 3D Environment
Wilka Carvalho, Anthony Liang, Kimin Lee, Sungryull Sohn, Honglak Lee,, Richard L. Lewis, Satinder Singh

TL;DR
This paper introduces an unsupervised, object-centric reinforcement learning approach that learns object models with attention in a 3D simulated environment, significantly improving learning efficiency for sparse-reward tasks.
Contribution
It proposes a novel attentive object-model as an auxiliary task, enabling faster and more effective learning without supervision in complex 3D environments.
Findings
Our method outperforms alternative auxiliary tasks in learning speed.
It achieves success rates close to models with ground-truth object info.
Object-attention improves object-category and object-state representation learning.
Abstract
First-person object-interaction tasks in high-fidelity, 3D, simulated environments such as the AI2Thor virtual home-environment pose significant sample-efficiency challenges for reinforcement learning (RL) agents learning from sparse task rewards. To alleviate these challenges, prior work has provided extensive supervision via a combination of reward-shaping, ground-truth object-information, and expert demonstrations. In this work, we show that one can learn object-interaction tasks from scratch without supervision by learning an attentive object-model as an auxiliary task during task learning with an object-centric relational RL agent. Our key insight is that learning an object-model that incorporates object-attention into forward prediction provides a dense learning signal for unsupervised representation learning of both objects and their relationships. This, in turn, enables faster…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
