Environment Generation for Zero-Shot Compositional Reinforcement Learning
Izzeddin Gur, Natasha Jaques, Yingjie Miao, Jongwook Choi, Manoj, Tiwari, Honglak Lee, Aleksandra Faust

TL;DR
This paper introduces CoDE, a method for automatically generating compositional environments to train RL agents, enabling better learning, robustness, and zero-shot generalization to complex, multi-step tasks.
Contribution
The paper proposes a novel environment generation algorithm for compositional tasks, along with two benchmark frameworks, improving RL training and generalization in complex scenarios.
Findings
CoDE achieves 4x higher success rate than baselines.
Successfully trains agents on real web navigation tasks.
Generates environments with multiple pages or rooms for complex task learning.
Abstract
Many real-world problems are compositional - solving them requires completing interdependent sub-tasks, either in series or in parallel, that can be represented as a dependency graph. Deep reinforcement learning (RL) agents often struggle to learn such complex tasks due to the long time horizons and sparse rewards. To address this problem, we present Compositional Design of Environments (CoDE), which trains a Generator agent to automatically build a series of compositional tasks tailored to the RL agent's current skill level. This automatic curriculum not only enables the agent to learn more complex tasks than it could have otherwise, but also selects tasks where the agent's performance is weak, enhancing its robustness and ability to generalize zero-shot to unseen tasks at test-time. We analyze why current environment generation techniques are insufficient for the problem of generating…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
Taxonomy
TopicsAdvanced X-ray and CT Imaging
