Go Beyond Imagination: Maximizing Episodic Reachability with World Models
Yao Fu, Run Peng, Honglak Lee

TL;DR
This paper introduces GoBI, an intrinsic reward method using world models to enhance exploration in reinforcement learning, significantly improving performance on challenging navigation and locomotion tasks by maximizing reachability expansion.
Contribution
The paper proposes a novel intrinsic reward called GoBI that combines lifelong novelty with episodic reachability, utilizing learned world models to predict and incentivize exploration of new states.
Findings
Outperforms state-of-the-art on 12 Minigrid tasks
Improves sample efficiency on DeepMind Control Suite locomotion tasks
Effectively maximizes state reachability through world model predictions
Abstract
Efficient exploration is a challenging topic in reinforcement learning, especially for sparse reward tasks. To deal with the reward sparsity, people commonly apply intrinsic rewards to motivate agents to explore the state space efficiently. In this paper, we introduce a new intrinsic reward design called GoBI - Go Beyond Imagination, which combines the traditional lifelong novelty motivation with an episodic intrinsic reward that is designed to maximize the stepwise reachability expansion. More specifically, we apply learned world models to generate predicted future states with random actions. States with more unique predictions that are not in episodic memory are assigned high intrinsic rewards. Our method greatly outperforms previous state-of-the-art methods on 12 of the most challenging Minigrid navigation tasks and improves the sample efficiency on locomotion tasks from DeepMind…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsReinforcement Learning in Robotics · Human Pose and Action Recognition · Multimodal Machine Learning Applications
