A Study of Global and Episodic Bonuses for Exploration in Contextual MDPs
Mikael Henaff, Minqi Jiang, Roberta Raileanu

TL;DR
This paper analyzes the effectiveness of global and episodic bonuses for exploration in contextual MDPs, revealing their complementary strengths and proposing a combined approach that improves performance across diverse tasks.
Contribution
It provides a conceptual framework for understanding global and episodic bonuses, and introduces a new algorithm that outperforms previous methods on multiple benchmarks.
Findings
Episodic bonuses excel with little shared structure across episodes.
Global bonuses are more effective when there is more shared structure.
Combining both bonuses yields robust performance across various environments.
Abstract
Exploration in environments which differ across episodes has received increasing attention in recent years. Current methods use some combination of global novelty bonuses, computed using the agent's entire training experience, and \textit{episodic novelty bonuses}, computed using only experience from the current episode. However, the use of these two types of bonuses has been ad-hoc and poorly understood. In this work, we shed light on the behavior of these two types of bonuses through controlled experiments on easily interpretable tasks as well as challenging pixel-based settings. We find that the two types of bonuses succeed in different settings, with episodic bonuses being most effective when there is little shared structure across episodes and global bonuses being effective when more structure is shared. We develop a conceptual framework which makes this notion of shared structure…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsArtificial Intelligence in Games · Reinforcement Learning in Robotics · Multimodal Machine Learning Applications
