Contingency-Aware Exploration in Reinforcement Learning

Jongwook Choi; Yijie Guo; Marcin Moczulski; Junhyuk Oh; Neal Wu,; Mohammad Norouzi; Honglak Lee

arXiv:1811.01483·cs.LG·March 5, 2019·28 cites

Contingency-Aware Exploration in Reinforcement Learning

Jongwook Choi, Yijie Guo, Marcin Moczulski, Junhyuk Oh, Neal Wu,, Mohammad Norouzi, Honglak Lee

PDF

Open Access

TL;DR

This paper introduces an attentive dynamics model that learns controllable environment elements to improve exploration in reinforcement learning, achieving state-of-the-art results on challenging Atari games like Montezuma's Revenge.

Contribution

The paper presents a novel contingency-aware exploration method using an attentive dynamics model trained in a self-supervised manner, enhancing exploration in sparse reward environments.

Findings

01

Achieved >11,000 points on Montezuma's Revenge without supervision.

02

Demonstrated the effectiveness of contingency-awareness for exploration.

03

Improved performance over existing methods on challenging Atari games.

Abstract

This paper investigates whether learning contingency-awareness and controllable aspects of an environment can lead to better exploration in reinforcement learning. To investigate this question, we consider an instantiation of this hypothesis evaluated on the Arcade Learning Element (ALE). In this study, we develop an attentive dynamics model (ADM) that discovers controllable elements of the observations, which are often associated with the location of the character in Atari games. The ADM is trained in a self-supervised fashion to predict the actions taken by the agent. The learned contingency information is used as a part of the state representation for exploration purposes. We demonstrate that combining actor-critic algorithm with count-based exploration using our representation achieves impressive results on a set of notoriously challenging Atari games due to sparse rewards. For…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsReinforcement Learning in Robotics · Robot Manipulation and Learning · Evolutionary Algorithms and Applications