Explore and Control with Adversarial Surprise

Arnaud Fickinger; Natasha Jaques; Samyak Parajuli; Michael Chang,; Nicholas Rhinehart; Glen Berseth; Stuart Russell; Sergey Levine

arXiv:2107.07394·cs.LG·December 30, 2021

Explore and Control with Adversarial Surprise

Arnaud Fickinger, Natasha Jaques, Samyak Parajuli, Michael Chang,, Nicholas Rhinehart, Glen Berseth, Stuart Russell, Sergey Levine

PDF

Open Access 1 Repo

TL;DR

This paper introduces an adversarial unsupervised RL method where two policies compete to explore and control a stochastic environment, leading to diverse skill acquisition and improved transfer performance.

Contribution

The paper proposes a novel adversarial RL framework that maximizes environment coverage and skill diversity in high-dimensional stochastic settings, with theoretical and empirical validation.

Findings

01

Maximizes state coverage in stochastic environments

02

Leads to emergence of complex, meaningful skills

03

Outperforms existing unsupervised RL methods in exploration and transfer

Abstract

Unsupervised reinforcement learning (RL) studies how to leverage environment statistics to learn useful behaviors without the cost of reward engineering. However, a central challenge in unsupervised RL is to extract behaviors that meaningfully affect the world and cover the range of possible outcomes, without getting distracted by inherently unpredictable, uncontrollable, and stochastic elements in the environment. To this end, we propose an unsupervised RL method designed for high-dimensional, stochastic environments based on an adversarial game between two policies (which we call Explore and Control) controlling a single body and competing over the amount of observation entropy the agent experiences. The Explore agent seeks out states that maximally surprise the Control agent, which in turn aims to minimize surprise, and thereby manipulate the environment to return to familiar and…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

ArnaudFickinger/adversarial-surprise
pytorch

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsReinforcement Learning in Robotics · Adversarial Robustness in Machine Learning · Neural dynamics and brain function