Explore and Control with Adversarial Surprise
Arnaud Fickinger, Natasha Jaques, Samyak Parajuli, Michael Chang,, Nicholas Rhinehart, Glen Berseth, Stuart Russell, Sergey Levine

TL;DR
This paper introduces an adversarial unsupervised RL method where two policies compete to explore and control a stochastic environment, leading to diverse skill acquisition and improved transfer performance.
Contribution
The paper proposes a novel adversarial RL framework that maximizes environment coverage and skill diversity in high-dimensional stochastic settings, with theoretical and empirical validation.
Findings
Maximizes state coverage in stochastic environments
Leads to emergence of complex, meaningful skills
Outperforms existing unsupervised RL methods in exploration and transfer
Abstract
Unsupervised reinforcement learning (RL) studies how to leverage environment statistics to learn useful behaviors without the cost of reward engineering. However, a central challenge in unsupervised RL is to extract behaviors that meaningfully affect the world and cover the range of possible outcomes, without getting distracted by inherently unpredictable, uncontrollable, and stochastic elements in the environment. To this end, we propose an unsupervised RL method designed for high-dimensional, stochastic environments based on an adversarial game between two policies (which we call Explore and Control) controlling a single body and competing over the amount of observation entropy the agent experiences. The Explore agent seeks out states that maximally surprise the Control agent, which in turn aims to minimize surprise, and thereby manipulate the environment to return to familiar and…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsReinforcement Learning in Robotics · Adversarial Robustness in Machine Learning · Neural dynamics and brain function
