Generative Adversarial Exploration for Reinforcement Learning

Weijun Hong; Menghui Zhu; Minghuan Liu; Weinan Zhang; Ming Zhou; Yong; Yu; Peng Sun

arXiv:2201.11685·cs.LG·January 28, 2022·1 cites

Generative Adversarial Exploration for Reinforcement Learning

Weijun Hong, Menghui Zhu, Minghuan Liu, Weinan Zhang, Ming Zhou, Yong, Yu, Peng Sun

PDF

Open Access

TL;DR

This paper introduces GAEX, a novel exploration method for reinforcement learning that uses a generative adversarial network to encourage agents to visit less familiar states, improving performance on complex exploration tasks.

Contribution

The paper presents the first use of GANs for RL exploration, integrating a discriminator and generator to effectively identify and encourage visiting novel states.

Findings

01

GAEX improves exploration in challenging RL environments.

02

GAEX outperforms baseline methods on Montezuma's Revenge and Super Mario Bros.

03

The approach is simple to implement and computationally efficient.

Abstract

Exploration is crucial for training the optimal reinforcement learning (RL) policy, where the key is to discriminate whether a state visiting is novel. Most previous work focuses on designing heuristic rules or distance metrics to check whether a state is novel without considering such a discrimination process that can be learned. In this paper, we propose a novel method called generative adversarial exploration (GAEX) to encourage exploration in RL via introducing an intrinsic reward output from a generative adversarial network, where the generator provides fake samples of states that help discriminator identify those less frequently visited states. Thus the agent is encouraged to visit those states which the discriminator is less confident to judge as visited. GAEX is easy to implement and of high training efficiency. In our experiments, we apply GAEX into DQN and the DQN-GAEX…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsReinforcement Learning in Robotics · Generative Adversarial Networks and Image Synthesis · Artificial Intelligence in Games

MethodsQ-Learning · Dense Connections · Convolution · Deep Q-Network