State Entropy Maximization with Random Encoders for Efficient   Exploration

Younggyo Seo; Lili Chen; Jinwoo Shin; Honglak Lee; Pieter Abbeel,; Kimin Lee

arXiv:2102.09430·cs.LG·June 22, 2021·22 cites

State Entropy Maximization with Random Encoders for Efficient Exploration

Younggyo Seo, Lili Chen, Jinwoo Shin, Honglak Lee, Pieter Abbeel,, Kimin Lee

PDF

Open Access 2 Repos 1 Video

TL;DR

This paper introduces RE3, a novel exploration method in deep reinforcement learning that uses random encoders to estimate state entropy efficiently, significantly enhancing sample-efficiency in high-dimensional environments.

Contribution

The paper proposes using fixed, randomly initialized encoders to estimate state entropy in high-dimensional spaces, enabling stable and efficient exploration in deep RL.

Findings

01

RE3 improves sample-efficiency in locomotion and navigation tasks.

02

RE3 enables learning diverse behaviors without extrinsic rewards.

03

RE3 outperforms existing exploration methods on benchmark tasks.

Abstract

Recent exploration methods have proven to be a recipe for improving sample-efficiency in deep reinforcement learning (RL). However, efficient exploration in high-dimensional observation spaces still remains a challenge. This paper presents Random Encoders for Efficient Exploration (RE3), an exploration method that utilizes state entropy as an intrinsic reward. In order to estimate state entropy in environments with high-dimensional observations, we utilize a k-nearest neighbor entropy estimator in the low-dimensional representation space of a convolutional encoder. In particular, we find that the state entropy can be estimated in a stable and compute-efficient manner by utilizing a randomly initialized encoder, which is fixed throughout training. Our experiments show that RE3 significantly improves the sample-efficiency of both model-free and model-based RL methods on locomotion and…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

Videos

State Entropy Maximization with Random Encoders for Efficient Exploration· slideslive

Taxonomy

TopicsReinforcement Learning in Robotics · Human Pose and Action Recognition · Adversarial Robustness in Machine Learning