State Entropy Maximization with Random Encoders for Efficient Exploration
Younggyo Seo, Lili Chen, Jinwoo Shin, Honglak Lee, Pieter Abbeel,, Kimin Lee

TL;DR
This paper introduces RE3, a novel exploration method in deep reinforcement learning that uses random encoders to estimate state entropy efficiently, significantly enhancing sample-efficiency in high-dimensional environments.
Contribution
The paper proposes using fixed, randomly initialized encoders to estimate state entropy in high-dimensional spaces, enabling stable and efficient exploration in deep RL.
Findings
RE3 improves sample-efficiency in locomotion and navigation tasks.
RE3 enables learning diverse behaviors without extrinsic rewards.
RE3 outperforms existing exploration methods on benchmark tasks.
Abstract
Recent exploration methods have proven to be a recipe for improving sample-efficiency in deep reinforcement learning (RL). However, efficient exploration in high-dimensional observation spaces still remains a challenge. This paper presents Random Encoders for Efficient Exploration (RE3), an exploration method that utilizes state entropy as an intrinsic reward. In order to estimate state entropy in environments with high-dimensional observations, we utilize a k-nearest neighbor entropy estimator in the low-dimensional representation space of a convolutional encoder. In particular, we find that the state entropy can be estimated in a stable and compute-efficient manner by utilizing a randomly initialized encoder, which is fixed throughout training. Our experiments show that RE3 significantly improves the sample-efficiency of both model-free and model-based RL methods on locomotion and…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
Taxonomy
TopicsReinforcement Learning in Robotics · Human Pose and Action Recognition · Adversarial Robustness in Machine Learning
