ELEMENT: Episodic and Lifelong Exploration via Maximum Entropy
Hongming Li, Shujian Yu, Bin Liu, Jose C. Principe

TL;DR
ELEMENT introduces a multiscale, intrinsically motivated RL framework that enhances exploration without extrinsic rewards, using episodic entropy maximization and efficient kNN-based entropy estimation to improve transfer and scalability.
Contribution
It proposes a novel multiscale entropy optimization, an episodic maximum entropy approach, and a kNN-based entropy estimation method for improved lifelong exploration in RL.
Findings
Outperforms state-of-the-art intrinsic rewards in exploration tasks.
Effective in task-agnostic pre-training and offline RL data collection.
Reduces computational costs significantly compared to previous methods.
Abstract
This paper proposes \emph{Episodic and Lifelong Exploration via Maximum ENTropy} (ELEMENT), a novel, multiscale, intrinsically motivated reinforcement learning (RL) framework that is able to explore environments without using any extrinsic reward and transfer effectively the learned skills to downstream tasks. We advance the state of the art in three ways. First, we propose a multiscale entropy optimization to take care of the fact that previous maximum state entropy, for lifelong exploration with millions of state observations, suffers from vanishing rewards and becomes very expensive computationally across iterations. Therefore, we add an episodic maximum entropy over each episode to speedup the search further. Second, we propose a novel intrinsic reward for episodic entropy maximization named \emph{average episodic state entropy} which provides the optimal solution for a theoretical…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Thermodynamics and Statistical Mechanics
MethodsSPEED: Separable Pyramidal Pooling EncodEr-Decoder for Real-Time Monocular Depth Estimation on Low-Resource Settings
