A Max-Min Entropy Framework for Reinforcement Learning
Seungyul Han, Youngchul Sung

TL;DR
This paper introduces a max-min entropy framework for reinforcement learning that enhances exploration by focusing on low-entropy states, leading to significant performance improvements over existing algorithms.
Contribution
It proposes a novel max-min entropy framework for RL that disentangles exploration and exploitation, improving exploration efficiency and overall performance.
Findings
Drastic performance improvements over state-of-the-art RL algorithms.
Effective learning to visit low-entropy states with maximized entropy.
Framework applicable to general Markov decision processes.
Abstract
In this paper, we propose a max-min entropy framework for reinforcement learning (RL) to overcome the limitation of the soft actor-critic (SAC) algorithm implementing the maximum entropy RL in model-free sample-based learning. Whereas the maximum entropy RL guides learning for policies to reach states with high entropy in the future, the proposed max-min entropy framework aims to learn to visit states with low entropy and maximize the entropy of these low-entropy states to promote better exploration. For general Markov decision processes (MDPs), an efficient algorithm is constructed under the proposed max-min entropy framework based on disentanglement of exploration and exploitation. Numerical results show that the proposed algorithm yields drastic performance improvement over the current state-of-the-art RL algorithms.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
Taxonomy
TopicsReinforcement Learning in Robotics · Adaptive Dynamic Programming Control · Adversarial Robustness in Machine Learning
