A Max-Min Entropy Framework for Reinforcement Learning

Seungyul Han; Youngchul Sung

arXiv:2106.10517·cs.LG·December 21, 2021·5 cites

A Max-Min Entropy Framework for Reinforcement Learning

Seungyul Han, Youngchul Sung

PDF

Open Access 1 Repo 1 Video

TL;DR

This paper introduces a max-min entropy framework for reinforcement learning that enhances exploration by focusing on low-entropy states, leading to significant performance improvements over existing algorithms.

Contribution

It proposes a novel max-min entropy framework for RL that disentangles exploration and exploitation, improving exploration efficiency and overall performance.

Findings

01

Drastic performance improvements over state-of-the-art RL algorithms.

02

Effective learning to visit low-entropy states with maximized entropy.

03

Framework applicable to general Markov decision processes.

Abstract

In this paper, we propose a max-min entropy framework for reinforcement learning (RL) to overcome the limitation of the soft actor-critic (SAC) algorithm implementing the maximum entropy RL in model-free sample-based learning. Whereas the maximum entropy RL guides learning for policies to reach states with high entropy in the future, the proposed max-min entropy framework aims to learn to visit states with low entropy and maximize the entropy of these low-entropy states to promote better exploration. For general Markov decision processes (MDPs), an efficient algorithm is constructed under the proposed max-min entropy framework based on disentanglement of exploration and exploitation. Numerical results show that the proposed algorithm yields drastic performance improvement over the current state-of-the-art RL algorithms.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

seungyulhan/mme
tfOfficial

Videos

A Max-Min Entropy Framework for Reinforcement Learning· slideslive

Taxonomy

TopicsReinforcement Learning in Robotics · Adaptive Dynamic Programming Control · Adversarial Robustness in Machine Learning