Maximum Entropy RL (Provably) Solves Some Robust RL Problems

Benjamin Eysenbach; Sergey Levine

arXiv:2103.06257·cs.LG·May 6, 2022·28 cites

Maximum Entropy RL (Provably) Solves Some Robust RL Problems

Benjamin Eysenbach, Sergey Levine

PDF

Open Access 1 Video

TL;DR

This paper provides a theoretical proof that maximum entropy reinforcement learning inherently maximizes a lower bound on a robust RL objective, demonstrating its robustness to certain disturbances without extra modifications.

Contribution

The work offers the first rigorous proof and theoretical characterization of MaxEnt RL's robustness to disturbances in dynamics and reward functions.

Findings

01

MaxEnt RL maximizes a lower bound on a robust RL objective.

02

MaxEnt RL is robust to certain disturbances without additional modifications.

03

Provides formal guarantees for MaxEnt RL's robustness.

Abstract

Many potential applications of reinforcement learning (RL) require guarantees that the agent will perform well in the face of disturbances to the dynamics or reward function. In this paper, we prove theoretically that maximum entropy (MaxEnt) RL maximizes a lower bound on a robust RL objective, and thus can be used to learn policies that are robust to some disturbances in the dynamics and the reward function. While this capability of MaxEnt RL has been observed empirically in prior work, to the best of our knowledge our work provides the first rigorous proof and theoretical characterization of the MaxEnt RL robust set. While a number of prior robust RL algorithms have been designed to handle similar disturbances to the reward function or dynamics, these methods typically require additional moving parts and hyperparameters on top of a base RL algorithm. In contrast, our results suggest…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

Maximum Entropy RL (Provably) Solves Some Robust RL Problems· slideslive

Taxonomy

TopicsReinforcement Learning in Robotics · Advanced Bandit Algorithms Research · Adversarial Robustness in Machine Learning