State Entropy Regularization for Robust Reinforcement Learning

Yonatan Ashlag; Uri Koren; Mirco Mutti; Esther Derman; Pierre-Luc Bacon; Shie Mannor

arXiv:2506.07085·cs.LG·December 2, 2025

State Entropy Regularization for Robust Reinforcement Learning

Yonatan Ashlag, Uri Koren, Mirco Mutti, Esther Derman, Pierre-Luc Bacon, Shie Mannor

PDF

Open Access

TL;DR

This paper investigates how state entropy regularization enhances robustness in reinforcement learning, especially against structured perturbations, providing theoretical guarantees and practical insights into its advantages over policy entropy.

Contribution

It offers the first theoretical analysis of state entropy regularization's robustness benefits and compares it with policy entropy, highlighting its effectiveness against correlated perturbations.

Findings

01

State entropy improves robustness to structured perturbations.

02

Theoretical guarantees are provided for reward and transition uncertainties.

03

Robustness benefits are sensitive to the number of rollouts.

Abstract

State entropy regularization has empirically shown better exploration and sample complexity in reinforcement learning (RL). However, its theoretical guarantees have not been studied. In this paper, we show that state entropy regularization improves robustness to structured and spatially correlated perturbations. These types of variation are common in transfer learning but often overlooked by standard robust RL methods, which typically focus on small, uncorrelated changes. We provide a comprehensive characterization of these robustness properties, including formal guarantees under reward and transition uncertainty, as well as settings where the method performs poorly. Much of our analysis contrasts state entropy with the widely used policy entropy regularization, highlighting their different benefits. Finally, from a practical standpoint, we illustrate that compared with policy entropy,…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsReinforcement Learning in Robotics · Adversarial Robustness in Machine Learning · Adaptive Dynamic Programming Control