Viability of Future Actions: Robust Safety in Reinforcement Learning via Entropy Regularization

Pierre-Fran\c{c}ois Massiani; Alexander von Rohr; Lukas Haverbeck; Sebastian Trimpe

arXiv:2506.10871·cs.LG·December 23, 2025

Viability of Future Actions: Robust Safety in Reinforcement Learning via Entropy Regularization

Pierre-Fran\c{c}ois Massiani, Alexander von Rohr, Lukas Haverbeck, Sebastian Trimpe

PDF

1 Repo

TL;DR

This paper explores how entropy regularization in reinforcement learning promotes safety and robustness by encouraging viable actions, and shows that safety constraints can be effectively relaxed through penalties, enabling standard RL methods to achieve robust safety.

Contribution

It reveals the connection between entropy regularization and robustness in constrained RL, proposing a method to approximate safety constraints with penalties for improved resilience.

Findings

01

Entropy regularization biases policies toward future viable actions.

02

Relaxing safety constraints with penalties approximates constrained RL with unconstrained RL.

03

The approach empirically improves robustness to disturbances while maintaining safety and optimality.

Abstract

Despite the many recent advances in reinforcement learning (RL), the question of learning policies that robustly satisfy state constraints under unknown disturbances remains open. In this paper, we offer a new perspective on achieving robust safety by analyzing the interplay between two well-established techniques in model-free RL: entropy regularization, and constraints penalization. We reveal empirically that entropy regularization in constrained RL inherently biases learning toward maximizing the number of future viable actions, thereby promoting constraints satisfaction robust to action noise. Furthermore, we show that by relaxing strict safety constraints through penalties, the constrained RL problem can be approximated arbitrarily closely by an unconstrained one and thus solved using standard model-free RL. This reformulation preserves both safety and optimality while empirically…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

data-science-in-mechanical-engineering/entropy_robustness
noneOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

MethodsEntropy Regularization