Improving Stochastic Action-Constrained Reinforcement Learning via Truncated Distributions

Roland Stolz; Michael Eichelbeck; Matthias Althoff

arXiv:2511.22406·cs.LG·December 1, 2025

Improving Stochastic Action-Constrained Reinforcement Learning via Truncated Distributions

Roland Stolz, Michael Eichelbeck, Matthias Althoff

PDF

Open Access

TL;DR

This paper introduces efficient numerical methods for accurately estimating key properties of truncated normal distributions in action-constrained reinforcement learning, leading to improved policy performance.

Contribution

It proposes novel numerical approximations and sampling strategies for truncated distributions, enhancing policy updates in constrained RL settings.

Findings

01

Significant performance improvements on benchmark environments

02

Accurate estimation of entropy and log-probability is crucial

03

Efficient sampling reduces computational overhead

Abstract

In reinforcement learning (RL), it is often advantageous to consider additional constraints on the action space to ensure safety or action relevance. Existing work on such action-constrained RL faces challenges regarding effective policy updates, computational efficiency, and predictable runtime. Recent work proposes to use truncated normal distributions for stochastic policy gradient methods. However, the computation of key characteristics, such as the entropy, log-probability, and their gradients, becomes intractable under complex constraints. Hence, prior work approximates these using the non-truncated distributions, which severely degrades performance. We argue that accurate estimation of these characteristics is crucial in the action-constrained RL setting, and propose efficient numerical approximations for them. We also provide an efficient sampling strategy for truncated policy…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsReinforcement Learning in Robotics · Adversarial Robustness in Machine Learning · Domain Adaptation and Few-Shot Learning