Safety-Aware Reinforcement Learning for Control via Risk-Sensitive Action-Value Iteration and Quantile Regression

Clinton Enwerem; Aniruddh G. Puranic; John S. Baras; Calin Belta

arXiv:2506.06954·cs.LG·December 9, 2025

Safety-Aware Reinforcement Learning for Control via Risk-Sensitive Action-Value Iteration and Quantile Regression

Clinton Enwerem, Aniruddh G. Puranic, John S. Baras, Calin Belta

PDF

Open Access

TL;DR

This paper introduces a safety-aware reinforcement learning algorithm that uses risk-sensitive quantile regression and CVaR to improve safety and performance in stochastic control tasks.

Contribution

It proposes a novel risk-regularized quantile-based RL method with theoretical guarantees and practical benefits for safety-critical applications.

Findings

01

Achieves higher goal success rates in simulations.

02

Reduces collision rates compared to risk-neutral methods.

03

Provides convergence guarantees for the risk-sensitive Bellman operator.

Abstract

Mainstream approximate action-value iteration reinforcement learning (RL) algorithms suffer from overestimation bias, leading to suboptimal policies in high-variance stochastic environments. Quantile-based action-value iteration methods reduce this bias by learning a distribution of the expected cost-to-go using quantile regression. However, ensuring that the learned policy satisfies safety constraints remains a challenge when these constraints are not explicitly integrated into the RL framework. Existing methods often require complex neural architectures or manual tradeoffs due to combined cost functions. To address this, we propose a risk-regularized quantile-based algorithm integrating Conditional Value-at-Risk (CVaR) to enforce safety without complex architectures. We also provide theoretical guarantees on the contraction properties of the risk-sensitive distributional Bellman…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsReinforcement Learning in Robotics · Robot Manipulation and Learning · Model Reduction and Neural Networks