Risk-Sensitive Exponential Actor Critic

Alonso Granados; Jason Pacheco

arXiv:2602.07202·cs.LG·February 10, 2026

Risk-Sensitive Exponential Actor Critic

Alonso Granados, Jason Pacheco

PDF

Open Access

TL;DR

This paper introduces rsEAC, a risk-sensitive actor-critic algorithm that improves numerical stability and effectiveness in learning risk-aware policies in complex continuous tasks, supported by new theoretical insights.

Contribution

The paper provides a theoretical foundation for policy gradients on the entropic risk measure and proposes rsEAC, a novel off-policy method avoiding explicit exponential value functions.

Findings

01

rsEAC achieves more stable updates than existing methods.

02

Successfully learns risk-sensitive policies in MuJoCo tasks.

03

Provides theoretical justification for risk-sensitive policy gradients.

Abstract

Model-free deep reinforcement learning (RL) algorithms have achieved tremendous success on a range of challenging tasks. However, safety concerns remain when these methods are deployed on real-world applications, necessitating risk-aware agents. A common utility for learning such risk-aware agents is the entropic risk measure, but current policy gradient methods optimizing this measure must perform high-variance and numerically unstable updates. As a result, existing risk-sensitive model-free approaches are limited to simple tasks and tabular settings. In this paper, we provide a comprehensive theoretical justification for policy gradient methods on the entropic risk measure, including on- and off-policy gradient theorems for the stochastic and deterministic policy settings. Motivated by theory, we propose risk-sensitive exponential actor-critic (rsEAC), an off-policy model-free…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsReinforcement Learning in Robotics · Adversarial Robustness in Machine Learning · Adaptive Dynamic Programming Control