Beyond CVaR: Leveraging Static Spectral Risk Measures for Enhanced Decision-Making in Distributional Reinforcement Learning
Mehrdad Moghimi, Hyejin Ku

TL;DR
This paper introduces a new distributional reinforcement learning algorithm that optimizes a broad class of static spectral risk measures, providing better risk management and interpretability in decision-making tasks.
Contribution
The paper develops a novel DRL algorithm with convergence guarantees that optimizes spectral risk measures, extending beyond CVaR, and offers a clearer interpretation of learned policies.
Findings
Outperforms existing risk-neutral DRL models in various tasks.
Learns policies aligned with spectral risk measures.
Provides theoretical convergence guarantees.
Abstract
In domains such as finance, healthcare, and robotics, managing worst-case scenarios is critical, as failure to do so can lead to catastrophic outcomes. Distributional Reinforcement Learning (DRL) provides a natural framework to incorporate risk sensitivity into decision-making processes. However, existing approaches face two key limitations: (1) the use of fixed risk measures at each decision step often results in overly conservative policies, and (2) the interpretation and theoretical properties of the learned policies remain unclear. While optimizing a static risk measure addresses these issues, its use in the DRL framework has been limited to the simple static CVaR risk measure. In this paper, we present a novel DRL algorithm with convergence guarantees that optimizes for a broader class of static Spectral Risk Measures (SRM). Additionally, we provide a clear interpretation of the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
Taxonomy
TopicsReinforcement Learning in Robotics
Methodsstyle-based recalibration module
