Risk-Averse Reinforcement Learning with Itakura-Saito Loss

Igor Udovichenko; Olivier Croissant; Anita Toleutaeva; Evgeny Burnaev; Alexander Korotin

arXiv:2505.16925·cs.LG·May 27, 2025

Risk-Averse Reinforcement Learning with Itakura-Saito Loss

Igor Udovichenko, Olivier Croissant, Anita Toleutaeva, Evgeny Burnaev, Alexander Korotin

PDF

Open Access

TL;DR

This paper introduces a new risk-averse reinforcement learning approach using the Itakura-Saito divergence as a loss function, demonstrating improved performance over existing methods in various scenarios.

Contribution

It presents a numerically stable, mathematically sound Itakura-Saito divergence-based loss function for risk-averse RL with exponential utility, along with theoretical and empirical validation.

Findings

01

Itakura-Saito loss outperforms alternatives in multiple scenarios.

02

The method is stable and compatible with existing RL algorithms.

03

The approach effectively incorporates risk aversion via exponential utility.

Abstract

Risk-averse reinforcement learning finds application in various high-stakes fields. Unlike classical reinforcement learning, which aims to maximize expected returns, risk-averse agents choose policies that minimize risk, occasionally sacrificing expected value. These preferences can be framed through utility theory. We focus on the specific case of the exponential utility function, where one can derive the Bellman equations and employ various reinforcement learning algorithms with few modifications. To address this, we introduce to the broad machine learning community a numerically stable and mathematically sound loss function based on the Itakura-Saito divergence for learning state-value and action-value functions. We evaluate the Itakura-Saito loss function against established alternatives, both theoretically and empirically. In the experimental section, we explore multiple scenarios,…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsReinforcement Learning in Robotics · Risk and Portfolio Optimization · Explainable Artificial Intelligence (XAI)

MethodsFocus