Risk-Sensitive Soft Actor-Critic for Robust Deep Reinforcement Learning   under Distribution Shifts

Tobias Enders; James Harrison; Maximilian Schiffer

arXiv:2402.09992·cs.LG·February 16, 2024·1 cites

Risk-Sensitive Soft Actor-Critic for Robust Deep Reinforcement Learning under Distribution Shifts

Tobias Enders, James Harrison, Maximilian Schiffer

PDF

Open Access 1 Repo

TL;DR

This paper introduces a novel risk-sensitive deep reinforcement learning algorithm based on Soft Actor-Critic, designed to enhance robustness against distribution shifts in complex optimization problems, with empirical validation showing superior performance over existing methods.

Contribution

The paper derives a new risk-sensitive DRL algorithm using entropic risk measures and provides the first structured analysis of robustness under distribution shifts in this domain.

Findings

01

The proposed algorithm outperforms risk-neutral Soft Actor-Critic.

02

It demonstrates robustness to realistic distribution shifts.

03

It maintains performance on training distributions.

Abstract

We study the robustness of deep reinforcement learning algorithms against distribution shifts within contextual multi-stage stochastic combinatorial optimization problems from the operations research domain. In this context, risk-sensitive algorithms promise to learn robust policies. While this field is of general interest to the reinforcement learning community, most studies up-to-date focus on theoretical results rather than real-world performance. With this work, we aim to bridge this gap by formally deriving a novel risk-sensitive deep reinforcement learning algorithm while providing numerical evidence for its efficacy. Specifically, we introduce discrete Soft Actor-Critic for the entropic risk measure by deriving a version of the Bellman equation for the respective Q-values. We establish a corresponding policy improvement result and infer a practical algorithm. We introduce an…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

tumbais/risksensitivesacforrobustdrlunderdistshifts
tfOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsEvolutionary Algorithms and Applications

MethodsFocus