ACReL: Adversarial Conditional value-at-risk Reinforcement Learning

M. Godbout; M. Heuillet; S. Chandra; R. Bhati; A. Durand

arXiv:2109.09470·cs.LG·May 19, 2022

ACReL: Adversarial Conditional value-at-risk Reinforcement Learning

M. Godbout, M. Heuillet, S. Chandra, R. Bhati, A. Durand

PDF

Open Access

TL;DR

ACReL introduces an adversarial reinforcement learning approach to optimize CVaR, enhancing risk-averse policies with theoretical guarantees and practical effectiveness in safety-critical domains.

Contribution

It proposes a novel adversarial meta-algorithm for CVaR optimization in RL, with a gradient-based training method and theoretical convergence guarantees.

Findings

01

ACReL matches state-of-the-art CVaR RL baselines.

02

The approach provides theoretical guarantees on risk-averse policy optimality.

03

Empirical results demonstrate effectiveness in safety-critical scenarios.

Abstract

In the classical Reinforcement Learning (RL) setting, one aims to find a policy that maximizes its expected return. This objective may be inappropriate in safety-critical domains such as healthcare or autonomous driving, where intrinsic uncertainties due to stochastic policies and environment variability may lead to catastrophic failures. This can be addressed by using the Conditional-Value-at-Risk (CVaR) objective to instill risk-aversion in learned policies. In this paper, we propose Adversarial Cvar Reinforcement Learning (ACReL), a novel adversarial meta-algorithm to optimize the CVaR objective in RL. ACReL is based on a max-min between a policy player and a learned adversary that perturbs the policy player's state transitions given a finite budget. We prove that, the closer the players are to the game's equilibrium point, the closer the learned policy is to the CVaR-optimal one…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdversarial Robustness in Machine Learning · Health Systems, Economic Evaluations, Quality of Life · Explainable Artificial Intelligence (XAI)