ACReL: Adversarial Conditional value-at-risk Reinforcement Learning
M. Godbout, M. Heuillet, S. Chandra, R. Bhati, A. Durand

TL;DR
ACReL introduces an adversarial reinforcement learning approach to optimize CVaR, enhancing risk-averse policies with theoretical guarantees and practical effectiveness in safety-critical domains.
Contribution
It proposes a novel adversarial meta-algorithm for CVaR optimization in RL, with a gradient-based training method and theoretical convergence guarantees.
Findings
ACReL matches state-of-the-art CVaR RL baselines.
The approach provides theoretical guarantees on risk-averse policy optimality.
Empirical results demonstrate effectiveness in safety-critical scenarios.
Abstract
In the classical Reinforcement Learning (RL) setting, one aims to find a policy that maximizes its expected return. This objective may be inappropriate in safety-critical domains such as healthcare or autonomous driving, where intrinsic uncertainties due to stochastic policies and environment variability may lead to catastrophic failures. This can be addressed by using the Conditional-Value-at-Risk (CVaR) objective to instill risk-aversion in learned policies. In this paper, we propose Adversarial Cvar Reinforcement Learning (ACReL), a novel adversarial meta-algorithm to optimize the CVaR objective in RL. ACReL is based on a max-min between a policy player and a learned adversary that perturbs the policy player's state transitions given a finite budget. We prove that, the closer the players are to the game's equilibrium point, the closer the learned policy is to the CVaR-optimal one…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdversarial Robustness in Machine Learning · Health Systems, Economic Evaluations, Quality of Life · Explainable Artificial Intelligence (XAI)
