Catastrophic-risk-aware reinforcement learning with   extreme-value-theory-based policy gradients

Parisa Davar; Fr\'ed\'eric Godin; Jose Garrido

arXiv:2406.15612·cs.LG·July 1, 2024

Catastrophic-risk-aware reinforcement learning with extreme-value-theory-based policy gradients

Parisa Davar, Fr\'ed\'eric Godin, Jose Garrido

PDF

Open Access 1 Repo

TL;DR

This paper introduces POTPG, a policy gradient method leveraging extreme value theory to effectively mitigate low-frequency, high-severity catastrophic risks in sequential decision making, with demonstrated success in financial risk management.

Contribution

The paper presents a novel policy gradient algorithm, POTPG, that incorporates extreme value theory to better handle tail risks in reinforcement learning.

Findings

01

POTPG outperforms standard benchmarks in risk mitigation.

02

The method effectively models tail risks with limited data.

03

Application to financial hedging demonstrates practical utility.

Abstract

This paper tackles the problem of mitigating catastrophic risk (which is risk with very low frequency but very high severity) in the context of a sequential decision making process. This problem is particularly challenging due to the scarcity of observations in the far tail of the distribution of cumulative costs (negative rewards). A policy gradient algorithm is developed, that we call POTPG. It is based on approximations of the tail risk derived from extreme value theory. Numerical experiments highlight the out-performance of our method over common benchmarks, relying on the empirical distribution. An application to financial risk management, more precisely to the dynamic hedging of a financial option, is presented.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

parisadavar/EVT-policy-gradient-RL
noneOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsReinforcement Learning in Robotics