Reinforcement Learning with Dynamic Convex Risk Measures

Anthony Coache; Sebastian Jaimungal

arXiv:2112.13414·cs.LG·December 1, 2022

Reinforcement Learning with Dynamic Convex Risk Measures

Anthony Coache, Sebastian Jaimungal

PDF

Open Access 1 Repo

TL;DR

This paper introduces a model-free reinforcement learning framework that incorporates dynamic convex risk measures for time-consistent risk-sensitive decision making, demonstrated across finance and robotics applications.

Contribution

It develops a novel RL approach using dynamic programming and policy gradients to handle risk measures, with an actor-critic neural network implementation for practical optimization.

Findings

01

Effective in financial trading and hedging tasks

02

Robust obstacle avoidance in robotics

03

Flexible across multiple risk-sensitive applications

Abstract

We develop an approach for solving time-consistent risk-sensitive stochastic optimization problems using model-free reinforcement learning (RL). Specifically, we assume agents assess the risk of a sequence of random variables using dynamic convex risk measures. We employ a time-consistent dynamic programming principle to determine the value of a particular policy, and develop policy gradient update rules that aid in obtaining optimal policies. We further develop an actor-critic style algorithm using neural networks to optimize over policies. Finally, we demonstrate the performance and flexibility of our approach by applying it to three optimization problems: statistical arbitrage trading strategies, financial hedging, and obstacle avoidance robot control.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

acoache/rl-dynamicconvexrisk
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsRisk and Portfolio Optimization · Energy, Environment, and Transportation Policies · Stochastic processes and financial applications