Model-Free Risk-Sensitive Reinforcement Learning

Gr\'egoire Del\'etang; Jordi Grau-Moya; Markus Kunesch; Tim Genewein,; Rob Brekelmans; Shane Legg; Pedro A. Ortega

arXiv:2111.02907·cs.LG·November 5, 2021

Model-Free Risk-Sensitive Reinforcement Learning

Gr\'egoire Del\'etang, Jordi Grau-Moya, Markus Kunesch, Tim Genewein,, Rob Brekelmans, Shane Legg, Pedro A. Ortega

PDF

Open Access

TL;DR

This paper introduces a risk-sensitive reinforcement learning algorithm that extends TD learning to estimate free energy, enabling decision-making that accounts for uncertainty in model-free settings.

Contribution

It develops a novel stochastic approximation rule for estimating Gaussian free energy, integrating risk sensitivity into model-free reinforcement learning.

Findings

01

Provides a new risk-sensitive RL algorithm based on TD learning.

02

Enables estimation of mean and variance from i.i.d. samples.

03

Applicable to risk-sensitive decision-making scenarios.

Abstract

We extend temporal-difference (TD) learning in order to obtain risk-sensitive, model-free reinforcement learning algorithms. This extension can be regarded as modification of the Rescorla-Wagner rule, where the (sigmoidal) stimulus is taken to be either the event of over- or underestimating the TD target. As a result, one obtains a stochastic approximation rule for estimating the free energy from i.i.d. samples generated by a Gaussian distribution with unknown mean and variance. Since the Gaussian free energy is known to be a certainty-equivalent sensitive to the mean and the variance, the learning rule has applications in risk-sensitive decision-making.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsGene Regulatory Network Analysis · Advanced Multi-Objective Optimization Algorithms · Evolutionary Algorithms and Applications