Risk-sensitive control as inference with R\'enyi divergence

Kaito Ito; Kenji Kashima

arXiv:2411.01827·cs.LG·November 5, 2024

Risk-sensitive control as inference with R\'enyi divergence

Kaito Ito, Kenji Kashima

PDF

Open Access 1 Repo 1 Video

TL;DR

This paper presents RCaI, a unified framework for risk-sensitive control and reinforcement learning using Re9nyi divergence, connecting it to MaxEnt control and deriving new algorithms like risk-sensitive policy gradient and soft actor-critic.

Contribution

It introduces RCaI, extending CaI with Re9nyi divergence, and derives risk-sensitive RL algorithms, unifying risk-sensitive and risk-neutral control under a common framework.

Findings

01

RCaI is equivalent to log-probability regularized risk-sensitive control.

02

Risk-sensitive optimal policies solve a soft Bellman equation.

03

The framework recovers risk-neutral CaI and RL as special cases.

Abstract

This paper introduces the risk-sensitive control as inference (RCaI) that extends CaI by using R\'{e}nyi divergence variational inference. RCaI is shown to be equivalent to log-probability regularized risk-sensitive control, which is an extension of the maximum entropy (MaxEnt) control. We also prove that the risk-sensitive optimal policy can be obtained by solving a soft Bellman equation, which reveals several equivalences between RCaI, MaxEnt control, the optimal posterior for CaI, and linearly-solvable control. Moreover, based on RCaI, we derive the risk-sensitive reinforcement learning (RL) methods: the policy gradient and the soft actor-critic. As the risk-sensitivity parameter vanishes, we recover the risk-neutral CaI and RL, which means that RCaI is a unifying framework. Furthermore, we give another risk-sensitive generalization of the MaxEnt control using R\'{e}nyi entropy…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

kaito-1111/risk-sensitive-sac
pytorchOfficial

Videos

Risk-sensitive control as inference with Rényi divergence· slideslive

Taxonomy

TopicsFault Detection and Control Systems