Risk-sensitive control as inference with R\'enyi divergence
Kaito Ito, Kenji Kashima

TL;DR
This paper presents RCaI, a unified framework for risk-sensitive control and reinforcement learning using Re9nyi divergence, connecting it to MaxEnt control and deriving new algorithms like risk-sensitive policy gradient and soft actor-critic.
Contribution
It introduces RCaI, extending CaI with Re9nyi divergence, and derives risk-sensitive RL algorithms, unifying risk-sensitive and risk-neutral control under a common framework.
Findings
RCaI is equivalent to log-probability regularized risk-sensitive control.
Risk-sensitive optimal policies solve a soft Bellman equation.
The framework recovers risk-neutral CaI and RL as special cases.
Abstract
This paper introduces the risk-sensitive control as inference (RCaI) that extends CaI by using R\'{e}nyi divergence variational inference. RCaI is shown to be equivalent to log-probability regularized risk-sensitive control, which is an extension of the maximum entropy (MaxEnt) control. We also prove that the risk-sensitive optimal policy can be obtained by solving a soft Bellman equation, which reveals several equivalences between RCaI, MaxEnt control, the optimal posterior for CaI, and linearly-solvable control. Moreover, based on RCaI, we derive the risk-sensitive reinforcement learning (RL) methods: the policy gradient and the soft actor-critic. As the risk-sensitivity parameter vanishes, we recover the risk-neutral CaI and RL, which means that RCaI is a unifying framework. Furthermore, we give another risk-sensitive generalization of the MaxEnt control using R\'{e}nyi entropy…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
Taxonomy
TopicsFault Detection and Control Systems
