Risk-Sensitive Reinforcement Learning Applied to Control under   Constraints

P. Geibel; F. Wysotzki

arXiv:1109.2147·cs.LG·September 13, 2011

Risk-Sensitive Reinforcement Learning Applied to Control under Constraints

P. Geibel, F. Wysotzki

PDF

TL;DR

This paper introduces a risk-sensitive reinforcement learning algorithm for Markov Decision Processes with error states, enabling control under safety constraints without relying on strict model assumptions.

Contribution

It formalizes risk as a second criterion in constrained MDPs and presents a model-free heuristic algorithm that balances performance and safety.

Findings

01

Successfully applied to control of a feed tank system

02

Handles relaxed assumptions compared to traditional methods

03

Achieves good performance while respecting risk constraints

Abstract

In this paper, we consider Markov Decision Processes (MDPs) with error states. Error states are those states entering which is undesirable or dangerous. We define the risk with respect to a policy as the probability of entering such a state when the policy is pursued. We consider the problem of finding good policies whose risk is smaller than some user-specified threshold, and formalize it as a constrained MDP with two criteria. The first criterion corresponds to the value function originally given. We will show that the risk can be formulated as a second criterion function based on a cumulative return, whose definition is independent of the original value function. We present a model free, heuristic reinforcement learning algorithm that aims at finding good deterministic policies. It is based on weighting the original value function and the risk. The weight parameter is adapted in…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.