Central-limit approach to risk-aware Markov decision processes

Pengqian Yu; Jia Yuan Yu; Huan Xu

arXiv:1512.00583·math.OC·December 3, 2015·2 cites

Central-limit approach to risk-aware Markov decision processes

Pengqian Yu, Jia Yuan Yu, Huan Xu

PDF

Open Access

TL;DR

This paper introduces a central-limit theorem-based method for evaluating and optimizing risk in Markov decision processes, applicable with known or unknown transition probabilities, and includes a gradient-based policy improvement algorithm.

Contribution

It presents a novel risk evaluation framework using a central limit theorem for MDPs and a convergent gradient-based policy improvement method.

Findings

01

Effective risk evaluation over long horizons

02

Applicable to known and unknown transition probabilities

03

Convergent policy improvement algorithm

Abstract

Whereas classical Markov decision processes maximize the expected reward, we consider minimizing the risk. We propose to evaluate the risk associated to a given policy over a long-enough time horizon with the help of a central limit theorem. The proposed approach works whether the transition probabilities are known or not. We also provide a gradient-based policy improvement algorithm that converges to a local optimum of the risk objective.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsSimulation Techniques and Applications · Risk and Portfolio Optimization · Advanced Control Systems Optimization