Stochastic dominance-constrained Markov decision processes

William B. Haskell; Rahul Jain

arXiv:1206.4568·math.OC·June 21, 2012·SIAM J. Control. Optim.

Stochastic dominance-constrained Markov decision processes

William B. Haskell, Rahul Jain

PDF

Open Access

TL;DR

This paper introduces a linear programming approach to solve risk-constrained Markov decision processes using stochastic dominance constraints, applicable to both average and discounted reward settings, with a portfolio optimization example.

Contribution

It develops a novel linear programming formulation for stochastic dominance-constrained MDPs, including dual dynamic programming equations with a new pricing term, extending to various stochastic orders.

Findings

01

Linear constraints on occupation measures for risk constraints.

02

Optimal policies derived from linear programs incorporating dominance constraints.

03

Application demonstrated in portfolio optimization.

Abstract

We are interested in risk constraints for infinite horizon discrete time Markov decision processes (MDPs). Starting with average reward MDPs, we show that increasing concave stochastic dominance constraints on the empirical distribution of reward lead to linear constraints on occupation measures. The optimal policy for the resulting class of dominance-constrained MDPs is obtained by solving a linear program. We compute the dual of this linear program to obtain average dynamic programming optimality equations that reflect the dominance constraint. In particular, a new pricing term appears in the optimality equations corresponding to the dominance constraint. We show that many types of stochastic orders can be used in place of the increasing concave stochastic order. We also carry out a parallel development for discounted reward MDPs with stochastic dominance constraints. The paper…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsRisk and Portfolio Optimization · Economic theories and models · Reinforcement Learning in Robotics