First-order sensitivity of the optimal value in a Markov decision model   with respect to deviations in the transition probability function

Patrick Kern; Axel Simroth; Henryk Z\"ahle

arXiv:1909.07781·math.OC·September 18, 2019·Math. Methods Oper. Res.

First-order sensitivity of the optimal value in a Markov decision model with respect to deviations in the transition probability function

Patrick Kern, Axel Simroth, Henryk Z\"ahle

PDF

TL;DR

This paper introduces a method to measure the first-order sensitivity of the optimal value in Markov decision models to changes in transition probabilities, aiding in understanding the impact of model simplifications.

Contribution

It proposes a derivative concept for the optimal value with respect to transition probabilities, providing explicit formulas and broad applicability.

Findings

01

The derivative is explicitly specified for a broad class of MDMs.

02

Theoretical results are illustrated with inventory control and finance examples.

03

The approach helps assess the reasonableness of model reductions.

Abstract

Markov decision models (MDM) used in practical applications are most often less complex than the underlying `true' MDM. The reduction of model complexity is performed for several reasons. However, it is obviously of interest to know what kind of model reduction is reasonable (in regard to the optimal value) and what kind is not. In this article we propose a way how to address this question. We introduce a sort of derivative of the optimal value as a function of the transition probabilities, which can be used to measure the (first-order) sensitivity of the optimal value w.r.t.\ changes in the transition probabilities. `Differentiability' is obtained for a fairly broad class of MDMs, and the `derivative' is specified explicitly. Our theoretical findings are illustrated by means of optimization problems in inventory control and mathematical finance.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.