First-order sensitivity of the optimal value in a Markov decision model with respect to deviations in the transition probability function
Patrick Kern, Axel Simroth, Henryk Z\"ahle

TL;DR
This paper introduces a method to measure the first-order sensitivity of the optimal value in Markov decision models to changes in transition probabilities, aiding in understanding the impact of model simplifications.
Contribution
It proposes a derivative concept for the optimal value with respect to transition probabilities, providing explicit formulas and broad applicability.
Findings
The derivative is explicitly specified for a broad class of MDMs.
Theoretical results are illustrated with inventory control and finance examples.
The approach helps assess the reasonableness of model reductions.
Abstract
Markov decision models (MDM) used in practical applications are most often less complex than the underlying `true' MDM. The reduction of model complexity is performed for several reasons. However, it is obviously of interest to know what kind of model reduction is reasonable (in regard to the optimal value) and what kind is not. In this article we propose a way how to address this question. We introduce a sort of derivative of the optimal value as a function of the transition probabilities, which can be used to measure the (first-order) sensitivity of the optimal value w.r.t.\ changes in the transition probabilities. `Differentiability' is obtained for a fairly broad class of MDMs, and the `derivative' is specified explicitly. Our theoretical findings are illustrated by means of optimization problems in inventory control and mathematical finance.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
