Demonic variance and a non-determinism score for Markov decision processes
Jakob Piribauer

TL;DR
This paper introduces the concept of demonic variance to quantify non-determinism in Markov decision processes, providing a new score and analyzing its properties and computational aspects.
Contribution
It proposes the novel notion of demonic variance and a non-determinism score for MDPs, along with properties, bounds, and algorithms for computing these measures.
Findings
Demonic variance is between 1 and 2 times the maximal variance in an MDP.
The non-determinism score measures how much non-determinism influences the variability of outcomes.
Algorithms for computing maximal and demonic variance are developed for specific random variables.
Abstract
This paper studies the influence of probabilism and non-determinism on some quantitative aspect X of the execution of a system modeled as a Markov decision process (MDP). To this end, the novel notion of demonic variance is introduced: For a random variable X in an MDP M, it is defined as 1/2 times the maximal expected squared distance of the values of X in two independent execution of M in which also the non-deterministic choices are resolved independently by two distinct schedulers. It is shown that the demonic variance is between 1 and 2 times as large as the maximal variance of X in M that can be achieved by a single scheduler. This allows defining a non-determinism score for M and X measuring how strongly the difference of X in two executions of M can be influenced by the non-deterministic choices. Properties of MDPs M with extremal values of the non-determinism score are…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
