Sufficiency of Deterministic Policies for Atomless Discounted and Uniformly Absorbing MDPs with Multiple Criteria
Eugene A. Feinberg, Aleksey B. Piunovskiy

TL;DR
This paper proves that for atomless discounted and uniformly absorbing MDPs with multiple criteria, deterministic policies are sufficient to achieve optimal performance, simplifying policy selection in such settings.
Contribution
It establishes that in atomless MDPs, deterministic policies suffice for optimality across multiple criteria, extending known results to more general classes.
Findings
Deterministic policies can match the performance of randomized ones in atomless MDPs.
Results apply to uniformly absorbing MDPs with bounded and unbounded rewards.
Lyapunov's convexity theorem is a special case of the presented results.
Abstract
This paper studies Markov Decision Processes (MDPs) with atomless initial state distributions and atomless transition probabilities. Such MDPs are called atomless. The initial state distribution is considered to be fixed. We show that for discounted MDPs with bounded one-step reward vector-functions, for each policy there exists a deterministic (that is, nonrandomized and stationary) policy with the same performance vector. This fact is proved in the paper for a more general class of uniformly absorbing MDPs with expected total costs, and then it is extended under certain assumptions to MDPs with unbounded rewards. For problems with multiple criteria and constraints, the results of this paper imply that for atomless MDPs studied in this paper it is sufficient to consider only deterministic policies, while without the atomless assumption it is well-known that randomized policies can…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
