Measurized Markov Decision Processes
Daniel Adelman, Alba V. Olivares-Nadal

TL;DR
This paper introduces the concept of measurized MDPs, lifting Markov Decision Processes to the space of probability measures, enabling new analysis and applications with simpler assumptions and proofs.
Contribution
It generalizes stochastic MDPs through a measurable framework, incorporating constraints and value approximations, and introduces a novel algebraic lifting procedure.
Findings
Measurized MDPs are a valid generalization of stochastic MDPs.
The framework allows for easier verification of assumptions and simpler proofs.
Demonstrates how lifted MDPs can incorporate constraints and external shocks.
Abstract
In this paper, we explore lifting Markov Decision Processes (MDPs) to the space of probability measures and consider the so-called measurized MDPs: deterministic processes where states are probability measures on the original state space, and actions are stochastic kernels on the original action space. We show that measurized MDPs are a generalization of stochastic MDPs, thus the measurized framework can be deployed without loss of fidelity. Bertsekas and Shreve studied similar deterministic MDPs under the discounted infinite-horizon criterion in the context of universally measurable policies. Here, we also consider the long-run average reward case, but we cast lifted MDPs within the semicontinuous-semicompact framework of Hern\'andez-Lerma and Lasserre. This makes the lifted framework more accessible as it entails (i) optimal Borel-measurable value functions and policies, (ii)…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
