The implications of state aggregation in deteriorating Markov Decision Processes with optimal threshold policies
Madeleine Pollack, Lauren N. Steimle

TL;DR
This paper examines how state aggregation in Markov Decision Processes affects the optimality of threshold policies, highlighting trade-offs between model precision and data limitations through theoretical analysis and simulations.
Contribution
It provides conditions under which aggregated MDPs retain threshold policies and analyzes the trade-offs in policy quality with different levels of state aggregation.
Findings
Larger state spaces offer greater potential for policy improvement.
Aggregated MDPs are preferable when data is limited.
Threshold policies can be preserved under certain aggregation conditions.
Abstract
Markov Decision Processes (MDPs) are mathematical models of sequential decision-making under uncertainty that have found applications in healthcare, manufacturing, logistics, and others. In these models, a decision-maker observes the state of a stochastic process and determines which action to take with the goal of maximizing the expected total discounted rewards received. In many applications, the state space of the true system is large and there may be limited observations out of certain states to estimate the transition probability matrix. To overcome this, modelers will aggregate the true states into ``superstates" resulting in a smaller state space. This aggregation process improves computational tractability and increases the number of observations among superstates. Thus, the modeler's choice of state space leads to a trade-off in transition probability estimates. While coarser…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsEconomic Policies and Impacts
