Model Reduction Techniques for Computing Approximately Optimal Solutions for Markov Decision Processes
Thomas L. Dean, Robert Givan, Sonia Leach

TL;DR
This paper introduces a novel approach for reducing large implicit Markov decision processes by using epsilon-homogeneous partitions, leading to smaller bounded parameter MDPs that facilitate approximate policy computation.
Contribution
It presents the concept of epsilon-homogeneity for state space partitioning and develops algorithms to compute approximate solutions for large factored MDPs via bounded parameter MDPs.
Findings
Epsilon-homogeneous partitions effectively reduce state space size.
Algorithms can find approximately optimal policies in reduced MDPs.
The approach offers a trade-off between solution quality and computational resources.
Abstract
We present a method for solving implicit (factored) Markov decision processes (MDPs) with very large state spaces. We introduce a property of state space partitions which we call epsilon-homogeneity. Intuitively, an epsilon-homogeneous partition groups together states that behave approximately the same under all or some subset of policies. Borrowing from recent work on model minimization in computer-aided software verification, we present an algorithm that takes a factored representation of an MDP and an 0<=epsilon<=1 and computes a factored epsilon-homogeneous partition of the state space. This partition defines a family of related MDPs - those MDPs with state space equal to the blocks of the partition, and transition probabilities "approximately" like those of any (original MDP) state in the source block. To formally study such families of MDPs, we introduce the new notion of a…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsFormal Methods in Verification · Software Reliability and Analysis Research · Software Testing and Debugging Techniques
