Reductive MDPs: A Perspective Beyond Temporal Horizons
Thomas Spooner, Rui Silva, Joshua Lockhart, Jason Long, Vacslav, Glukhov

TL;DR
This paper introduces reductive SSPs, a subclass of stochastic shortest path problems, demonstrating polynomial-time solutions by extending backward induction, thus bridging the gap between finite-horizon and general MDP complexities.
Contribution
It defines reductive SSPs with a drift condition, extending traditional horizon concepts, and shows they can be solved efficiently with polynomial-time algorithms.
Findings
Optimal policies for reductive SSPs can be computed in polynomial time.
The approach generalizes finite-horizon algorithms to broader classes of MDPs.
Numerical experiments confirm the effectiveness on an optimal liquidation problem.
Abstract
Solving general Markov decision processes (MDPs) is a computationally hard problem. Solving finite-horizon MDPs, on the other hand, is highly tractable with well known polynomial-time algorithms. What drives this extreme disparity, and do problems exist that lie between these diametrically opposed complexities? In this paper we identify and analyse a sub-class of stochastic shortest path problems (SSPs) for general state-action spaces whose dynamics satisfy a particular drift condition. This construction generalises the traditional, temporal notion of a horizon via decreasing reachability: a property called reductivity. It is shown that optimal policies can be recovered in polynomial-time for reductive SSPs -- via an extension of backwards induction -- with an efficient analogue in reductive MDPs. The practical considerations of the proposed approach are discussed, and numerical…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsBayesian Modeling and Causal Inference · Machine Learning and Algorithms · Markov Chains and Monte Carlo Methods
