Large Markov Decision Processes and Combinatorial Optimization
Ali Eshragh

TL;DR
This paper reviews methods for solving large Markov decision processes and their application to combinatorial optimization, highlighting recent machine learning approaches and their limitations.
Contribution
It offers a concise overview of existing literature on large MDPs and their use in combinatorial optimization, emphasizing the gap in convergence guarantees.
Findings
Machine learning techniques are used to tackle large MDPs.
Current methods lack convergence guarantees.
Large MDPs are applicable to various fields like supply chains and autonomous control.
Abstract
Markov decision processes continue to gain in popularity for modeling a wide range of applications ranging from analysis of supply chains and queuing networks to cognitive science and control of autonomous vehicles. Nonetheless, they tend to become numerically intractable as the size of the model grows fast. Recent works use machine learning techniques to overcome this crucial issue, but with no convergence guarantee. This note provides a brief overview of literature on solving large Markov decision processes, and exploiting them to solve important combinatorial optimization problems.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Database Systems and Queries · Data Management and Algorithms · Data Quality and Management
