Linear programming for finite-horizon vector-valued Markov decision processes
Anas Mifrani, Dominikus Noll

TL;DR
This paper introduces a vector linear programming approach to solve finite-horizon vector-valued Markov decision processes, enabling the characterization and enumeration of Pareto efficient policies.
Contribution
It develops a novel vector linear programming formulation for non-stationary finite-horizon MDPs with vector rewards, and provides an algorithm to enumerate all efficient deterministic policies.
Findings
Efficient policies correspond to solutions of the vector linear program.
The approach fully characterizes deterministic efficient policies.
Numerical tests demonstrate the algorithm's effectiveness in an engineering application.
Abstract
We propose a vector linear programming formulation for a non-stationary, finite-horizon Markov decision process with vector-valued rewards. Pareto efficient policies are shown to correspond to efficient solutions of the linear program, and vector linear programming theory allows us to fully characterize deterministic efficient policies. An algorithm for enumerating all efficient deterministic policies is presented then tested numerically in an engineering application.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSimulation Techniques and Applications · Vehicle Routing Optimization Methods · Supply Chain and Inventory Management
