Reduction of total-cost and average-cost MDPs with weakly continuous transition probabilities to discounted MDPs
Eugene A. Feinberg, Jefferson Huang

TL;DR
This paper establishes conditions under which total-cost and average-cost MDPs with weakly continuous transitions can be transformed into discounted MDPs, enabling easier analysis and policy computation.
Contribution
It provides a framework for reducing certain undiscounted MDPs to discounted ones, ensuring optimality equations and stationary policies are valid.
Findings
Reductions are valid under specified weak continuity conditions.
Optimal policies can be computed using the discounted MDP framework.
Applied to inventory control with fixed costs and lost sales.
Abstract
This note describes sufficient conditions under which total-cost and average-cost Markov decision processes (MDPs) with general state and action spaces, and with weakly continuous transition probabilities, can be reduced to discounted MDPs. For undiscounted problems, these reductions imply the validity of optimality equations and the existence of stationary optimal policies. The reductions also provide methods for computing optimal policies. The results are applied to a capacitated inventory control problem with fixed costs and lost sales.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSupply Chain and Inventory Management · Auction Theory and Applications · Traffic control and management
