Properties of Turnpike Functions for Discounted Finite MDPs

Eugene A. Feinberg; Gaojin He

arXiv:2502.05375·math.OC·July 15, 2025

Properties of Turnpike Functions for Discounted Finite MDPs

Eugene A. Feinberg, Gaojin He

PDF

Open Access

TL;DR

This paper investigates the properties of turnpike functions in discounted finite MDPs, providing bounds on the number of iterations needed for value iteration to produce optimal policies, thus supporting the rolling horizon approach.

Contribution

It characterizes properties of turnpike integers in discounted finite MDPs and establishes upper bounds, enhancing understanding of value iteration convergence.

Findings

01

Turnpike integers are finite and well-defined.

02

Upper bounds for turnpike integers are derived.

03

Results support the effectiveness of the rolling horizon approach.

Abstract

This paper studies discounted Markov Decision Processes (MDPs) with finite sets of states and actions. Value iteration is one of the major methods for finding optimal policies. For each discount factor, starting from a finite number of iterations, which is called the turnpike integer, value iteration algorithms always generate decision rules, which are deterministic optimal policies for the infinite-horizon problems. This fact justifies the rolling horizon approach for computing infinite-horizon optimal policies by conducting a finite number of value iterations. This paper describes properties of turnpike integers and provides their upper bounds.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Numerical Methods in Computational Mathematics