Linear Programming Formulations of Deterministic Infinite Horizon   Optimal Control Problems in Discrete Time

Vladimir Gaitsgory; Alex Parkinson; I. Shvartsman

arXiv:1702.00857·math.OC·February 6, 2017

Linear Programming Formulations of Deterministic Infinite Horizon Optimal Control Problems in Discrete Time

Vladimir Gaitsgory, Alex Parkinson, I. Shvartsman

PDF

Open Access

TL;DR

This paper explores the connection between infinite horizon optimal control problems in discrete time and infinite-dimensional linear programming, analyzing their relationships and asymptotic behaviors for discounted and average criteria.

Contribution

It establishes a novel link between discrete-time optimal control problems and IDLP formulations, including asymptotic relationships between different criteria.

Findings

01

Optimal control problems relate to specific IDLP problems.

02

Asymptotic relationships between discounted and average criteria are established.

03

The study provides a new framework for analyzing long-term control strategies.

Abstract

This paper is devoted to a study of infinite horizon optimal control problems with time discounting and time averaging criteria in discrete time. We establish that these problems are related to certain infinite-dimensional linear programming (IDLP) problems. We also establish asymptotic relationships between the optimal values of problems with time discounting and long-run average criteria.

Equations253

y (t + 1) = f (y (t), u (t)), t = 0, 1, \dots

y (t + 1) = f (y (t), u (t)), t = 0, 1, \dots

y (0) = y_{0},

y (t) \in Y,

u (t) \in U (y (t)) .

u (t) \in A (y (t)),

u (t) \in A (y (t)),

A (y) := {u \in U (y) ∣ f (y, u) \in Y} \forall y \in Y .

A (y) := {u \in U (y) ∣ f (y, u) \in Y} \forall y \in Y .

G := graph A = {(y, u) ∣ y \in Y, u \in U (y), f (y, u) \in Y},

G := graph A = {(y, u) ∣ y \in Y, u \in U (y), f (y, u) \in Y},

u (\cdot) \in U (y_{0}) min t = 0 \sum \infty α^{t} g (y (t), u (t)) =: V_{α} (y_{0}),

u (\cdot) \in U (y_{0}) min t = 0 \sum \infty α^{t} g (y (t), u (t)) =: V_{α} (y_{0}),

u (\cdot) \in U_{S} (y_{0}) min t = 0 \sum S - 1 g (y (t), u (t)) =: V (S, y_{0}),

u (\cdot) \in U_{S} (y_{0}) min t = 0 \sum S - 1 g (y (t), u (t)) =: V (S, y_{0}),

γ \in W_{α} (y_{0}) min \int_{G} g (y, u) γ (d y, d u) := g_{α}^{*} (y_{0})

γ \in W_{α} (y_{0}) min \int_{G} g (y, u) γ (d y, d u) := g_{α}^{*} (y_{0})

γ \in W min \int_{G} g (y, u) γ (d y, d u) := g^{*},

γ \in W min \int_{G} g (y, u) γ (d y, d u) := g^{*},

W_{α} (y_{0}) := {

W_{α} (y_{0}) := {

\int_{G} [α (φ (f (y, u)) - φ (y)) + (1 - α) (φ (y_{0}) - φ (y))] γ (d y, d u) = 0 \forall φ \in C (Y)}

W := {γ \in P (G) ∣ \int_{G} (φ (f (y, u)) - φ (y)) γ (d y, d u) = 0 \forall φ \in C (Y)} .

W := {γ \in P (G) ∣ \int_{G} (φ (f (y, u)) - φ (y)) γ (d y, d u) = 0 \forall φ \in C (Y)} .

(1 - α) V_{α} (y_{0}) = g_{α}^{*} (y_{0})

(1 - α) V_{α} (y_{0}) = g_{α}^{*} (y_{0})

α ↑ 1 lim y \in Y min (1 - α) V_{α} (y) = S \to \infty lim y \in Y min \frac{1}{S} V (S, y) = g^{*} .

α ↑ 1 lim y \in Y min (1 - α) V_{α} (y) = S \to \infty lim y \in Y min \frac{1}{S} V (S, y) = g^{*} .

∣ J_{α} (u_{k}, y_{0}) - J_{α} (\overset{u}{ˉ}, y_{0}) ∣ \leq t = 0 \sum N α^{t} ∣ g (y_{k} (t), u_{k} (t)) - g (\overset{y}{ˉ} (t), \overset{u}{ˉ} (t)) ∣ + t = N + 1 \sum \infty α^{t} ∣ g (y_{k} (t), u_{k} (t)) - g (\overset{y}{ˉ} (t), \overset{u}{ˉ} (t)) ∣.

∣ J_{α} (u_{k}, y_{0}) - J_{α} (\overset{u}{ˉ}, y_{0}) ∣ \leq t = 0 \sum N α^{t} ∣ g (y_{k} (t), u_{k} (t)) - g (\overset{y}{ˉ} (t), \overset{u}{ˉ} (t)) ∣ + t = N + 1 \sum \infty α^{t} ∣ g (y_{k} (t), u_{k} (t)) - g (\overset{y}{ˉ} (t), \overset{u}{ˉ} (t)) ∣.

k \to \infty lim V_{α} (y_{0 k}) = k \to \infty lim J_{α} (u_{k}, y_{0 k}) = J_{α} (\overset{u}{ˉ}, y_{0}) \geq u (\cdot) \in U (y_{0}) min J_{α} (u, y_{0}) = V_{α} (y_{0}),

k \to \infty lim V_{α} (y_{0 k}) = k \to \infty lim J_{α} (u_{k}, y_{0 k}) = J_{α} (\overset{u}{ˉ}, y_{0}) \geq u (\cdot) \in U (y_{0}) min J_{α} (u, y_{0}) = V_{α} (y_{0}),

V_{α} (y) = u \in A (y) min {g (y, u) + α V_{α} (f (y, u))} .

V_{α} (y) = u \in A (y) min {g (y, u) + α V_{α} (f (y, u))} .

H_{ψ} (y) := u \in A (y) min {α (ψ (f (y, u)) - ψ (y)) + g (y, u)} .

H_{ψ} (y) := u \in A (y) min {α (ψ (f (y, u)) - ψ (y)) + g (y, u)} .

H_{V_{α}} (y) - (1 - α) V_{α} (y) = 0,

H_{V_{α}} (y) - (1 - α) V_{α} (y) = 0,

γ_{(y (\cdot), u (\cdot))}^{α} (Q) = (1 - α) t = 0 \sum \infty α^{t} 1_{Q} (y (t), u (t)),

γ_{(y (\cdot), u (\cdot))}^{α} (Q) = (1 - α) t = 0 \sum \infty α^{t} 1_{Q} (y (t), u (t)),

γ_{(y (\cdot), u (\cdot)), S} (Q) = \frac{1}{S} t = 0 \sum S - 1 1_{Q} (y (t), u (t)),

γ_{(y (\cdot), u (\cdot)), S} (Q) = \frac{1}{S} t = 0 \sum S - 1 1_{Q} (y (t), u (t)),

\int_{G} q (y, u) γ_{(y (\cdot), u (\cdot))}^{α} (d y, d u) = (1 - α) t = 0 \sum \infty α^{t} q (y (t), u (t))

\int_{G} q (y, u) γ_{(y (\cdot), u (\cdot))}^{α} (d y, d u) = (1 - α) t = 0 \sum \infty α^{t} q (y (t), u (t))

\int_{G} q (y, u) γ_{(y (\cdot), u (\cdot)), S} (d y, d u) = \frac{1}{S} t = 0 \sum S - 1 q (y (t), u (t))

\int_{G} q (y, u) γ_{(y (\cdot), u (\cdot)), S} (d y, d u) = \frac{1}{S} t = 0 \sum S - 1 q (y (t), u (t))

ρ (γ^{'}, γ^{''}) := j = 1 \sum \infty \frac{1}{2 ^{j}} \int_{G} q_{j} (y, u) γ^{'} (d y, d u) - \int_{G} q_{j} (y, u) γ^{''} (d y, d u)

ρ (γ^{'}, γ^{''}) := j = 1 \sum \infty \frac{1}{2 ^{j}} \int_{G} q_{j} (y, u) γ^{'} (d y, d u) - \int_{G} q_{j} (y, u) γ^{''} (d y, d u)

k \to \infty lim \int_{G} q (y, u) γ^{k} (d y, d u) = \int_{G} q (y, u) γ (d y, d u)

k \to \infty lim \int_{G} q (y, u) γ^{k} (d y, d u) = \int_{G} q (y, u) γ (d y, d u)

ρ (γ, Γ) := γ^{'} \in Γ in f ρ (γ, γ^{'}), ρ_{H} (Γ_{1}, Γ_{2}) := max {γ \in Γ_{1} sup ρ (γ, Γ), γ \in Γ_{2} sup ρ (γ, Γ_{2})} .

ρ (γ, Γ) := γ^{'} \in Γ in f ρ (γ, γ^{'}), ρ_{H} (Γ_{1}, Γ_{2}) := max {γ \in Γ_{1} sup ρ (γ, Γ), γ \in Γ_{2} sup ρ (γ, Γ_{2})} .

Γ_{α} (y_{0}) :=_{u (\cdot) \in U (y_{0})} ⋃ {γ_{(y (\cdot), u (\cdot))}^{α}}, Γ_{α} := y_{0} \in Y ⋃ {Γ_{α} (y_{0})},

Γ_{α} (y_{0}) :=_{u (\cdot) \in U (y_{0})} ⋃ {γ_{(y (\cdot), u (\cdot))}^{α}}, Γ_{α} := y_{0} \in Y ⋃ {Γ_{α} (y_{0})},

Γ (S, y_{0}) := u (\cdot) \in U_{S} (y_{0}) ⋃ {γ_{(y (\cdot), u (\cdot)), S}}, Γ (S) := y_{0} \in Y ⋃ {Γ_{S} (y_{0})} .

Γ (S, y_{0}) := u (\cdot) \in U_{S} (y_{0}) ⋃ {γ_{(y (\cdot), u (\cdot)), S}}, Γ (S) := y_{0} \in Y ⋃ {Γ_{S} (y_{0})} .

γ \in Γ_{α} (y_{0}) min \int_{G} g (y, u) γ (d y, d u) = (1 - α) V_{α} (y_{0})

γ \in Γ_{α} (y_{0}) min \int_{G} g (y, u) γ (d y, d u) = (1 - α) V_{α} (y_{0})

γ \in Γ (S, y_{0}) min \int_{G} g (y, u) γ (d y, d u) = V (S, y_{0}),

γ \in Γ (S, y_{0}) min \int_{G} g (y, u) γ (d y, d u) = V (S, y_{0}),

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsOptimization and Variational Analysis · Aerospace Engineering and Control Systems · Advanced Optimization Algorithms Research

Full text

Linear Programming Formulations of Deterministic Infinite Horizon Optimal Control Problems in Discrete Time

**V. Gaitsgorya, A. Parkinsona and I. Shvartsmanb

a** *Department of Mathematics, Macquarie University, Eastern Road, Macquarie Park, NSW 2113, Australia

b* Department of Mathematics and Computer Science, Penn State Harrisburg, Middletown, PA 17057, USA

Abstract. This paper is devoted to a study of infinite horizon optimal control problems with time discounting and time averaging criteria in discrete time. We establish that these problems are related to certain infinite-dimensional linear programming (IDLP) problems. We also establish asymptotic relationships between the optimal values of problems with time discounting and long-run average criteria.

**Key words: Optimal control, discrete systems, infinite horizon, long-run average, occupational measures, linear programming, duality 111AMS subject classification: 49N15, 93C55 **

1 Introduction

The linear programming (LP) approach to control systems is based on the fact that the occupational measures generated by admissible controls and the corresponding solutions of a dynamical system satisfy certain linear equations that represent the system’s dynamics in an integral form. The idea of such linearization was explored extensively in both deterministic and stochastic settings (see, e.g., [5], [8], [9], [13], [24], [30], [31] and, respectively, [1], [12], [14], [15], [16], [17], [18], [21], [23], [25], [27], [29], [33] as well as references therein). In [15] and [16] in particular, the validity of LP formulations of deterministic infinite time horizon problems of optimal control with time average and time discounting criteria was proved for systems evolving in continuous time (note that other approachers/techniques for dealing with deterministic optimal control problems on the infinite time horizon have been studied, e.g., in [4], [7], [10], [34]; see also references therein). In the present paper, we show that the LP formulations of problems of optimal control with time average and time discounting criteria are valid for systems evolving in discrete time.

Note that some of the results of [15] and [16] were obtained under certain technical assumptions. For example, the statement implying the validity of the LP formulation of the long run average optimal control problem (see Theorem 2.6 in [16]) was proved under the assumption that the dependence of the control set on the state variables is Lipschitz continuous. These assumptions can be significantly relaxed in dealing with the discrete time systems. In particular, the result about the validity of the LP formulation of the long run average optimal control problem in discrete time is established in this paper under the assumption that the dependence of the control set on the state variables is upper semicontinuous. Also, it is worth noting that the results in [16] (see also Remark 4.5 in [15]) are stated with the use of the relaxed controls formalism, the latter playing no role in tackling the discrete time systems.

Everywhere in what follows we will be dealing with the discrete time controlled dynamical system

[TABLE]

Here $Y$ is a given nonempty compact subset of $I\!\!R^{m}$ , $\ U(\cdot):\,Y\leadsto U_{0}$ is an upper semicontinuous compact-valued mapping to a given compact metric space $U_{0}$ , $\ f(\cdot,\cdot):\,I\!\!R^{m}\times U_{0}\to I\!\!R^{m}$ is a continuous function.

Note that the last two constraints of (1) can be rewritten as one:

[TABLE]

where the map $\ A(\cdot):\,Y\leadsto U_{0}$ is defined by the equation

[TABLE]

As can be readily verified, the map $A(\cdot)$ is upper semicontinuous and its graph $G$ ,

[TABLE]

is a compact subset of $Y\times U_{0}$ .

A control $u(\cdot)$ and the pair $(y(\cdot),u(\cdot))$ will be called an admissible control and, respectively, an admissible process if the relationships (1) are satisfied. The sets of admissible controls will be denoted by ${\cal U}(y_{0})$ or ${\cal U}_{S}(y_{0})$ , depending on whether the problem is considered on the infinite time horizon ( $t\in{\cal T}:=\{0,1,\dots\}$ ) or on a finite time sequence ( $t\in\{0,\dots,S-1\}$ , where $S$ is a positive integer).

Consider the optimal control problem

[TABLE]

where $g:\,I\!\!R^{m}\times U_{0}\to I\!\!R^{m}$ is a continuous function and $\alpha\in(0,1)$ is a discount factor. Consider also the optimal control problem

[TABLE]

Everywhere in the paper, it is assumed that

A1. The set ${\cal U}(y_{0})$ is not empty (that is, there exists at least one admissible control).

As shown below (see Propositions 2.1 and 2.3), the minima in (2) and (3) are achieved if A1 is satisfied. To obtain our main results, we use a stronger assumption:

A2. The set $A(y)$ is not empty for any $y\in Y$ .

This assumption implies non-emptiness of ${\cal U}(y)$ for any $y\in Y$ (systems that satisfy such a property are called viable; see [3]).

Along with optimal control problems (2) and (3), let us consider two infinite-dimensional (ID) linear programming (LP) problems:

[TABLE]

and

[TABLE]

where $W_{\alpha}(y_{0})$ and $W$ are subsets of ${\cal P}(G)$ (here and in what follows ${\cal P}(G)$ stands for the space of probability measures on Borel subsets of $G$ ) defined by the equations:

[TABLE]

and

[TABLE]

Note that (4) and (5) are indeed LP problems since both the objective functions and the constraints defining $W_{\alpha}(y_{0})$ and $W$ are linear in the “decision variable” $\gamma$ . Note also that $W$ can be obtained from $W_{\alpha}(y_{0})$ by setting $\alpha=1$ .

In the paper, we prove that, under Assumption A2,

[TABLE]

and the limits $\ \lim_{\alpha\uparrow 1}\min_{y\in Y}(1-\alpha)V_{\alpha}(y)$ and $\ \lim_{S\to\infty}\min_{y\in Y}{1\over S}V(S,y)$ exist and are equal to $g^{*}$ :

[TABLE]

It is worth mentioning that there exists an extensive literature devoted to the relationship between the limits of the sums $\ \displaystyle(1-\alpha)\sum_{t=0}^{\infty}\alpha^{t}b_{t}$ and $\ \displaystyle{1\over S}\sum_{t=0}^{S-1}b_{t}$ as $\alpha\uparrow 1$ and $S\to\infty$ , respectively. There are many examples showing that these limit may not exist (see, e.g., [6], where relationships between the corresponding lower and upper limits were investigated). However, provided that the sequence $\{b_{t}\}$ is bounded, the existence of one of these limits implies the existence of the other and their equality (see, e.g., [32]). In the context of optimal control in discrete time, relationships between the lower and upper limits of $(1-\alpha)V_{\alpha}(y)$ and ${1\over S}V(S,y)$ were studied, e.g., in [26] and [28]. The (full) aforementioned limits may not exist, and, as was shown in [26] (without the assumption about the compactness of the set of admissible states $Y$ ), these limits, even if exist, may be different. As mentioned above, in this paper we establish that, under the validity of A2, the limits of the minima over the initial conditions of $(1-\alpha)V_{\alpha}(y)$ and ${1\over S}V(S,y)$ exist and are equal to the optimal value of the IDLP problem (5).

The paper is organized as follows. Section 2 contains some preliminary results used in the sequel. In Section 3, we introduce discounted and “non-discounted” occupational measures and we reformulate problems (2) and (3) in terms of minimization over the sets of such measures. In Section 4, we establish that (8) is valid, and in Section 5 we prove the validity of (9). In this section, we also establish asymptotic properties of the sets of discounted and non-discounted occupational measures. In Section 6, we prove auxiliary results that are used in Sections 4 and 5.

2 Preliminaries

Everywhere in this and the following sections, it is assumed that A1 is satisfied.

Proposition 2.1

The minimum in (2) is achieved.

Proof. For an admissible process $(y(\cdot),u(\cdot))$ , denote $J_{\alpha}(u,y_{0}):=\sum_{t=0}^{\infty}\alpha^{t}g(y(t),u(t))$ . Let $u_{k}(\cdot)$ , $k=1,2,\dots$ be a minimizing sequence of controls and let $y_{k}(\cdot)$ be the corresponding sequence of trajectories. By using the diagonalization argument and taking into account compactness of $G$ , we can find convergent subsequences (we do not relabel) $u_{k}(t)\to\bar{u}(t)$ and $y_{k}(t)\to\bar{y}(t)$ for all $t$ . By passing to the limit in the relation $y_{k}(t+1)=f(y_{k}(t),u_{k}(t))$ as $k\to\infty$ we conclude that the process $(\bar{y}(\cdot),\bar{u}(\cdot))$ is admissible. For any natural $N$ we have

[TABLE]

Take $\varepsilon>0$ and find $N$ large enough so that the second sum does not exceed $\varepsilon/2$ for all $k$ , then the first sum can be made less than $\varepsilon/2$ by taking sufficiently large $k$ . Therefore, $J_{\alpha}(u_{k},y_{0})\to J_{\alpha}(\bar{u},y_{0})$ as $k\to\infty$ , which implies that the process $(\bar{y}(\cdot),\bar{u}(\cdot))$ is optimal.

Proposition 2.2

The optimal value function $V_{\alpha}(\cdot)$ is lower semicontinuous.

Proof. Take a sequence $y_{0k}\to y_{0}$ as $k\to\infty$ such that $V_{\alpha}(y_{0k})<\infty$ . Let $u_{k}(\cdot)$ be the corresponding sequence of minimizing controls, that is, controls such that $V_{\alpha}(y_{0k})=J_{\alpha}(u_{k},y_{0k})$ . We want to show that $\displaystyle\liminf_{k\to\infty}V_{\alpha}(y_{0k})\geq V_{\alpha}(y_{0}).$ Without loss of generality assume that $\displaystyle\liminf_{k\to\infty}V_{\alpha}(y_{0k})$ is reached on the same sequence $y_{0k}$ . Again, using the diagonalization argument and passing to a subsequence, we can assume that $u_{k}(t)$ converges to admissible control $\bar{u}(t)$ for all $t$ . Using the same argument as in the proof of Proposition 2.1 we can show that $\lim_{k\to\infty}J_{\alpha}(u_{k},y_{0k})=J_{\alpha}(\bar{u},y_{0})$ . We have

[TABLE]

which is the required inequality. $\Box$

Proposition 2.3

The minimum in (3) is achieved and the optimal value function $V(S,\cdot)$ is lower semicontinuous.

Proof. The fact that the minimum in (3) is achieved is obvious (since it is a finite-dimensional problem on a compact set), and the fact that $V(S,\cdot)$ is lower semicontinuous is proved similarly to Proposition 2.2. $\Box$

Corollary 2.4

The minima in (9) are achieved.

Proof. The proof follows from the fact that the functions $V_{\alpha}(\cdot)$ and $V(S,\cdot)$ are lower semicontinuous. $\Box$

Proposition 2.5

For any $y\in Y$ such that $V_{\alpha}(y)<\infty$ , the following equation is valid

[TABLE]

Proof. The proposition is the well known dynamic programming principle for problem (2). For completeness of the exposition, we reproduce its proof in Section 6. $\Box$

For a lower semicontinuous function $\psi:\,Y\to I\!\!R$ , let $H_{\psi}(y)$ be defined as follows

[TABLE]

Then equation (10) can be written as

[TABLE]

which resembles the Hamilton-Jacobi-Bellman equation for continuous time systems; see, e.g., [4].

3 Occupational Measure Formulations

Let $(y(\cdot),u(\cdot))$ be an admissible process. A probability measure $\gamma^{\alpha}_{(y(\cdot),u(\cdot))}$ is called the discounted occupational measure generated by the process $(y(\cdot),u(\cdot))$ if, for any Borel set $Q\subset G$ ,

[TABLE]

where $1_{Q}(\cdot)$ is the indicator function of $Q$ . A probability measure $\gamma_{(y(\cdot),u(\cdot)),S}$ is called the occupational measure generated by the process $(y(\cdot),u(\cdot))$ over the time sequence $\{0,1,...,S-1\}$ if, for any Borel set $Q\subset G$ ,

[TABLE]

It can be shown that if $\gamma^{\alpha}_{(y(\cdot),u(\cdot))}$ is the discounted occupational measure generated by the process $(y(\cdot),u(\cdot))$ , then

[TABLE]

for any Borel measurable function $q$ on $G$ . Also, it can be shown that if $\gamma_{(y(\cdot),u(\cdot)),S}$ is the occupational measure generated by the process $(y(\cdot),u(\cdot))$ over the time sequence $\{0,1,...,S-1\}$ , then

[TABLE]

for any Borel measurable function $q$ on $G$ .

To describe convergence properties of occupational measures, we introduce the following metric on ${\cal P}(G)$ :

[TABLE]

for $\gamma^{\prime},\gamma^{\prime\prime}\in{\cal P}(G)$ , where $q_{j}(\cdot),\,j=1,2,\dots,$ is a sequence of Lipschitz continuous functions dense in the unit ball of the space of continuous functions $C(G)$ from $G$ to $I\!\!R$ . This metric is consistent with the weak∗ convergence topology on ${\cal P}(G)$ , that is, a sequence $\gamma^{k}\in{\cal P}(G)$ converges to $\gamma\in{\cal P}(G)$ in this metric if and only if

[TABLE]

for any $q\in C(G)$ . Note that the sets $W_{\alpha}(y_{0})$ and $W$ are compact in this topology.

Using the metric $\rho$ , we can define the “distance” $\rho(\gamma,\Gamma)$ between $\gamma\in{\cal P}(G)$ and $\Gamma\subset{\cal P}(G)$ and the Hausdorff metric $\rho_{H}(\Gamma_{1},\Gamma_{2})$ between $\Gamma_{1}\subset{\cal P}(G)$ and $\Gamma_{2}\subset{\cal P}(G)$ as follows:

[TABLE]

Note that, although, by some abuse of terminology, we refer to $\rho_{H}(\cdot,\cdot)$ as a metric on the set of subsets of ${\cal P}(Y\times U)$ , it is, in fact, a semi metric on this set (since $\rho_{H}(\Gamma_{1},\Gamma_{2})=0$ implies $\Gamma_{1}=\Gamma_{2}$ if $\Gamma_{1}$ and $\Gamma_{2}$ are closed and the equality may not be true if at least one of these sets is not closed).

Introduce the following notation for the sets of occupational measures:

[TABLE]

Due to (13) and (14), problems (2) and (3) can be rewritten in the form

[TABLE]

and

[TABLE]

respectively.

4 Validity of (8)

Proposition 4.1

The inclusion $\Gamma_{\alpha}(y_{0})\subset W_{\alpha}(y_{0})$ is true.

Proof. For arbitrary $\varphi\in C(Y)$ and admissible process $(y(\cdot),u(\cdot))$ we have

[TABLE]

Multiplying both sides by $1-\alpha$ and taking into account (13), we obtain

[TABLE]

where $\gamma^{\alpha}_{(y(\cdot),u(\cdot))}\in\Gamma_{\alpha}(y_{0})$ is generated by $(y(\cdot),u(\cdot))$ . The latter is equivalent to

[TABLE]

This implies that $\gamma^{\alpha}_{u}\in W_{\alpha}(y_{0})$ , which concludes the proof of the proposition. $\Box$

Remark 4.2

Due to the assumed validity of A1, $\Gamma_{\alpha}(y_{0})\neq\emptyset$ and, hence, $W_{\alpha}(y_{0})\neq\emptyset$ .

Note that from Proposition 4.1 it follows that

[TABLE]

Let $LS$ be the class of bounded lower semicontinuous functions from $Y$ to $I\!\!R$ . Note that $V_{\alpha}(\cdot)\in LS$ if Assumption A2 is satisfied. In fact, in this case

[TABLE]

From this point on, it is everywhere assumed that Assumption A2 is indeed satisfied.

Consider the max-min problem

[TABLE]

We say that $\tilde{\psi}$ is a solution of (18) if

[TABLE]

Our first main result is the following theorem.

Theorem 4.3

The optimal values in problems (4) and (18) coincide and are equal to the optimal value of (2) multiplied by $(1-\alpha)$ , that is,

[TABLE]

Moreover, the supremum in (18) is reached at $\psi=V_{\alpha}$ .

Proof. From Proposition 2.5 we have

[TABLE]

which implies that

[TABLE]

Therefore,

[TABLE]

Taking into account (16), we get

[TABLE]

Let us show the opposite inequality. For $\psi\in LS$ denote

[TABLE]

so that $\displaystyle\mu^{*}_{\alpha}(y_{0})=\sup_{\psi\in LS}\mu_{\alpha}(\psi,y_{0})$ . Take $\gamma\in W_{\alpha}(y_{0})$ , arbitrary $\psi\in LS$ and let $\{\psi_{n}\}_{n=1}^{\infty}$ be a bounded sequence of continuous functions such that $\psi_{n}(y)\to\psi(y)$ point-wise on $Y$ as $n\to\infty$ (due to (17), such a sequence exists; see, e.g., Theorem A6.6 in [2]). From (22), from Lebesgue dominated convergence theorem and from the definition of $W_{\alpha}(y_{0})$ it follows that

[TABLE]

Taking supremum with respect to $\psi\in LS$ and minimum with respect $\gamma\in W_{\alpha}(y_{0})$ leads to $\mu^{*}_{\alpha}(y_{0})\leq g^{*}_{\alpha}(y_{0})$ which, together with (21), implies (19). It also follows from (20) that

[TABLE]

which implies the second part of the theorem. $\Box$

Corollary 4.4

The following equality is valid

[TABLE]

where $\bar{\rm co}\,$ stands for the closure of the convex hull of the corresponding set.

Proof. Due to (4) and (15), the equality (8) can be rewritten in the form

[TABLE]

which implies that

[TABLE]

Since the latter is valid for any continuous $g$ , it proves the validity of (23). $\Box$

Remark 4.5

Note that problem (18) can be shown to be equivalent to the problem dual to the IDLP problem (4) (see Appendix of [15]), with the equality of the optimal values being a part of the duality relationships between these two problems.**

5 Validity of (9)

Let us introduce the following notation:

[TABLE]

where the minimization is over admissible controls and over the initial conditions in $Y$ .

The main results of this section are Theorems 5.1 and 5.7 below. In Theorem 5.1 we, in particular, establish existence and equality of the limits in (9). Theorem 5.7 deals with a limiting property of the sets of occupational measures and is closely related to Theorem 5.1. Continuous-time analogs of Theorems 5.1 and 5.7 are proved in [15], Chapter 6. However, in continuous time, as opposed to discrete time, a few strong assumptions are needed for the validity of the corresponding results (e.g., Lipschitz continuity of the value function).

Let

[TABLE]

Theorem 5.1

The limits $\displaystyle\lim_{\alpha\uparrow 1}\min_{y\in Y}(1-\alpha)V_{\alpha}(y)$ and $\displaystyle\lim_{S\to\infty}G_{S}$ exist and

[TABLE]

The proof is broken down into a series of propositions and lemmas.

Proposition 5.2

The equality $g^{*}=\mu^{*}$ holds true.

Proof. Take any $\psi\in LS$ . Integrating the inequality

[TABLE]

with respect to arbitrary $\gamma\in W$ we obtain

[TABLE]

Taking minimum with respect to $\gamma\in W$ and supremum with respect to $\psi\in LS$ , we conclude that

[TABLE]

Let us show the opposite inequality. Define

[TABLE]

that is, compared to (25), supremum in the formula above is taken with respect to continuous, rather than lower semicontinuous bounded functions. It is clear that

[TABLE]

therefore $\mu_{C}^{*}<\infty$ .

Let $\{\phi_{i}\}_{i=1}^{\infty}$ be a sequence of functions in $C(Y)$ with the following properties: (i) any finite collection of functions from this sequence is linearly independent on $Y$ , (ii) for any $\psi\in C(Y)$ and any $\delta>0$ there exist $N$ and scalars $\lambda_{i}^{N}$ , $i=1,\dots,N$ such that $\displaystyle\sup_{y\in Y}|\psi(y)-\sum_{i=1}^{N}\lambda_{i}^{N}\phi_{i}(y)|\leq\delta$ . (An example of such sequence is the sequence of monomials $y_{1}^{i_{1}}\dots y_{m}^{i_{m}},\,i_{1},\dots,i_{m}=0,1,\dots$ , where $y_{j}$ stands for the $j$ th component of $y$ .)

Let us notice first that for any $\psi\in C(Y)$ we have

[TABLE]

Indeed, if this was not the case, then, for $\psi_{m}:=m\psi$ with positive integer $m$ we would get

[TABLE]

which contradicts boundedness of $\mu_{C}^{*}$ .

Assume that functions $\{\phi_{i}\}$ are normalized so that $\max_{y\in Y}|\phi_{i}(y)|<1/2^{i}$ . Define $\hat{Q}\subset I\!\!R\times l^{1}$ by

[TABLE]

It’s easy to see that the set $\hat{Q}$ is compact and for any $j=1,2,\dots$ the point $(g^{*}-\frac{1}{j},0)$ does not belong to $\hat{Q},$ where 0 is the zero element of $l_{1}$ (otherwise, $g^{*}$ is not the minimum in (5)). Due to Hahn-Banach separation theorem (see, e.g., [11], Section V.2) there exists a sequence $(\kappa^{j},\lambda^{j})\in I\!\!R\times l^{\infty}$ (where $\lambda^{j}=(\lambda_{1}^{j},\lambda_{2}^{j},\dots)$ ) such that

[TABLE]

where $\delta^{j}>0$ for all $j$ and $\psi_{\lambda^{j}}:=\sum_{i=1}^{\infty}\lambda_{i}^{j}\phi_{i}$ . From the last formula it is easy to see that $\kappa^{j}\geq 0$ . Let us show that, in fact, $\kappa^{j}>0$ . Indeed, if it was not the case and $\kappa^{j}=0$ , then we would have

[TABLE]

which is a contradiction to (29). Thus, $\kappa^{j}>0$ . Dividing (30) through by $\kappa_{j}$ we obtain

[TABLE]

Therefore, $g^{*}\leq\mu^{*}_{C}$ . Taking into account inequalities (26) and (28) we conclude that $g^{*}=\mu^{*}$ . $\Box$

Proposition 5.3

The limit $\displaystyle\lim_{\alpha\uparrow 1}\min_{y\in Y}(1-\alpha)V_{\alpha}(y)$ exists and is equal to $g^{*}$ .

Proof. Let us show that

[TABLE]

Indeed, let $\alpha_{i}\uparrow 1$ , $y_{i}\in Y$ and $\gamma_{i}\in W_{\alpha_{i}}(y_{i})$ be such that $\gamma_{i}\to\gamma$ . We have

[TABLE]

Passing to the limit as $i\to\infty$ in this equality we obtain $\displaystyle\int_{G}(\varphi(f(y,u))-\varphi(y))\gamma(dy,du)=0$ , therefore, $\gamma\in W$ , i.e, (31) holds. It follows from (31) and (19) that

[TABLE]

From (10) it follows that for any $\alpha\in(0,1)$ we have

[TABLE]

Therefore,

[TABLE]

Consequently,

[TABLE]

and

[TABLE]

Along with Proposition 5.2, the latter implies

[TABLE]

The assertion of the proposition follows from this relation and (32). $\Box$

The following two lemmas, proved in the Appendix, are discrete-time analogs of [19], Lemma 3.5 (ii) and [20], Lemma 3.8. For $v\in I\!\!R$ the notation $[v]$ stands for the integer part of $v$ .

Lemma 5.4

Let $g:\,{\cal T}\to I\!\!R$ be a function such that $|g(t)|\leq M$ for all $t$ . Let $\alpha\in(0,1)$ and

[TABLE]

Then for any $\varepsilon>0$ there exists a positive integer $\displaystyle T\geq\left[{\varepsilon\over(4M+4|\sigma|+\varepsilon)(-\ln\alpha)}\right]$ satisfying

[TABLE]

Lemma 5.5

Let $g:\,{\cal T}\to I\!\!R$ be a function such that $|g(t)|\leq M$ for all $t$ . Let $t$ be an arbitrary positive integer and

[TABLE]

For any $\varepsilon>0$ there exists $t^{*}\in\{0,\dots,t-1\}$ such that

[TABLE]

Moreover,

[TABLE]

Proposition 5.6

The limit $\displaystyle\lim_{S\to\infty}G_{S}$ exists and is equal to $g^{*}$ .

Proof. Let us show first that

[TABLE]

Take a sequence $S_{i}\to\infty$ as $i\to\infty$ and let $\gamma_{i}\in\Gamma_{S_{i}}$ be such that $\gamma_{i}\to\gamma$ . Since $\gamma_{i}\in\Gamma_{S_{i}}$ , there exists an initial condition $y_{0i}$ and a control $u_{i}(\cdot)\in{\cal U}_{S_{i}}(y_{0i})$ such that for the corresponding trajectory $y_{i}(\cdot)$ and any $\varphi\in C(Y)$ we have

[TABLE]

Therefore,

[TABLE]

due to boundedness of $Y$ . Thus, $\gamma\in W$ , i.e, inclusion (38) holds, which implies that

[TABLE]

Take a sequence $\alpha_{i}\uparrow 1$ . Due to Proposition 5.3 there exists a sequence of initial conditions $y_{0i}$ , controls $u_{i}(\cdot)\in{\cal U}(y_{0i})$ and the corresponding trajectories $y_{i}(\cdot)$ such that

[TABLE]

where $\lim_{i\to\infty}\xi_{i}=0$ . Applying Lemma 5.4 with $\sigma=g^{*}+\xi_{i}$ and $\varepsilon=\sqrt{-\ln\alpha_{i}}$ we conclude that there exists a sequence $S_{i}$ , such that $S_{i}\geq K/\sqrt{-\ln\alpha_{i}}$ ( $K$ is a constant independent of $i$ ) and

[TABLE]

therefore, $\displaystyle\liminf_{S\to\infty}G_{S}\leq g^{*}$ . Together with (39) this implies that

[TABLE]

The latter means that

[TABLE]

where $\lim_{i\to\infty}\eta_{i}=0$ . Let us apply Lemma 5.5 in which $S_{i}$ plays the role of $t$ and $\sigma=g^{*}+\eta_{i}$ . Set $\varepsilon={1/{S_{i}}}$ , denote the value corresponding to $t^{*}$ by $t_{i}$ and $l(S_{i}):=S_{i}-t_{i}$ . We conclude that $l(S_{i})\to\infty$ as $i\to\infty$ and

[TABLE]

Let $\tilde{u}_{i}(\cdot)=u_{i}(t_{i}+\cdot)$ , $\tilde{y}_{i}(\cdot)=y_{i}(t_{i}+\cdot)$ . Note that $(\tilde{u}_{i},\tilde{y}_{i})$ is an admissible process. It follows from (42) that

[TABLE]

hence,

[TABLE]

which, along with (41), completes the proof of the proposition. $\Box$

Combining the assertions of Propositions 5.2, 5.3, and 5.6, we complete the proof of Theorem 5.1.

The theorem below asserts convergence of the sets of occupational measures $\Gamma_{\alpha}$ and $\Gamma_{S}$ defined in Section 2 to $W$ given by (43).

Theorem 5.7

The following holds:

[TABLE]

Proof. The assertion of Proposition 5.3 in terms of occupational measures can be written as

[TABLE]

which, due to linearity of the integral with respect to $\gamma$ , implies that

[TABLE]

Since $g$ in the equality above can be any continuous function, we can write

[TABLE]

Denote

[TABLE]

Due to (31) we have

[TABLE]

which, due to convexity of $W$ , implies that

[TABLE]

that is,

[TABLE]

From the inclusion

[TABLE]

proved in Proposition 4.1, by taking the union with respect to $y_{0}\in Y$ and, then, closure of the convex hull, we conclude that

[TABLE]

Therefore, from (45) we get

[TABLE]

To complete the proof of the equality

[TABLE]

it remains to show that

[TABLE]

The proof of this relation is based on formula (43) and weak∗ separation theorem. It follows the same steps as the proof of Proposition 6.1 in [15], starting with formula (6.6). The only difference is that the parameter $C$ , approaching 0 in [15], should be replaced with $\alpha$ , approaching 1. We do not reproduce this proof here.

The proof of the second equality of the theorem $\lim_{S\to\infty}\rho_{H}(\bar{\rm co}\,\,\Gamma_{S},W)=0$ is very similar to the proof of (46). Namely, Proposition 5.6 can be written in terms of occupational measures as

[TABLE]

which implies that

[TABLE]

Further, from (38) we derive that (cf. (44)-(45))

[TABLE]

The rest of the proof follows from (47) and (48) using weak∗ separation theorem following the lines of [15], as described above.

$\Box$

6 Appendix

Proof of Proposition 2.5. We have

[TABLE]

The second minimum is equal to $V(y(1))=V(f(y(0),u(0)))$ , therefore,

[TABLE]

Replacing now $u(0)$ and $y(0)$ with $u$ and $y$ , respectively, we obtain relation (10). $\Box$

Lemma 6.1

([19], Lemma 3.5 (ii))* Let $q:\,[0,\infty)\to I\!\!R$ be a measurable function such that $|q(\tau)|\leq M$ for a.a. $\tau\in I\!\!R$ . Let $\delta>0$ be arbitrary and*

[TABLE]

Then for any $\varepsilon>0$ there exists $\displaystyle\tilde{T}\geq{\varepsilon\over(4M+4|\tilde{\sigma}|+\varepsilon)\delta}$ satisfying

[TABLE]

Proof of Lemma 5.4. Lemma 5.4 is a discrete-time analog of Lemma 6.1.

Define the piecewise constant function $q:\,[0,\infty)\to I\!\!R$ by

[TABLE]

and apply Lemma 6.1 with $\delta=-\ln\alpha$ . Let us first evaluate $\tilde{\sigma}$ given by (49). For $t\in{\cal T}$ we have

[TABLE]

therefore,

[TABLE]

Due to Lemma 6.1 there exists $\tilde{T}\geq{\varepsilon/\big{(}(4M+4|\sigma|+\varepsilon)(-\ln\alpha)\big{)}}$ such that

[TABLE]

In the case if $0<\tilde{T}<1$ , then ${1\over\tilde{T}}\int_{0}^{\tilde{T}}q(\tau)\,d\tau=g(0)$ and inequality (35) holds in the form

[TABLE]

with $T=1$ . Assume, therefore, that $\tilde{T}\geq 1$ .

Let $T:=[\tilde{T}]\geq 1$ and denote $\Delta T:=\tilde{T}-T\in[0,1)$ . We have

[TABLE]

For the second integral we have

[TABLE]

Taking into account that $1/(1+x)\geq 1-x$ for $x>-1$ we have

[TABLE]

therefore, in the case if $\int_{0}^{T}q(\tau)\,d\tau\geq 0$ , for the first integral on the right hand side of (52) we have

[TABLE]

If $\int_{0}^{T}q(\tau)\,d\tau<0$ , then $\displaystyle{1\over T+\Delta T}\int_{0}^{T}q(\tau)\,d\tau\geq{1\over T}\int_{0}^{T}q(\tau)\,d\tau$ and the inequality above still holds. Thus, we obtain from (52)-(54), that

[TABLE]

and (35) follows from (51) and (55). $\Box$

Proof of Lemma 5.5. Let $\displaystyle\beta:=\max_{1\leq s\leq t}{1\over s}\sum_{\tau=0}^{s-1}q(\tau).$ If $\beta\leq\sigma+\varepsilon$ then the statement of the lemma holds with $t^{*}=0$ . Assume, therefore, that $\beta>\sigma+\varepsilon$ and set

[TABLE]

Let us show that this $t^{*}$ satisfies the required properties. Indeed, $t^{*}\neq t$ due to the definition of $\sigma$ , hence, $0\leq t^{*}\leq t-1$ . Let us show that (36) is satisfied. Assume the contrary, that is, there exists $1\leq s_{1}\leq t-t^{*}$ such that $\displaystyle\sigma+\varepsilon<{1\over s_{1}}\sum_{\tau=0}^{s_{1}-1}q(t^{*}+\tau)={1\over s_{1}}\sum_{\tau=t^{*}}^{t^{*}+s_{1}-1}q(\tau)$ . This implies that

[TABLE]

which contradicts the definition of $t^{*}$ .

Let us show now that $l(t):=t-t^{*}\to\infty$ as $t\to\infty$ . We have

[TABLE]

This can be equivalently written as

[TABLE]

or,

[TABLE]

which implies that $l\to\infty$ as $t\to\infty$ , that is, (37) holds. $\Box$

Bibliography34

The reference list from the paper itself. Each links out to its DOI / PubMed record.

1[1] D. Adelman and D. Klabjan, Duality and existence of optimal policies in generalized joint replenishment , Mathematics of Operations Research, 30(1) (2005), 28–-50.
2[2] R. Ash, “Measure, Integration and Functional Analysis”, Academic Press, 2014.
3[3] J.-P. Aubin, “Viability Theory”, Birkhauser, 1991.
4[4] M. Bardi and I. Capuzzo-Dolcetta, “Optimal control and viscosity solutions of Hamilton-Jacobi-Bellman equations,” Systems and Control: Foundations and Applications, Birkhäuser, Boston, 1997.
5[5] A.G. Bhatt and V.S. Borkar, Occupation measures for controlled Markov processes: characterization and optimality, Annals of Probability, 24 (1996), 1531-1562.
6[6] C.J. Bishop, E.A. Feinberg and J. Zhang, Examples concerning Abel and Cesàro limits , Journal of Mathematical Analysis and Applications, 420 (2014), 1654-1661
7[7] J. Blot, A Pontryagin principle for infinite-horizon problems under constraints, Dynamics of Continuous, Discrete and Impulsive Systems Series B: Applications and Algorithms , 19 (2012), 267-275.
8[8] V.S. Borkar, A convex analytic approach to Markov decision processes , Probability Theory and Related Fields, 78 (1988), 583-602.