LP Formulations of Discrete Time Long-Run Average Optimal Control   Problems: The Non-Ergodic Case

Vivek S. Borkar; Vladimir Gaitsgory; Ilya Shvartsman

arXiv:1812.04790·math.OC·May 29, 2019·SIAM J. Control. Optim.

LP Formulations of Discrete Time Long-Run Average Optimal Control Problems: The Non-Ergodic Case

Vivek S. Borkar, Vladimir Gaitsgory, Ilya Shvartsman

PDF

TL;DR

This paper develops an LP framework for deterministic discrete-time long-run average optimal control problems, especially addressing cases where the optimal value depends on initial conditions, expanding the theoretical understanding of such problems.

Contribution

It introduces a novel LP formulation and duality approach for non-ergodic long-run average control problems with initial condition dependence.

Findings

01

LP formulation characterizes the optimal value

02

Dual problem provides optimality conditions

03

Addresses non-ergodic cases with initial dependence

Abstract

We formulate and study the infinite dimensional linear programming (LP) problem associated with the deterministic discrete time long-run average criterion optimal control problem. Along with its dual, this LP problem allows one to characterize the optimal value of the optimal control problem. The novelty of our approach is that we focus on the general case wherein the optimal value may depend on the initial condition of the system.

Equations488

y (t + 1) = f (y (t), u (t)), t = 0, 1, \dots

y (t + 1) = f (y (t), u (t)), t = 0, 1, \dots

y (0) = y_{0},

y (t) \in Y,

u (t) \in U (y (t)) .

u (t) \in A (y (t)),

u (t) \in A (y (t)),

A (y) := {u \in U (y) ∣ f (y, u) \in Y} \forall y \in Y .

A (y) := {u \in U (y) ∣ f (y, u) \in Y} \forall y \in Y .

G := graph A = {(y, u) ∣ y \in Y, u \in U (y), f (y, u) \in Y},

G := graph A = {(y, u) ∣ y \in Y, u \in U (y), f (y, u) \in Y},

\frac{1}{T} u (\cdot) \in U_{T} (y_{0}) min t = 0 \sum T - 1 k (y (t), u (t)) =: V_{T} (y_{0}),

\frac{1}{T} u (\cdot) \in U_{T} (y_{0}) min t = 0 \sum T - 1 k (y (t), u (t)) =: V_{T} (y_{0}),

(1 - α) u (\cdot) \in U (y_{0}) min t = 0 \sum \infty α^{t} k (y (t), u (t)) =: h_{α} (y_{0}),

(1 - α) u (\cdot) \in U (y_{0}) min t = 0 \sum \infty α^{t} k (y (t), u (t)) =: h_{α} (y_{0}),

γ_{(y (\cdot), u (\cdot)), S} (Q) = \frac{1}{S} t = 0 \sum S - 1 1_{Q} (y (t), u (t)) .

γ_{(y (\cdot), u (\cdot)), S} (Q) = \frac{1}{S} t = 0 \sum S - 1 1_{Q} (y (t), u (t)) .

γ_{(y (\cdot), u (\cdot))}^{α} (Q) = (1 - α) t = 0 \sum \infty α^{t} 1_{Q} (y (t), u (t)),

γ_{(y (\cdot), u (\cdot))}^{α} (Q) = (1 - α) t = 0 \sum \infty α^{t} 1_{Q} (y (t), u (t)),

\int_{G} q (y, u) γ_{(y (\cdot), u (\cdot)), S} (d y, d u) = \frac{1}{S} t = 0 \sum S - 1 q (y (t), u (t))

\int_{G} q (y, u) γ_{(y (\cdot), u (\cdot)), S} (d y, d u) = \frac{1}{S} t = 0 \sum S - 1 q (y (t), u (t))

\int_{G} q (y, u) γ_{(y (\cdot), u (\cdot))}^{α} (d y, d u) = (1 - α) t = 0 \sum \infty α^{t} q (y (t), u (t))

\int_{G} q (y, u) γ_{(y (\cdot), u (\cdot))}^{α} (d y, d u) = (1 - α) t = 0 \sum \infty α^{t} q (y (t), u (t))

Γ_{T} (y_{0}) := u (\cdot) \in U_{T} (y_{0}) ⋃ {γ_{(y (\cdot), u (\cdot)), T}}, Γ_{T} := y_{0} \in Y ⋃ {Γ_{T} (y_{0})},

Γ_{T} (y_{0}) := u (\cdot) \in U_{T} (y_{0}) ⋃ {γ_{(y (\cdot), u (\cdot)), T}}, Γ_{T} := y_{0} \in Y ⋃ {Γ_{T} (y_{0})},

Θ_{α} (y_{0}) :=_{u (\cdot) \in U (y_{0})} ⋃ {γ_{(y (\cdot), u (\cdot))}^{α}}, Θ_{α} := y_{0} \in Y ⋃ {Θ_{α} (y_{0})} .

Θ_{α} (y_{0}) :=_{u (\cdot) \in U (y_{0})} ⋃ {γ_{(y (\cdot), u (\cdot))}^{α}}, Θ_{α} := y_{0} \in Y ⋃ {Θ_{α} (y_{0})} .

γ \in Γ_{T} (y_{0}) min \int_{G} k (y, u) γ (d y, d u) = V_{T} (y_{0})

γ \in Γ_{T} (y_{0}) min \int_{G} k (y, u) γ (d y, d u) = V_{T} (y_{0})

γ \in Θ_{α} (y_{0}) min \int_{G} k (y, u) γ (d y, d u) = (1 - α) h_{α} (y_{0}),

γ \in Θ_{α} (y_{0}) min \int_{G} k (y, u) γ (d y, d u) = (1 - α) h_{α} (y_{0}),

ρ (γ^{'}, γ^{''}) := j = 1 \sum \infty \frac{1}{2 ^{j}} \int_{G} q_{j} (y, u) γ^{'} (d y, d u) - \int_{G} q_{j} (y, u) γ^{''} (d y, d u)

ρ (γ^{'}, γ^{''}) := j = 1 \sum \infty \frac{1}{2 ^{j}} \int_{G} q_{j} (y, u) γ^{'} (d y, d u) - \int_{G} q_{j} (y, u) γ^{''} (d y, d u)

k \to \infty lim \int_{G} q (y, u) γ^{k} (d y, d u) = \int_{G} q (y, u) γ (d y, d u)

k \to \infty lim \int_{G} q (y, u) γ^{k} (d y, d u) = \int_{G} q (y, u) γ (d y, d u)

ρ (γ, Γ) := γ^{'} \in Γ in f ρ (γ, γ^{'}), ρ_{H} (Γ_{1}, Γ_{2}) := max {γ \in Γ_{1} sup ρ (γ, Γ_{2}), γ \in Γ_{2} sup ρ (γ, Γ_{1})} .

ρ (γ, Γ) := γ^{'} \in Γ in f ρ (γ, γ^{'}), ρ_{H} (Γ_{1}, Γ_{2}) := max {γ \in Γ_{1} sup ρ (γ, Γ_{2}), γ \in Γ_{2} sup ρ (γ, Γ_{1})} .

W := {γ \in P (G) ∣ \int_{G} (φ (f (y, u)) - φ (y)) γ (d y, d u) = 0 for all φ \in C (Y)},

W := {γ \in P (G) ∣ \int_{G} (φ (f (y, u)) - φ (y)) γ (d y, d u) = 0 for all φ \in C (Y)},

W (α, y_{0}) =

W (α, y_{0}) =

\int_{G} (α φ (f (y, u)) - φ (y) + (1 - α) (φ (y_{0}) - φ (y))) γ (d y, d u) = 0 for all φ \in C (Y)} .

T \to \infty lim ρ_{H} (\overset{co}{ˉ} Γ_{T}, W) = α ↑ 1 lim ρ_{H} (\overset{co}{ˉ} Θ_{α}, W) = 0.

T \to \infty lim ρ_{H} (\overset{co}{ˉ} Γ_{T}, W) = α ↑ 1 lim ρ_{H} (\overset{co}{ˉ} Θ_{α}, W) = 0.

\overset{co}{ˉ} Θ_{α} (y_{0}) = W (α, y_{0}) \forall α \in (0, 1) .

\overset{co}{ˉ} Θ_{α} (y_{0}) = W (α, y_{0}) \forall α \in (0, 1) .

(γ, ξ) \in Ω (y_{0}) in f \int_{G} k (y, u) γ (d y, d u) =: k^{*} (y_{0}),

(γ, ξ) \in Ω (y_{0}) in f \int_{G} k (y, u) γ (d y, d u) =: k^{*} (y_{0}),

Ω (y_{0}) := {(γ, ξ) \in P (G) \times M_{+} (G) ∣ γ \in W,

Ω (y_{0}) := {(γ, ξ) \in P (G) \times M_{+} (G) ∣ γ \in W,

\int_{G} (φ (y_{0}) - φ (y)) γ (d y, d u) + \int_{G} (φ (f (y, u)) - φ (y)) ξ (d y, d u) = 0 for all φ \in C (Y)},

(μ, ψ, η) \in D sup μ =: d^{*} (y_{0}),

(μ, ψ, η) \in D sup μ =: d^{*} (y_{0}),

k (y, u) + (ψ (y_{0}) - ψ (y)) + η (f (y, u)) - η (y) - μ \geq 0,

k (y, u) + (ψ (y_{0}) - ψ (y)) + η (f (y, u)) - η (y) - μ \geq 0,

ψ (f (y, u)) - ψ (y) \geq 0.

d^{*} (y_{0}) = ψ, η sup (y, u) \in G min {k (y, u) + (ψ (y_{0}) - ψ (y)) + η (f (y, u)) - η (y)},

d^{*} (y_{0}) = ψ, η sup (y, u) \in G min {k (y, u) + (ψ (y_{0}) - ψ (y)) + η (f (y, u)) - η (y)},

d^{*} (y_{0}) \leq k^{*} (y_{0})

d^{*} (y_{0}) \leq k^{*} (y_{0})

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Full text

LP Formulations of Discrete Time Long-Run Average Optimal Control Problems: The Non-Ergodic Case

Vivek S. Borkar, Vladimir Gaitsgory and Ilya Shvartsman Department of Electrical Engineering, Indian Institute of Technology Bombay, Powai, Mumbai 400076, India, [email protected]; the work of this author was supported by a J. C. Bose Fellowship from the Government of IndiaDepartment of Mathematics and Statistics, Macquarie University, Sydney, NSW 2109, Australia, [email protected]; the work of this author was supported by the Australian Research Council Discovery Grants DP130104432Department of Mathematics and Computer Science, Penn State Harrisburg, Middletown, PA 17057, USA, [email protected]

Abstract

We formulate and study the infinite dimensional linear programming (LP) problem associated with the deterministic discrete time long-run average criterion optimal control problem. Along with its dual, this LP problem allows one to characterize the optimal value of the optimal control problem. The novelty of our approach is that we focus on the general case wherein the optimal value may depend on the initial condition of the system.

1 Introduction and Preliminaries

In this paper, we formulate and study the infinite dimensional (ID) linear programming (LP) problem associated with the deterministic discrete time optimal control problem with long-run average cost, in which the optimal value may depend on the initial condition of the system. The paper continues the line of research started in [10], where similar issues were dealt with in the context of systems evolving in continuous time. Note that, although ideas behind the consideration of continuous and discrete time cases are similar, results in the discrete time case are stronger and are obtained under weaker assumptions comparatively to their continuous time counterparts presented in [10] (we discuss relationships between the two groups of results in detail in the conclusions section at the end of the paper).111An updated and extended version of this paper has been published in SIAM Journal on Control and Optimization, Vol. 57, No 3, pp.1783-1817, DOI. 10.1137/18M1229432

Allowing one to use the convex duality theory and linear programming based numerical techniques, LP formulations of various classes of optimal control problems have been studied extensively in the literature. For example, LP formulations of problems of optimal control of stochastic systems evolving in continuous time have been considered in [5, 8, 11, 16, 29, 37]. Various aspects of the LP approach to problems of optimization of discrete time stochastic systems (controlled Markov chains) have been discussed in [9, 25, 26, 27]. In the deterministic setting, the LP approach has been developed/applied in [21, 24, 30, 35, 38] for systems evolving in continuous time considered on a finite time interval. The applicability of the LP approach to deterministic continuous and discrete time systems considered on the infinite time horizon has been explored in [17, 18, 19, 20, 34].222Infinite time horizon optimal control problems have been traditionally studied with the help of other (not LP related) techniques; see, e.g., [7, 13, 14, 15, 22, 23, 39, 40] and references therein. Note that the list of references mentioned above represents only a sample of the available literature and is not even close to being exhaustive.

Note that, while the form and the properties of the IDLP problem related to the ergodic case (that is, the case when the optimal value is independent of the initial conditions) have been well understood, the linear programming formulation of the long-run average optimal control problem in the non-ergodic case has not been discussed much in the literature. In fact, a justification of counterparts of LP formulations for reducible finite state Markov chains, as in, e.g., [26] and [27], presents a significant mathematical challenge. First steps to address this challenge have been made in [10], and (as mentioned above) the present paper is a continuation of this work.

Everywhere in what follows, we will be dealing with the discrete time controlled dynamical system

[TABLE]

Here $Y$ is a given nonempty compact subset of $I\!\!R^{m}$ , $\ U(\cdot):\,Y\leadsto U_{0}$ is an upper semicontinuous compact-valued mapping to a given compact metric space $U_{0}$ , $\ f(\cdot,\cdot):\,I\!\!R^{m}\times U_{0}\to I\!\!R^{m}$ is a continuous function.

It can be observed that the last two constraints of (1.1) can be rewritten as one:

[TABLE]

where the map $\ A(\cdot):\,Y\leadsto U_{0}$ is defined by the equation

[TABLE]

The map $A(\cdot)$ is upper semicontinuous and its graph $G$ ,

[TABLE]

is a compact subset of $Y\times U_{0}$ .

A control $u(\cdot)$ and the pair $(y(\cdot),u(\cdot))$ will be called an admissible control and an admissible process, respectively, if the relationships (1.1) are satisfied. The set of admissible controls will be denoted ${\cal U}(y_{0})$ or ${\cal U}_{T}(y_{0})$ , depending on whether the problem is considered on the infinite time horizon or on a finite time sequence $t\in\{0,\dots,T-1\}$ .

Everywhere in the paper, it is assumed that

A1. *The set $A(y)$ is not empty for any $y\in Y$ .

This assumption implies that the sets ${\cal U}_{T}(y_{0})$ (with $T$ being an arbitrary positive interger) and the set ${\cal U}(y_{0})$ are not empty for any $y_{0}\in Y$ . That is, there exists at least one admissible control for any initial condition (systems that satisfy such a property are called viable; see [4]).

On the trajectories of (1.1), we consider the following optimal control problems:

[TABLE]

where $k:\,I\!\!R^{m}\times U_{0}\to I\!\!R^{m}$ is a continuous function and $\alpha\in(0,1)$ is a discount factor. Note that, under Assumption A1, the minima in (1.2) and (1.3) are achieved and the optimal value functions $V_{T}(\cdot)$ , $h_{\alpha}(\cdot)$ are lower semicontinuous (see, e.g., Propositions 1-3 and Corollary 1 in [19]).

An extensive literature is devoted to matters related to the existence and equality of the limits $\ \lim_{T\rightarrow\infty}V_{T}(y_{0})$ and $\ \lim_{\alpha\uparrow 1}h^{\alpha}(y_{0})$ . The ergodic case, when these limits are constants (that is, when they do not depend on the initial condition $y_{0}$ ), was studied, for example, in [3, 5, 7, 17] (see also references therein). Results for the non-ergodic case were obtained in [12, 22, 23, 28, 31, 32, 33]. In particular, it was results of [12] that were instrumental for obtaining the IDLP representation for the aforementioned limits for systems evolving in continuous time in [10]. Some ideas from [12] are used in this paper too.

The paper is organized as follows. In the remainder of this introductory section, we give some definitions and state some earlier results that are used further in the text. In Section 2, we introduce an IDLP problem and its dual, the optimal value of the latter giving a lower bound for $\ \liminf_{T\rightarrow\infty}V_{T}(y_{0})$ and $\ \liminf_{\alpha\uparrow 1}h_{\alpha}(y_{0})$ (see Proposition 2.3). In Section 3, we establish (see Theorem 3.1) that $\ \limsup_{T\rightarrow\infty}V_{T}(y_{0})$ and $\ \limsup_{\alpha\uparrow 1}h_{\alpha}(y_{0})$ are bounded from above by the optimal value of the IDLP problem introduced in Section 2 provided that the value functions $V_{T}(\cdot)$ , $h_{\alpha}(\cdot)$ are continuous. Note that the proof of Theorem 3.1 is based on a lemma that extends some results of [12] to the discrete time case (see Lemma 3.2). A direct corollary from the above mentioned results is Proposition 4.1 of Section 4 stating that the limits $\ \lim_{T\rightarrow\infty}V_{T}(y_{0})$ and $\ \lim_{\alpha\uparrow 1}h^{\alpha}(y_{0})$ exist and are equal to the optimal value of the IDLP problem if there is no duality gap. The main result of Section 4 is Theorem 4.2 establishing that, if the pointwise limits $\ \lim_{T\rightarrow\infty}V_{T}(y_{0})$ and $\ \lim_{\alpha\uparrow 1}h^{\alpha}(y_{0})$ exist and are continuous, then they are equal to the optimal value of the dual problem. Also in this section, we use the optimal solution of the dual IDLP problem to state sufficient and necessary optimality conditions for the long-run average optimal control problem (see Propostions 4.5 and 4.6), these optimality conditions are illustrated with an elementary “toy example”. In Section 5, we establish some auxiliary results used in the proofs of the previous sections and in Section 6, we present some conclusions summarizing results obtained and comparing them with results of [10].

We conclude this section with the introduction of notations and results that are used in the sequel. Let $(y(\cdot),u(\cdot))$ be an admissible process. A probability measure $\gamma_{(y(\cdot),u(\cdot)),S}$ is called the occupational measure generated by the process $(y(\cdot),u(\cdot))$ over the time sequence $\{0,1,...,S-1\}$ if, for any Borel set $Q\subset G$ ,

[TABLE]

A probability measure $\gamma^{\alpha}_{(y(\cdot),u(\cdot))}$ is called the discounted occupational measure generated by the process $(y(\cdot),u(\cdot))$ if, for any Borel set $Q\subset G$ ,

[TABLE]

where $1_{Q}(\cdot)$ is the indicator function of $Q$ .

It can be shown that, if $\gamma_{(y(\cdot),u(\cdot)),S}$ is the occupational measure generated by the process $(y(\cdot),u(\cdot))$ over the time sequence $\{0,1,...,S-1\}$ , then

[TABLE]

for any Borel measurable function $q$ on $G$ . Also, it can be shown that if $\gamma^{\alpha}_{(y(\cdot),u(\cdot))}$ is the discounted occupational measure generated by the process $(y(\cdot),u(\cdot))$ , then

[TABLE]

for any Borel measurable function $q$ on $G$ .

Let us introduce the following notations for the sets of occupational measures:

[TABLE]

Note that, due to (1.5) and (1.6), problems (1.2) and (1.3) can be rewritten in the form

[TABLE]

and

[TABLE]

respectively.

To describe convergence properties of occupational measures, we introduce the following metric on ${\cal P}(G)$ (the space of probability measures defined on Borel subsets of $G$ ):

[TABLE]

for $\gamma^{\prime},\gamma^{\prime\prime}\in{\cal P}(G)$ , where $q_{j}(\cdot),\,j=1,2,\dots,$ is a sequence of Lipschitz continuous functions dense in the unit ball of the space of continuous functions $C(G)$ from $G$ to $I\!\!R$ . This metric is consistent with the weak∗ convergence topology on ${\cal P}(G)$ , that is, a sequence $\gamma^{k}\in{\cal P}(G)$ converges to $\gamma\in{\cal P}(G)$ in this metric if and only if

[TABLE]

for any $q\in C(G)$ . Using the metric $\rho$ , we can define the “distance” $\rho(\gamma,\Gamma)$ between $\gamma\in{\cal P}(G)$ and $\Gamma\subset{\cal P}(G)$ and the Hausdorff metric $\rho_{H}(\Gamma_{1},\Gamma_{2})$ between $\Gamma_{1}\subset{\cal P}(G)$ and $\Gamma_{2}\subset{\cal P}(G)$ as follows:

[TABLE]

Note that, although, by some abuse of terminology, we refer to $\rho_{H}(\cdot,\cdot)$ as a metric on the set of subsets of ${\mathcal{P}}(G)$ , it is, in fact, a semi metric on this set (since $\rho_{H}(\Gamma_{1},\Gamma_{2})=0$ implies $\Gamma_{1}=\Gamma_{2}$ if $\Gamma_{1}$ and $\Gamma_{2}$ are closed, but the equality may not be true if at least one of these sets is not closed).

Let us define the sets $W$ and $W(\alpha,y_{0})$ by the equations:

[TABLE]

Note that the sets $W$ and $W(\alpha,y_{0})$ are convex and compact in the topology specified above. The following equalities establish relationships between these sets and the occupational measures sets introduced earlier (see Theorem 5.4 in [19]):

[TABLE]

Also (see Corollary 2 in [19]),

[TABLE]

Here and in what follows, $\bar{\rm co}$ stands for the closed convex hull of the corresponding set.

2 Estimates of the Limit Optimal Value Functions from Below

Consider the IDLP problem

[TABLE]

where

[TABLE]

with $\mathcal{M}_{+}(G)$ standing for the space of nonnegative measures defined on Borel subsets of $G$ . Also consider the problem

[TABLE]

where ${\cal D}(y_{0})$ is the set of triplets $(\mu,\psi(\cdot),\eta(\cdot))\in I\!\!R\times C(Y)\times C(Y)$ that for all $(y,u)\in G$ satisfy the inequalities

[TABLE]

Note that the optimal value of problem (2.3) can be equivalently represented as

[TABLE]

where $\psi$ and $\eta$ are continuous functions, and $\psi$ satisfies the second inequality in (2.4). The optimal values of (2.3) and (2.1) are related by the inequality

[TABLE]

(see Lemma 5.3 in Section 5.2). Problem (2.3) is, in fact, dual with respect to (2.1), with (2.6) being a part of the duality relationships (see more details in Section 5.2).

As can be readily seen, problem (2.1) can be equivalently written as

[TABLE]

where

[TABLE]

Along with (2.7), consider the problem

[TABLE]

where

[TABLE]

It is easy to see that both sets $W_{1}(y_{0})$ and $W_{2}(y_{0})$ are convex, set $W_{2}(y_{0})$ is closed (and, therefore, compact), and

[TABLE]

Lemma 2.1

The following inclusions are true:

[TABLE]

This implies, in particular, that the set $W_{2}(y_{0})$ is not empty.

Proof. Note first that since the sets $\Gamma_{T}(y_{0})$ and $\Theta^{\alpha}(y_{0})$ are not empty for all admissible $T$ and $\alpha$ , so are the sets $\displaystyle\limsup_{T\to\infty}\Gamma_{T}(y_{0})$ and $\displaystyle\limsup_{\alpha\uparrow 1}\Theta^{\alpha}(y_{0})$ . Note also that from (1.11) it follows that

[TABLE]

Let $\displaystyle\gamma\in\limsup_{T\to\infty}\Gamma_{T}(y_{0})$ . Then there exist sequences $T_{i}\to\infty$ and $\gamma_{i}\in\Gamma_{T_{i}}(y_{0})$ such that $\gamma_{i}\to\gamma$ as $i\to\infty$ . Let $u_{i}(\cdot)\in{\cal U}_{T_{i}}(y_{0})$ be the control generating $\gamma_{i}$ and $y_{i}(\cdot)$ be the corresponding trajectory. For any $\varphi\in C(Y)$ we have

[TABLE]

Define the functional $\zeta_{i}\in C^{*}(G)$ (here and in what follows, $C^{*}(G)$ stands for the space of continuous linear functionals on $C(G)$ ) by the equation

[TABLE]

Due to Riesz representation theorem (see, e.g., Theorem 4.3.9, p. 181 in [6]), there exists $\xi_{i}\in{\cal M_{+}}(G)$ such that

[TABLE]

Then (2.11) can be written as

[TABLE]

Passing to the limit, we obtain

[TABLE]

Since $\gamma\in W$ (due to (2.10)), the latter equality implies that $\gamma\in W_{2}(y_{0}).$ Thus, the first inclusion in (2.9) is proved.

Let us prove the second inclusion. By (1.12), to prove the second inclusion in (2.9), it is sufficient to prove that

[TABLE]

Note that from (1.11) and (1.12) it follows that

[TABLE]

Take $\gamma\in\limsup_{\alpha\uparrow 1}W(\alpha,y_{0})$ . There exist sequences $\alpha_{i}\uparrow 1$ and $\gamma_{i}\in W(\alpha_{i},y_{0})$ such that $\gamma_{i}\to\gamma$ as $i\to\infty$ . Since $\gamma_{i}\in W(\alpha_{i},y_{0})$ , we have

[TABLE]

where $\xi_{i}=\gamma_{i}/(1-\alpha_{i})$ . Passing to the limit as $i\to\infty$ we obtain

[TABLE]

Since $\gamma\in W$ , the second inclusion in (2.9) is proved. $\Box$

The next lemma establishes a relation between the optimal values in problems (2.3) and (2.8).

Lemma 2.2

The optimal value in problems (2.3) and (2.8) are equal, that is,

[TABLE]

Proof. The proof of the lemma is given in Section 5.2. $\Box$

Proposition 2.3

The lower limits of the optimal value functions in problems (1.2) and (1.3) are bounded from below by the optimal value of (2.3), that is,

[TABLE]

Proof. This proposition follows from Lemmas 2.1 and 2.2, and from the fact that the equalities

[TABLE]

are valid. $\Box$

Let $\mathcal{T}$ be a positive integer and let $(y_{\mathcal{T}}(\cdot),u_{\mathcal{T}}(\cdot))$ be a $\mathcal{T}$ -periodic admissible process. This process will be referred to as finite time (FT) reachable from $y_{0}$ if there exist an integer $\bar{t}\geq 0$ and a control $u(\cdot)\in{\cal U}_{\bar{t}}(y_{0})$ such that the solution $y(t)=y(t,y_{0},u)$ of (1.1) obtained with this control satisfies the equality $y(\bar{t})=y_{\mathcal{T}}(0)$ .

Consider the optimal control problem

[TABLE]

where ${\rm inf}$ is over all integer $\mathcal{T}>0$ and over all $\mathcal{T}$ -periodic pairs $(y_{\mathcal{T}}(\cdot),u_{\mathcal{T}}(\cdot))$ that are FT reachable from $y_{0}$ . Similarly to (1.9), this problem can be reformulated in terms of occupational measures

[TABLE]

where $\Gamma_{per}(y_{0})$ is the set of occupational measures generated by all FT reachable from $y_{0}$ -admissible periodic pairs. Note that

[TABLE]

and, therefore,

[TABLE]

Proposition 2.4

The following relationships are valid:

[TABLE]

Proof. Due to (2.7) and (2.15), it is sufficient to prove only the first relationship. Note that from (2.10) and (2.16) it follows that

[TABLE]

Take now an arbitrary $\gamma\in\Gamma_{per}(y_{0})$ . By definition, it means that $\gamma$ is generated by a $\mathcal{T}$ -periodic pair $(y_{\mathcal{T}}(\cdot),u_{\mathcal{T}}(\cdot))$ that is FT reachable from $y_{0}$ . That is, for any continuous function $q(y,u)$ ,

[TABLE]

Consequently, for any $\phi\in C(Y)$ ,

[TABLE]

where $y(t)=y(t,y_{0},u)$ is a solution of (1.1) that satisfies the equality $y(\bar{t})=y_{\mathcal{T}}(0)$ (the existence of $\bar{t}\geq 0$ and the existence of a control $u(\cdot)\in{\cal U}_{\bar{t}}(y_{0})$ that ensure the validity of this equality follows from the fact that $(y_{\mathcal{T}}(\cdot),u_{\mathcal{T}}(\cdot))$ is FT reachable from $y_{0}$ ). Since $y_{\mathcal{T}}(s+1)=f(y_{\mathcal{T}}(s),u_{\mathcal{T}}(s))$ and $y(s+1)=f(y(s),u(s))$ , from (2.20) it follows that

[TABLE]

Define $\zeta\in C^{*}(G)$ by the equation

[TABLE]

Due to Riesz representation theorem, there exists $\xi\in{\cal M_{+}}(G)$ such that

[TABLE]

Therefore, (2.21) can be rewritten as

[TABLE]

Since $\gamma\in W$ (by (2.19)), the latter implies that $\gamma\in W_{1}(y_{0})$ . Thus, the first relationship in (2.18) is established. $\Box$

Corollary 2.5

If

[TABLE]

then

[TABLE]

3 Estimates of the Limit Optimal Value Functions from Above

Theorem 3.1

(a)* Let $V_{T}(\cdot)$ be continuous on $Y$ for all natural $T$ . Then*

[TABLE]

(b)* Let $h_{\alpha}(\cdot)$ be continuous on $Y$ for all $\alpha\in(0,1)$ . Then*

[TABLE]

Proof of the theorem is based on the following lemma.

Lemma 3.2

For any natural $T$ ,

[TABLE]

Also, for any $\alpha\in(0,1)$ ,

[TABLE]

The proof of the lemma is given at the end of the section.

**Proof of Theorem 3.1.

**

Proof of (a). Let us fix an arbitrary natural $T$ and let us consider the following IDLP problem

[TABLE]

where $Q(T)$ is the set of pairs $(\psi(\cdot),\eta(\cdot))\in C(Y)\times C(Y)$ that satisfy the inequalities

[TABLE]

with

[TABLE]

Let us show that, for an arbitrary small $\varepsilon>0$ , there exists a function $\eta_{T,\varepsilon}(\cdot)\in C(Y)$ such that

[TABLE]

Note that, if the inclusion above is established, it would imply that

[TABLE]

Let us first verify that there exists $\eta_{T,\varepsilon}(\cdot)\in C(Y)$ such that the pair $(\psi_{T,\varepsilon}(\cdot),\eta_{T,\varepsilon}(\cdot))$ satisfies the first inequality in (3.6). To this end, note that the inequality (3.3) is equivalent to the inequality

[TABLE]

which, in turn, is equivalent to

[TABLE]

The problem on the left hand side of (3.10), i.e.,

[TABLE]

is an IDLP problem, its dual being

[TABLE]

The optimal values of (3.11) and (3.12) are equal (see Proposition 6 in [19]). Therefore, (3.10) is equivalent to

[TABLE]

From (3.13) it follows that, for any $\varepsilon>0$ , there exists a function $\eta_{T,\varepsilon}(\cdot)\in C(Y)$ such that

[TABLE]

The latter implies that that the pair $(\psi_{T,\varepsilon}(\cdot),\eta_{T,\varepsilon}(\cdot))$ , where $\psi_{T,\varepsilon}(\cdot):=V_{T}(\cdot)-\varepsilon$ , satisfies the first inequality in (3.6).

Let us now verify that the function $\psi_{T,\varepsilon}(\cdot)=V_{T}(\cdot)-\varepsilon$ satisfies the second inequality in (3.6). From the dynamic programming principle applied to problem (1.2), it follows that, for any $T\geq 1$ ,

[TABLE]

Also, as can be readily seen,

[TABLE]

By (3.15) and (3.16),

[TABLE]

Consequently,

[TABLE]

Thus, $\psi_{T,\varepsilon}(\cdot)=V_{T}(\cdot)-\varepsilon$ satisfies the second inequality in (3.6). Hence, (3.8) is valid and, consequently, (3.9) is valid too.

By Lemma 5.3 of Section 5,

[TABLE]

where

[TABLE]

(Note that, to adjust the notations used above and those used in Lemma 5.3, one should write $d^{*}(T,y_{0})$ and $k^{*}(T,y_{0})$ as $d^{*}(\theta_{T},y_{0})$ and $k^{*}(\theta_{T},y_{0})$ , where $\theta_{T}=\frac{2M}{T}$ .)

From (3.9) and (3.17) it follows that $\ V_{T}(y_{0})-\varepsilon\leq k^{*}(T,y_{0}),$ which implies that

[TABLE]

since $\varepsilon>0$ is arbitrary small. Due to (3.19), to prove (3.1), it is sufficient to establish that

[TABLE]

One can readily see that $k^{*}(T,y_{0})$ is a decreasing function of $T$ and that $\ k^{*}(T,y_{0})\geq k^{*}(y_{0})$ for any $T\geq 1$ . Hence,

[TABLE]

Let us now show that the opposite inequality is also valid. Let $\delta>0$ be arbitrary small and let $(\gamma^{\prime},\xi^{\prime})\in\Omega(y_{0})$ be $\delta$ -optimal for (2.1). That is,

[TABLE]

Then

[TABLE]

( $\delta>0$ can be arbitrary small). Thus (3.20) is established and statement (a) is proved.

*Proof of *(b) The proof of (b) is very similar to that of (a). We fix an arbitrary $\alpha\in(0,1)$ and consider the IDLP problem

[TABLE]

where $Q(\alpha)$ is the set of pairs $(\psi(\cdot),\eta(\cdot))\in C(Y)\times C(Y)$ that satisfy the inequalities

[TABLE]

We then show that, for an arbitrary small $\varepsilon>0$ , there exists a function $\eta_{\alpha,\varepsilon}(\cdot)\in C(Y)$ such that

[TABLE]

with the inclusion above implying that

[TABLE]

To verify (3.23), we first show that there exists $\eta_{\alpha,\varepsilon}(\cdot)\in C(Y)$ such that the pair $(\psi_{\alpha,\varepsilon}(\cdot),\eta_{\alpha,\varepsilon}(\cdot))$ satisfies the first inequality in (3.22). As in the proof of (a), we rewrite the inequality (3.4) in the form

[TABLE]

which is equivalent to

[TABLE]

The problem on the left hand side of (3.25), i.e.,

[TABLE]

is an IDLP problem, the dual of which is

[TABLE]

The optimal values of (3.26) and (3.27) are equal (Proposition 6 in [19]). Therefore, (3.25) is equivalent to

[TABLE]

From (3.28) it follows that, for any $\varepsilon>0$ , there exists a function $\eta_{\alpha,\varepsilon}(\cdot)\in C(Y)$ such that

[TABLE]

The latter implies that the pair $(\psi_{\alpha,\varepsilon}(\cdot),\eta_{\alpha,\varepsilon}(\cdot))$ , where $\psi_{\alpha,\varepsilon}(\cdot):=h_{\alpha}(\cdot)-\varepsilon$ , satisfies the first inequality in (3.22).

To verify that the function $\psi_{\alpha,\varepsilon}(\cdot)=h_{\alpha}(\cdot)-\varepsilon$ satisfies the second inequality in (3.22), note that from the dynamic programming principle applied to problem (1.3), it follows that

[TABLE]

(see, e.g., Proposition 4 in [19]). The latter implies that

[TABLE]

which, in turn, implies that

[TABLE]

(since, as can be readily seen, $\ \max_{y\in Y}|h_{\alpha}(y)|\leq M$ ). Thus, $\psi_{\alpha,\varepsilon}(\cdot)=h_{\alpha}(\cdot)-\varepsilon$ satisfies the second inequality in (3.22), and, therefore, (3.24) is valid too. Starting from this point, the proof of (b) follows exactly the same steps as that of (a). $\Box$

Proof of Lemma 3.2. Let us prove (3.3). To this end, let us show first that, for any natural $T$ and $T^{\prime}$ ,

[TABLE]

where $M$ is as in (3.7). Take $y_{0}\in Y$ , $\gamma^{\prime}\in\Gamma_{T^{\prime}}(y_{0})$ , and let $u(\cdot)\in{\cal U}_{T^{\prime}}(y_{0})$ be a control that generates $\gamma^{\prime}$ on $\{0,\dots,T^{\prime}-1\}$ . Extend $u$ from the interval $\{0,\dots,T^{\prime}-1\}$ to the interval $\{0,\dots,T^{\prime}+T-1\}$ so that $u\in{\cal U}_{T^{\prime}+T}(y_{0})$ . Such extension is possible due to viability of $Y$ . Let $y(\cdot)$ be the corresponding trajectory. Taking into account that $\displaystyle V_{T}(y(s))\leq{1\over T}\sum_{r=0}^{T-1}k(y(r+s),u(r+s)))$ for all $s\in\{0,\dots,T^{\prime}-1\}$ , we obtain

[TABLE]

Thus the inequality (3.32) is established. From this inequality it follows that

[TABLE]

where $\Gamma_{T^{\prime}}$ is the union of $\Gamma_{T^{\prime}}(y_{0})$ over $y_{0}\in Y$ (see (1.7)). Take an arbitrary $\gamma\in W$ . From (1.11) it follows that there exist sequences $T^{\prime}_{l}>0,\ \gamma^{\prime}_{l}\in\Gamma_{T^{\prime}_{l}}$ , $\ l=1,2,...,$ such that $T^{\prime}_{l}\rightarrow\infty$ and $\gamma^{\prime}_{l}\rightarrow\gamma$ . Passing to the limit along these sequences in (3.33) and having in mind that

[TABLE]

(since $V_{T}(\cdot)$ is lower semicontinuous for any $T>0$ ; see, e.g., Theorem 3.1.5 in [36]), one arrives at inequality (3.3).

Let us now prove (3.4). To this end, let us show first that, for any $\alpha\in(0,1)$ and any $\alpha^{\prime}\in(\alpha,1)$ ,

[TABLE]

Take $y_{0}\in Y$ , $\gamma^{\prime}\in\Theta_{\alpha^{\prime}}(y_{0})$ , and let $u(\cdot)\in{\cal U}(y_{0})$ be a control that generates $\gamma^{\prime}$ . Let also $y(\cdot)$ be the trajectory corresponding to $u(\cdot)$ . We have

[TABLE]

From (3.34) it follows that

[TABLE]

where $\Theta_{\alpha^{\prime}}$ is the union of $\Theta_{\alpha^{\prime}}(y_{0})$ over $y_{0}\in Y$ (see (1.8)). Take an arbitrary $\gamma\in W$ . From (1.11) it follows that there exist sequences $\alpha^{\prime}_{l}\in(0,1),\ \gamma^{\prime}_{l}\in\Gamma_{\alpha^{\prime}_{l}}$ , $\ l=1,2,...,$ such that $\alpha^{\prime}_{l}\uparrow 1$ and $\gamma^{\prime}_{l}\rightarrow\gamma$ . Passing to the limit along these sequences in (3.35) and keeping in mind that

[TABLE]

(since $h_{\alpha}(\cdot)$ is lower semicontinuous for any $\alpha\in(0,1)$ ; see also Theorem 3.1.5 in [36]), one arrives at inequality (3.4). $\Box$

4 LP Representation for the Optimal Value and Related Sufficient/Necessary Optimality Conditions

The following statement is a direct corollary of Theorem 3.1 and Proposition 2.3.

Proposition 4.1

If

[TABLE]

then, provided that $V_{T}(\cdot)$ is continuous for any $T>1$ , there exists the pointwise limit

[TABLE]

Also, provided that $h_{\alpha}(\cdot)$ is continuous for any $\alpha\in(0,1)$ , there exists the pointwise limit

[TABLE]

Note that a statement about the LP representation of the pointwise limits (4.2) and (4.3) can be established without the strong duality assumption (4.1) . Namely, the following result is valid.

Theorem 4.2

(a)* Let the pointwise limit*

[TABLE]

exist and let the function $V(\cdot)$ be continuous. Then

[TABLE]

(b)* Let the pointwise limit*

[TABLE]

exist and the function $h(\cdot)$ be continuous. Then

[TABLE]

Proof. The proof of the theorem is given at the end of this section. $\Box$

Remark 4.3

If (4.4) and (4.5) are valid, then the strong duality equality (4.1) is true provided that condition (2.22) of Corollary 2.5 is satisfied. **

In the rest of this section, we assume that the pointwise limit $\lim_{T\rightarrow\infty}V_{T}(\cdot)=V(y)$ exists and is continuous, and, therefore, it is equal to the optimal value $d^{*}(y_{0})$ of the dual problem (2.3) (by Theorem 4.2). That is, (4.4) and (4.5) are valid.

Consider the optimal control problem

[TABLE]

Note that, due to (4.4), the optimal value of (4.8) is equal to $V(y_{0})$ (see Proposition 5.4 in Section 5). Below, we discuss sufficient and necessary optimality conditions for problem (4.8) stated in terms of an optimal solution of problem (2.3).

DEFINITION. A pair $(\bar{\psi}(\cdot),\bar{\eta}(\cdot))\in C(G)\times C(G)$ will be called an optimal solution of (2.3) if it satisfies the inequalities (compare with (2.4))

[TABLE]

Proposition 4.4

(a)* A pair $(\bar{\psi}(\cdot),\bar{\eta}(\cdot))$ is an optimal solution of (2.3) if and only if $\ \bar{\psi}(\cdot)$ satisfies the second inequality in (4.9) and*

[TABLE]

(b)* If $\bar{\eta}(\cdot)$ is such that*

[TABLE]

then the pair $(\bar{\psi}(\cdot),\bar{\eta}(\cdot))$ , where $\bar{\psi}(\cdot)=V(\cdot)$ , is an optimal solution of problem (2.3).

Proof. By (2.5), the first inequality in (4.9) is equivalent to the equality

[TABLE]

Also, (4.12) is equivalent to (4.10) (due to (4.5)). Thus (a) is proved.

If $\bar{\eta}(\cdot)$ is such that (4.11) is satisfied, then the pair $(\bar{\psi}(\cdot),\bar{\eta}(\cdot))$ , where $\bar{\psi}(\cdot)=V(\cdot)$ , satisfies (4.10). Therefore, by (a), this pair is an optimal solution of (2.3). This proves (b). $\Box$

Proposition 4.5

Let an optimal solution $(\bar{\psi}(\cdot),\bar{\eta}(\cdot))$ of (2.3) exist. Then, for an admissible process $(y(\cdot),u(\cdot))$ to be optimal in (4.8) it is sufficient that the equalities

[TABLE]

are satisfied for all $t=0,1,...\$ .

Proof. From (4.13) and (4.14) it follows that

[TABLE]

for all $t=0,1,...\$ . Therefore, for any $T\geq 1$ ,

[TABLE]

Taking into account that

[TABLE]

we obtain

[TABLE]

That is, the process $(y(\cdot),u(\cdot))$ is optimal in (4.8). $\Box$

We will now establish that the fulfillment of (4.13)-(4.14) is also a necessary condition of optimality of an admissible process $(y(\cdot),u(\cdot))$ provided that the latter is periodic, that is, there exists a positive integer $T_{0}$ such that, for any $t=0,1,...$ ,

[TABLE]

Proposition 4.6

Let an optimal solution $(\bar{\psi}(\cdot),\bar{\eta}(\cdot))$ of (2.3) exist. Then, for an admissible process $(y(\cdot),u(\cdot))$ satisfying the periodicity conditions (4.16) to be optimal in (4.8), it is necessary that the equalities (4.13)-(4.14) are satisfied for all $t=0,1,...$ .

Proof. Note that the fact that the periodic admissible process is optimal in (4.8) means that

[TABLE]

Note also that from Proposition 4.4 it follows that, for any $t=0,1,...,T_{0}-1$ ,

[TABLE]

From (4.17) and (4.18) it follows that

[TABLE]

which implies that

[TABLE]

due to the fact that

[TABLE]

(by (4.16)). The inequalities (4.19) and (4.20) establish the validity of (4.14). In view of (4.14), the inequality (4.18) is equivalent to that

[TABLE]

for all $t=0,1,...,T_{0}-1$ . If the above inequality was strict for at least one $t$ , then one would obtain

[TABLE]

which, by (4.21), would lead to

[TABLE]

The latter contradicts (4.17). Hence, (4.22) is satisfied as equality for all $t=0,1,...T_{0}-1$ . This proves (4.13). $\Box$

Remark 4.7

As established by Proposition 4.5, an admissible process $(y(\cdot),u(\cdot))$ is optimal if it satisfies the equalities (4.13), (4.14). Assuming that these are valid, one may conclude (due to (4.10)) that the equality (4.13) is equivalent to

[TABLE]

which leads to

[TABLE]

The latter implies that the feedback control

[TABLE]

is optimal in the sense that, being used in (1.1), it allows one to obtain the optimal “open loop” admissible process $(y(\cdot),u(\cdot))$ .**

Let us illustrate the optimality conditions discussed above with the following “toy example”.

Example. Let the dynamics be one-dimensional and be described by the equation (compare with (1.1))

[TABLE]

with $\ Y=[-1,1]$ and with $U(y)=\{-1,1\}$ (that is, the control can be either equal to $1$ or to $-1$ ). Consider problem (1.2) with $\ k(y,u)=y$ . As can be readily understood, the optimal admissible processes in this example are as follows. If $y_{0}\in(0,1]$ , then

[TABLE]

If $y_{0}\in[-1,0)$ , then

[TABLE]

Also, if $y_{0}=0$ , then the system is uncontrollable, and the only admissible trajectory is $\ y(t)=0\ \forall\ t\geq 0$ . The admissible processes described above are optimal on any time horizon (both finite and infinite), with the optimal value function being defined by the equation

[TABLE]

Thus, $V(y)=-|y|$ . Note that condition (2.22) of Corollary 2.5 is satisfied and, therefore, the strong duality equality (4.1) is valid in the given example (see Remark 4.3).

Define the function $\bar{\eta}(\cdot)$ by the equation

[TABLE]

One can readily verify that

[TABLE]

the latter implying that

[TABLE]

That is, $\bar{\eta}(\cdot)$ satisfies (4.11). Therefore, the pair $(\bar{\psi}(\cdot),\bar{\eta}(\cdot))$ , where $\bar{\psi}(y)=-|y|$ , is an optimal solution of (2.3). The $argmin$ feedback control defined in (4.23) takes in this case the form

[TABLE]

This feedback control is optimal and it is consistent with the optimal open loop solution shown above.

Remark 4.8

If (4.13), (4.14) are valid, then the relationships (4.15) are valid, the latter implying that

[TABLE]

This provides an interpretation of $\bar{\eta}(\cdot)$ as a function that defines the difference between the running cost $\ \displaystyle{1\over T}\sum_{t=0}^{T-1}k(y(t),u(t))$ and the optimal value $V(y_{0})$ along the optimal trajectory. Note that, if

[TABLE]

that is the process $(y(\cdot),u(\cdot))$ is optimal on any finite time horizon as well, then (4.26) can be rewritten as follows

[TABLE]

That was the case in the example considered above, in which the optimal trajectory $y(\cdot)$ satisfies the equalities: $\ y(T)=-y_{0}\ \forall y_{0}\in(0,1]$ and $\ y(T)=y_{0}\ \forall y_{0}\in[-1,0]$ for all $T\geq 1$ . This leads to $\ \bar{\eta}(y(T))=0$ (see (4.25)) and, consequently, to that

[TABLE]

Thus, the relationships in (4.24) are consistent with (4.27). **

Proof of Theorem 4.2. If the pointwise limit (4.4) exists, then, by Proposition 2.3, the limit function $V(\cdot)$ satisfies the inequality

[TABLE]

Therefore, to prove the statement (a), one needs to show that

[TABLE]

Similarly, if the pointwise limit (4.6) exists, then, by Proposition 2.3, the limit function $h_{\alpha}(\cdot)$ satisfies the inequality

[TABLE]

Therefore, to prove the statement (b), one needs to show that

[TABLE]

Proof of (4.28). Firstly, note that, by dividing (3.15) by $T$ and passing to the limit as $T\to\infty$ , one obtains

[TABLE]

Also, by passing to the limit as $T\to\infty$ in (3.3), one obtains

[TABLE]

Inequality (4.31) can be rewritten in the form

[TABLE]

which is equivalent to that

[TABLE]

The problem in the left hand side of the above inequality,

[TABLE]

is an IDLP problem, whose dual is

[TABLE]

Through equality of the optimal values of (4.33) and (4.34) (see Proposition 6 in [19]), we conclude that (4.32) is equivalent to

[TABLE]

From (4.35) it follows that, for any $\varepsilon>0$ , there exists a function $\eta_{\varepsilon}(\cdot)\in C(Y)$ such that

[TABLE]

Consider the problem

[TABLE]

where $Q$ is the set of pairs $(\psi,\eta)\in C(Y)\times C(Y)$ that satisfy inequalities

[TABLE]

Note that the optimal value of problem (4.37) is the same as that of (2.3) (see (5.15) in the proof of Lemma 5.3 taken with $\theta=0$ ). Due to (4.30) and (4.36), the pair $(\psi_{\varepsilon}(\cdot),\eta_{\varepsilon}(\cdot))$ , where $\psi_{\varepsilon}(\cdot):=V(\cdot)-\varepsilon$ , satisfies the inequalities (4.38). Consequently,

[TABLE]

This proves (4.28) since $\varepsilon>0$ is arbitrarily small .

Proof of (4.29). By passing to the limit as $\alpha\uparrow 1$ in (3.30), we conclude that $h(\cdot)$ satisfies the inequality

[TABLE]

Also, by passing to the limit as $\alpha\uparrow 1$ in (3.4) we establish that

[TABLE]

Proceeding from this point in exactly the same way as above, one establishes the validity of (4.29) $\Box$

5 Appendix

5.1 Another representation for the limit optimal values

Let ${\cal K}$ be the set of continuous functions that satisfy the following relationships:

[TABLE]

and

[TABLE]

In these notations, the relationships (4.30), (4.31) and (4.39), (4.40) are equivalent to the inclusions

[TABLE]

and

[TABLE]

respectively.

Proposition 5.1

(a)* Let the pointwise limit (4.4) exist and the function $V(\cdot)$ be continuous. Then*

[TABLE]

(b)* Let the pointwise limit (4.6) exists and the function $h(\cdot)$ be continuous. Then*

[TABLE]

Proof. Note that, due to (5.3) and (5.4)

[TABLE]

Therefore, to prove the proposition, it is sufficient to establish that the inequalities opposite to (5.7) are valid. For a natural $T$ , let $u_{T}(\cdot)$ be an optimal control in (1.2), $\gamma_{T}\in\Gamma_{T}(y_{0})$ be the occupational measure generated by this control, and $y_{T}(\cdot)$ be the corresponding trajectory. Then

[TABLE]

Let $\gamma_{T}(dy,du)$ converge to $\gamma$ in weak∗ topology as $T\to\infty$ along a subsequence (we do not relabel). Note that $\gamma\in W$ (due to (1.11)). From the equality above, by passing to the limit as $T\to\infty$ , we obtain

[TABLE]

For $w\in{\cal K}$ , taking into account the monotonicity property (5.1), we have

[TABLE]

Since $w$ is continuous, we can pass to the limit as $T\to\infty$ and obtain

[TABLE]

Combining this with (5.2) and (5.8) we obtain

[TABLE]

The latter implies that the inequality opposite to the first inequality in (5.7) is valid. This proves part (a) of the proposition.

The proof of the inequality opposite to the second inequality in (5.7) is similar. For $\alpha\in(0,1),$ let $u_{\alpha}(\cdot)$ be an optimal control in (1.3), $\gamma_{\alpha}\in\Theta_{\alpha}(y_{0})$ be the occupational measure generated by this control, and $y_{\alpha}(\cdot)$ be the corresponding trajectory. Then

[TABLE]

Let $\gamma_{\alpha}(dy,du)$ converge to $\gamma$ in weak∗ topology as $\alpha\to 1$ along a subsequence (we do not relabel). Note that $\gamma\in W$ (due to (1.11)). From the equality above, by passing to the limit as $\alpha\to 1$ we obtain

[TABLE]

Combining this with (5.2) and (5.9) we obtain

[TABLE]

The latter implies that the inequality opposite to the second inequality in (5.7) is valid, and, thus, proves part (b) of the proposition. $\Box$

Remark 5.2

It can be verified directly that the optimal value of the problem in the right hand side of (5.5) and (5.6) is equal to $d^{*}(y_{0})$ (the optimal value of the dual problem (2.3)). Results establishing the validity of presentations similar to (5.5) and (5.6) in continuous time setting were obtained in [12]. **

5.2 Results referred to in Sections 3 and 4

Consider a perturbed version of the IDLP problem (2.1)

[TABLE]

and the corresponding perturbed version of the dual problem (2.3)

[TABLE]

where ${\cal D}(\theta,y_{0})$ is the set of triplets $(\mu,\psi(\cdot),\eta(\cdot))\in I\!\!R\times C(Y)\times C(Y)$ that satisfy the inequalities

[TABLE]

Note that $\theta\geq 0$ is a perturbation parameter and note that (5.10) and (5.11) become (2.1) and (2.3) with $\theta=0$ . Consider also the problem

[TABLE]

where $Q(\theta)$ is the set of pairs $(\psi(\cdot),\eta(\cdot))\in C(Y)\times C(Y)$ that satisfy the inequalities

[TABLE]

Lemma 5.3

The following relationships are valid:

[TABLE]

Proof. Let us prove, first, that

[TABLE]

In fact, the inequality $\bar{d}^{*}(\theta,y_{0})\leq d^{*}(\theta,y_{0})$ is true (since, for any pair $\ (\psi(\cdot),\eta(\cdot))\in Q(\theta)$ , the triplet $\ (\mu,\psi(\cdot),\eta(\cdot))\in{\cal D}(\theta,y_{0})$ with $\mu=\psi(y_{0})$ ). Let us prove the opposite inequality. Let a triplet $\ (\mu^{\prime},\psi^{\prime}(\cdot),\eta^{\prime}(\cdot))\in{\cal D}(\theta,y_{0})$ be such that $\mu^{\prime}\geq d^{*}(\theta,y_{0})-\delta$ , with $\delta>0$ being arbitrarily small. Then the pair $\ (\tilde{\psi}^{\prime}(\cdot),\eta^{\prime}(\cdot))\in Q(\theta)$ , with $\tilde{\psi}^{\prime}(y)=\psi^{\prime}(y)-\psi^{\prime}(y_{0})+\mu^{\prime}$ . Since $\tilde{\psi}^{\prime}(y_{0})=\mu^{\prime}$ , it leads to the inequality $\bar{d}^{*}(\theta,y_{0})\geq d^{*}(\theta,y_{0})-\delta$ and, consequently, to the inequality $\bar{d}^{*}(\theta,y_{0})\geq d^{*}(\theta,y_{0})$ since $\delta>0$ is arbitrarily small. Thus, (5.16) is proved.

Let us now prove the inequality

[TABLE]

Take any $(\gamma,\xi)\in\Omega(y_{0})$ and $(\mu,\psi,\eta)\in{\cal D}(\theta,y_{0})$ . Integrating the first inequality in (5.12) with respect to $\gamma$ and taking into account that $\gamma\in W$ we conclude that

[TABLE]

Taking into account that $(\gamma,\xi)\in\Omega(y_{0})$ and the second inequality in (5.12), we obtain

[TABLE]

Therefore,

[TABLE]

This proves (5.17). $\Box$

Let $C^{*}(Y)$ stand for the space of continuous linear functionals on $C(Y)$ and let $\mathcal{M}(G)$ stand for the space of measures defined on Borel subsets of $G$ . Define a linear operator $\mathcal{A}(\cdot):\mathcal{M}(G)\times\mathcal{M}(G)\mapsto I\!\!R^{1}\times C^{*}(Y)\times C^{*}(Y)$ as follows: for any $(\gamma,\xi)\in\mathcal{M}(G)\times\mathcal{M}(G)$ ,

[TABLE]

where $a_{(\gamma,\xi)},\ b_{\gamma}\in C^{*}(Y)$ are defined by the equation: $\ \forall\ \phi(\cdot)\in C(Y)$ ,

[TABLE]

In this notation, the set $\Omega(y_{0})$ defined in (2.2) can be rewritten as follows

[TABLE]

where ${\bf 0}$ stands for the zero element of $C^{*}(Y)$ . Also, problem (2.1) takes the form

[TABLE]

where $\langle\cdot,\gamma\rangle$ (also, $\langle\cdot,\xi\rangle$ in the sequel) denoting the integral of the corresponding function over $\gamma$ (respectively, over $\xi$ ). Note that, for any $(\mu,\psi(\cdot),\eta(\cdot))\in I\!\!R^{1}\times C(Y)\times C(Y)$ ,

[TABLE]

Define now the linear operator

$\mathcal{A}^{*}(\cdot):I\!\!R^{1}\times C(Y)\times C(Y)\mapsto C(G)\times C(G)\subset\mathcal{M}^{*}(G)\times\mathcal{M}^{*}(G)$ in such a way that, for any $(\mu,\psi(\cdot),\eta(\cdot))\in I\!\!R^{1}\times C(Y)\times C(Y)$ ,

[TABLE]

Thus,

[TABLE]

That is, the operator $\mathcal{A}^{*}(\cdot)$ is the adjoint of $\mathcal{A}(\cdot)$ . The problem dual to (5.19) is of the form (see [1] and [2])

[TABLE]

the latter being equivalent to (2.3).

Proof of Lemma 2.2. Let

[TABLE]

and let $\bar{H}$ stand for the closure of $H$ in the weak∗ topology of $I\!\!R^{1}\times C^{*}(Y)\times C^{*}(Y)\times I\!\!R^{1}$ . Consider the problem

[TABLE]

Its optimal value $k_{sub}^{*}(y_{0})$ is called the subvalue of the IDLP problem (5.19). Let us show that the optimal value of (2.8) is equal to the subvalue. In fact, as can be readily seen, $\left(1,{\bf 0},{\bf 0},\int_{G}k(y,u)\gamma(dy,du)\right)\in\bar{H}$ if $\gamma\in W_{2}(y_{0})$ . Consequently,

[TABLE]

From the fact that $k_{sub}^{*}(y_{0})$ is defined as the optimal value in (5.20) it follows that there exists a sequence $(\gamma_{l},\xi_{l})\in\mathcal{M}_{+}(G)\times\mathcal{M}_{+}(G)$ such that $\mathcal{A}(\gamma_{l},\xi_{l})$ converges (in weak∗ topology) to $(1,{\bf 0},{\bf 0})$ , with $\int_{G}k(y,u)\gamma_{l}(dy,du)$ converging to $k_{sub}^{*}(y_{0})$ as $l$ tends to infinity. That is (see (5.18)),

[TABLE]

Without loss of generality, one may assume that $\gamma_{l}$ converges in weak∗ topology to a measure $\gamma$ that satisfies the relationships

[TABLE]

Also, $\ a_{(\gamma,\xi_{l})}\rightarrow{\bf 0}$ and $\int_{G}k(y,u)\gamma(dy,du)=k_{sub}^{*}(y_{0})$ . That is, $\gamma\in W_{2}(y_{0})$ and therefore,

[TABLE]

Thus, the optimal value of (2.8) is equal to the subvalue. To complete the proof, it is sufficient to note that the subvalue of an IDLP problem is equal to the optimal value of its dual provided that the former is bounded (see, e.g., Theorem 3 in [1]). That is, $k_{sub}^{*}(y_{0})=d^{*}(y_{0})$ . $\Box$

Let us conclude this section with proving the validity of the following proposition.

Proposition 5.4

The optimal value of the problem in the left hand side of (4.8) is equal to $\liminf_{T\to\infty}V_{T}(y_{0})$ . That is,

[TABLE]

Proof. Let $u(\cdot)\in{\cal U}(y_{0})$ and let $y(\cdot)$ be the corresponding trajectory. Then

[TABLE]

Therefore,

[TABLE]

and, hence,

[TABLE]

Let us prove the opposite inequality. For any $\varepsilon>0$ and $u(\cdot)\in{\cal U}(y_{0})$ , and for sufficiently large $T$ ,

[TABLE]

where $y(\cdot)=y(t,y_{0},u)$ . Therefore,

[TABLE]

and, consequently,

[TABLE]

Hence,

[TABLE]

The proposition is proved. $\Box$

6 Conclusions

We have introduced the IDLP problem, the optimal value of which gives an upper bound for $\ \limsup_{T\rightarrow\infty}V_{T}(y_{0})$ and $\ \limsup_{\alpha\uparrow 1}h_{\alpha}(y_{0})$ , with the optimal value of the corresponding dual problem providing a lower bound for $\ \liminf_{T\rightarrow\infty}V_{T}(y_{0})$ and $\ \liminf_{\alpha\uparrow 1}h_{\alpha}(y_{0})$ . While the result establishing the validity of the lower bound (Proposition 2.3) is very similar to the corresponding result in [10], the statement about the validity of the upper bound (Theorem 3.1) is much stronger than its continuous time counterpart in [10], where it was assumed that the uniform limits $\ \lim_{T\rightarrow\infty}V_{T}(y_{0})$ and $\ \lim_{\alpha\uparrow 1}h_{\alpha}(y_{0})$ exist and are Lipschitz continuous. Note also that, in contrast to the result of [10], we did not assume that the set $Y$ is invariant (only that it is viable). We believe that establishing the validity of the upper bound for systems evolving in continuous time under assumptions similar to those of Theorem 3.1 is possible, and it can be a subject for future research.

We have also established that, if the pointwise limits $\ \lim_{T\rightarrow\infty}V_{T}(y_{0})$ and $\ \lim_{\alpha\uparrow 1}h_{\alpha}(y_{0})$ exist and are continuous, then they are equal to the optimal value of the dual problem (Theorem 4.2). A similar statement in the continuous time setting can be established using a similar argument if the limits of the optimal value functions exist and are continuously differentiable. This assumption is, however, too strong, and finding less restrictive conditions, under which a statement similar to Theorem 4.2 for systems in continuous time is valid, can also be a subject for future research.

Finally, we have stated sufficient and necessary optimality conditions for the long-run average optimal control problem using an optimal solution of the dual problem (Propositions 4.5 and 4.6). Similar results can be readily obtained in the continuous time case too.

Acknowledgment. We would like to express our gratitude to D. Khlopin and to M. Quincampoix for useful discussions and for sharing with us some insightful examples.

Bibliography40

The reference list from the paper itself. Each links out to its DOI / PubMed record.

1[1] E.J. Anderson, A Review of Duality Theory for Linear Programming over Topological Vector Spaces, J. of Math. Analysis and App. , 97:2 (1983), pp. 380-392
2[2] E.J. Anderson and P. Nash, Linear Programming in Infinite-Dimensional Spaces, Wiley, Chichester, 1987.
3[3] M. Arisawa and P.-L. Lions, On Ergodic Stochastic Control, Commun. in Partial Differential Equations , 23:11 (1998), pp. 2187-2217.
4[4] J.-P. Aubin, Viability Theory, Birkhauser, Basel, 1991.
5[5] A. Arapostathis, V.S. Borkar and M.K. Ghosh, Ergodic Control of Diffusion Processes, Cambridge Uni. Press, Cambridge, UK, 2012.
6[6] R. Ash, Measure, Integration and Functional Analysis , Academic Press, 1972.
7[7] M. Bardi and I. Capuzzo-Dolcetta, Optimal Control and Viscosity Solutions of Hamilton-Jacobi-Bellman Equations, Birkhauser, Boston, 1997.
8[8] A.G. Bhatt and V.S. Borkar, Occupation measures for controlled Markov processes: characterization and optimality, The Annals of Probability , 24:3 1996), pp. 1531-1562.