This paper introduces the concept of a local infimum in optimal control, extending existing definitions, and establishes existence and necessary conditions that strengthen classical maximum principles.
Contribution
It defines a new concept of local infimum in optimal control and derives strengthened necessary conditions akin to maximum principles.
Findings
01
Existence theorem for local infimum
02
Necessary conditions resembling maximum principles
03
Examples demonstrating the strength of the new conditions
Abstract
The concept of a local infimum for an optimal control problem is introduced. This definition extends that of an optimal process. For a~local infimum we prove an existence theorem and derive necessary conditions that resemble some family of "maximum principles". Examples are given to demostrate the meaningfulness of the necessary conditions obtained in the present paper, which extend and strengthen the classical results in this field.
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsOptimization and Variational Analysis · Aerospace Engineering and Control Systems · Spacecraft Dynamics and Control
Full text
Local infimum in optimal control
E. R. Avakov, G. G. Magaril-Il’yaev
Abstract
The concept of a local infimum for an optimal control problem is introduced. This definition
extends that of an optimal process. For a local infimum we prove an existence theorem
and derive necessary conditions that resemble some family of ‘‘maximum principles’’.
Examples are given to demostrate the meaningfulness of the
necessary conditions obtained in the present paper, which extend and strengthen the classical results in this field.
Introduction
By the Pontryagin maximum principle for an optimal control problem, one means, as is well known
(see [1]), necessary conditions for optimality of a process — this being a pair
consisting of an optimal control and the corresponding optimal (phase)
trajectory. In the present paper, we introduce the concept of a local infimum, which extends
that of an optimal trajectory. For a local infimum, necessary conditions
which resemble some family of ‘‘maximum principles’’ are proved. If a local infimum is, in particular,
an optimal trajectory, then this family contains the classical Pontryagin maximum principle,
as well as some other relations, which in general are capable of providing additional information
and thereby, as is shown by examples, strengthen the Pontryagin maximum principle.
If a local infimum is not an optimal trajectory, then these necessary conditions provide a tool for
finding trajectories ‘‘suspicious’’ for a local
infimum. The use of this machinery is to a large extent the same as that of the Pontryagin maximum principle for
finding processes which are suspicious for optimality.
In the present paper, we employ the idea of ‘‘convexification’’ of the original control problem. This
idea can be found in the book of Gamkrelidze [2]. We also use this idea, but in
a less general setting, which, however, is quite sufficient for our purposes.
We also put forward an existence theorem for a local infimum in an
optimal control problem. This result does not involve any convexity-type assumptions.
So, the class of problems in which there exists a local infimum turns out to be considerably wider than the
class of problems in which one can guarantee the existence of an optimal process, because the existence
of the latter depends on fairly inconvenient assumptions on the convexity of some family of
sets associated with the problem.
The paper consists of three sections. In the first section we formulate and prove the main
results. In the second section we give examples illustrating the results obtained.
In the third section (the Appendix) we prove a generalized
implicit function theorem and derive four lemmas: the inverse function lemma, the lemma on equation in variations,
and two approximation lemmas.
All these results, which in our opinion are of independent interest, are chief ingredients in the
proofs of the main results of the paper.
The authors are deeply grateful to Revaz Valer’yanovich Gamkrelidze for
his kind attention and useful discussions.
Main results and proofs
Let U be a nonempty subset of Rr. Assume that we are given
a mapping φ:R×Rn×Rr→Rn of variables
t, x and u, a function f0:Rn×Rn→R, and mappings
f:Rn×Rn→Rm1 and g:Rn×Rn→Rm2 of variables ζi∈Rn, i=1,2.
Consider the following optimal control problem
[TABLE]
It what follows it will be assumed that the mapping φ is continuous together with its
derivative with respect to x on R×Rn×Rr and the mappings
f0, f and g are continuously differentiable on Rn×Rn.
In connection with the definition of a local infimum,
it will be convenient to rephrase the standard definitions of an admissible process and
an optimal processes with emphasis on the concept of a trajectory
and when the control plays a secondary role.
**Definition **
A trajectory x(⋅)∈AC([t0,t1],Rn) is called admissible for the control system
(2), (3) if there exists a control u(⋅)∈L∞([t0,t1],Rr) satisfying conditions (2), (3).
The set of admissible trajectories for the control system (2), (3) (in what follows,
for brevity, we shall drop the word ‘‘control’’) will be always considered as
a subset of C([t0,t1],Rn).
**Definition **
An admissible trajectory x(⋅) is called an optimal trajectory for problem
(1)–(3) if it delivers a local minimum to the functional f0 on
the set of admissible trajectories.
The following definition is the main definition in the present paper.
**Definition **
A function x(⋅)∈C([t0,t1],Rn) is called a local infimum for problem
(1)–(3) if it delivers a local minimum for the functional f0 on the closure of the set
of admissible trajectories.
If a minimum is global, then we speak about the global infimum.
It is clear that the value of f0 on a global infimum coincides with the infimum of f0 over
all admissible trajectories.
It is easily seen that if x(⋅) is an optimal trajectory for problem (1)–(3),
then x(⋅) is a local infimum for this problem. On the other hand, if a function x(⋅)
is a local infimum for problem (1)–(3) and is admissible, then x(⋅) is an
optimal trajectory for this problem.
The above definition of local infimum is not only formally more general than that
of an optimal trajectory, but it also has an advantage over the definition of an optimal trajectory
in that the class of optimal control problems in which the existence of a local infimum can be guaranteed
is considerably larger than the class of problems in which one secures the existence of an optimal
trajectory, because in the former case it is not required to satisfy the cumbersome condition that the
set
[TABLE]
be convex for all t∈[t0,t1] and x∈Rn.
The existence theorem will be proved for the following particular case of problem
(1)–(3):
[TABLE]
here x0∈Rn, f0:Rn→R and g:Rn→Rm.
Theorem 1
Assume that in problem (5)–(7) the set U is compact,
the set of admissible trajectories is nonempty, and there exists a constant K>0 such that
[TABLE]
for all t∈[t0,t1], x∈Rn and u∈U. Then problem (5)–(7)
has a global infimum.
Note that if the conditions of this theorem are augmented with the additional condition
that the set
(4) be convex for all t∈[t0,t1] and x∈Rn, then we get by Filippov’s theorem [3]
the conditions for the existence of an optimal trajectory for problem (5)–(7).
**Proof **
We set Σn+1={α=(α1,…,αn+1)∈R+n+1:∑i=1n+1αi=1} and define f(t,x,u)=∑i=1n+1αiφ(t,x,ui), where u=(u1,…,un+1,α1,…,αn+1)∈Q=Un+1×Σn+1. Consider the problem
[TABLE]
It is clear that the set Q is compact and that any admissible trajectory for system
(6), (7) is admissible also for system (10), (11).
From (8) and from the form of the function f one easily finds that
[TABLE]
for all t∈[t0,t1], x∈Rn and u∈Q.
We next note that for each t∈[t0,t1] and x∈Rn the set
[TABLE]
is convex, because, on the one hand, R(t,x) is clearly contained in the convex hull of φ(t,x,U),
and on the other hand, by Carathéodory’s theorem, any element from this convex hull
can be represented as a convex combination of at most n+1 elements from φ(t,x,U), and
hence, R(t,x) coincides with the convex hull of φ(t,x,U).
So, all the hypotheses of A. F. Filippov’s theorem are satisfied. This result states in essence that
the set of admissible trajectories for system (10), (11) is compact. We let γ
denote the infimum of the numbers f0(x(t1)) over all
admissible trajectories x(⋅) for system (6), (7). These trajectories
are clearly also admissible for system (10), (11), and hence γ is finite.
There exists a sequence xN(⋅) of these trajectories such that the sequence
f0(xN(t1)) converges to γ. Hence by compactness it can be assumed that this sequence
converges to some function x(⋅)∈C([t0,t1],Rn), and therefore, x(⋅)
lies in the closure of the set of all admissible trajectories for system
(6), (7). Next, since f0 is continuous, we have
f0(x(t1))=limN→∞f0(xN(t1))=γ⩽f0(x(t1)) for any admissible
trajectory x(⋅) for system (6), (7). Therefore, x(⋅) is a global
infimum for problem (5)–(7).
For any k∈N, we set
[TABLE]
and associate with system (2), (3) the control system
[TABLE]
where u(⋅)=(u1(⋅),…,uk(⋅)) and α(⋅)=(α1(⋅),…,αk(⋅)).
This system will be called a convex extension (relaxation) of system
(2), (3) (or simply a convex system).
As before, a trajectory x(⋅)∈AC([t0,t1],Rn) is called admissible for a convex system (12), (13) if there exist u(⋅)∈(L∞([t0,t1],Rr))k and α(⋅)∈(L∞([t0,t1]))k
satisfying conditions (12), (13).
A triple (x(⋅),u(⋅),α(⋅)) will also be called admissible for
the convex system (12) (13).
We need some more notation.
We let ⟨λ,x⟩=∑i=inλixi
denote a linear functional
λ=(λ1,…,λn)∈(Rn)∗ evaluated at a point
x=(x1,…,xn)T∈Rn (T is the transpose).
The Euclidean norm of an element x∈Rn is denoted by ∣x∣.
By (Rn)+∗ we denote the set of functionals on Rn which assume nonnegative values on nonnegative vectors.
The adjoint operator to a linear operator
Λ:Rn→Rm will be denoted by Λ∗.
If x(⋅) is a fixed function, then, to alleviate the notation, the partial derivatives
of the mappings f0, f and g with respect to ζ1 and ζ2 at a point (x(t0),x(t1))
will be written as f0ζi, fζi and gζi,
i=1,2.
Theorem 2
If a function x(⋅)∈AC([t0,t1],Rn) is a local infimum for problem
(1)–(3), then, for any k∈N, u(⋅)=(u1(⋅),…,uk(⋅))
and α(⋅)=(α1(⋅),…,αk(⋅)), for which triple
(x(⋅),u(⋅),α(⋅)) is admissible for the convex system (12), (13),
there exist a nonzero tuple (λ0,λf,λg)∈R+×(Rm1)+∗×(Rm2)∗ and a vector function p(⋅)∈AC([t0,t1],(Rn)∗) such that the following conditions hold:
1)
the stationarity condition with respect to x(⋅)
[TABLE]
2)
the transversality condition
[TABLE]
3)
the complementary slackness condition
[TABLE]
4)
the maximum condition for almost all t∈[t0,t1]
[TABLE]
If U is a compact set, then a local infimum for problem (1)–(3) is an
admissible trajectory for the convex system (12), (13) with k=n+1.
It can be seen that necessary optimality conditions form a family of relations which is parameterized by triples
(x(⋅),u(⋅),α(⋅)), each relation having the form of a maximum principle. Moreover,
if x(⋅) is an optimal trajectory for problem
(1)–(3), then these relations contain the classical Pontryagin maximum principle
(k=1, u1(⋅)=u(⋅), α1(⋅)=1), as well as some other relations, which, in general,
can provide an additional information about the optimal trajectory (see Example 2 in the section ‘‘Examples’’).
So, the above theorem strengthens the Pontryagin maximum principle.
If a local infimum is not an admissible trajectory for system
(2), (3), then this theorem provides a tool (similar to a large extent to that based
on the Pontryagin maximum principle for finding the trajectories ‘‘suspicious’’ for a local infimum.
Moreover, if such a trajectory is found from conditions 1)–4), which are satisfied only for
λ0=0, then by Theorem 3 (to be proved below), this trajectory lies in the closure of the set
of admissible trajectories for system (2), (3).
All this will be illustrated in Example 1 in the section ‘‘Examples’’.
Example 3 in the section ‘‘Examples’’ shows that the assumption in the last assertion of the theorem
that the set U be compact is essential.
Proof of Theorem 2. Below, to alleviate the notation, we frequently write x, u, u, α,
α, etc., in place of x(⋅), u(⋅), u(⋅), α(⋅), α(⋅), etc.
A neighborhood of a point x from a normed space will be denoted by O(x).
We introduce the following notation for controls in the convex system (9), (10).
We set
[TABLE]
and define
[TABLE]
where, we recall, Σk={α=(α1,…,αk)∈R+k:∑i=1kαi=1}.
Let (x,u,α) be a triple from the statement of the theorem
which is admissible for the convex system (9), (10), N>k, α′=(α,0,…,0)∈AN,
v=(v1,…,vN−k)∈UN−k and u′=(u1,…,uk,v1,…,vN−k).
By the condition, x is a solution to the differential equation
[TABLE]
on [t0,t1]. By Lemma 2 there exist neighborhoods O(x(t0))
and O(α′) such that, for all ξ∈O(x(t0)) and
α=(α1,…,αN)∈O(α′), there exists a unique solution
x(⋅,ξ,α;u′) to the equation
[TABLE]
on [t0,t1], where ui=ui, i=1,…,k, uk+i=vi, i=1,…,N−k. Moreover, the mapping (ξ,α)↦x(⋅,ξ,α;u′),
qua a mapping into C([t0,t1],Rn), is continuously differentiable.
Let us define the mapping Φ, which associates with a quadruple (ξ,α,ν0,ν)
from O(x(t0))×O(α′)×R×Rm1 a vector from
R1+m1+m2 by the rule
[TABLE]
The dependence of this mapping on a fixed tuple u′ will not be indicated.
The mapping Φ is clearly well defined and is continuously differentiable
with respect to (ξ,α,ν0,ν) by the properties of the mapping (ξ,α)↦x(⋅,ξ,α;u′) and the mappings f0, f and g.
The scheme of the proof of the necessary conditions in Theorem 2 is as follows. We show that the
inclusion
[TABLE]
where w=(x(t0),α′,0,−f(x(t0),x(t1))), contradicts the fact that x is a local infimum
for problem (1)–(3). Then, separating from zero the convex set on the right of (17)
for each N>k and
v=(v1,…,vN−k)∈UN−k, we get all the necessary conditions formulated in the theorem.
Let us use Lemma 1 in the case X=Rn×(L∞([t0,t1]))N×R×Rm1, K=Rn×AN×R+×R+m1, and
w=(x(t0),α′,0,−f(x(t0),x(t1))).
Let O0(x(t0)) and O0(α′) be neighborhoods from Lemma 4. Reducing these neighborhoods
and considering bounded neighborhoods O0(0) (of the origin in R) and
O0(−f(x(t0),x(t1)), one can assume that the mapping Φ is bounded on
V=O0(x(t0))×O0(α′)×O0(0)×O0(−f(x(t0),x(t1)).
The inclusion (17) is equivalent to the inclusion 0∈intΦ′(w)(K−w). So,
all the hypotheses of Lemma 1 are satisfied.
By Lemma 4, for sufficiently large s∈N, the continuous
mappings (ξ,α)↦xs(⋅,ξ,α;u′) from M=O0(x(t0))×(O0(α′)∩AN) into C([t0,t1],Rn) are defined. Hence, for such s, the
continuous mappings
[TABLE]
are defined on M×R×Rm1.
Since the mappings (ξ,α)↦xs(⋅,ξ,α;u′)
lie in the space C(M,C([t0,t1],Rn)) and converge in this space
to the mapping (ξ,α)↦x(⋅,ξ,α;u′) as
s→∞, and since the mappings
f0, f and g are continuously differentiable, it easily follows that the mappings (ξ,α,ν0,ν)↦Φs(ξ,α,ν0,ν) lie in the space C(V∩K,R1+m1+m2) (with reduced neighborhood V, if necessary) and converge in this space
to the mapping (ξ,α,ν0,ν)↦Φ(ξ,α,ν0,ν) as s→∞.
Let ε>0. There exists s0∈N such that
∥xs(⋅,ξ,α;u′)−x(⋅)∥C([t0,t1],Rn)<ε/2 for all s⩾s0 and
(ξ,α)∈M. Next, since the mapping
(ξ,α)↦x(⋅,ξ,α;u′) is continuous at
(x(t0),α′), there exists δ0>0 such that
∥x(⋅,ξ,α;u′)−x(⋅)∥C([t0,t1],Rn)<ε/2 if
∣ξ−x(t0)∣+∥α−α′∥(L∞([t0,t1]))N<δ0.
As a result, we see that if s⩾s0 and if a pair (ξ,α)∈M is such that
∣ξ−x(t0)∣+∥α−α′∥(L∞([t0,t1]))N<δ0, then
[TABLE]
Let a neighborhood V0⊂V of the point w and constants r0 and γ be from Lemma 1.
We choose r∈(0,r0] so as to have γr⩽δ0 and let s⩾s0
be such that Φs∈UC(V∩K,R1+m1+m2)(Φ,r).
We have Φ(w)=0, and hence the pairs (w,y), where y∈R1+m1+m2 and
∣y∣⩽r, satisfy relation (63) of Lemma 1.
Let z=(w,y) be such a pair and let y be of the form y=(y1,0), where y1<0.
If gΦs is the mapping from this lemma, then for this pair the lemma asserts that
(here we denote gΦs(z)=wz=(ξz,αz,ν0z,νz))
[TABLE]
and
[TABLE]
By Lemma 4 the function xs(⋅,ξz,αz;u′) is a solution
of the equation
[TABLE]
From the definition of us(αz;u′) (see Lemma 3) it follows that
us(αz;u′)(t)∈U for almost all t∈[t0,t1]. It is also clear that
ξz=xs(t0,ξz,αz;u′).
Now from the second and third relations in (19) and using the inequality
νz⩾0, it follows that the function xs(⋅,ξz,αz;u′) is admissible
for system (2), (3). Moreover, from the first relation it follows that on this function the value of the functional f0 is not smaller
than on x (ν0z⩾0, y1<0).
and hence by (18) we have ∥xs(⋅,ξz,αz;u′)−x(⋅)∥C([t0,t1],Rn)<ε.
So, in any neighborhood of the point x there exists a function which is
admissible for system (2), (3) and on which the value of the functional to be minimized
is smaller than on x. This contradicts the fact that x is a local infimum for problem (1)–(3).
So, inclusion (17) does not hold for any N>k and any tuple v=(v1,…,vN−k)∈UN−k.
Therefore, for any such N and v, it follows from the separation theorem that there exists
a nonzero vector λ(v)∈(R1+m1+m2)∗ for which
[TABLE]
for all (ξ,α,ν0,ν)∈Rn×AN×R+×R+m1.
Let λ(v)=(λ0(v),λ1(v),λ2(v))∈R×(Rm1)∗×(Rm2)∗. By the chain rule, inequality (21) can be written as
[TABLE]
for any (ξ,α,ν0,ν)∈Rn×AN×R+×R+m1, where
xξ=xξ(v) and xα=xα(v)
are the partial derivatives of the mapping
(ξ,α)↦x(⋅,ξ,α;u′) at the point (x(t0),α′)
with respect to ξ and α, respectively.
Let us show that there exists a tuple
λ=(λ0,λf,λg)∈R×(Rm1)∗×(Rm2)∗, ∣λ∣=1, such that
(22) holds with this λ for any
N>k and any tuple v=(v1,…,vN−k)∈UN−k.
For a given tuple v=(v1,…,vN−k)∈UN−k
we denote by ΛN(v) the set of all such vectors
λ(v), ∣λ(v)∣=1, that satisfy
(22). It is clear that ΛN(v) is a closed
subset of the (compact) unit sphere of (R1+m1+m2)∗. Let us check that the family
A of all such subsets (over all N>k and tuples v=(v1,…,vN−k)∈UN−k) has the finite intersection property.
Let ΛNj(vj), j=1,…,s, be an arbitrary family of sets
from A (Nj−k is the length of the vector vj). Let us show that ∩j=1sΛNj(vj)=∅.
Indeed, we set v=(v1,…,vs) (the length of v
is denoted by N). Let 1⩽j⩽s.
Consider the ffamily vj and define αj=(α1,…,αk,αk+1,…,αNj)∈ANj.
We augment the vector αj by zero functions to the vector α of length N. It is clear that α∈AN.
Setting αNj′=(α,0)∈ANj, it is easily seen that
[TABLE]
Hence, using (22), we get the inclusion λ(v)∈ΛNj(vj), thereby showing that λ(v)∈∩j=1sΛNj(vj).
So, the system of sets A has the finite intersection property, and hence,
there exist λ0∈R, λf∈(Rm1)∗ and λg∈(Rm2)∗ such that (22)
holds for any tuple v. In particular,
this relation holds for the tuples consisting of a single element v=v; that is,
[TABLE]
for all (ξ,α,ν0,ν)∈Rn×Ak+1×R+×R+m1 and v∈U.
Now we employ this inequality to derive the necessary conditions from the theorem.
Setting ξ=0, α=α′ and
ν=−f(x(t0),x(t1)) in (23), we see that λ0ν0⩾0 for any ν0⩾0,
and hence, λ0⩾0.
If ξ=0, α=α′, ν0=0 and ν=ν′−f(x(t0),x(t1)), where ν′∈R+m1, then
from (23) it follows that ⟨λf,ν′⟩⩾0 for any ν′∈R+m1; that is, λf∈(Rm1)+∗.
Let ξ=0, α=α′, ν0=0 and ν=0. Now from inequality (23)
it follows that ⟨λf,f(x(t0),x(t1))⟩⩾0. But λf∈(Rm1)+∗, f(x(t0),x(t1))⩽0, and therefore,
⟨λf,f(x(t0),x(t1))⟩⩽0; that is,
⟨λf,f(x(t0),x(t1))⟩=0, which proves the complementary slackness condition.
Let p be the solution of the equation
[TABLE]
In (23) we put α=α′, ν0=0 and ν=−f(x(t0),x(t1)).
Since ξ∈Rn, we have
[TABLE]
From (74) of Lemma 2 it follows that the derivative xξ(v) (which is identified
with the corresponding matrix function) satisfies the equation
For any 1⩽i⩽k, we set
(α1,…,αi−1,(1/2)αi,αi+1,…,αk,(1/2)αi)∈Ak+1. Substituting this tuple into the last inequality,
we see that
[TABLE]
for all v(⋅)∈U.
We let T0 denote the set of Lebesgue points of the functions αi(⋅) and ⟨p(⋅),φ(⋅,x(⋅),ui)⟩, i=1,…,k, on (t0,t1). Since these
functions are essentially bounded, it can be easily checked that T0 is the set of Lebesgue points
also for the functions αi(⋅)⟨p(⋅),φ(⋅,x(⋅),ui)⟩, i=1,…,k.
Let τ∈T0. We fix 1⩽i⩽k. For any h>0 such that [τ−h,τ+h]⊂(t0,t1), we set
vh(t)=v if t∈[τ−h,τ+h] and vh(t)=ui(t) if t∈[t0,t1]∖[τ−h,τ+h]. It is clear that vh(⋅)∈U and so, using (29),
[TABLE]
Since the function φ(⋅,x,v) is continuous, τ is its Lebesgue point, and moreover, by the above,
τ is also a Lebesgue point for the function αi(⋅)⟨p(⋅),φ(⋅,x,v)⟩. Making h→0 in the last inequality, we find that
[TABLE]
for each i=1,…,k.
Adding these inequalities and taking into account equation (14), which
becomes a sharp equality
at the Lebesgue points of the function on the right, we get the relation
[TABLE]
Since v∈U is arbitrary and since T0 is a set of full measure, this relation
is equivalent to condition 4) of the theorem. So, all the necessary conditions in the theorem are proved.
Let us prove the last assertion of the theorem. By definition of local infimum, there exists
a sequence of admissible trajectories for system (2), (3), which converges to x(⋅).
It is clear that these trajectories are also admissible for the convex system
(12), (13) for any k. To prove that the function x(⋅)
is also admissible for this system with k=n+1 we employ Filippov’s theorem from
([3]), in which it is shown, in particular (in our setting) that
if Q is a compact set and the set
[TABLE]
is convex for any t∈[t0,t1] and x∈Rn, then the limit of a converging
sequence of admissible trajectories is also an admissible trajectory.
Since in our setting the set Q is clearly
compact, and since the convexity of the set R(t,x) for any
t∈[t0,t1] and x∈Rn is secured by Theorem 1, this proves the
last assertion of Theorem 2.
To formulate the next result we introduce the concept of regularity of the convex system
(12), (13).
Let k∈N and let (x(⋅),u(⋅),α(⋅)) be an admissible triple for the convex system
(12), (13). By Λ(x(⋅),u(⋅),α(⋅)) we denote the set
of tuples (λf,λg,p(⋅))∈(Rm1)∗×(Rm2)∗×AC([t0,t1],(Rn)∗), where λf and λg are not simultaneously zero, satisfying
the relations
[TABLE]
The condition Λ(x(⋅),u(⋅),α(⋅))=∅ means that
the necessary conditions of geometric optimality in the form of a maximum principle are satisfied for the
convex system
(12), (13) at the point (x(⋅),u(⋅),α(⋅)). Therefore, the negation of this condition
(that is, the case when Λ(x(⋅),u(⋅),α(⋅))=∅) can be looked upon as a regularity
condition for the convex system (12), (13) at the point
(x(⋅),u(⋅),α(⋅)). Taking this into account, we say that the convex system (12), (13) is regular at a point (x(⋅),u(⋅),α(⋅)) if Λ(x(⋅),u(⋅),α(⋅))=∅.
Theorem 3
If a convex system (12), (13) is regular at a point
(x(⋅),u(⋅),α(⋅)), then x(⋅) lies in the closure of the set of admissible trajectories
for system (2), (3).
**Proof **
At the beginning of the proof of the previous theorem, we introduced the mapping Φ.
Let us consider here its ‘‘truncated’’ variant
[TABLE]
which differs from the mapping Φ by the absence of the component
f0(ξ,x(t1,ξ,α;u′))−f0(x(t0),x(t1))+ν0.
We claim that if a convex system (12), (13) is regular at a point
(x(⋅),u(⋅),α(⋅)), then, for some N>k and a tuple v=(v1,…,vN−k)∈UN−k,
[TABLE]
where w=(x(t0),α′,−f(x(t0),x(t1))) (inclusion (31) is similar to
inclusion (17)).
Indeed, if inclusion (31) is not satisfied for any N>k and any tuple v=(v1,…,vN−k)∈UN−k, then arguing as in Theorem 2, we conclude that
conditions (30) (which coincide with the necessary conditions in Theorem 2 with λ0=0)
hold with some nonzero tuple (λf,λg), contradicting the assumption.
But if inclusion (31) holds, then arguing again as in the proof of Theorem 2
in the part pertaining to the inverse function lemma we conclude that, for each
ε>0, there exists an admissible function for system (2), (3) which differs
by lesser than ε from x(⋅) in the metric of C([t0,t1],Rn), which proves the theorem.
Let us derive two corollaries from this theorem. Let consider the problem of minimization of the functional f0
(see (1)) on trajectories of the convex system (12), (13). This problem will be referred to as
the convex problem (1), (12), (13). The concept of an
optimal trajectory for this problem is defined in a natural way. Moreover, when speaking
about an admissible trajectory for problem (1), (12), (13) we imply that this trajectory
is admissible for system (12), (13), which imposes constraints in this problem.
Corollary 1
If a convex system (12), (13) is regular at a point point
(x(⋅),u(⋅),α(⋅)) and if x(⋅) is an optimal trajectory in the convex problem
(1), (12), (13), then x(⋅) is a local infimum for problem
(1)–(3).
**Proof **
By the hypothesis, there exists a neighborhood V of the point x(⋅) such that f0(x(t0),x(t1))⩾f0(x(t0),x(t1)) for any
trajectory x(⋅)∈V admissible for system (12), (13). In particular, this is true if x(⋅) is an admissible trajectory for
system (2), (3). Since by Theorem 3 the function x(⋅)
lies in the closure of the set of admissible trajectories for system (2), (3),
it follows that x(⋅) is a local infimum for problem (1)–(3).
We now give the definition of a sliding regime.
**Definition **
By a sliding regime for system (2), (3) we mean a function lying in the closure
of the set of admissible trajectories for this system but which does not lie in this set.
It is clear that if under the hypotheses of Corollary 1 an optimal trajectory x(⋅) in the convex
problem (1), (12), (13) is not an admissible trajectory for system
(2), (3), then x(⋅) is a sliding regime for this system.
Corollary 2
If a convex system (12), (13) is regular at a point
(x(⋅),u(⋅),α(⋅)) and if x(⋅) is not an admissible trajectory for system
(2), (3), then x(⋅) is a sliding regime for this system. On the other hand,
if x(⋅) is a sliding regime for system (2), (3) and if the set
U is compact, then x(⋅) is an admissible trajectory for the convex system
(12), (13) for k=n+1.
**Proof **
The first assertion is a clear corollary to Theorem 3. The second assertion follows from the proof of
the last assertion of Theorem 2, because it involves only the existence of a sequence
of admissible trajectories converging to x(⋅).
Examples
In this section we give examples illustrating the results of the first section.
Example 1
Consider the following optimal control problem
[TABLE]
where a function f:[0,1]→R is absolutely continuous, f(0)=0, ∣f˙(t)∣⩽1
and ∣f˙(t)∣=1 for almost all t∈[0,1] and U=(−∞,−1]∪[1,+∞).
To deal with this problem, we first follow the standard approach, namely, we try to find an optimal
trajectory with the help of the Pontryagin maximum principle. By examining the
conditions of this principle, we show that there is no optimal trajectory in this problem.
Next, using Theorem 2, we find a function ‘‘suspected’’ for a local infimum. And finally,
by an appeal to Theorem 3 we show that the function thus obtained is a local infimum in problem (32).
So, let us assume that x(⋅)=(x1(⋅),x2(⋅)) is an optimal trajectory in problem
(32); that is, there exists u(⋅)∈L∞([0,1]) such that the pair
(x(⋅),u(⋅)) is admissible for this problem and delivers a strong minimum in it. Then by the
Pontryagin maximum principle, there exist a nonzero absolutely continuous vector function
p(⋅)=(p1(⋅),p2(⋅)) and a number λ0⩾0 such that
[TABLE]
and
[TABLE]
for almost all t∈[0,1].
Let us show that this implies the equality u(⋅)=f˙(⋅), which is contradictory, because by the condition
∣u(t)∣⩾1 for almost all t∈[0,1], and ∣f˙(t)∣<1 on a set of positive measure. This will prove that problem (32)
has no optimal trajectory.
From (33) it follows that the function p2(⋅) is constant. This constant is nonzero, because
if p2=0, then from (33) it follows that p1(⋅) is a nonzero constant, and in this case
equality (34) is clearly impossible. We set p2=−1.
From (34) with u=1 we see that p1(t)−1⩽p1(t)u(t)−u2(t) for almost all
t∈[0,1], or what is the same
In turn, from (35) and (36) we find that, for almost all t∈[0,1],
[TABLE]
Indeed, if on some set of positive measure the function p1(⋅)
is nonnegative, then in view of the properties of f(⋅) inequality (37) readily follows from
(35), and if it is nonpositive, then (37) follows from (36).
From (37), (33) and from the boundary conditions in problem (32) it follows that
(u(⋅)=x˙1(⋅))
[TABLE]
that is, x1(t)=f(t) for all t∈[t0,t1], and therefore, u(t)=x˙1(t)=f˙(t)
for almost all t∈[0,1]. But, as was already noted, this is impossible, and hence problem
(32) has no optimal trajectory.
So, the Pontryagin maximum principle gives nothing for the problem under consideration.
Let us employ Theorem 2 to find a function delivering a local
infimum in problem (32). Applying this theorem with k=2, we conclude that if
x(⋅)=(x1(⋅),x2(⋅)) is a local infimum, then, for any measurable function α(⋅), 0⩽α(t)⩽1,
for almost all t∈[t0,t1] and any functions ui(⋅)∈L∞([0,1]), ui(t)∈U
for almost all t∈[t0,t1], i=1,2, such that
[TABLE]
x1(0)=x2(0)=0 and x1(1)=f(1), there exist a nonzero absolutely continuous
vector function p(⋅) and a number λ0⩾0 such that
[TABLE]
and moreover, for almost all t∈[0,1],
[TABLE]
It is seen that relations (39) coincide with (33) and are independent of
u1(⋅), u2(⋅) and α(⋅).
In this case, our problem becomes simpler: one needs to find functions x1(⋅) and x2(⋅)
satisfying the convex system (38) (at least for one tuple u1(⋅), u2(⋅) and
α(⋅)) and such that there exist a nonzero absolutely continuous vector function
p(⋅)=(p1(⋅),p2(⋅)) and a number λ0⩾0 satisfying
(39) and (40).
From the above relations, repeating in essence the previous arguments in the proof of
inequalities (35) and (36), we get the inequalities
[TABLE]
and
[TABLE]
for almost all t∈[0,1]. Using these inequalities we find, as before, that
[TABLE]
for almost all t∈[0,1].
Further, repeating now verbatim the above arguments, we see that x1(t)=f(t) for all
t∈[t0,t1]. Now from (39) it follows that the function p1(⋅) is constant.
Since ∣f˙(t)∣=1 for almost all t∈[0,1], the equalities f˙(t)=1 and f˙(t)=−1 are impossible for almost all t∈[0,1], and hence from inequalities (41) and
(42) we get p1=0.
We have x1(t)=f(t), t∈[0,1], and hence from (40) for u=1 we get the inequality
x˙2(t)⩽1 for almost all t∈[0,1]. On the other hand, by the second of
(38) and the definition of the set U, we conclude that x˙2(t)⩾1 for almost all
t∈[0,1]; that is, x˙2(t)=1 a.e. on [0,1], and so, x2(t)=t.
So, for any functions x(⋅), u1(⋅), u2(⋅)
and α(⋅) admissible for convex system, we get a unique trajectory x(t)=(f(t),t),
t∈[0,1], which is suspicious for a local infimum in problem (32) and which is
not admissible for this problem (for otherwise there should exist a control
u(⋅) such that ∣u(t)∣⩾1 and x˙1(t)=f˙(t)=u(t) for almost all
t∈[0,1], but this is impossible, as was already pointed out).
Moreover, as u1(⋅),
u2(⋅) and α(⋅) one can take u1(t)=1, u2(t)=−1 and
α(t)=(1−f˙(t))/2 for almost all t∈[0,1].
Let us show that the trajectory thus found is a global infimum in problem (32).
To this end, we employ Theorem 3. The regularity of the convex system (38) at
a point (x(⋅), u1(⋅), u2(⋅), α(⋅)) means that the relations
(39) and (40) with λ0=0 are satisfied only by the zero
vector function p(⋅)=(p1(⋅),p2(⋅)). But this is indeed so: it is clear that p2(⋅)=0,
and moreover, that p1(⋅)=0 was already proved above.
Therefore, by Theorem 3 the trajectory x(⋅) lies in the closure of the admissible trajectories for problem (32).
Next, for any admissible trajectory x(⋅)=(x1(⋅),x2(⋅)) we have
[TABLE]
and hence x(⋅) is a global infimum for problem (32).
Note that the trajectory x(⋅) is a sliding regime for the system specifying the constraints in problem (32).
One can easily construct a sequence of admissible trajectories
xn(⋅)=(x1n(⋅),x2n(⋅)) for problem (32) such that x2n(1)→1 as n→∞. Let n∈N.
We split the interval [0,1] into n intervals:
[s/n,(s+1)/n], s=0,…,n−1. We set bn(s)=f(s/n)−(s/n) and
cn(s)=f((s+1)/n)+(s+1)/n, s=0,…,n−1. It can be easily checked that
((cn(s)−bn(s))/2)∈[s/n,(s+1)/n], s=0,…,n−1.
Consider the sequence x1n(⋅) defined by
[TABLE]
s=0,…,n−1.
This is a broken line (with slopes ±1 between its segments and which interpolates f(⋅) at the points
s/n, s=0,…,n), which converges uniformly to f(⋅). We set un(⋅)=x˙1n(⋅) and
[TABLE]
Since ∣un(t)∣=1 for almost all t∈[t0,t1], the pairs (x1n(⋅),x2n(⋅)),
n∈N, are admissible for problem (32). Moreover, it is clear that x2n(1)→1 as n→∞.
Example 2
Here we give an example when from Theorem 2 one can derive more information about
an optimal process in comparison with that delivered by the Pontryagin maximum principle.
Let g:R→R and U⊂R. Consider the problem
[TABLE]
Assume that the function g is continuous, g(0)=0, and 0∈intU.
In an equivalent form, this problem reads as
[TABLE]
Let us show that the equilibrium point x1(⋅)=x2(⋅)=0, u(⋅)=0 satisfies
the Pontryagin maximum principle for problem (44); hence this point is ‘‘suspicious’’ from the
viewpoint of this principle for a strong minimum in this problem. Indeed, the adjoint equation and
the maximum condition at this point are equivalent to the relations
[TABLE]
and
[TABLE]
It is clear that p1(⋅) is a constant, and moreover, since 0∈intU, this constant is zero.
The function p2(⋅) is also constant. Setting, for example, p2=−λ0=−1, we conclude that the
point x1(⋅)=x2(⋅)=0, u(⋅)=0 satisfies the Pontryagin maximum principle.
We now employ Theorem 2 to show that if a point x1(⋅)=x2(⋅)=0, u(⋅)=0
is a point of strong minimum for problem (43), then certain additional meaningful conditions should be satisfied.
Namely, the following result holds.
Proposition 1
If a point x1(⋅)=x2(⋅)=0, u(⋅)=0 delivers a strong minimum for problem
(43), then
the function u↦g(u) is linear on some interval with center at the origin.
**Proof **
We apply Theorem 2 at the point x1(⋅)=x2(⋅)=0. This means, in particular, that
for any ui∈U, i=1,2, and α∈[0,1] such that
[TABLE]
there exsit a vector function p(⋅)=(p1(⋅),p2(⋅)) and a number λ0⩾0
satisfying the relations
[TABLE]
and
[TABLE]
We have 0∈intU, and hence this implies, as before, that p1(⋅) is the
constantly zero. As a result, p2=0, for otherwise all the Lagrange multipliers would be zero.
Let ε>0 be such that [−ε,ε]⊂U. In this case it is clear
that any u1∈[−ε,0), u2∈(0,ε] and
α=u1/(u1−u2)∈(0,1) would satisfy the equations from (45). Now from the first equality in (46) we get
the relation
Let u∈[−ε,ε]. If u<0, then from (47) with u1=u and
u2=ε we get
[TABLE]
If u>0, then by another appeal to (47) with u2=u and u1=−ε and taking into account (48)
we obtain
[TABLE]
If u=0, then by the hypothesis g(0)=0, and hence
[TABLE]
So, Theorem 2 strengthens in general the Pontryagin maximum principle.
Example 3
This examples shows that the condition that the set U
be compact in the last assertion of Theorem 2 and in the second part of Corollary 2
is essential.
Consider the control system
[TABLE]
and construct the following sequence of functions
[TABLE]
It is clear that this is a sequence of absolutely continuous functions which are admissible for system
(49) and which converge uniformly on [0,1] to the function t↦t,
whose derivative, clearly, does not lies in L∞([0,1]).
It follows that the equality
[TABLE]
where x(t)=t, t∈[0,1], cannot be satisfied for almost all t∈[0,1]
for any measurable function α(⋅) for which
0⩽α(t)⩽1 for almost all t∈[0,1] and for any functions ui(⋅)∈L∞([0,1]), i=1,2.
This means that the
trajectory x(⋅), which lies in the closure of the admissible trajectories for system (49),
is not admissible for any convex extension of this system for k=2.
Example 4
Here our aim is to show that the regularity condition
in Corollary 1 and in the first part of Corollary 2 is essential.
Consider the problem
[TABLE]
If a trajectory x(⋅)=(x1(⋅),x2(⋅),x3(⋅)) is admissible for the system specifying
the constraints in this problem, then from the third differential equation and the boundary
conditions it follows that x1(⋅)=x2(⋅). Now from the first and second differential equations
we conclude that, for all t∈[0,1], the equality holds
∫0tu(τ)dτ=∫0t(4u2(τ)−3u3(τ))dτ for some u(⋅)∈L∞([0,1]) for which u(t)∈U for almost all t∈[0,1]. It follows that
u(t)=4u2(t)−3u3(t) for almost all t∈[0,1]. The control u(⋅) cannot assume
values not lying in U on a set of positive measure. Hence from the last equality it follows that u(t)
is either 1/3 or 1 for almost all t∈[0,1]. This implies, in particular, that
x1(1)=∫01u(t)dt⩾1/3 for any admissible trajectory.
Let us show that the zero trajectory x(⋅)=(0,0,0) delivers a global minimum for the convex extension of
problem (51) with k=3. Indeed, a direct verification shows that the triple (x(⋅),u(⋅),α(⋅)), where u1(⋅)=−1, u2(⋅)=3, u3(⋅)=1/3, α1(⋅)=3/8,
α2(⋅)=1/16, and α3(⋅)=9/16, is admissible for this extension and that the zero
delivers a global minimum f0. But this trajectory is neither a local infimum nor a sliding regime for problem (51),
because by the above estimate x1(1)⩾1/3 this trajectory does not lie in the closure
of the set of admissible trajectories for this problem.
So, the assertions of Corollary 1 and the first part of Corollary 2
are not true for the case under consideration. This can be explained by the fact that the regularity
condition is violated — namely, the convex extension of the system specifying the constraints in problem
(51) is not regular at the point x(⋅). Indeed, by definition, the regularity
is equivalent to saying that only the zero vector function
p(⋅)=(p1(⋅),p2(⋅),p3(⋅)) can satisfy the relations
[TABLE]
and
[TABLE]
But this implies that p1(⋅) and p2(⋅) are zero constants and as p3(⋅)
one can take any nonzero constant.
Note that in this case there exists an optimal trajectory — namely, putting
u(⋅)=1/3 we see that x1(1)=1/3.
Appendix
In this section we prove the generalized implicit function theorem and establish four lemmas:
the inverse function lemma, the lemma on equation in variations, and two approximation lemmas.
We introduce the following definition. Let X and Y be normed spaces, Σ
be a topological space, and let M be a nonempty subset of X. We denote by
Cx1(M×Σ,Y) the restriction to M×Σ of the set
of mappings F:X×Σ→Y which are continuous together with its derivative with respect to x
and for which the norm
[TABLE]
is finite.
Theorem 4** (**the generalized implicit function theorem)
Let X and Y be Banach spaces, Σ be a topological space,
σ∈Σ, V be a neighborhood of a point x∈X, Q be a convex closed
subset of X, F∈Cx1((V∩Q)×Σ,Y), F(x,σ)=0, and let
the operator Fx(x,σ) be invertible.
Then there exist neighborhoods V0′⊂V0⊂V of the point x, a neighborhood U0
of σ, and a neighborhood W0 of the mapping F such that, for F∈W0 for which
x−Fx−1(x,σ)F(x,σ)∈Q for all (x,σ)∈(V0′∩Q)×U0, there exists a continuous mapping gF:U0→V0∩Q such that
[TABLE]
for all (x,σ)∈(V0′∩Q)×U0. Moreover, the equality F(x,σ)=0 on
(V0∩Q)×U0 is possible only if x=gF(σ).
**Proof **
For brevity, we set Λ=Fx(x,σ) and write Cx1 in place of
Cx1((V∩Q)×Σ,Y). The mapping (x,σ)↦Fx(x,σ)
is continuous at the point (x,σ), and hence there exist 0<δ⩽1 such that
UX(x,δ)⊂V111UX(x,δ) denotes the open ball in a normed space X
with center at x and of radis δ. and a neighborhood U
of the point σ for which ∥Fx(x,σ)−Λ∥⩽1/8∥Λ−1∥ for all
(x,σ)∈UX(x,δ)×U.
We set V0=UX(x,δ), and choose neighborhoods V0′, U0 and W0 so that
V0′⊂UX(x,δ/2), U0⊂U, and moreover,
∥F(x,σ)∥Y<δ/8∥Λ−1∥ if (x,σ)∈V0′×U0,
W0=UCx1(F,δ/8∥Λ−1∥).
Let F∈W0 and x−Fx−1(x,σ)F(x,σ)∈Q for all (x,σ)∈(V0′∩Q)×U0. We claim that, for any x,x′∈V0∩Q and σ∈U0,
[TABLE]
Indeed, first, we have (δ⩽1)
[TABLE]
The sets V0 and Q are convex, and hence if x,x′∈V0∩Q, then
xθ=(1−θ)x+θx′∈V0∩Q for θ∈[0,1]. By the mean value theorem,
as applied to the mapping x→F(x,σ)−Fx(x,σ)x, where σ∈U0,
we get, by (54) and in view of the choice of F, that
Let (x,σ)∈(V0′∩Q)×U0. Considering the sequence (the modified Newton method)
[TABLE]
we claim that this sequence lies in UX(x,δ)∩Q and is a Cauchy sequence.
The first claim is proved by induction. It is clear that x0∈UX(x,δ)∩Q. Let xk∈UX(x,δ)∩Q, 1⩽k⩽n. We need to show that xn+1∈UX(x,δ)∩Q.
Applying the operator Λ to both sides of (55), we find that
[TABLE]
Using in succession (55), (56), (53) and then iterating, we find that
[TABLE]
Next, employing the triangle inequality, using (57), (55), and
taking into account the formula for the sum of a geometric progression, we have, since F∈W0,
[TABLE]
that is, xn+1∈UX(x,δ).
By the induction hypothesis, xn∈UX(x,δ)∩Q, and so
xn+1=xn−Λ−1F(xn,σ)∈Q by the choice of the mapping F.
Therefore, the entire sequence {xn} lies in UX(x,δ)∩Q.
Next, the sequence {xn} is a Cauchy sequence. Indeed, using (57) and
arguing as in the previous inequality, we have, for all n,m∈N,
[TABLE]
which proves that {xn} is a Cauchy sequence.
The functions xn are defined on (V0′∩Q)×U0. Let (x,σ)∈(V0′∩Q)×U0. We set gF(x,σ)=limn→∞xn. From (58)
it follows that gF(x,σ)∈UX(x,δ)=V0. Since the set Q
is closed, we have gF(x,σ)∈Q, and thus the mapping
gF:(V0′∩Q)×U0→(V0∩Q) is defined.
Making n→∞ in (56) and taking into account that F is continuous, we get the relation
F(gF(x,σ),σ)=0.
Let us show that gF(x,σ)=gF(x,σ) for any point
(x,σ)∈(V0′∩Q)×U0. Indeed, by (53) we have
[TABLE]
that is, gF(x,σ)=gF(x,σ).
We set gF(σ)=gF(x,σ). This is a mapping from U0 into V0∩Q
and by the above F(gF(σ),σ)=0 for all σ∈U0.
From (58) it follows that ∥xn−x∥X⩽2∥Λ−1∥∥F(x,σ)∥Y. Making
n→∞, we get the inequality ∥gF(σ)−x∥X⩽2∥Λ−1∥∥F(x,σ)∥Y.
Since F is continuous we derive from (55) that the functions xn, qua
functions of σ, are continuous on U0. Making m→∞ in (59), we conclude that
the mapping σ↦gF(σ) is the uniform limit of continuous functions and hence is continuous.
That the equality F(x,σ)=0 on (V0∩Q)×U0 is possible only when
x=gF(σ) is proved by the same arguments as in (60).
To derive another corollary to this theorem we first need one definition.
Let M be a topological space, Z be a normed space.
We denote by C(M,Z) the space of continuous bounded mappings
F:M→Z equipped with the norm
[TABLE]
Corollary 3
Let in the theorem the mapping F, together with the mapping F∈W0, be such that
x−Fx−1(x,σ)F(x,σ)∈Q for all (x,σ)∈(V0′∩Q)×U0.
Then there exist continuous mappings gF:U0→V0∩Q and gF:U0→V0∩Q such that F(gF(σ),σ)=0 and
F(gF(σ),σ)=0 for all σ∈U0 and there exists a neighborhood
U0′⊂U0 of the point σ such that
[TABLE]
Moreover, the equalities F(x,σ)=0 and F(x,σ)=0 on (V0∩Q)×U0
are possible, respectively, only when x=gF(σ) and x=gF(σ).
**Proof **
All the assertions of the corollary, except for inequality (61), are direct
consequences of the theorem. Let us prove inequality (61).
Since the mapping gF is continuous at σ (from (52) for F it follows that
gF(σ)=x), there exists a neighborhood U0′⊂U0 of this point such that
gF(σ)∈V0′∩Q for all σ∈U0′. For such σ,
substituting gF(σ)
in estimate (52) for F in place of x and subtracting the zero element
F(gF(σ),σ) on the right under the norm sign, we get the inequality
[TABLE]
Taking the supremum over σ∈U0′ on the left, we get (61).
Lemma 1** (**the inverse function lemma)
Let X be a Banach space, K be a convex closed subset of X, V be
a neighborhood of a point w∈K and let Φ:V→Rm. Assume that the following
conditions are satisfied:
1)
Φ∈C(V∩K,Rm),
2)
Φ* is continuously differentiable at the point w,*
3)
0∈intΦ′(w)(K−w).
Then there exist a neighborhood V0⊂V of the point w and constants r0>0 and γ>0
such that, for any r∈(0,r0] and any mapping Φ∈UC(V∩K,Rm)(Φ,r), there exists a mapping gΦ(w,y):(V0∩K)×Rm→V∩K satisfying
[TABLE]
for all (w,y)∈(V0∩K)×Rm, for which
[TABLE]
Before proceeding with the proof of the lemma, we prove one result, which is a direct corollary
to the implicit function theorem for inclusions (see the paper [4] by the authors of the present paper).
Proposition 2
Let X and Y be Banach spaces, Λ:X→Y be a linear
continuous operator, C be a convex closed subset of X, x0∈C, y0=Λx0 and
[TABLE]
Then there exist neighborhoods V1, U1, U2, respectively, of the points x0, y0, 0X,
a constant a>0, and a continuous mapping R:V1×U1×U2→X such that,
for any σ=(σ1,σ2)∈U1×U2 and ξ∈V1,
[TABLE]
and
[TABLE]
where dist(ξ−σ2,C)) is the distance from ξ−σ2 to the set C.
**Proof **
Let a mapping F:X→Y×X act by the rule F(x,σ)=(Λx−σ1,x−σ2). We set A=(0Y,C) and σ0=(y0,0X). Then
F(x0,σ0)=(Λx0−y0,x0)=(0Y,x0)∈A.
By (64), there exists δ0>0 such that UY(y0,δ0)⊂ΛC. Setting δ=δ0/(∥Λ∥+1), we show that UY×X(0,δ)⊂int(Im(Λ,Id)+(0Y,x0−C)).
Let σ=(σ1,σ2)∈UY×X(0,δ). From the choice of δ
it follows that σ1−Λσ2∈UY(0,δ0), and hence
y0+σ1−Λσ2∈UY(y0,δ0)⊂ΛC. Therefore, there exists
an element x1(σ)∈C such that y0+σ1−Λσ2=Λx1(σ).
We set x(σ)=σ2+x1(σ)−x0. Hence, using the previous equality, we find that
σ1=Λσ2+Λx1(σ)−Λx0=Λ(σ2+x1(σ)−x0)=Λx(σ) and
σ2=x(σ)+x0−x1(σ)∈x(σ)+x0−C;
that is, σ=(σ1,σ2)∈Im(Λ,Id)+(0Y,x0−C). Therefore,
UY×X(0,δ)⊂Im(Λ,Id)+(0Y,x0−C), proving inclusion (67).
Now all the hypotheses of the implicit function theorem from [4] are clearly satisfied,
where Σ=Y×X and V=X (with x, y and σ in place of
x0, y0 and σ0, respectively). Relations (65) and (66)
are immediate consequences of this theorem.
**Proof **
[of the inverse function lemma]
We set C=K−w, x0=0, and put Λ=Φ′(w). Then y0=Λx0=0∈intΛC by condition 3) of the lemma, and hence the hypotheses of
Proposition 2 with these data are satisfied. Let V1, U1, U2 be the corresponding
neighborhoods (of the origins in X and Rm), and let the constant a>0 and a continuous mapping
R:V1×U1×U2→X be from this proposition.
Let δ1>0 be such that UX(0,δ1)⊂V1∩U2. Then if w∈UX(w,δ1) and z∈U1, then (w−w,(z,w−w))∈V1×U1×U2.
Therefore, the mapping φ:UX(w,δ1)×U1→X is defined by the formula
φ(w,z)=R(w−w,(z,w−w)). Moreover, by (2) and (3),
we have
[TABLE]
and
[TABLE]
because dist(0,K−w)=0.
From condition 2) of the lemma it follows that there exists δ2>0 such that
UX(w,δ2)⊂V and moreover, for all w,w′∈UX(w,δ2),
[TABLE]
We set V0=URm(w,δ), where δ=min(δ1,δ2), and choose
ρ>0 so that BRm(0,ρ)⊂U1. Let
r0=min(ρ/3,δ/a(∥Λ∥+3)).
Let r∈(0,r0], Φ∈UC(V∩K,Rm)(Φ,r), and let a pair
(w,y)∈(V0∩K)×Rm satisfy inequality (63). Consider the
mapping G:BRm(Φ(x),3r)→Rm defined by
[TABLE]
This definition is correct, because if z∈BRm(Φ(w),3r), then
∣z−Φ(w)∣⩽3r⩽3r0⩽ρ; that is, z−Φ(w)∈U1. Next, from
(69), (63) and by the choice of r0 we have
[TABLE]
The right-hand side is smaller than δ, and hence
w+φ(w,z−Φ(w))∈V. Besides, from
(68) we get w+φ(w,z−Φ(w))∈K.
Let us show that the range of G lies in the ball BRm(Φ(w),3r).
Indeed, using the equality Λφ(w,z−Φ(w))=z−Φ(w), which holds by (68), and employing
condition (63), relation (70), the condition Φ∈UC(V∩K,Rm)(Φ,r),
inequality (71), and again condition (63), we get
[TABLE]
(here we set, for brevity ε=1/(a(∥Λ∥+3)+1)).
The mapping is continuous qua the superposition of continuous mappings, and hence by Brouwer’s
fixed point theorem, there exists a point z∗=z∗(w,y,Φ) such that
G(z∗)=z∗ or, what is the same, Φ(w+φ(w,z∗−Φ(w)))=y. We set
gΦ(w,y)=w+φ(w,z∗−Φ(w)) (outside the set defined by inequality (63)
we define gΦ to be zero, for example). Hence
Φ(gΦ(w,y))=y. By the above, gΦ(w,y)∈V, and from (68) we get
gΦ(w,y)∈K. Moreover, ∥gΦ(w,y)−w∥X=∥φ(w,z∗−Φ(w))∥X⩽(a(∥Λ∥+3)+1)r=γr, where γ=a(∥Λ∥+3)+1.
We now formulate some assumptions and notation to be used in all the lemmas that follow.
We shall assume that the mapping φ:R×Rn×Rr→Rn (of the variables t∈R, x∈Rn and u∈Rr) is continuous together with its derivative with respect to x.
Recall that the set
[TABLE]
where Σk={α=(α1,…,αk)∈R+k:∑i=1kαi=1}
was defined above for each k∈N (see the beginning of the proof of Theorem 2).
Let a tuple α=(α1,…,αk)∈Ak
and a tuple u=(u1,…,uk)∈(L∞([t0,t1],Rr))k be fixed. Further,
let N>k, α′=(α,0)∈(L∞([t0,t1]))N, v=(v1,…,vN−k)∈(L∞([t0,t1],Rr))N−k and u′=(u1,…,uk,v1,…,vN−k).
The elements of u′ will be denoted by ui, i=1,…,N; that is, ui=ui,
i=1,…,k, uk+i=vi, i=1,…,N−k.
Lemma 2** (**the lemma on equation in variations)
Let x be the solution of the differential equation
[TABLE]
on [t0,t1]. Then there exist neighborhoods O(x(t0)) and O(α′)
such that, for all ξ∈O(x(t0)) and α=(α1,…,αN)∈O(α′), there exists a unique solution x(⋅,ξ,α;u′) of the Cauchy problem
[TABLE]
on [t0,t1].
The mapping (ξ,α)↦x(⋅,ξ,α;u′) lies in
C(O(x(t0))×O(α′),C([t0,t1],Rn)) and is continuously
differentiable.
If x′ is the derivative of this mapping at a point (x(t0),α′), then, for
any ξ∈Rn and
α=(α1,…,αN)∈(L∞([t0,t1]))N, the function
h=x′[ξ,α] is the solution of the equation in variations
[TABLE]
**Proof **
Consider the mapping F:C([t0,t1],Rn)×Rn×(L∞([t0,t1]))N→C([t0,t1],Rn), which is defined for all
t∈[t0,t1] by the formula
[TABLE]
The dependence of F on the fixed tuple u′ will not be indicated.
It is easily checked that at any point (x,ξ,α)∈C([t0,t1],Rn)×Rn×(L∞([t0,t1]))N the mapping F has the continuous
partial derivative with respect to x, which acts by the rule
[TABLE]
for all h∈C([t0,t1],Rn) and t∈[t0,t1] (for details, see,
for example, [5] and [6]).
The existence and continuity of the partial derivative of F with respect to the variable (ξ,α),
which enters linearly, can be easily checked. Moreover, at each point
(x,ξ,α) this derivative acts by the rule
[TABLE]
for all (ξ,α)∈Rn×(L∞([t0,t1]))N and t∈[t0,t1].
Therefore, the mapping F is continuously differentiable on C([t0,t1],Rn)×Rn×(L∞([t0,t1]))N.
Since x is the solution of equation (72), we have
F(x,x(t0),α′)(t)=0, t∈[t0,t1]. Finally,
that the partial derivative of F with respect to x is invertible at the point (x,x(t0),α′)
follows from the solvability of the Cauchy problem for the corresponding linear equation for any
initial conditions.
We can employ the classical implicit function theorem (see, for example, [7]).
According to this theorem, there exist neighborhoods O(x), O(x(t0)) and
O(α′) and a continuously differentiable mapping (ξ,α)↦x(⋅,ξ,α;u′) from O(x(t0))×O(α′) into O(x)
such that F(x(t,ξ,α;u′),ξ,α)(t)=0 for all (ξ,α)∈O(x(t0))×O(α′) and t∈[t0,t1]. This is equivalent to saying that
x(⋅,ξ,α;u′) is a (unique) solution to equation (73).
By the formula for the derivative of an implicit function,
the derivative x′ of the mapping (ξ,α)↦x(⋅,ξ,α,u′) satisfies
Fx(x,x(t0),α′)x′=−F(ξ,α)(x,x(t0),α′) at the
point (x(t0),α′).
Substituting here the expressions from (75) and (76) for the derivatives at the point
(x,x(t0),α′), we see that, for all (ξ,α)∈Rn×(L∞([t0,t1])N (α=(α1,…,αN)) and
t∈[t0,t1], the equality is satisfied
[TABLE]
If we denote h=x′[ξ,α], then this equality is equivalent to equation (74).
We recall that the space Cx1(M×Σ,Y) was defined before the statement
of the generalized implicit function theorem. The set Ak, for any k∈N,
and the tuple of controls u′=(u1,…,uN) are defined before the formulation
of the lemma on equation in variations.
Let L>0. We denote by QL=QL([t0,t1],Rn) the class of Lipschitz
vector functions on [t0,t1] with values in Rn and with Lipschitz constant L.
As in the previous lemma, the dependence of mappings F and Fs on the fixed
tuple u′ is not indicated.
Lemma 3** (**the first approximation lemma)
Let M be a bounded set in C([t0,t1],Rn), Ω be
a bounded set in Rn, and let L>0. Then the mapping F:C([t0,t1],Rn)×Rn×(L∞([t0,t1]))N→C([t0,t1],Rn), as defined for all t∈[t0,t1] by the formula
[TABLE]
where α=(α1,…,αN), lies in the space Cx1=Cx1((M∩QL)×Ω×AN,C([t0,t1],Rn)). Moreover, for any
α∈AN, there exists a sequence of controls
us(α,u′)∈L∞([t0,t1],Rr), s∈N, such that
the mappings Fs:C([t0,t1],Rn)×Rn×AN→C([t0,t1],Rn), as defined for all t∈[t0,t1] by the rule
[TABLE]
also lie in Cx1, and besides, the sequence Fs converges to F in the metric of Cx1 as s→∞.
**Proof **
Let us show that F∈Cx1. By the previous lemma, the mapping F is continuous together with its
partial derivative on C([t0,t1],Rn)×Rn×(L∞([t0,t1]))N.
Let us now check that the mapping F and its partial derivative with respect to x are bounded on the set
(M∩QL)×Ω×AN.
Indeed, let δ>0 be such that M⊂BC([t0,t1],Rn)(0,δ), Ω⊂BRn(0,δ) and γ=max1⩽i⩽N∥ui∥L∞([t0,t1],Rr). The mappings φ and φx
are continuous on the compact set K=[t0,t1]×BRn(0,δ)×BRr(0,γ). We set C=max{∣φ(t,x,u)∣:(t,x,u)∈K} and
C0=max{∥φx(t,x,u)∥:(t,x,u)∈K}. Then, for any
(x,ξ,α)∈(M∩QL)×Ω×AN, h∈C([t0,t1],Rn) and t∈[t0,t1], it can be easily shown that
∣F(x,ξ,α)(t)∣⩽2δ+C and ∣Fx(x,ξ,α)[h](t)∣⩽(1+C0)∥h∥C([t0,t1],Rn) (see formula (75)).
So, F∈Cx1.
For each s∈N we split the interval [t0,t1] into s subintervals
Δj(s)=[t0+j(t1−t0)/s,t0+(j+1)(t1−t0)/s] of length
∣Δj(s)∣=(t1−t0)/s, j=0,…,s−1.
We set
[TABLE]
It is clear that αij⩾0 and ∑i=1Nαij=1, j=0,…,s−1.
We split each subinterval Δj(s) into N successive subintervals
Δji(s,α) of length
∣Δji(s,α)∣=αij∣Δj(s)∣=αij(t1−t0)/s,
i=1,…,N.
Define the function us(α;u′) on [t0,t1] by the rule: us(α;u′)(t)=ui(t) if t∈Δji(s,α), 1⩽i⩽N, j=0,1,…,s−1
(on the end-points of the subintervals the values of the functions ui and, respectively, of the function
us(α;u) can be taken arbitrarily). It is clear that us(α;u′)∈L∞([t0,t1],Rr) and ∥us(α;u′)∥L∞([t0,t1],Rr)⩽γ for all s∈N.
We claim that Fs∈Cx1 for any s∈N. We first show that, for any
s∈N, the mapping Fs is continuous.
To begin with, we note that the mappings α↦us(α;u′), qua
mappings from (L∞([t0,t1]))N into L1([t0,t1],Rr), are continuous on
AN uniformly with respect to s∈N. For simplicity of calculations, we shall check it
in the case N=2 and t0=0, t1=1.
Let α0=(α10,α20)∈A2 be a fixed pair.
Setting α0=α10, we have 1−α0=α20. Next, let
βj0=∣Δj(s)∣−1∫Δj(s)α0(t)dt, j=0,…,s−1.
The subinterval Δj(s) of length 1/s is split into two successive subintervals
Δj1(s,α0) and Δj2(s,α0) of length, respectively,
βj0/s and (1−βj0)/s.
Further, let α=(α,1−α) be a different pair from A2 and let
βj=∣Δj(s)∣−1∫Δj(s)α(t)dt, j=0,…,s−1. Then on
each subinterval Δj(s) we have
[TABLE]
(γ=max(∥u1∥L∞([t0,t1],Rr),∥u2∥L∞([t0,t1],Rr))).
Summing these inequalities over j=0,…,s−1, we find that
[TABLE]
whence the required result follows.
After making this remark, we proceed with the proof of the continuity of the mappings Fs.
Let (x0,ξ0,α0)∈C([t0,t1],Rn)×Rn×AN and ε>0. We set K1={(t,x)∈Rn+1:∣x−x0(t)∣⩽δ1,t∈[t0,t1]}×BRr(0,γ). The mapping
φ is continuous on the compact set K1. Let C1=max{∣φ(t,x,u)∣:(t,x,u)∈K1}. Since φ is uniformly continuous on this compact set, there exists
0<δ2⩽min(δ1,ε) such that
∣φ(t,x1,u1)−φ(t,x2,u2)∣<ε for all (t,xi,ui)∈K1,
i=1,2, for which ∣x1−x2∣<δ2 and ∣u1−u2∣<δ2.
By the above, there exists a neighborhood O(α0) such that if α∈O(α0)∩AN, then us(α;u′)∈UL1([t0,t1],Rr)(us(α0;u′),εδ2) for all s∈N. For each
such α and s, we set Eδ2(α,s)={t∈[t0,t1]:∣us(α;u′)(t)−us(α0;u′)(t)∣⩾δ2}. Then
[TABLE]
and hence, mesEδ2(α,s)<ε.
Now let x∈UC([t0,t1],Rn)(x0,δ2), ξ∈URn(ξ0,ε) and α∈O(α0)∩AN. For any
t∈[t0,t1], we have
[TABLE]
that is, the mappings Fs are continuous at the point (x0,ξ0,α0) uniformly with respect to s, and
therefore, this is true for any point from C([t0,t1],Rn)×Rn×AN.
For any s, the mapping Fs has the partial derivative with respect to x at any point
(x,ξ,α)∈C([t0,t1],Rn)×Rn×AN,
which acts by the rule
[TABLE]
for all h∈C([t0,t1],Rn) and t∈[t0,t1]. This is proved along the same lines
as the existence of the partial derivative with respect to x of the mapping F
in the lemma on equation in variations.
Let us show that this derivative is continuous on C([t0,t1],Rn)×Rn×AN. In other words, we need to show that if (x0,ξ0,α0)∈C([t0,t1],Rn)×Rn×AN, then for any
ε>0 there exist neighborhoods O1(x0), O1(ξ0) and
O1(α0) such that, for all
(x,ξ,α)∈O1(x0)×O1(ξ0)×(O1(α0)∩AN), all h∈C([t0,t1],Rn), ∥h∥C([t0,t1],Rn)⩽1 and
all t∈[t0,t1],
[TABLE]
However, as is easy to check, to prove this inequality one needs in essence to repeat the above arguments
related to the continuity of the mapping F.
The boundedness of the mappings Fs and their partial derivatives with respect to x, s∈N, on
(M∩QL)×Ω×AN can be proved as for the mapping F. This implies that
Fs∈Cx1 for all s∈N.
Let us now show that the sequence Fs converges to F in the metric of Cx1 as
s→∞.
We first show that the sequence Fs converges to F in C((M∩QL)×Ω×AN,C([t0,t1],Rn)) as s→∞. In other words,
we need to show that, for any ε>0, there exists s0=s0(ε)
such that, for all s⩾s0, all (x,ξ,α)∈(M∩QL)×Ω×AN, and all t∈[t0,t1], the inequality holds
[TABLE]
Let ε>0. It can be assumed that 0<ε<t1−t0. From the
Luzin C-property
and since the Lebesgue measure is regular, it follows that there exists a closed set
A=A(ε)⊂[t0,t1] such that mesA>(t1−t0)−ε and
that on A
the functions ui, i=1,…,N, are continuous. Moreover, there exist continuous functions
vi on [t0,t1] such that vi=ui on A and ∥vi∥C([t0,t1],Rr)⩽∥ui∥L∞([t0,t1],Rr), i=1,…,N.
Let (x,ξ,α)∈(M∩QL)×Ω×AN. On each subinterval
Δj(s), 0⩽j⩽s−1, it is easily checked that
[TABLE]
(for brevity we replace Δj(s) and Δji(s,α), respectively,
by Δj and Δji)
Let us now estimate the expressions on the right. It is easily checked that the sum of the norms
of the first two terms on the right in (79) is at most
[TABLE]
where the number C was defined at the beginning of the proof.
Now let us estimate the difference of the two last integrals in (79). We first proceed with
each component of this difference.
Let
φ(⋅,x,vi)=(φ1(⋅,x,vi),…,φn(⋅,x,vi))T,
i=1,…,N. We fix 1⩽l⩽n. By the mean value theorem for the integrals,
[TABLE]
where ξi,ζi∈Δj, 1⩽i⩽N.
Let us now estimate the absolute value of the difference on the right.
The mapping φ is uniformly continuous on the compact set K (which was defined
at the beginning of the proof). Hence, there exists δ0>0 such that
∣φl(t′,x′,u′)−φl(t′′,x′′,u′′)∣<ε for all (t′,x′,u′) and
(t′′,x′′,u′′) from K for which ∣t′−t′′∣<δ0, ∣x′−x′′∣<δ0 and
∣u′−u′′∣<δ0.
Let s0=s0(ε) be so large that
∣Δj(s0)∣<min(ε,δ0,δ0/L) and ∣vi(t′)−vi(t′′)∣<δ0,
i=1,…,N, for t′,t′′∈Δj(s0). Hence, if ξi,ζi∈Δj(s0), then ∣ξi−ζi∣<δ0, ∣x(ξi)−x(ζi)∣⩽L∣ξi−ζi∣<δ0 and ∣vi(ξi)−vi(ζi)∣<δ0, i=1,…,N.
Therefore, the expression on the right in (81) is majorized by
∣Δj(s)∣ε for s⩾s0. This estimate implies that the norm of the difference of the
last two integrals in (79) is estimated from above by
∣Δj(s)∣nε.
Let us now prove inequality (78). Assume that the interval [t0,t]
contains a noninteger number of subintervals Δj(s). If t<t0+(t1−t0)/s, then we see that the
norm of the difference on the left in (78) is at most 2C(t−t0)<2C∣Δ0(s))∣<2Cε.
Let t∈(t0,t1] be such that the interval [t0,t] contains an integer number of
subintervals Δj(s). Summing (79) with respect to all such subintervals, we get the
expression under the norm sign in (78).
In view of estimate (80), the sum of the norms of the first two integrals on the right (after addition)
is at most 4Cmes([t0,t]∖A)⩽4Cmes([t0,t1]∖A)<4Cε.
The norm of the difference of the two last integrals in (79) (after addition) is at most
(t−t0)nε⩽(t1−t0)nε.
So, for all (x,ξ,α)∈(M∩QL)×Ω×AN and s⩾s0(ε), the expression on the left in (78) for the t under consideration is
at most (4C+(t1−t0)n)ε.
The case when the interval [t0,t] is composed of an integer number of
subintervals Δj(s) and an additional interval of length ε can clearly be reduced to the above cases.
So, we have proved inequality (78), but with cε in place of ε,
which, however, is immaterial, because c does not depend on x, on α, and on t, and hence
the sequence Fs converges to F in C((M∩QL)×Ω×AN,C([t0,t1],Rn)) as s→∞.
It remains to show that the sequence Fsx converges to Fx in C((M∩QL)×Ω×AN,C([t0,t1],Rn)) as s→∞. This means that,
for any ε>0, there exists s0=s0(ε) such that
[TABLE]
for all s⩾s0, all h∈C([t0,t1],Rn), ∥h∥C([t0,t1],Rn)⩽1, all (x,ξ,α)∈(M∩QL)×Ω×AN and all
t∈[t0,t1].
But it is easily seen that this inequality can be proved by the same line of arguments
as in the proof of (78). This proves the first approximation lemma.
We recall that the functions us(α;u′) are defined in Lemma 3,
and the space
C(M,Z) is defined before Corollary 3.
Lemma 4** (**the second approximation lemma)
Let x and x(⋅,ξ,α;u′), where (ξ,α)∈O(x(t0))×O(α′), are solutions to, respectively, equations (72) and
(73) from the lemma on equation in variations. Then there exist neighborhoods
O0(x(t0))⊂O(x(t0)) and O0(α′)⊂O(α′) such that,
for all (ξ,α)∈M=O0(x(t0))×(O0(α′)∩AN) and sufficiently large
s∈N,
there exists a unique solution xs(⋅,ξ,α;u′)
to the equation
[TABLE]
on [t0,t1]. Moreover, the mapping (ξ,α)↦xs(⋅,ξ,α;u′) lies in the space C(M, C([t0,t1],Rn)) and converges in
this space to the mapping (ξ,α)↦x(⋅,ξ,α;u′)
as s→∞.
**Proof **
Here we employ Corollary 3 to the above generalized implicit function theorem.
We first require some preliminary considerations.
Let δ>0, γ=max1⩽i⩽N∥ui∥L∞([t0,t1],Rr) and
K0={(t,x)∈R×Rn:∣x−x(t)∣⩽δ,t∈[t0,t1]}×BRr(0,γ). We set
C0=max{∣φ(t,x,u)∣:(t,x,u)∈K0} and C1=max{∥φx(t,x,u)∥:(t,x,u)∈K0}
Let F and Fs, s∈N, be the mappings from Lemma 3. We set
Λ=Fx(x,x(t0),α′). The operator Λ is invertible (see Lemma 2).
Let x∈UC([t0,t1],Rn)(x,δ), ξ∈URn(x(t0),δ), α∈AN and s∈N. Then, for all
such x, ξ, α, s and t∈[t0,t1], we have
[TABLE]
We denote by D the constant on the right and define L=C1(D+δ)+C0.
Recall that QL is the class of Lipschitz vector functions on [t0,t1] with values in Rn
and with Lipschitz constant L. It is easily checked that QL is a convex
closed set in C([t0,t1],Rn).
By Lemma 3 the mappings F and Fs, s∈N, are contained in the space Cx1=Cx1((UC([t0,t1],Rn)(x,δ)∩QL)×URn(x(t0),δ)×AN,C([t0,t1],Rn)) and converge to F in this space Fs as s→∞.
Now we can apply Corollary 3 to the generalized implicit function theorem, in which
X=Y=C([t0,t1],Rn), Σ=URn(x(t0),δ)×AN, σ=(x(t0),α′), x=x(⋅), V=UC([t0,t1],Rn)(x,δ), Q=QL and F=F.
From Lemma 2 it follows that F(x,x(t0),α′)=0, and moreover, the
operator Λ=Fx(x,x(t0),α′) is invertible, as was noted above.
Let neighborhoods V0′⊂V0⊂V of the point x, a neighborhood U0⊂URn(x(t0),δ)×AN of the point (x(t0),α′) and
a neighborhood W0 of the mapping F be from the conclusion of the theorem.
Since the mappings Fs converge to F in the space Cx1 as
s→∞, there exists s0 such that Fs∈W0 for all s⩾s0.
Let us check that x−Λ−1Fs(x,ξ,α)∈QL for all
(x,ξ,α)∈(V0′∩QL)×U0 and s∈N. Indeed, if
y=x−Λ−1Fs(x,ξ,α), then Λy=Λx−Fs(x,ξ,α),
or (by the definition of Λ and Fs(x,ξ,α))
[TABLE]
for all t∈[t0,t1].
Therefore, since ∥y∥C([t0,t1],Rn)⩽D (see (83)),
∥x∥C([t0,t1],Rn)⩽δ, ∑i=1kαi(τ)=1
for almost all t∈[t0,t1] and ∥us(α;u′)∥L∞([t0,t1],Rr)⩽γ for all s∈N (see Lemma 3), we see that, for any
t′,t′′∈[t0,t1],
[TABLE]
and so, x−Λ−1Fs(x,ξ,α)∈QL for all
(x,ξ,α)∈(V0′∩QL)×U0 and s∈N.
The same argument shows that x−Λ−1F(x,ξ,α)∈QL for all
(x,ξ,α)∈(V0′∩QL)×U0 (instead of the last integral on the right
in (84), we have the integral
∫t0t(∑i=1Nαi(τ)φ(τ,x(τ),ui(τ)))dτ).
Hence by Corollary 3, for all s⩾s0, there exist
continuous mappings gFs:U0→V0∩QL and gF:U0→V0∩QL such that Fs(gFs(ξ,α),ξ,α)(t)=0 and
F(gF(ξ,α),ξ,α)(t)=0 for all (ξ,α)∈U0 and
t∈[t0,t1].
This is equivalent to saying that, for all (ξ,α)∈U0, the function
gFs(ξ,α) is a unique solution xs(⋅,ξ,α;u)
to equation (82) and the function gF(ξ,α) is a unique solution
x(⋅,ξ,α;u) to equation (73), whose properties are described in Lemma 2.
Moreover, by Corollary 3 there exists a neighborhood U0′⊂U0 of the point
(x(t0),α′) (it can be assumed that this neighborhood has the form M=O0(x(t0))×(O0(α′)∩AN), where
O0(x(t0))⊂O(x(t0)), O0(α′)⊂O(α′), and
O(x(t0)) and O(α′) are the neighborhoods from Lemma 2) such that
[TABLE]
where xs and x are, respectively, the continuous mappings (ξ,α)↦xs(⋅,ξ,α;u′) and (ξ,α)↦x(⋅,ξ,α;u′),
and Λ=Fx(x,x(t0),α′).
The quantity on the right tends to zero as s→∞, and hence, xs→x as
s→∞ in the metric of the space C(M,C([t0,t1],Rn)).
Bibliography7
The reference list from the paper itself. Each links out to its DOI / PubMed record.
1[1] L. S. Pontryagin, V. G. Boltyanskii, R. V. Gamkrelidze, and E. F. Mishchenko, The mathematical theory of optimal processes, Interscience Publishers, John Wiley & Sons, Inc., New York–London 1962
2[2] R. V. Gamkrelidze, Principles of optimal control theory, Tbilisi University Publishing House, Tbilisi 1977, English transl., rev. ed., Math. Concepts Methods Sci. Eng., vol. 7, Plenum Press, New York–London 1978,
3[3] A. F. Filippov, ‘‘Some questions of optimal control theory’’, Vest. Moskov. Univ. Ser. Mat. Mekh. Astron. Fiz. Khim., 1959, no. 2, 25–32.
4[4] E. R. Avakov, G. G. Magaril-Il’yaev, ‘‘An implicit-function theorem for inclusions’’, Mat. Zametki, 91:6 (2012), 813–818; Math. Notes, 91:6 (2012), 764–769.
5[5] V. M. Alekseev, V. M. Tikhomirov, and S. V. Fomin, Optimal control, Contemp. Soviet Math., Consultants Bureau, New York, 1987.
6[6] E. R. Avakov, G. G. Magaril-Il’yaev, and V. M. Tikhomirov, ‘‘Lagrange’s principle in extremum problems with constraints’’, Uspekhi Mat. Nauk, 68:3(411) (2013), 5–38; Russian Math. Surveys, 68:3 (2013), 401–433.
7[7] V. A. Zorich, Mathematical analysis, vol. II, Nauka, Moscow 1984; English transl., Universitext, Springer-Verlag, Berlin 2004.