On the proof of Michel of the Maximum Pontryagin principle
Joe¨l Blot & Hasan Yilmaz
Joël Blot: Laboratoire SAMM EA 4543,
Université Paris 1 Panthéon-Sorbonne, centre P.M.F.,
90 rue de Tolbiac, 75634 Paris cedex 13,
France.
[email protected]
Hasan Yilmaz: Laboratoire LPSM UMR 8001
Université Paris-Diderot, Sorbonne-Paris-Cité
bâtiment Sophie Germain, 8 place Aurélie Nemours,
75013 Paris, France.
[email protected]
(Date: April, 1, 2019)
Abstract.
We provide an improvment of the maximum principle of Pontryagin of the optimal control problems, for a system governed by an ordinary differential equation, in presence of final constraints, in the setting of the piecewise differentiable state functions (valued in a Banach space) and of piecewise continuous control functions (valued in a metric space). As Michel we use the needlelike variations, but we introduce tools of functional analysis and a recent multiplier rule of the static optimization to make our proofs.
Mathematical Subject Classification 2010: 49K15, 47H10
Key Words: Pontryagin maximum principle, piecewise continuous functions, fixed point theorem
1. Introduction
The paper deals with the maximum principle of Pontryagin for a problem of Bolza in the following form.
[TABLE]
In the special case where f0 is equal to zero, the problem is called a problem of Mayer and it is denoted by (M). T∈(0,+∞) is fixed. E denotes a real Banach space, Ω is a nonempty open subset of E, U denotes a nonempty metric space, and ξ0∈Ω is fixed; we use the mappings f0:[0,T]×Ω×U→R and f:[0,T]×Ω×U→E. The real valued functions gα and hβ are defined on Ω, and m and q are fixed integer numbers.
PC0([0,T],U) denotes the space of the piecewise continuous functions from [0,T] into U, and PC1([0,T],Ω) denotes the space of the piecewise differentiable functions from [0,T] into Ω. The precise definitions of these notions are given in Section 2.
When (x,u) is an admissible process for (B) or (M), we consider the following condition of qualification, i∈{0,1}. (QC, 0) is due to Michel, [9].
[TABLE]
The main theorems of the paper are the following ones.
Theorem 1.1**.**
Let (x0,u0) be a solution of problem (B). We assume that the following assumptions are fulfilled.
For all α∈{0,...,m}, gα is Fréchet differentiable at x0(T).
For all β∈{1,...,q}, hβ is continuous on a neighborhood of x0(T) and is Fréchet differentiable at x0(T).
f0* is continuous on [0,T]×Ω×U, the partial differential with respect to the second vector variable D2f0(t,ξ,ζ) exists for all (t,ξ,ζ)∈[0,T]×Ω×U, and D2f0 is continuous on [0,T]×Ω×U.*
f* is continuous on [0,T]×Ω×U, the partial differential with respect to the second vector variable D2f(t,ξ,ζ) exists for all (t,ξ,ζ)∈[0,T]×Ω×U, and D2f is continuous on [0,T]×Ω×U.*
*Then there exists (λα)0≤α≤m∈R1+m, (μβ)1≤β≤q∈Rq and p∈PC1([0,T],E∗) which satisfy the following conditions.
Part (I)*
(λα)0≤α≤m* and (μβ)1≤β≤q are not simultaneously equal to zero.*
For all α∈{0,...,m}, λα≥0.
For all α∈{1,...,m}, λαgα(x0(T))=0.
∑α=0mλαDgα(x0(T))+∑β=1qμβDhβ(x0(T))=p(T).
p′(t)=−D2HB(t,x0(t),u0(t),p(t),λ0)* for all t∈[0,T] except at most when t is a discontinuity point of u0.*
*For all t∈[0,T], for all ζ∈U, *
HB(t,x0(t),u0(t),p(t),λ0)≥HB(t,x0(t),ζ,p(t),λ0).
HˉB:=[t↦HB(t,x0(t),u0(t),p(t),λ0)]∈C0([0,T],R).
Part (II)* If in addition we assume that, for all (t,ξ,ζ)∈[0,T]×Ω×U, the partial derivatives with respect to the first variable ∂1f0(t,ξ,ζ) and ∂1f(t,ξ,ζ) exist and ∂1f0 and ∂1f are continuous on [0,T]×Ω×U, then HˉB∈PC1([0,T],R) and, for all t∈[0,T] which is a continuity point of u0, HˉB′(t)=∂1HB(t,x0(t),u0(t),p(t),λ0).
Part (III) If we assume that (QC, 1) is fulfilled for (x,u)=(x0,u0) then, for all t∈[0,T], (λ0,p(t)) is never equal to zero.*
In this statement, E∗ denotes the topological dual space of E, (NN) is a condition of non nullity, (Si) is a sign condition, (Sℓ) is a slackness condition, (TC) is the transversality condition, (AE.B) is the adjoint equation where the Hamiltonian of the problem of Bolza is defined as HB(t,x,u,p,λ):=λf0(t,x,u)+p⋅f(t,x,u). (MP.B) is the maximum principle and (CH.B) is a condition of continuity on the Hamiltonian.
Theorem 1.2**.**
*Let (x0,u0) be a solution of (M). Under (A1), (A2), and (A4) there exist (λα)0≤α≤m∈R1+m, (μβ)1≤β≤q∈Rq and p∈PC1([0,T],E∗) such that the following conditions hold.
Part (I)*
(λα)0≤α≤m* and (μβ)1≤β≤q are not simultaneously equal to zero.*
For all α∈{0,...,m}, λα≥0.
For all α∈{1,...,m}, λαgα(x0(T))=0.
∑α=0mλαDgα(x0(T))+∑β=1qμβDhβ(x0(T))=p(T).
p′(t)=−D2HM(t,x0(t),u0(t),p(t))* for all t∈[0,T] except at most when t is a discontinuity point of u0.*
*For all t∈[0,T], for all ζ∈U, *
HM(t,x0(t),u0(t),p(t))≥HM(t,x0(t),ζ,p(t)).
HˉM:=[t↦HM(t,x0(t),u0(t),p(t))]∈C0([0,T],R).
Part (II)* If we assume that, for all (t,ξ,ζ)∈[0,T]×Ω×U, the partial derivative with respect to the first variable ∂1f(t,ξ,ζ) exists and ∂1f is continuous on [0,T]×Ω×U, then we have HˉM∈PC1([0,T],R) and, for all t∈[0,T] which is a continuity point of u0, HˉM′(t)=∂1HM(t,x0(t),u0(t),p(t)).
Part (III) If in addition of (A1), (A2), (A4), we assume that (QC, 0) is fulfilled when (x,u)=(x0,u0), then p(t) is never equal to zero when t∈[0,T].*
In this statement the Hamiltonian of the problem of Mayer is defined as
HM(t,x,u,p):=pf(t,x,u).
To prove these statements, we build a variation of the proof of Michel [9] (on the problem of Mayer) by introducing functional analytic arguments. Notably we consider special function spaces of piecewise continuous functions, operators on these function spaces and fixed point theorems. We also use a recent result of multiplier rule on static optimization problems. The main contributions of the paper are the following ones.
Our assumptions on the gα are only their Fréchet diffferentiability, and on the hβ are their continuity and their Fréchet differentiability, not their continuous differentiability as in [9], [1] (p. 321) and [7] (p. 132).
In [1]( p. 321) and [7] (p. 132) the first conclusion of the theorem of Pontryagin is that (λα)0≤α≤m, (μβ)1≤β≤q, and p are not simultaneously equal to zero. In our Theorem 1.1, the first conclusion is that (λα)0≤α≤m and (μβ)1≤β≤q are not simultaneously equal to zero; it is an improvment.
As in [9] we do not demand the finiteness of the dimension of the space E; in [1] and in [7], E is finite-dimensional. Moreover we use an open subset of E instead of E; it is another difference with [9]. Ever about [9], we prove a condition of continuity of the needlike variations with respect to the thickness of the needles, which is useful but omitted in [9].
Note that there exist statements of theorem of Pontryagin without assumptions of continuous differentiablity, by using locally Lipschitzean mappings and generalized differential calculus on these mappings, e.g. in [5]. A mapping which is Fréchet differentiable at a point is not necessarily locally Lipschitzean, and conversely a mapping which is locally Lipschitzean is not necessarily Fréchet differentiable at a given point; hence our result is not comparable with the statements of the locally Lipschitzean setting.
2. Function spaces
When X and Y are metric spaces, C0(X,Y) denotes the space of the continuous mappings from X into Y. When X is an open subset of a real normed vector space or an interval of R, C1(X,Y) denotes the space of the continuously Fréchet differentiable mappings from X into Y. When X and Y are real normed vector spaces, L(X,Y) denotes the space of the bounded linear mappings from X into Y, and Isom(X,X) denotes the space of the topological isomorphisms from X onto X. When X is a metric space, x∈X and r∈R+∗:=(0,+∞), the closed ball (respectively open ball) centered at x with a radius equal to r is denoted by B(x,r) (respectively B(x,r)).
2.1. Piecewise continuous functions.
Let Y be a metric space. A function u:[0,T]→Y is called piecewise continuous when u∈C0([0,T],Y) or when there exists a subdivision 0=τ0<τ1<...<τk<τk+1=T such that
For all i∈{0,...,k}, u is continuous on (τi,τi+1).
For all i∈{0,...,k}, the right-hand limit u(τi+) exists in Y.
For all i∈{1,...,k+1}, the left-hand limit u(τi−) exists in Y.
In other words, such a function is a regulated function (cf. [4], chapter 2 ) which possesses at most a finite number of discontinuity points. Their space is denoted by PC0([0,T],Y). PC0([0,T],Y,(τi)0≤i≤k+1) denotes the space of the u∈PC0([0,T],Y) such that the set of the discontinuity points of u is included in {τi:i∈{0,...,k+1}}. When A is a subset of Y, PC0([0,T],A) (respectively PC0([0,T],A,(τi)0≤i≤k+1)) denotes the space of the u∈PC0([0,T],Y) (respectively PC0([0,T],A,(τi)0≤i≤k+1)) such that the closure u([0,T])⊂A.
Definition 2.1**.**
A function u∈PC0([0,T],A) is called a normalized piecewise continuous function when moreover u is right continuous on [0,T) and when u(T−)=u(T).
The space of such functions is denoted by NPC0([0,T],A). When (τi)0≤i≤k+1 is a subdivision of [0,T], we set
[TABLE]
2.2. Piecewise continuously differentiable functions.
When E is a real Banach space, a function x:[0,T]→E is called piecewise continuously differentiable when x∈C0([0,T],E) and when x∈C1([0,T],E) or when there exists a subdivision (τi)0≤i≤k+1 of [0,T] such that the following conditions are fulfilled.
For all i∈{0,...,k}, x is C1 on (τi,τi+1)
For all i∈{0,...,k}, x′(τi+) exists in E
For all i∈{1,...,k+1}, x′(τi−) exists in E.
The τi are the corners of the function x.
We denote by PC1([0,T],E) the space of such functions; this space is denoted by KC1([0,T],E) in [1] (p. 66, Section 1.4). When Ω is an open subset of E, PC1([0,T],Ω) is the set of the x∈PC1([0,T],E) such that x([0,T])⊂Ω. When (τi)0≤i≤k+1 is a subdivision of [0,T], we denote by PC1([0,T],E,(τi)0≤i≤k+1) the set of the x∈PC1([0,T],E) such that the set of the corners of x is included in {τi:i∈{0,...,k+1}}.
When x∈PC1([0,T],E,(τi)0≤i≤k+1), we define the function dx:[0,T]→E by setting
[TABLE]
Note that dx∈NPC0([0,T],E,(τi)0≤i≤k+1).
2.3. Rewording of the problems
We consider the following problem.
[TABLE]
We denote by (M′) the special case of (B′) where f0=0.
We denote by Adm(B) (respectively Adm(B′)) the set of the admissible processes of (B) (respectively (B′)). When (x,u)∈Adm(B), and when the discontinuity points of u are in the values of the subdivision (τi)0≤i≤k+1 of [0,T], we introduce the fonction
[TABLE]
we have u∈NPC0([0,T],U).
Note that f(t,x(t),u(t)) and f(t,x(t),u(t)) can to be diffferent only when t∈{τi:0≤i≤k+1} and so we have dx(t)=f(t,x(t),u(t)) for all t∈[0,T]. Also note that
f0(t,x(t),u(t)) and f0(t,x(t),u(t)) can to be diffferent only when t∈{τi:0≤i≤k+1} and so we have ∫0Tf0(t,x(t),u(t))dt=∫0Tf0(t,x(t),u(t))dt. Consequently we obtain J(Adm(B))=J(Adm(B′)). When (x0,u0) is a solution of (B′) then it is also a solution of (B). Conversely, when (x0,u0) is a solution of (B), building u0 by using (2.2) where u is u0, we obtain that (x0,u0) is a solution of (B′). It is why we can say that the problems (B) and (B′) are equivalent problems. A similar reasoning is valid to show that the problems (M) and (M′) are equivalent.
3. The needlelike variations
3.1. Two results of the metric spaces theory
The first result is a generalization of the theorem of Heine on the uniform continuity of a continuous mapping on a compact metric space; it is useful to avoid an assumption of local compactness, and specially, in normed vector spaces, to avoid an assumption of finiteness of the dimension.
Theorem 3.1**.**
([12] p. 355, note (**))
Let X and Y be two metric spaces, ϕ∈C0(X,Y), and K⊂X be a compact. Then we have
[TABLE]
The following result is a theorem of fixed points in presence of parameters.
Theorem 3.2**.**
([12] p. 103, Theorem 46-bis )
Let X be a complete metric space, Λ be a metric space, and ϕ:X×Λ→X be a mapping. We assume that the following conditions are fulfilled.
∀x∈X,ϕ(x,⋅)∈C0(Λ,X).**
∃k∈[0,1),∀λ∈Λ,∀x,z∈X,d(ϕ(x,λ),ϕ(z,λ))≤kd(x,z).**
Then we have
∀λ∈Λ,∃!xλ∈X,ϕ(xλ,λ)=xλ.**
[λ↦xλ]∈C0(Λ,X).
3.2. Definitions of the needlelike variations
We follow the definition of Michel of the needlelike variations which is given in [9]; Michel himself refers to [11] for this approach. Let (x0,u0) be a solution of (M′).
When N∈N∗:=N∖{0}, we consider S:=((ti,vi))1≤i≤N where ti∈[0,T] satisfying 0<t1≤t2≤...≤tN<T, and where vi∈U. We denote by S the set of such S.
When S∈S and a=(a1,...,aN)∈R+N, we define the following objects
[TABLE]
[TABLE]
[TABLE]
[TABLE]
When a is small enough, we have Ii(a)⊂[0,T] and Ii(a)∩Ij(a)=∅ when i=j.
We will prove the existence of a solution, denoted by xa (which depends on S and a) of the following Cauchy problem on [0,T]:
[TABLE]
In the sequel of this section, we arbitrarily fix a S=(ti,vi)1≤i≤N in S.
3.3. Properties of continuity
In this subsection, we establish the existence of xa on [0,T] all over and we establish the continuity of the mapping [a↦xa]. To do that we introduce an appropriate function space and a nonlinear operator from which xa appears as a fixed point of this operator. The continuity of [a↦xa] will be a consequence of the fixed point theorem with parameters.
Lemma 3.3**.**
([9], Proposition 2) There exists k∈R+∗:=R+∖{0}, there exists ρ∈R+∗ such that, for all a∈R+N satisfying ∥a∥≤ρ, we have
[TABLE]
We consider the subdivision (τi)0≤i≤k+1 of [0,T] where the τi are the discontinuity points of u0. For all i∈{0,...k} we consider the function u0i:[τi,τi+1]→U defined by
[TABLE]
Hence u0i∈C0([τi,τi+1],U), and consequently u0i([τi,τi+1]) is compact. We set
[TABLE]
M is compact as a finite union of compacts. We set
[TABLE]
Since x0∈C0([0,T],Ω), Γ is compact.
Lemma 3.4**.**
*There exist L∈R+∗ and r∈R+∗ such that, ∀t∈[0,T],
∀ξ,ξ1∈B(x0(t),r), ∀ζ∈M, we have
∥f(t,ξ,ζ)−f(t,ξ1,ζ)∥≤L∥ξ−ξ1∥.*
Proof.
Since Ω is open in E, since x0([0,T]) is compact and included in Ω, there exists γ>0 such that {ξ∈E:d(ξ,x0([0,T]))<γ}⊂Ω, where d(ξ,x0([0,T]):=inf0≤t≤T∥ξ−x0(t)∥.
We set K:=Γ×M; K is compact as a product of compacts. Using Theorem 3.1 and (A4), we have
[TABLE]
Arbitrarily fix an ϵ>0. Let t∈[0,T], ζ∈M and ξ∈B(x0(t),δϵ). From (3.6) we obtain
[TABLE]
L:=sup{∥D2f(t,ξ1,ζ1)∥:t∈[0,T],ξ1∈B(x0(t),δϵ),ζ1∈M}
≤sup(t1,ξ1,ζ1)∈K∥D2f(t1,ξ1,ζ1)∥+ϵ<+∞.
We set r:=δϵ. If t∈[0,T], ξ,ξ1∈B(x0(t),r) and ζ∈M, using the Mean Value Inequality of the differential calculus theory we obtain
∥f(t,ξ,ζ)−f(t,ξ1,ζ)∥≤L∥ξ−ξ1∥.
∎
When φ∈C0([0,T],E), we set ∥φ∥b:=supt∈[0,T](e−Lt∥φ(t)∥). ∥⋅∥b is called the norm of Bielecki ([6] p. 25-27) and (C0([0,T],E),∥⋅∥b) is a complete normed vector space. We define
[TABLE]
Note that (X,∥⋅∥b) is a complete metric space. Note that when x∈B(x0,r1) we have, for all t∈[0,T], e−L⋅T∥x(t)−x0(t)∥≤e−L⋅t∥x(t)−x0(t)∥≤e−L⋅Tr which implies ∥x(t)−x0(t)∥≤r<γ, and so x(t)∈Ω. For all a∈B(0,ρ)∩R+N, we consider the operator
[TABLE]
Lemma 3.5**.**
*The constants k and ρ come from Lemma 3.3; the constant L comes from Lemma 3.4 and the constant r1 comes from (3.7).
We set r2:=min{ρ,e−L⋅Tr1k−1}.
When a∈R+N, if ∥a∥≤r2 then Φa(X)⊂X.*
Proof.
Note that, for all t∈[0,T], we have x0(t)=ξ0+∫0tf(s,x0(s),u0(s))ds; consequently we have
∥Φa(x0)(t)−x0(t)∥=∥∫0t(f(s,x0(s),ua(s))−f(s,x0(s),u0(s)))ds∥
≤∫0t∥f(s,x0(s),ua(s))−f(s,x0(s),u0(s))∥ds⟹
e−L⋅t∥Φa(x0)(t)−x0(t)∥≤e−L⋅t∫0t∥f(s,x0(s),ua(s))−f(s,x0(s),u0(s))∥ds
≤e−L⋅t∫0T∥f(s,x0(s),ua(s))−f(s,x0(s),u0(s))∥ds≤e−L⋅tk∥a∥ using Lemma 3.3. Hence taking the sup on the t∈[0,T], we obtain
∥Φa(x0)−x0∥b≤supt∈[0,T]e−L⋅tk∥a∥≤k∥a∥, and so we have, for a∈R+N,
[TABLE]
Let x∈X; then for all t∈[0,T], we have e−L⋅t∥x(t)−x0(t)∥≤r1 which implies ∥x(t)−x0(t)∥≤r, and we can use Lemma 3.4 to assert that we have
[TABLE]
Now for all t∈[0,T] we have
∥(Φa(x)−Φa(x0))(t)∥=∥∫0t(f(s,x(s),ua(s))−f(s,x0(s),ua(s)))ds∥
≤∫0t∥f(s,x(s),ua(s))−f(s,x0(s),ua(s))∥ds⟹
e−L⋅t∥(Φa(x)−Φa(x0))(t)∥≤e−L⋅t∫0t∥f(s,x(s),ua(s))−f(s,x0(s),ua(s))∥ds
≤e−L⋅t∫0t(L∥x(t)−x0(t)∥)ds (after (3.10))
=Le−L⋅t∫0t(eL⋅se−L⋅s∥x(s)−x0(s)∥)ds≤Le−L⋅t∫0t(eL⋅s∥x−x0∥b)ds
=Le−L⋅tLeL⋅t−1∥x−x0∥b=(1−eL⋅t)∥x−x0∥b≤(1−e−L⋅T)r1.
Taking the sup on the t∈[0,T], we have proven
[TABLE]
Using (3.9), we obtain ∥Φa(x)−Φa(x0)∥b≤
∥Φa(x)−Φa(x0)∥b+∥Φa(x0)−x0∥b≤(1−e−L⋅T)r1+e−L⋅Tr1=r1, hence Φa(x)∈X.
∎
Lemma 3.6**.**
The constant r2 comes from Lemma 3.5.
Let a∈R+N. If ∥a∥≤r2, then, for all x,z∈X, we have
∥Φa(x)−Φa(z)∥b≤(1−e−L⋅T)∥x−z∥b.
Proof.
Let x,z∈X. Since, for all t∈[0,T], we have e−L⋅t∥x(t)−x0(t)∥≤r1 and e−L⋅t∥z(t)−x0(t)∥≤r1, we obtain ∥x(t)−x0(t)∣≤r and ∥z(t)−x0(t)∥‘≤r, and using Lemma 3.4, we have
e−L⋅t∥(Φa(x)−Φa(z))(t)∥≤e−L⋅t∫0t∥f(s,x(s),ua(s))−f(s,z(s),ua(s))∥ds
≤e−L⋅t∫0t(L∥x(s)−z(s)∥)ds=Le−L⋅t∫0t(eL⋅se−L⋅s∥x(s)−z(s)∥)ds
≤Le−L⋅t∫0t(eL⋅s∥x−z∥b)ds≤Le−L⋅tLeL⋅t−1∥x−z∥b≤(1−e−L⋅T)∥x−z∥b.
∎
Lemma 3.7**.**
For all x∈X, the mapping [a↦Φa(x)] is continuous from B(0,r2)∩R+N into X.
Proof.
Lemma 3.3 ensures the continuity of this mapping at a=0. Now we fix a^=0. Let (an)n∈N be a sequence in B(0,r2)∩R+N which converges toward a^. Note that we have
[TABLE]
We denote by μ the positive measure of Borel-Lebesgue of [0,T]. We have
limn→+∞1[ti+bi(an),ti+bi(an)+ain)(t)=1[ti+bi(a^),ti+bi(a^)+a^i)(t), μ-a.e. t∈[0,T] since the pointwise convergence is clear when t∈(ti+bi(a^),ti+bi(a^)+a^i) and when t∈[0,T]∖[ti+bi(a^),ti+bi(a^)+a^i], and a finite set is a μ-null set. Similarly we obtain
limn→+∞1[ti+bi(an)+ain,ti+1+bi+1(an))(t)=1[ti+bi(a^)+a^i,ti+1+bi+1(a^))(t), μ-a.e. t∈[0,T], limn→+∞1[tN+bN(an)+aNn,T](t)=1[tN+bN(a^)+a^N,T](t), μ-a.e. t∈[0,T]. Since a finite union of μ-null sets is a μ-null set, using (3.12) we obtain
[TABLE]
Let (τj)0≤j≤k+1 be a subdivision of [0,T] such that the discontinuity points of u0 belong to {τj:0≤j≤k}. When j∈{0,...,k} we use the function u0j defined in (3.3) and then {f(t,x(t),u0j(t)):t∈[τj,τj+1]} is compact an an image of a compact set by a continuous function. Since {f(t,x(t),u0(t)):t∈[0,T]} is included in the finite union of compact sets ⋃0≤j≤k{f(t,x(t),u0j(t)):t∈[τj,τj+1]}, it is bounded.
For all i∈{1,...,N}, the set {f(t,x(t),vi):t∈[0,T]} is compact under (A4). Note that {f(t,x(t),ua(t)):t∈[0,T],a∈B(0,r2)∩R+N} is included in {f(t,x(t),u0(t)):t∈[0,T]}∪(∪1≤i≤N{f(t,x(t),vi):t∈[0,T]}). This last set is bounded as a finite union of bounded sets, hence there exists σ∈R+∗ such that, for all t∈[0,T] and for all a∈B(0,r2)∩R+N, ∥f(t,x(t),ua(t))∥≤2σ. Hence we have
[TABLE]
Note that the constant σ is μ-integrable on ∣0,T], and that the functions [t↦∥f(t,x(t),uan(t))−f(t,x(t),ua^(t))∥] is a Borel function on [0,T] as a composition of Borel functions. Hence, using (3.13) and (3.14), we can use the theorem of the dominated convergence of Lebesgue and assert that we have
[TABLE]
For all n∈N, for all t∈[0,T], we have
e−L⋅t∥(Φan(x)−Φa^(x))(t)∥≤e−L⋅t∫0t∥f(s,x(s),uan(s))−f(s,x(s),ua^(s))∥ds
≤∫0T∥f(t,x(t),uan(t))−f(t,x(t),ua^(t))∥dt, then taking the sup on the t∈[0,T], and using (3.15), we obtain limn→+∞∥Φan(x)−Φa^(x)∥b=0.
∎
Proposition 3.8**.**
The following assertions hold.
For all a∈B(0,r2)∩R+N, there exists a solution xa of the Cauchy problem (3.2) which is defined on [0,T] all over.
The mapping [a↦xa], from B(0,r2)∩R+N into X, is continuous.
Proof.
From Lemma 3.5, Lemma 3.6 and Lemma 3.7 we can use Theorem 3.2 and assert that, for each a∈B(0,r2)∩R+N, there exists a unique fixed point xa of Φa in X, and moreover we know that the mapping [a↦xa] is continuous. From the definition (3.5), we have xa(t)=ξ0+∫0tf(s,xa(s),ua(s))ds for all t∈[0,T]. From (A4), we can see that the function [s↦f(s,xa(s),ua(s))] belongs to NPC0([0,T],E), and consequently the function [t↦∫0tf(s,xa(s),ua(s))ds] belongs to PC1([0,T],E), and using a classical result on the differentiation of the primitives functions ([4], chapter 2, Corollary 1, FVR. II6), we obtain that dxa is well defined on [0,T] and we have dxa(t)=f(t,xa(t),ua(t)) on [0,T]. We also have xa(0)=ξ0, and so xa is a solution of the Cauchy problem (3.2). Hence the assertion (i) is proven, and the assertion (ii) results from the continuity of the fixed point with respect to a.
∎
3.4. Properties of differentiability
In this subsection we establish the Fréchet differentiability of the mapping [a↦xa(T)] at the origine.
First we recall some properties of the resolvents.
We consider the linear ODE dy(t)=D2f(t,x0(t),u0(t))y(t) when t∈[0,T]. Following the indications which are given in [10] (Chapter 18) we can assert that, denoting by R(t,s) the resolvent of this linear equation, we have R(t3,t1)=R(t3,t2)R(t2,t1), R(s,s)=idE, R(s,t)=R(t,s)−1, R(⋅,s)∈PC1([0,T],L(E,E)). We define d1R(t,s):=dR(⋅,s)(t) and we have, for all t∈[0,T], d1R(t,s)=D2f(t,x0(t),u0(t))R(t,s), and from R(t,s)=R(s,t)−1, we obtain that R(t,⋅)∈PC1([0,T],L(E,E)). We set d2R(t,s):=dR(t,⋅)(s).
The second step is the following fundamental result due to Michel.
Lemma 3.9**.**
*([9] Lemma 1) There exist r3∈(0,r2), Λ∈L(RN,E) and a mapping ϱ:B(0,r3)∩R+N→E such that lima→0ϱ(a)=0, and such that, for all a∈B(0,r3)∩R+N, we have
xa(T)=x0(T)+Λa+∥a∥ϱ(a).
More precisely, Λa=∑i=1NaiR(T,ti)[f(ti,x0(ti),vi)−f(ti,x0(ti),u0(ti))].*
The following result proves that the mapping [a↦xa(T)] is a restriction of a mapping (defined on a neighborhood of the origine in RN) which is Fréchet differentiable at the origine.
Proposition 3.10**.**
The constant r3 and the linear mapping Λ are provided by Lemma 3.9. There exist r4∈(0,r3] and a mapping κ∈C0(B(0,r4),Ω) which is Fréchet differentiable at a=0 and which satisfies, for all a∈B(0,r3)∩R+N, κ(a)=xa(T), and Dκ(0)=Λ.
Proof.
As a norm on RN we choose the norm associated to the usual inner product. We denote by π the best approximation projector from RN on the closed convex cone R+N, [2] (p. 18, Theorem 1). We know that π is 1-Lipschitzean. It is easy to verify that π(B(0,r3))⊂(B(0,r3)∩R+N). Using Proposition 3.8 note that the mapping ϱ is continuous on B(0,r3)∩R+N since we have
[TABLE]
We set ϱ:=ϱ∘π∈C0(B(0,r3),E). We define κ:B(0,r3)→E by setting κ(a):=x0(T)+Λa+∥a∥ϱ(a). Then κ is continuous since Λ and ϱ are continuous. We have also lima→0ϱ(a)=ϱ(0)=0 which implies that κ is Fréchet differentiable at [math], and that Dκ(0)=Λ.
Since x0(T)∈Ω with Ω open, since lima→0(Λa+∥a∥ϱ(a))=0, reducing r3 to r4∈(0,r3] we can assert that κ(B(0,r4))⊂Ω.
∎
4. Proof of the principle for the problem of Mayer
We describe the general method. When we fix S=((ti,vi))1≤i≤N∈S, we reduce the initial dynamic problem of Mayer to a finite-dimensional static optimization problem where the unknow is the vector a of the thicknessess of the needles. Using a multiplier rule on this static problem we obtain a list of multipliers which is dependent on S. This is the matter of the first subsection.
In the second subsection we prove that we can choose such a list of multipliers which is independent of S∈S, and from this particular list we build the multipliers and the adjoint function of Theorem 1.2.
4.1. Reduction to the finite dimension
We arbitrarily fix S∈S. Since (x0,u0) is optimal for (M′), [math] is a solution of the following finite-dimensional optimization problem
[TABLE]
Using the mapping κ of Proposition 3.10 and (bi∗)1≤i≤N, the dual basis of the canonical basis of RN, [math] is also solution of the following finite-dimensional optimization problem
[TABLE]
since, when a∈B(0,r4) is admissible for (FS1) then necessarily we have a∈B(0,r4)∩R+N. The interest to introduce (FS1) is that this problem enters into the setting of the multiplier rule of [3] while it is not the case for (FS).
Note that Michel in [9] works on (FS), not on (FS1). To do that, he uses a multiplier rule given in [8], which concerns problems on a convex cone.
Lemma 4.1**.**
Let S=((ti,vi))1≤i≤N∈S. There exist (λα)0≤α≤m∈R1+m and (μβ)1≤β≤q∈Rq which satisfy the following conditions.
(λα)0≤α≤m* and (μβ)1≤β≤q are not simulteanous equal to zero.*
∀α=0,...,m, λα≥0.
∀α=1,...,m, λαgα(x0(T))=0.
∀i=1,...,N, p(ti)[f(ti,x0(ti),vi)−f(ti,x0(ti),u0(ti))]≤0, where
p(t):=(∑α=0mλαDgα(x0(T))+∑β=1qμβDhβ(x0(T)))R(T,t), R(t,s) being defined just before Lemma 3.9.
Proof.
Using Proposition 3.10, (A1) and (A2), the assumptions of Theorem 3.2 in [3] are fulfilled, and so we know that there exist (λα)0≤α≤m∈R1+m, (μβ)1≤β≤q∈Rq, and (νi)1≤i≤N∈RN such that the following conditions are fulfilled.
(λα)0≤α≤m, (μβ)1≤β≤q and (νi)1≤i≤N are not simultaneously equal to zero.
∀α=0,...,m, λα≥0.
∀i=1,...,N, νi≥0.
∀α=1,...,m, λαgα(x0(T))=0.
∀i=1,...,N, νibi∗0=0.
∑α=0mλαDgα(x0(T))Dκ(0)+∑β=1qμβDhβ(x0(T))Dκ(0)+∑i=1Nνibi∗=0.
To prove (a), we proceed by contradiction, we assume that (λα)0≤α≤m and (μβ)1≤β≤q are equal to zero. Hence, using (i), we have (νi)1≤i≤N different to zero. Using (vi) we obtain ∑i=1Nνibi∗=0, and since the bi∗ are linearly independent we obtain that (νi)1≤i≤N is equal to zero: this is a contradiction. Consequently (a) is proven. Assertion (b) comes from (i) and (c) comes from (iv). When a∈R+N, using (iii), we have νiai≥0, and from (vi) we obtain
[TABLE]
which implies the following relation, for all a∈R+N,
[TABLE]
Since Dκ(0)a=∑i=1NaiR(T,ti)[f(ti,x0(ti),vi)−f(ti,x0(ti),u0(ti))], the relation 4.1) is equivalent to
[TABLE]
which is equivalent to the conclusion (d).
∎
4.2. End of the proof of Part (I)
In this subsection we follow [9]. Since the set of the lists of multipliers is a cone, we can normalized them by adding the condition ∑α=0m∣λα∣+∑β=1q∣μβ∣=1. When S∈S, we define K(S) as the set of the ((λα)0≤α≤m,(μβ)1≤β≤q) which verify the conclusions (a, b, c, d) of Lemma 4.1 and the additional condition ∑α=0m∣λα∣+∑β=1q∣μβ∣=1. Denoting by Σ(0,1) the unit sphere of R1+m+q, we have K(S)⊂Σ(0,1), K(S) is closed since it is defined by wide inequalities and equalities, If (Sℓ)1≤ℓ≤n=((tiℓ,viℓ)1≤i≤Nℓ)1≤ℓ≤n is a finite family of elements of S, then setting N:=∑ℓ=1nNℓ, we can build 0<s1≤s2≤...≤sN<T and w1,w2,...,wN∈U such that Sˉ=(sj,wj)1≤j≤N∈S and such that, for all ℓ∈{1,...,n}, for all i∈{1,...,Nℓ}, there exists a unique j∈{1,...,N} verifying tiℓ=sj; and then we take wj:=viℓ. Note that, for all ℓ∈{1,...,n}, the values of Sℓ belong to the values of Sˉ. If ((λα)0≤α≤m,(μβ)1≤β≤q)∈K(Sˉ), the conclusions
(a, b, c, d) of Lemma 4.1 are satisfied for the values of S, they are also satisfied for the values of Sℓ for alll ℓ∈{1,...,n}, which implies that ((λα)0≤α≤m,(μβ)1≤β≤q)∈⋂1≤ℓ≤nK(Sℓ)=∅. Hence, this last finite intersection is nonempty.
Since Σ(0,1) is compact, the finite intersection property of the closed subsets of Σ(0,1) implies that ⋂S∈SK(S)=∅, [6] (p. 154, Appendix). Now we choose an element ((λα)0≤α≤m,(μβ)1≤β≤q) in ⋂S∈SK(S), and we consider p defined in the conclusion (d) of Lemma 4.1 for this chosen ((λα)0≤α≤m,(μβ)1≤β≤q). After the building of the K(S), we see that the conclusions (NN), (Si) and (Sℓ) are proven.
We take t∈(0,T) and v∈U, and then we have (t,v)∈S. Then the conclusion (d) of Lemma 4.1 implies p(t)[f(t,x0(t),v)−f(t,x0(t),u0(t))]≤0. Doing t→0+ and t→T−, we obtain the inequality for all t∈[0,T]. Hence the conclusion (MP.M) is proven.
Now we want to prove that p is a solution of the adjoint equation. Using the differentiability of R(⋅,s) outside of a finite set, R(t,s)=R(s,t)−1, the Fréchet differentiability of the inversion operator I:Isom(E,E)→Isom(E,E), I(L):=L−1, and the chain rule we obtain the following formula.
[TABLE]
Differentiating p(t)=(∑α=0mλαDgα(x0(T))+∑β=1qμβDhβ(xaˋ(T)))R(T,t) with respect to t, we obtain
dp(t)=(∑α=0mλαDgα(x0(T))+∑β=1qμβDhβ(xaˋ(T)))d2R(T,t) and using (4.2), we obtain
dp(t)=(∑α=0mλαDgα(x0(T))+∑β=1qμβDhβ(xaˋ(T)))(−R(T,t)D2f(t,x0(t),u0(t))
=−p(t)D2f(t,x0(t),u0(t))=−D2HM(t,x0(t),u0(t),p(t)), and so p satisfies (AE).
From the equality R(T,T)=idE and from the formula which defines p we see that the conclusion (TC) holds. To prove (CH.M) we need the following result.
Lemma 4.2**.**
Let ϕ∈C0([0,T]×U,R) and u∈NPC0([0,T],U) such that ϕ(t,u(t))=maxζ∈Uϕ(t,ζ) for all t∈[0,T]. Then ϕˉ:=[t↦ϕ(t,u(t))]∈C0([0,T],R).
Proof.
Since u is right continuous on [0,T) and ϕ is continuous, ϕˉ is reght continuous on [0,T). Since u is left continuous at T and ϕ is continuous, we have ϕˉ is left continuous at T. Now we ought to prove that ϕˉ is left continuous on (0,T). Let t∈(0,T); for all h∈(−t,0), we have ϕ(t,u(t+h))≤ϕ(t,u(t)) and ϕ(t+h,u(t))≤ϕ(t+h,u(t+h)), and doing
h→0−, we obtain ϕ(t,u(t−))≤ϕ(t,u(t)) and ϕ(t,u(t))≤ϕ(t,u(t−)). Hence we have ϕ(t,u(t−))=ϕ(t,u(t)), i.e. ϕˉ(t−)=ϕˉ(t).
∎
If we set ϕ(t,ζ):=HM(t,x0(t),ζ,p(t)), from (MP.M) we have ϕˉ=HˉM and the conclusion (CH.M) is proven.
Hence Part (I) of Theorem 1.2 is completely proven for the problem of Mayer.
4.3. Proof of part (II)
We need of the following result.
Lemma 4.3**.**
Let ϕ∈C0([0,T]×U,R) such that, for all (t,ζ)∈[0,T]×U, the partial derivative with respect to the first variable ∂1ϕ(t,ζ) exists, and ∂1ϕ is continuous on [0,T]×U. Let u∈NPC0([0,T],U) such that ϕˉ(t):=ϕ(t,u(t))=maxζ∈Uϕ(t,ζ). Then the two following assertions hold.
When t is a continuity point of u, then ϕˉ is differentiable at t and we have ϕˉ′(t)=∂1ϕ(t,u(t)).
ϕˉ∈PC1([0,T],R).
Proof.
From Lemma 4.2 we know that ϕˉ∈C0([0,T],R). Let t be a continuity point of u. For all h>0 small enough, we set Δ(h):=ϕˉ(t+h)−ϕˉ(t). We have ϕ(t+h,u(t))−ϕ(t,u(t))≤ϕ(t+h,u(t+h))−ϕ(t,u(t))=Δ(h) and ϕ(t+h,u(t+h))−ϕ(t,u(t+h))≥ϕ(t+h,u(t+h))−ϕ(t,u(t))=Δ(h). Using a classical theorem of Lagrange for the functions of one real variable ([1], p. 142), we know that there exist θ1h and θ2h in (0,1) such that ∂1ϕ(t+θ1hh,u(t))h≤Δ(h)≤∂1ϕ(t+θ2hh,u(t+h))h which implies ∂1ϕ(t+θ1hh,u(t))≤h1Δ(h)≤∂1(t+θ2hh,u(t+h)), and doing h→0+ and using the continuity of ∂1ϕ and the continuity of u at t, we obtain limh→0+hΔ(h)=∂1ϕ(t,u(t). These last inequalities imply that the right derivative ϕˉR′(t) exists and is equal to ∂1ϕ(t,u(t)). Doing a similar reasonning, we obtain that the left derivative ϕˉL′(t) exists and is equal to ∂1ϕ(t,u(t)). Hence assertion (i) is proven.
Assertion (ii) is a consequence of assertion (i) using the continuity of ∂1ϕ and the normalized piecewise continuity of u.
∎
Setting ϕ(t,ζ):=HM(t,x0(t),ζ,p(t)), we have ϕˉ=HˉM and Part (II) is a corollary of Lemma 4.3.
4.4. Proof of Part (III)
We proceed by contradiction; if there exists t0∈[0,T] such that p(t0)=0, since (AE) is linear, by using the uniqueness of the solution of the Cauchy problem ((AE), p(t0)=0), we obtain that p(t)=0 for all t∈[0,T], notably p(T)=0. Hence using (TC), (Si) and (Sℓ), (QC, 0) implies that (∀α=0,...,m,λα=0) and (∀β=1,...,q,μβ=0) which is a contradiction with (NN). Hence Part (III) is proven.
5. Proof of the principle for the problem of Bolza
It is well known that we can transform a problem of Bolza into a problem of Mayer [10] (p. 393, Chapter 18). We realize such a transformation to deduce Theorem 1.1 from Theorem 1.2. We introduce an additional state variable denoted by σ. We set X:=(σ,x)∈R×Ω as a new state variable; we set F(t,(σ,x),u):=(f0(t,x,u),f(t,x,u)) as the new vectorfield; we set G0(σ,x):=σ+g0(x), Gα(σ,x):=gα(x) when α=1,...,m, and we set Hβ(σ,x):=hβ(x) when β=1,...,q. We formulate te new following problem of Mayer:
[TABLE]
5.1. Proof of Part (I)
We denote by ϖ1:R×E→R and by ϖ2:R×E→E the two projections.
When (x,u) is an admissible process for (B), setting σ(t):=∫0tf(s,x(s),u(s))ds, we see that ((σ,x),u) is an admissible process for (MB) and we have G0((σ,x))(T)=∫0Tf0(t,x(t),u(t))dt+g0(x(T)). Conversely when (X,u) is an admissible process for (MB), setting x:=ϖ2∘X, we see that (x,u) is an admissible process for (B), and setting σ:=ϖ1∘X, we have ∫0Tf0(t,x(t),u(t))dt+g0((x(T))=σ(T)+g0(x(T))=G0(X(T)). Hence since (x0,u0) is optimal for (B), we obtain that (X0,u0)=((σ0,x0),u0) is optimal for (MB). The assumptions of Theorem 1.1 imply that the assumptions of Theorem 1.2 are fulfilled, where (M) is replaced by (MB). Hence there exist (Λα)0≤α≤m∈R1+m, (Mβ)1≤β≤q∈Rq and P∈PC1([0,T],(R×E)∗) such that the conclusions of Theorem 1.2 hold.
When P∈(R×E)∗, we define p0∈R and p∈E∗ by setting p0:=P(1,0) and pξ:=P(0,ξ) for all ξ∈E, and so we have P(r,ξ)=p0r+pξ for all (r,ξ)∈R×E. The Hamiltonian of (MB) is HM(t,(σ,x),u,(p0,p)):=(p0,p)F(t,(σ,x),u)=p0f0(t,x,u)+pf(t,x,u). The conclusions of Theorem 1.2 provide the following conditions.
(Λα)0≤α≤m and (Mβ)1≤β≤q are not simulteanously equal to zero.
∀α=0,...,m, Λα≥0.
∀α=1,...,m, ΛαGα(X0(T))=0.
P(T)=∑α=0mΛαDGα(X0(T))+∑β=1qMβDHβ(X0(T)).
dP(t)=−D2HM(t,X0(t),u0(t),P(t)) for all t∈∣0,T].
HM(t,X0(t),u0(t),P(t))≥HM(t,X0(t),ζ,P(t)) for all t∈[0,T] and for all ζ∈U.
[t↦HM(t,X0(t),u0(t),P(t))]∈PC1([0,T],R).
We set λα:=Λα for all α=0,...,m, and μβ:=Mβ for all β=1,...,q.
Hence (i) and (ii) imply that (NN) and (Si) of Theorem 1.1 hold. From (iii) we obtain λαgα(x0(T))=0 for all α=1,...,m, and so (Sℓ) of Theorem 1.1 holds.
About the partial differentials, note that we have, for the partial differentials with respect to the first variable: D1G0(σ,x0(T))=idR, D1Gα(σ,x0(T))=0 when α=1,...,m, D1Hβ(σ,x0(T))=0 when β=1,...,q, and for the partial differentials with respect to the second variable: D2G0(σ,x0(T))=Dg0(x0(T)), D2Gα(σ,x0(T))=Dgα(x0(T)) when α=1,...,m, and D2Hβ(σ,x0(T))=Dhβ(x(T)) when β=1,...,q. Hence from (iv) we deduce the two following relations.
[TABLE]
[TABLE]
This last equatility is just the conclusion (TC) of Theorem 1.1.
From (v) we obtain that dp0(t)=0 for all t∈[0,T], and then using (5.1) we have the following relation.
[TABLE]
From (v) we also deduce that, for all t∈[0,T], we have
dp(t)=λ0D2f0(t,x0(t),u0(t))+p(t)D2f(t,x0(t),u0(t)) which is (AE.B) of Theorem 1.1.
From (vi) we deduce that, for all t∈[0,T] and for all ζ∈U, we have
λ0f0(t,x0(t),u0(t))+p(t)f(t,x0(t),u0(t))≥λ0f0(t,x0(t),ζ)+p(t)f(t,x0(t),ζ) which is the conclusion (MP.B) of Theorem 1.1.
From (vii), since HM(t,X0(t),u0(t),P(t))=HB(t,x0(t),u0(t),p(t),λ0) we obtain (CH.B).
Hence Part (I) of Theorem 1.1 is completely proven.
5.2. Proof of Part (II)
Using Part (II) of Theorem 1.2 on (MB), the existence and the continuity of ∂1f0 and of ∂1f imply the existence and the continuity of ∂1F. We obtain that [t↦HB(t,x0(t),u0(t),p(t),λ0)=HM(t,X0(t),u0(t),P(t))]∈PC1([0,T],R), and when t is a continuity point of u0, we have HˉB′(t)=HˉM′(t)=λ0∂1f0(t,x0(t),u0(t))+p(t)∂1f(t,x0(t),u0(t)). Hence Part (II) is proven.
5.3. Proof of Part (III)
We procced by contradiction assuming that there exists t∗∈[0,T] such (λ0,p(t∗))=(0,0). Since λ0=0, (AE.B) becomes an homogeneous linear equation, and using the uniqueness of the cauchy problem ((AE.B), p(t∗)=0), we obtain that p is equal to zero on [0,T], notably we have p(T)=0. Hence using (TC), (Si), (Sℓ), (QC, 1) implies that (∀α=1,...,m,λα=0) and (∀β=1,...,q,μβ=0). Since λ0=0, we have (∀α=0,...,m,λα=0) and (∀β=1,...,q,μβ=0) which is a contradiction with (NN).