This paper develops optimal control conditions for reaction-diffusion systems with hysteresis, addressing challenges posed by non-locality and rate-independent hysteresis, and establishing regularity and uniqueness results.
Contribution
It introduces first order optimality conditions for hysteresis-including reaction-diffusion systems, with improved conditions and uniqueness results for distributed controls.
Findings
01
Derived first order necessary optimality conditions.
02
Proved regularity and uniqueness of the adjoint system.
03
Analyzed the value function's regularity under control restrictions.
Abstract
This paper is concerned with the optimal control of hysteresis-reaction-diffusion systems. We study a control problem with two sorts of controls, namely distributed control functions, or controls which act on a part of the boundary of the domain. The state equation is given by a reaction-diffusion system with the additional challenge that the reaction term includes a scalar stop operator. We choose a variational inequality to represent the hysteresis. In this paper, we prove first order necessary optimality conditions. In particular, under certain regularity assumptions, we derive results about the continuity properties of the adjoint system. For the case of distributed controls, we improve the optimality conditions and show uniqueness of the adjoint variables. We employ the optimality system to prove higher regularity of the optimal solutions of our problem. Finally, we derive…
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Full text
Optimal control of reaction-diffusion systems with hysteresis
Christian Münch111Department of Mathematics - M6, Technical University of Munich, Boltzmannstr. 3, 85747 Garching, Germany. [email protected]
Abstract
This paper is concerned with the optimal control of hysteresis-reaction-diffusion systems.
We study a control problem with two sorts of controls, namely distributed control functions, or controls which act on a part of the boundary of the domain.
The state equation is given by a reaction-diffusion system with the additional challenge that the reaction term includes a scalar stop operator.
We choose a variational inequality to represent the hysteresis.
In this paper, we prove first order necessary optimality conditions. In particular, under certain regularity assumptions, we derive results about the continuity properties of the adjoint system. For the case of distributed controls, we improve the optimality conditions and show uniqueness of the adjoint variables. We employ the optimality system to prove higher regularity of the optimal solutions of our problem. Finally, we derive regularity properties of the value function of a perturbed control problem when the set of controls is restricted.
The specific feature of rate-independent hysteresis
in the state equation leads to difficulties concerning the analysis of the solution operator.
Non-locality in time of the Hadamard derivative of the control-to-state operator complicates the derivation of an adjoint system.
In this paper, we derive an adjoint system for the optimal control problem
[TABLE]
subject to
[TABLE]
where WΓD−1,p(Ω) is a product of dual spaces, see e.g. [Mün16, (16)-(18)] for the existence theory of problem (1)-(3) and related references therein.
We consider either spatially
distributed controls in the space
U1:=L2((0,T);[L2(Ω)]m), or controls which act on given Neumann boundary parts ΓNj, j∈{1,…m}, of the state space, i.e. controls in
U2:=L2((0,T);∏j=1mL2(ΓNj,Hd−1)).
The operators
B1:[L2(Ω)]m→WΓD−1,p(Ω) and
B2:∏j=1mL2(ΓNj,Hd−1)→WΓD−1,p(Ω) are continuous and Ap is an unbounded diffusion operator on the space WΓD−1,p(Ω). With i∈{1,2}, we identify Bi with the corresponding continuous operators from Ui into L2((0,T);WΓD−1,p(Ω)) which act pointwise in time, i.e. we write (Biu)(t)=Bi(u(t)) for all t∈(0,T). In the same way we identify (Apy)(t) with Ap(y(t)) for functions y:(0,T)→WΓD−1,p(Ω).
Moreover, S projects y to a scalar valued function. In particular,
W is a scalar stop operator and it is well-known (see e.g. [Vis13], [BK13]) that
W is represented by the solution operator z=W[v] of the variational inequality
[TABLE]
For i∈{1,2}, we denote by G the operator, which maps Biu to the unique solution y of (2)-(3), see [Mün16, Theorem 3.1].
Note that y=G(Biu) is a function of time with values in a product of dual spaces.
Optimal control of (systems of) partial differential equations has extensively been analyzed in the literature before.
In particular, optimal control problems with state equations of semilinear parabolic type are part of the well-known monograph [Trö10] and the early work [BC85]. Further studies in this direction are the subject of [RZ98] and [Cas97]. We also refer to [HKR13] for a control problem with parabolic state equation and rough boundary conditions like in our setting.
Early studies in the field of optimal control of reaction-diffusion systems and in particular in the direction of parameter sensitivity analysis have been performed in [Gri03] and were further established in [GV06] and several more papers. Optimality conditions for a similar problem were also derived in [BJT10].
The non-linearities in all the works mentioned so far are mostly smooth enough to obtain a (twice) continuously differentiable control-to-state operator, so that first and many times also second order optimality conditions could be derived.
In the literature, there are only few results available concerning optimal control of infinite-dimensional rate-independent processes.
For a class of energetically driven processes, existence of optimal controls for problems of this type has first been
studied in [Rin08] and [Rin09].
Subsequently, the results were applied to (thermal) control problems in the field of shape memory
materials in [ELS13] and [EL14]. No optimality conditions are given in these works.
Optimal control of a problem of static plasticity in the infinite-dimensional setting is the subject of [HMW12] and [HMW13]. The results were used in [HMW14] to numerically solve a quasi-static control problem by time-discretization.
Optimality conditions for time-continuous, infinite-dimensional, rate-independent control problems of quasi-static plasticity type could be derived in [Wac12], [Wac15], [Wac16] by means of time-discretization.
Another time-continuous, infinite-dimensional optimal control problem of a rate-independent system, which is represented by its energetic formulation, is addressed in [SWW16]. With help of viscous regularization, a necessary optimality condition is derived.
To our knowledge, the first results for optimal control of hysteresis have been achieved in [Bro87, Bro88, Bro91].
Necessary optimality conditions for the optimal control of an ODE-system with hysteresis were established.
An adjoint system was derived by a time discretization approach.
Optimal control of sweeping processes has been studied in [CMF14], [Col+12] and [Col+16].
In [BK13], first order optimality conditions for a control problem of an ODE-system with hysteresis of (vectorial) stop type were derived. The stop operator is represented in form of a variational inequality.
The main challenge with the stop operator (as with all hysteresis operators) is the fact that hysteresis acts non-local in time so that the state y(t) at each time t∈(0,T] depends on the whole background (0,t). Moreover, the stop operator is not differentiable in the classical sense and so the control-to-state can not be expected to be so either.
Regularization techniques were used in order to derive an optimality system.
Several of the ideas of this approach are useful also for us.
To handle a reaction-diffusion system requires additional work though. Firstly, the state vector y:[0,T]→WΓD−1,p(Ω) in (2) is a function with values in an infinite-dimensional space and secondly, the non-linearity f in our case is not necessarily continuously differentiable but only locally Lipschitz continuous and directionally differentiable. Therefore, techniques as in [MS15] are required. Particularly, since the domain Ω has a rough boundary, we have to consider a product of dual spaces for the domain of the diffusion operator Ap.
The existing literature provides only few rigorous results in the field of control of hysteresis-reaction-diffusion systems, especially when it comes to optimal control of such systems.
In [CC02], automatic control problems governed by reaction-diffusion systems with feedback control of relay switch and Preisach type have been studied. Global existence and uniqueness of solutions were proven.
Closed-loop control of a reaction-diffusion system coupled with ordinary differential inclusions has been considered in [DN11]. A feedback law for the case with a finite number of control devices was derived.
Necessary conditions for the optimal control of (general) non-smooth semilinear parabolic equations have been established in [MS15]. In particular, the non-linearity is merely locally Lipschitz continuous and directionally differentiable so that the control-to-state operator is not differentiable in the classical sense. Regularization techniques have been used to derive an adjoint system. No hysteresis is considered in this paper.
Nevertheless, a modification of the approach in [MS15] is applicable for the problem at hand.
In particular, we include ideas from [BK13] and adapt the proof to apply to non-localities in time such as hysteresis.
We refer to the references in [MS15] for a good overview of further contributions dealing with optimal control of non-smooth parabolic equations.
In this paper, we are interested in the optimal control of non-smooth reaction-diffusion systems with hysteresis. In particular, a scalar stop operator enters the non-linearity f.
The function f is assumed to be locally Lipschitz continuous and directionally differentiable. Additionally, the domain Ω satisfies minimal smoothness assumptions.
The outline of the paper is as follows:
In Section 2, we introduce the framework for the rest of the work and collect results from the literature.
Subsection 2.3 contains the main assumption and notation.
Our first main interest is to derive an adjoint system and first order necessary optimality conditions for problem (1)-(3).
In Section 3, we introduce a family of regularized control problems with ε-dependent state equations and derive adjoint systems as well as optimality conditions for those.
In particular, we regularize f and the stop operator W in dependence of the parameter ε>0 and replace the original control problem by a regularized one.
The corresponding control-to-state operator u↦Gε(Biu), i∈{1,2}, and the regularization y↦Zε(Sy) of W[S⋅] are Gâteaux-differentiable and we obtain optimal solutions uε, yε=Gε(Biuε) and zε=Zε(Syε) of the regularized problems.
We investigate in the limit ε→0 and use standard arguments to derive
a solution (u,y,z) of the original problem.
It still remains difficult to derive adjoint systems (pε,qε) already for the regularized problems.
The main result of Section 3 is Theorem 3.13 which contains the evolution equations of pε and qε and the adjoint equation which provides a relation between (pε,qε) and uε and u.
In Section 4, we perform the key step towards an optimality system of (1)-(3) by driving the regularization parameter to zero. We exploit the adjoint systems (pε,qε) to derive necessary optimality conditions for problem (1)-(3).
While the evolution equation for p
follows rather straight forward, the adjoint variable q
which belongs to z has lower regularity, similar as in optimal control problems with implicit state constraints of the form of variational inequalities.
The function q is contained in the space BV(0,T) of functions with bounded total variation in [0,T], and instead of a time derivative we obtain a measure dq∈C([0,T])∗.
In order to complete our knowledge about the optimality system, we investigate in studying q and dq. Indeed, we reveal a lot of the properties of q and the corresponding measure dq. There remains an abstract measure dμ∈C([0,T])∗ on which dq depends and which we cannot fully characterize. Moreover, dμ appears in the optimality conditions for problem (1)-(3). Still, we are able to prove that dμ has its support only in a part of [0,T].
With an additional regularity assumption on Sy, we can characterize the measure dμ also in most of the parts where it does not vanish.
The first main results of Section 4 are Theorem 4.13 and Corollary 4.14, which contain the existence of an adjoint system and optimality conditions for problem (1)-(3) for i∈{1,2}.
After having established the optimality system for the general problem (1)-(3), i∈{1,2}, we continue to improve the optimality conditions for the particular case of distributed control functions, i.e. for i=1, see Corollary 4.15 below.
Moreover, in Corollary 4.16. we show uniqueness of p, q and dμ for i=1. In the we make explicit use of the surjectivity of B1 which implies that the operator B1∗ in the adjoint equation is one-to-one.
These together are the second main result of Section 4.
In Section 5,
we prove higher regularity of the optimal control u and the optimal state y by means of the adjoint equation and the continuity properties of the adjoint variables, see Theorem 5.2 below. An example for a case in which Theorem 5.2 can be applied is given in Remark 5.3.
Finally, in Section 6 we study a perturbed problem similar to (1)-(3). In particular, in Theorem 6.1 we prove regularity results for the corresponding value function.
All our results are applicable for more general spaces of control functions U=\mathrm{L}^{2}\bigl{(}(0,T);\tilde{U}\bigr{)}, as long as there exists a continuous operator B:U~→WΓD−1,p(Ω). Also J(y,u) can be exchanged by a general differentiable functional J(y,u,z) if the corresponding reduced cost function remains coercive in u∈U.
Moreover, Ap can be replaced by a semi-linear parabolic operator which satisfies maximal parabolic regularity on the space WΓD−1,p(Ω).
We focus on the two particular control problems for U1 and U2 and on the operator Ap in order to give an illustration.
Notation:
We write L(X,Y) for the space of linear operators between spaces X and Y and L(X) for the space of linear operators on X.
We also abbreviate the duality in X by
⟨x,y⟩X∗,X=⟨x,y⟩X.c>0 denotes a generic constant which is adapted in the course of the paper.
In Banach space valued evolution equations like (2) we sometimes omit the range space if the latter is clear from the context, i.e. we only write ”for t∈(0,T)”.
2 Preliminaries and assumptions
We introduce the setting for the rest of the work, collect results from the literature and state the main assumption.
2.1 Sobolev spaces including homogeneous Dirichlet boundary conditions
Definition 2.1**.**
[[, Definition 2.1]I-sets]muench
For 0<I≤d and a closed set M⊂Rd let ρ denote the restriction of the I-dimensional Hausdorff measure HI to M. Then we call M an I-set if there are constants c1,c2>0 such that
[TABLE]
for all x in M and r∈]0,1[.
Assumption 2.2** (Domain).**
[Hal+15, Assumption 2.3 and Assumption 4.11] or [Mün16, Assumption 2.2 and Assumption 2.6]
For some given d≥2, the domain Ω⊂Rd is bounded and Ω is a d-set.
For j∈{1,…,m} the Neumann boundary part ΓNj⊂∂Ω is open and ΓDj=∂Ω\ΓNj is a (d−1)-set.
For any x∈ΓNj there is an open neighborhood Ux of x and a bi-Lipschitz mapping ϕx from Ux onto a cube in Rd such that ϕx(Ω∩Ux) equals the lower half of the cube and such that ∂Ω∩Ux is mapped onto the top surface of the lower half cube.
We only consider real valued functions.
For each component j∈{1,…,m} of the space of vector valued functions, see Definition 2.3, we decompose the boundary ∂Ω into the corresponding Dirichlet part ΓDj and the Neumann boundary ΓNj:=∂Ω\ΓDj, see Assumption 2.2.
The cases ΓDj=∅ and ΓDj=∂Ω are not excluded.
We define Sobolev spaces which include the Dirichlet boundary conditions for our state equation.
Definition 2.3** (Sobolev spaces).**
[Hal+15, Definition 2.4] or [Mün16, Definition 2.4]
For Ω from Assumption 2.2 and p∈[1,∞) we denote by
W1,p(Ω)
the usual Sobolev space on Ω.
If M is a closed subset of Ω we define
[TABLE]
where the closure is taken in the space W1,p(Ω).
In the case p∈(1,∞) we denote by p′ the Hölder conjugate of p.
Moreover, we write
[TABLE]
for the dual space WM1,p′(Ω).
In the vectorial setting we introduce the product space
[TABLE]
and for p∈(1,∞) we denote by WΓD−1,p(Ω) the (componentwise) dual space of
WΓD1,p′(Ω).
2.2 Operators and their properties
In this section we precisely define the operators Ap in equation (2), see Definition 2.4. We apply results from the literature to assure that Ap satisfies the properties which we need for the analysis of (2)-(3) for particular values of p to be chosen, see [Hal+15, Section 6] or [Mün16, Subsection 2.2].
Definition 2.4** (Diffusion operator).**
For p∈(1,∞) we define the continuous operators
[TABLE]
and
[TABLE]
With given diffusion coefficients d1,…,dm>0 we define the corresponding diffusion matrix in Rmd×md by
D=diag(d1,…,d1,…,dm,…,dm).
For p∈(1,∞) we set
[TABLE]
and define the unbounded operator
[TABLE]
The set ran(Ip) stands for the range of Ip. The domain dom(Ap) is equipped with the graph norm.
We introduce the notion of maximal parabolic regularity as in [Mün16, Definition 2.12].
For p,q∈(1,∞) and (t0,T)⊂R we say that Ap satisfies maximal parabolic Lq((t0,T);WΓD−1,p(Ω))-regularity if for all
g∈Lq((t0,T);WΓD−1,p(Ω)) there is a unique solution y∈W1,q((t0,T);WΓD−1,p(Ω))∩Lq((t0,T);dom(Ap)) of the equation
[TABLE]
The time derivative is taken in the sense of distributions [Aus+14, Definition 11.2].
For t∈[0,T] we abbreviate
Yq:=W1,q((0,T);WΓD−1,p(Ω))∩Lq((0,T);dom(Ap)),
Yq,t:={y∈Yq:y(t)=0}
and
If Definition 2.5 applies for Ap with some p∈(1,∞) then the property of maximal parabolic regularity is independent of
q∈(1,∞) and of the interval (t0,T), so we just say that Ap satisfies maximal parabolic regularity on WΓD−1,p(Ω) in this case.
2. 2.
If Ap satisfies maximal parabolic regularity on WΓD−1,p(Ω) for some p∈(1,∞) then
(dtd+Ap)−1
is bounded from Lq((0,T);WΓD−1,p(Ω)) to Yq,0 for any q∈(1,∞).
3. 3.
In the setting of Assumption 2.2 there is an open interval J containing 2 such that for p∈J the operator Ap+Ip is a topological isomorphism and such that −Ap generates an analytic semigroup of operators on WΓD−1,p(Ω)
[Mün16, Theorem 2.10] or [Hal+15, Theorem 5.6 and Theorem 5.12].
4. 4.
If p∈J and if θ≥0 is given then for Ap+1:=Ap+Id the fractional power spaces Xθ:=dom([Ap+1]θ)⊂WΓD−1,p(Ω) and the unbounded operators [Ap+1]θ in the sense of [H81, Chapter 1] are well-defined with X0=WΓD−1,p(Ω).
Xθ is equipped with the norm
∥x∥Xθ=∥(Ap+1)θx∥WΓD−1,p(Ω) [[, cf.]Remark 2.11]muench.
Note that we can identify X1 with the space dom(Ap) endowed with the graph norm.
5. 5.
If p∈J∩[2,∞), then Ap satisfies maximal parabolic Sobolev regularity on WΓD−1,p(Ω) and we have the topological equivalences [WΓD−1,p(Ω),WΓD1,p(Ω)]θ≃[WΓD−1,p(Ω),dom(Ap)]θ≃Xθ
for all θ∈(0,1) [CA01, Theorem 11.6.1].
By [⋅,⋅]θ we mean complex interpolation.
for every 0<θ<η<1−1/q and 0≤β<1−1/q−η. (⋅,⋅)η,1 or (⋅,⋅)η,q respectively means real interpolation. The first embeddings are compact because dom(Ap) is compactly embedded into WΓD−1,p(Ω).
With p∈J, the following estimate for the fractional powers of Ap+1 and the analytic semigroup exp(−Apt) is crucial:
Remark 2.8**.**
[Mün16, Remark 2.15]
Let p∈J with J from Remark 2.6. For t>0
and arbitrary γ∈(0,1) and θ≥0
there exists some
Cθ∈(0,∞) such that
[TABLE]
The stop operator has the following regularity properties.
Lemma 2.9** (Stop operator).**
With T>0 the stop operator W, which is represented by (4)-(5), is Lipschitz continuous as a mapping on C[0,T] and
[TABLE]
for all v,v1,v2∈C[0,T] and t∈[0,T]. Note that we have to add ∣z0∣ in (7) because, by (5), W[v](0)=z0 for any v∈C[0,T].
For q∈[1,∞) it is also bounded and weakly continuous on W1,q(0,T). W:C[0,T]→Lq(0,T) is Hadamard directionally differentiable, see Definition 2.11 below. The same regularity properties hold for the operator P=Id−W. P is a scalar play operator. More precisely, for r=2b−a let Pr:C[0,T]×R→C[0,T] denote a symmetrical scalar play operator (as in [BK15]).
Consider the affine linear transformation
T:[−r,r]→[a,b],T:x↦x−2b+a.
Then for v∈C[0,T] there holds
[TABLE]
Proof.
Follows from [Mün16, Subsection 2.4 and Subsection 4.2], see also [Vis13, Part 1, Chapter III] and [BK15].
∎
2.3 Assumptions and notation
Our main assumption is the following:
Assumption 2.10** (Main assumption).**
[Mün16, Assumptions 2.16, 4.6 and 5.1]
We always suppose that Assumption 2.2 holds.
Moreover we assume:
(A1)
Dimension and Sobolev exponent: d≥2 and with J from Remark 2.6 there holds p∈J∩[2,∞) and 2≥p(1−d1).
(A2)
Nonlinearity locally Lipschitz + Hadamard: We will need a fractional power space Xα=dom([Ap+1]α) with exponent strictly smaller than one half. This fact is highlighted by a new parameter α which we use instead of θ∈[0,∞). For some α∈(0,21) suppose that the function
f:Xα×R→WΓD−1,p(Ω) is locally Lipschitz continuous with respect to the Xα-norm.
This means that given any y0∈Xα there is a constant L(y0) and a neighbourhood
V(y0)={y∈Xα:∥y−y0∥Xα≤δ∈(0,∞)} of y0 such that
[TABLE]
for every y1,y2∈V(y0) and all x1,x2∈R.
f is assumed to be directionally differentiable and therefore Hadamard directionally differentiable, see Definition 2.11 below. Furthermore, the linear growth condition
[TABLE]
holds for some M>0.
(A3)
Scalar projection: For some w∈WΓD1,p′(Ω)\{0} the operator S∈[WΓD−1,p(Ω)]∗ in equation (3) is given by
Sy=⟨y,w⟩WΓD1,p′(Ω)∀y∈WΓD−1,p(Ω).
We assume that w is even contained in the space dom([(1+Ap)1−α]∗). Note that S belongs to [Xθ]∗ for all θ≥0 because of the embedding
Xθ↪WΓD−1,p(Ω).
(A4)
Desired state: The desired state yd in (1) is in L2((0,T);[L2(Ω)]m).
We introduce some more notation for the rest of the work:
(N1)
For the particular p from (A1) in Assumption 2.10 we set
X:=WΓD−1,p(Ω)
with WΓD−1,p(Ω) from Definition 2.3.
We sometimes identify elements v∈X∗ with their Riesz representation in WΓD1,p′(Ω), i.e.
⟨v,y⟩X=⟨y,v⟩WΓD1,p′(Ω),∀y∈X.
(N2)
The operators Ap and the spaces Xθ=dom([Ap+1]θ) are defined as in Definition 2.4 and Remark 2.6.
(N3)
The spaces Yq, Yq,t and Yq,t∗ are defined as in Definition 2.5.
B1 is defined by
B1:[L2(Ω)]m→X,⟨B1u,v⟩WΓD1,p′(Ω):=∫Ωu⋅vdxv∈WΓD1,p′(Ω).
Since 2≥p(1−d1), the embeddings
L2(ΓNj,Hd−1)↪WΓDj−1,p(Ω) are continuous for j∈{1,…,m} [Hal+15, Remark 5.11].
Therefore also the operator
B2:∏j=1mL2(ΓNj,Hd−1)→X,⟨B2y,v⟩W1,p′(Ω):=∑j=1m∫ΓNjyjvjdHd−1∀v∈WΓD1,p′(Ω)
is continuous.
(N6)
We write JT=(0,T), U1=L2(JT;[L2(Ω)]m)
and
U2=L2(JT;∏j=1mL2(ΓNj,Hd−1)).
2.4 Solution operator and optimal control
As in [Mün16, Equation (1)] we denote F[y](t):=f(y(t),W[Sy](t)) and introduce the more general abstract evolution equation
[TABLE]
Note that F[y] is non-local in time.
In order to obtain some kind of differentiability of the reduced cost function, the solution operator of the state equation has to be differentiable in a sense which allows for the chain rule.
We can not expect a Fréchet derivative because of the non-smooth hysteresis operator, see [BK15].
But the chain rule can also be applied within the weaker concept of Hadamard directional differentiability.
Definition 2.11**.**
[Hadamard directional differentiability]
Let X,Y be normed vector spaces and let U⊂X be open. If g:U→Y is directionally differentiable at x∈U and if in addition for all functions r:[0,λ0)→X with λ→0limλr(λ)=0 it holds
g′[x;h]=λ↓0limλg(x+λh+r(λ))−g(x)
for all directions h∈X, we call g′[x;h] the Hadamard directional derivative of g at x in the direction h.
Note that g(x+λh+r(λ)) is only well defined if λ is already small enough so that x+λh+r(λ)∈U. The chain rule applies for Hadamard directionally differentiable functions [Mün16, Lemma 4.3].
Hadamard directional differentiability of the solution operator G is shown in [Mün16].
By [Mün16, Theorem 3.1 and Theorem 4.7] we have:
Theorem 2.12** (Solution operator for the state equation).**
Let Assumption 2.10 hold. Then for the fixed value α∈(0,21) and for all u∈Lq(JT;X) with q∈(1−α1,∞] problem (8) has a unique mild solution
y=y(u)=:yu in C(JT;Xα).
In particular, this means that (F[y])+u is contained in L1(JT;X) and that y solves the integral equation
[TABLE]
The solution mapping
G:u↦y(u),Lq(JT;X)→C(JT;Xα)
is locally Lipschitz continuous.
G is linearly bounded with values in C(JT;Xα).
All statements remain valid if C(JT;Xα) is replaced by
Ys,0
where s=q if q<∞ and with s∈(1,∞) arbitrary if q=∞.
G is Hadamard directionally differentiable as a mapping into C(JT;Xα) as well as into Yq,0 for any q∈(1−α1,∞).
Its derivative yu,h:=G′[u;h] at u∈Lq(JT;X) in direction h∈Lq(JT;X) is given by the unique solution ζ of
[TABLE]
where F′[y;ζ](t)=f′[(y(t),W[Sy](t));(y(t),W′[Sy;Sζ](t))] and y=G(u).
The mapping h↦G′[u;h] is Lipschitz continuous from Lq(JT;X) to C(JT;Xα) and to Yq,0 with a modulus C=C(G(u),T)>0.
Existence of an optimal control for problem (1)-(3) is shown in [Mün16, Theorem 5.4]:
Theorem 2.13** (Existence of optimal control).**
Let Assumption 2.10 hold.
Then for i∈{1,2}, there exists an optimal control u∈Ui for the optimal control problem (1)-(3). This means that u, together with the optimal state y=G(u), which solves (2), are a solution of the minimization problem (1). The solution of (3) is given by z=W[Sy].
In order to derive an adjoint system for problem (1)-(3) we introduce a sequence of control problems with regularized ε-dependent state equations, for which we can derive adjoint systems.
To this aim we regularize the variational inequality which defines W and the non-linearity f, which yields a regularization of the solution operator of (8).
The regularization of W follows the techniques in [BK13, Section 3] and the approach for the regularization of semilinear parabolic equations relies on [MS15, Section 4].
In the end of Subsection 3.1, we estimate the norms of the solutions of the regularized state equations against the forcing term u, independently of ε.
The dynamics of the regularized state equations in dependence of ε is analyzed in Subsection 3.2: The estimates from Subsection 3.1 together with a weak compactness argument imply weak compactness of the regularized solution operators for fixed ε>0. This yields weakly converging subsequences yεk and zεk for any weakly converging sequence uε, ε→0.
In Subsection 3.3, we apply the convergence result from Subsection 3.2 to deduce convergence of the solutions of regularized control problems, which are introduced in Subsection 3.3, to an optimal solution of problem (1)-(3) as ε→0, see Theorem 3.9.
The adjoint equations for the solutions of the regularized control problems with ε>0 fixed are derived in Subsection 3.5, see Theorem 3.13 below.
In Subsection 3.6, we derive uniform-in-ε bounds for the norms of the adjoint variables pε,qε from Theorem 3.13.
The norm bounds on pε,qε from Subsection 3.6 give rise to weakly converging subsequences pεk and qεk.
Taking the limit k→∞ then yields an adjoint system for (1)-(3). This step is carried out in Section 4.
We begin with several assumptions on the functions which will enter the regularized problems.
Assumption 3.1** (Regularization).**
For ε∗>0 and ε∈(0,ε∗] we assume that:
(A1)ε
fε:Xα×R→ is Gâteaux differentiable.
(A2)ε
sup(y,z)∈Xα×R∥fε(y,z)−f(y,z)∥X→0 as ε→0.
(A3)ε
fε is locally Lipschitz continuous with respect to the Xα-norm and all the neighbourhoods and Lipschitz constants are equal to the ones of f in (A2) in Assumption 2.10, independently of ε.
The growth condition
∥fε(y,x)∥X≤M(1+∥y∥Xα+∣x∣)
holds for all y∈Xα and x∈R, with M from (A2) in Assumption 2.10.
(A4)ε
Following the ideas of [BK13], we introduce a convex function Ψ:R→R with Ψ(x)≡0 for x∈[a,b] and Ψ(x)>0 for x∈R\[a,b]. We assume that Ψ is twice continuously differentiable and Ψ′(x)≤m1∣x−a∣ for some m1>0 and all x∈R. Moreover, Ψ′′(x)≤m2 for some m2>0 and all x∈R and Ψ′′ is assumed to be locally Lipschitz continuous.
Remark 3.2**.**
A function Ψ as in Assumption 3.1 can be contructed as a piecewise defined mapping
Ψ=χ(−∞,a1]Ψ−2+χ(a1,a]Ψ−1+χ(b,b1]Ψ1+χ(b1,∞)Ψ2,
where a1<a<b<b1. χ denotes the characteristic function.
Ψ−2 and Ψ2 are affine linear and Ψ−1 and Ψ1 are polynomials of order four with roots in a and b which are at the same time saddle points and with turning points in a1 and b1.
For example we can choose b1:=b+2,
Ψ1(x):=(x−b)3(4+b−x)
and
Ψ2(x):=16(x−1−b)
and define Ψ−1,Ψ−2 in a similar way, cf. Figure 1.
Local Lipschitz continuity of Ψ′′ also in the points where the functions Ψ−2,…,Ψ2 are glued together is not hard to see. It follows that Ψ′′ is Lipschitz continuous.
We introduce the following regularized state equations for i∈{1,2} and ε>0:
[TABLE]
3.1 Regularization of (8) and uniform-in-ε estimates
In this subsection, we introduce a regularization of (8), similar to the regularized state equations (10)-(11) but for source terms u∈Lq(JT;X). We show well-posedness and estimate the norms of the solutions in u, independently of ε. The ideas for many of the steps in this subsection go back to [BK13, Subsection 3.1].
Definition 3.3** (Regularized stop).**
For ε∈(0,ε∗] we denote by Zε:v↦Zε(v) the solution operator of
[TABLE]
or of the corresponding integral equation. The input v is a function defined on JT.
Remark 3.4**.**
By standard techniques it follows that Zε is continuously differentiable on C(JT). Its derivative at v in direction h is given by the unique solution Zε′[v;h]=z of the integral equation
z(t)=h(t)−∫0tε1Ψ′′(Zε(v)(s))z(s)ds.Zε is bounded on W1,q(JT) for all q∈(1,∞).
Similar to the definition of F in Subsection 2.4 we denote (Fε(y))(t):=fε(y(t),Zε(Sy)(t)).
Consider the abstract evolution equation
[TABLE]
Corollary 3.5** (Existence of regularized problem).**
Let Assumption 2.10 and Assumption 3.1 hold and let ε∈(0,ε∗] be arbitrary. Furthermore, assume q∈(1−α1,∞] and set s=q if q<∞ or s∈(1,∞) arbitrary if q=∞.
Then for all u∈Lq(JT;X) problem (12) has a unique solution yε(u) in Ys,0.
The solution mapping
Gε:u↦yε(u)=:yεu
is locally Lipschitz continuous from Lq(JT;X) to C(JT;Xα) and to Ys,0. We denote zεu:=zε(u):=Zε(Syεu).
Proof.
Unique solvability of (12) and local Lipschitz continuity of the solution mapping follow because Zε satisfies the properties of W in Theorem 2.12.
∎
In the next step we estimate the norms of the solutions of (12) independently of ε by the norm of the source function u∈Lq(JT;X) . This yields analogous estimates also for the solutions of (10)-(11) if u is replaced by Biu.
Lemma 3.6** (Uniform bounds).**
Adopt the assumptions and the notation from Lemma 3.5.
There exists a constant c>0 which is independent of ε and u such that the following holds true. For all q∈(1−α1,∞] and ε∈(0,ε∗] we have
[TABLE]
with s=q if q<∞ and for all s∈(1,∞) if q=∞.
Moreover, there holds
[TABLE]
Proof.
Note first that for v∈W1,s(JT) and for t∈JT we have
[TABLE]
Note that Ψ′(x)(x−z0)≥0 for all x∈R because Ψ is convex and since Ψ′(z0)=0.
We insert Zε(v)(0)=z0 and dsd(Zε(v))=v˙−ε1Ψ′(Zε(v)) according to Definition 3.3.
The triangle inequality and rearranging yields
[TABLE]
Hence, with zεu=Zε(Syεu) and v=Syεu there follows
[TABLE]
Because the Riesz representation w of S is contained in dom([(Ap+1)1−α]∗) by (A3) in Assumption 2.10, we can estimate for all y∈dom(Ap):
[TABLE]
For y=Syεu(t), this together with (12) and the triangle inequality implies that for a.e. t∈JT
[TABLE]
Consequently, by the linear growth condition on fε in (A3)ε of Assumption 3.1 we further estimate (15) by
[TABLE]
Remember that yεu(0)=0 for any ε∈(0,ε∗].
Since yεu is the mild solution of (12), we can use (6) for arbitrary γ∈(0,1) and again the linear growth condition on fε to obtain
[TABLE]
Note that
(∫0t(t−s)−αq′ds)1/q′=(1−αq′t1−αq′)1/q′=(1−αq′)1/q′−αt1/q′−α since q<1−α1⇔q′1−α>0.
We sum up the estimates for ∣zεu(t)∣ and ∥yεu(t)∥Xα and apply Gronwall’s Lemma to arrive at
[TABLE]
for all q∈(1−α1,∞] and a constant c3>0 which depends on T, q′ and α but not on ε and u.
By maximal parabolic regularity of Ap, see Remark 2.6, one obtains
[TABLE]
for s=q if q∈(1−α1,∞) and for all s∈(1,∞) if q=∞, again for some c4>0 which is independent of ε and u. This shows (13).
We are left to prove (14).
Note that 2>1−α1 by (A2) in Assumption 2.10.
Because S∈X∗, (13) yields
∥Sy˙εu∥L2(JT)≤c5(1+∥u∥L2(JT;X))
for c5=c4∥S∥X∗.
We test z˙εu in Definition 3.3 by z˙εu, integrate from zero to t and use Young’s inequality to compute for t∈JT:
[TABLE]
Since Ψ(zεu(0))=0 and because Ψ≥0 it follows
[TABLE]
∎
The estimates which we derived in this subsection are crucial for Subsection 3.2.
3.2 Dynamics of the regularized states
This subsection contains ideas from [MS15, Section 4] and [BK13, Section 3.1].
We prove weak continuity of the solution operator of (12) for fixed ε∈(0,ε∗].
This yields weakly converging subsequences yεk and zεk for any weakly converging sequence uε, ε→0.
All results then also hold for the regularized state equations (10)-(11).
Using this, we are able to prove convergence of the solutions of the regularized control problems, as defined in Subsection 3.3, to an optimal solution of problem (1)-(3) with ε→0.
The following lemma is proved as [Mün16, Lemma 5.3].
Lemma 3.7**.**
Let Assumption 2.10 and Assumption 3.1 hold and consider the notation from Lemma 3.5. Suppose that un⇀u in L2(JT;X) with n→∞ for some sequence {un}⊂L2(JT;X).
For ε∈(0,ε∗] fixed consider the solutions yεun and yεu of (12), together with zεun and zεu. Then yεun→yεu with n→∞ weakly in Y2,0 and strongly in C(JT;Xα) and zεun→zεu with n→∞ weakly in H1(JT) and strongly in C(JT).
If the convergence of {un} is strong then the convergence of {yεun} in Y2,0
is also strong.
The same holds if L2(JT;X) is replaced by Ui for i∈{1,2} and if un and u are replaced by Biun and Biu. In this case, (yεBiun,zεBiun) and (yεBiu,zεBiu) are the solutions of (10)-(11).
Furthermore, we have the following convergence result:
Lemma 3.8**.**
Let Assumption 2.10 and Assumption 3.1 hold and consider the notation from Lemma 3.5. Suppose that uε⇀u in L2(JT;X) as ε→0.
Consider the solutions yεuε and yεu of (12), together with zεuε and zεu.
Then yεuε→yu with ε→0 weakly in Y2,0 and strongly in C(JT;Xα) and zεuε→W[Syu] with ε→0 weakly in H1(JT) and strongly in C(JT).
If the convergence of {uε} is strong then also the convergence of {yεuε} in Y2,0 is strong.
The same holds if L2(JT;X) is replaced by Ui for i∈{1,2} and if uε and u are replaced by Biuε and Biu.
In this case, (yεBiuε,zεBiuε) and (yεBiu,zεBiu) are the solutions of (10)-(11).
Proof.
The proof combines the proofs of [BK13, Lemma 3.2] and [Mün16, Lemma 5.3].
By Lemma 3.6 we obtain a bound for yεuε in Y2,0 and for zεuε in H1(JT) which is independent of ε∈(0,ε∗]. Hence, there exists a subsequence {εk} of the sequence {ε} and functions y~∈Y2,0 and z~∈H1(JT) to which yεk(uεk) and zεk(uεk) converge weakly in Y2,0 and H1(JT) and strongly in C(JT;Xα) and C(JT) with k→∞.
We abbreviate yεk:=yεk(uεk) and zεk:=zεk(uεk).
(14) implies that Ψ(zεk(t))→0 with k→∞ for t∈JT. By (A4)ε in Assumption 3.1 this yields z~(t)∈[a,b] for t∈JT. For any x∈R and ξ∈[a,b] there holds Ψ′(x)(x−ξ)≥0 because Ψ is convex and since Ψ′(ξ)=0 for ξ∈[a,b]. For any ξ∈[a,b] we therefore have
[TABLE]
Taking the limit k→∞ yields z~=W[Sy~] since z~ solves (4)-(5) with v=Sy~.
Weak continuity of dtd and Ap implies
dtdyεk+Apyεk⇀dtdy~+Apy~ in L2(JT;X) with k→∞.
For εk small enough we estimate with (A3)ε in Assumption 3.1:
[TABLE]
Because the right side converges to zero, we conclude that Fεk[yεk] converges to F[y~] in C(JT;X) with k→∞.
This together with z~=W[Sy~] yields y~=G(u). Uniqueness of the limit implies convergence of the whole sequence.
The statement about strong convergence follows essentially the same way as in [Mün16, Lemma 5.3].
∎
3.3 The regularized optimal control problem
In this subsection, we introduce regularized optimal control problems. It still requires work to get adjoint systems for those problems.
Nevertheless, we can exploit linearity of the derivatives of the solution operators of (10)-(11) to derive adjoint systems by a direct approach. This will be done in Subsection 3.5 below.
We follow the ideas in [BK13, Section 3.2] and [MS15, Section 4] in this subsection.
For i∈{1,2} consider an optimal control u∈Ui of problem (1)-(3) together with the state y=G(Biu) and z=W[Sy]. Existence of u follows from Theorem 2.13.
For ε∈(0,ε∗] we introduce the regularized optimal control problem
Theorem 3.9** (Convergence of optimal solutions).**
Let Assumption 2.10 and Assumption 3.1 hold. For i∈{1,2} suppose that u∈Ui is an optimal control for problem (1)-(3).
Then for all ε∈(0,ε∗] problem (10),(11),(16) has an optimal control uε∈Ui.
This means that uε, together with yε=Gε(Biuε) and zε=Zε(Syε) (see Definition 3.3), are a solution of the minimization problem (16). Furthermore, uε→u in Ui, yε→y=G(Biu) in Y2,0 and in C(JT;Xα) and zε→z=W[Sy] weakly in H1(JT) and strongly in C(JT) with ε→0.
Proof.
First of all note that the embedding Y0,2↪U1 is continuous, because dom(Ap)≃WΓD1,p(Ω)↪L2(Ω).
Note also that u exists by Theorem 2.13. Existence of optimal controls uε for (10),(11),(16) follows essentially the same way as for problem (1)-(3) by using Lemma 3.7, see also Theorem 2.13.
For all ε∈(0,ε∗], we deduce from optimality of (yε,zε,uε) for problem (10),(11),(16) and of (y,z,u) for problem (1)-(3) that
[TABLE]
Moreover, by (13) in Lemma 3.6, Gε(Biu)∈Y2,0 is uniformly bounded for ε∈(0,ε∗] so that J(Gε(Biu),u)≤c for some constant c>0.
Hence,
c>Jreg(yε,uε;u)=21∥yε−yd∥U12+2κ∥uε∥Ui2+21∥uε−u∥Ui2 and the norms of uε in Ui are bounded from above
independently of ε∈(0,ε∗].
Consequently, we can extract a subsequence {uεk} which converges weakly in Ui to some u~ with k→∞.
By Lemma 3.8, yεk→G(u~) with k→∞ weakly in Y0,2 and then also in U1.
Also by Lemma 3.8, Gε(Biu)→y with ε→0 strongly in Y2,0 and then in U1. Jreg is weakly lower semi-continuous. Hence, with (17) we obtain
[TABLE]
But this implies u~=u and that the convergence of {uεk} in Ui is strong.
Since the limit is uniquely determined by u, the whole sequence {uε} converges to u in Ui with ε→0.
All results then follow by applying the statement about strong convergence in Lemma 3.8.
∎
3.4 Gâteaux differentiability of the solution operator of the regularized state equation
In this subsection, we show that Gε is Gâteaux differentiable for all ε∈(0,ε∗].
Lemma 3.10** (Gâteaux differentiability of Gε).**
Let Assumption 2.10 and Assumption 3.1 hold and take the notation from Lemma 3.5.
Then for any ε∈(0,ε∗] and q∈(1−α1,∞) the solution operator Gε:Lq(JT;X)→Yq,0 of problem (12) is Gâteaux differentiable.
The derivative Gε′[u;h] at u∈Lq(JT;X) in direction h∈Lq(JT;X) is given by yεu,h, where yεu,h together with z=zεu,h=Zε′[Syεu;Syεu,h]∈W1,q(JT) are the unique solution of
[TABLE]
For i∈{1,2} and u,h∈Ui the derivative of the solution mapping u↦Gε(Biu) at u in direction h is given by yεBiu,Bih, i.e. by the unique solution of (18) with h replaced by Bih and z=zεBiu,Bih=Zε′[SyεBiu;SyεBiu,Bih].
Proof.
Gε is Hadamard directionally differentiable because Zε satisfies the properties of W in Theorem 2.12.
Gâteaux differentiability then follows from linearity of all the derivatives. To see that zεu,h∈W1,q(JT), insert Syu,h for h in Remark 3.4 and note that the right side is contained in W1,q(JT).
∎
3.5 Adjoint system for the regularized problem
In this section, we derive adjoint systems for the regularized problems (10),(11),(16) with ε∈(0,ε∗], see Theorem 3.13 below. We proceed in a similar way as in [BK13, Sections 3.3 and 3.5] and [MS15, Section 4].
The following estimates are needed.
Lemma 3.11**.**
Let Assumption 2.10 and Assumption 3.1 hold.
With a little abuse of notation we use the same symbol for the Nemitskii operator of fε, i.e. we write
fε:(y,z)↦fε(y(⋅),z(⋅)).
Then
fε is locally Lipschitz continuous and Gâteaux differentiable from C(JT;Xα)×Lq(JT) to Lq(JT;X) for all ε∈(0,ε∗] and q∈(1−α1,∞).
Moreover, the derivative fε′[(y,z);(⋅,⋅)] at (y,z)∈C(JT;Xα)×Lq(JT) is Lipschitz continuous with a modulus of the form K(y)=L(y)(1+T1/q), where L(y)>0 only depends on y∈C(JT;Xα).
K(y) and L(y) are independent of ε and remain the same in a sufficiently small neighbourhood of y.
For (v,h)∈C(JT;Xα)×Lq(JT) we can estimate
[TABLE]
For a.e. t∈JT, there also holds the pointwise estimate
[TABLE]
Furthermore, ∂y∂fε(y,z)=∂y∂fε(y(⋅),z(⋅)) is bounded by K(y) in L∞(JT;L(Xα,X)).
Moreover, ∂z∂fε(y,z)=∂z∂fε(y(⋅),z(⋅)) is bounded by K(y) in L∞(JT;X).
Proof.
First of all, fε is locally Lipschitz continuous and Gâteaux differentiable from the space C(JT;Xα)×Lq(JT) to Lq(JT;X) for all ε∈(0,ε∗] and q∈(1−α1,∞).
This follows from Step 3 in the proof of [Mün16, Theorem 3.1] and Step 1 in the proof of [Mün16, Theorem 4.7]. We give a sketch of the proof:
One first makes use of (A3)ε in Assumption 3.1 to show that (y(⋅),v)↦fε(y(⋅),v) is locally Lipschitz continuous from C(JT;Xα)×R to C(JT;X) with respect to the C(JT;Xα)-norm.
The proof contains a pointwise estimate of the following form:
For y∈C(JT;Xα) and some neighbourhood BC(JT;Xα)(y,δ) of y there holds
[TABLE]
for all y1,y2∈BC(JT;Xα)(y,δ), z1,z2∈R and t∈JT and for some L(y)>0.
This local estimate leads to a pointwise estimate of the form
[TABLE]
for a.e. s∈JT, for any y1,y2∈BC(JT;Xα)(y,δ) and z1,z2∈Lq(JT).
By Minkowski’s inequality, fε is locally Lipschitz continuous from C(JT;Xα)×Lq(JT) to Lq(JT;X) with Lipschitz constants of the form K(y)=L(y)(1+T1/q).
In a second step one shows that fε is directionally differentiable.
Convergence of the difference quotients
[TABLE]
for a.e. s∈JT and (y,z),(v,h)∈C(JT;Xα)×Lq(JT) follows from (A3)ε in Assumption 3.1.
Lebesgue’s dominated convergence theorem yields directional differentiability of fε from the space C(JT;Xα)×Lq(JT) to Lq(JT;X) and the bounds (20) and (21) for fε′[(y,z);(⋅,⋅)].
Linearity of the derivative and local Lipschitz continuity then already imply Gâteaux differentiability of fε.
Now for arbitrary y∈Xα with ∥y∥Xα=1, we choose the constant function v∈C(JT;Xα), v(t)=y for t∈JT and set h=0∈Lq(JT) in (21).
This implies that ∂y∂fε(y,z)=∂y∂fε(y(⋅),z(⋅)) is bounded by K(y) in L∞(JT;L(Xα,X)).
Then we choose v=0∈C(JT;Xα), h∈Lq(JT), h(t)=c>0 for t∈JT in (21) and divide by c on both sides to prove that ∂z∂fε(y,z)=∂z∂fε(y(⋅),z(⋅)) is bounded by K(y) in L∞(JT;X).
∎
The following lemma provides the main tool to derive adjoint systems for the regularized problems (10),(11),(16).
The hardest part in the proof is to find an explicit expression of the adjoint operator [Gε′[u;⋅]]∗:Yq,0∗→Lq′(JT;X∗) of Gε′[u;⋅] from Lemma 3.10. This comes from the fact that Gε′[u;⋅] is defined as the mapping which assigns to each h∈Lq(JT;X) the solution yεu,h∈Yq,0 of (18), which contains the solution zεu,h of (19) only implicitly.
Lemma 3.12**.**
Let Assumption 2.10 and Assumption 3.1 hold and adopt the notation from Lemma 3.10. For ε∈(0,ε∗] and any q∈(1−α1,∞), h∈Lq(JT;X) and ν∈Lq′(JT;[dom(Ap)]∗) there holds
[TABLE]
where
pεν∈Yq′,T∗
and qεν∈Lq′(JT) are the unique solution of
[TABLE]
and where yεu,h∈Yq,0 and zεu,h∈W1,q(JT) are
the unique solution of (18)-(19).
Moreover,
[TABLE]
for some constant C(yεu)>0. C(yεu) remains the same in a sufficiently small neighbourhood of yεu.
Proof.
Let q∈(1−α1,∞) be arbitrary. Consider the solution operator of
[TABLE]
which maps any v∈Lq(JT) to z∈W1,q(JT).
We denote by Tz,εu:Lq(JT)→Lq(JT),v↦Tz,εuv the corresponding operator
on Lq(JT).
Consider then the operator
Ty,εu:=Ap−∂y∂fε(yεu,zεu)−∂z∂fε(yεu,zεu)Tz,εuS(−Ap+∂y∂fε(yεu,zεu))
from Yq,0 to Lq(JT;X).
It follows as for the system (18)-(19) that for each h∈Lq(JT;X) there exists a unique couple of solutions (y~εu,h,z~εu,h) in Yq,0×Lq(JT) of the system
[TABLE]
This implies that (dtd+Ty,εu)−1 is bijective from Lq(JT;X) to Yq,0. Note the difference between (23)-(24) and (18)-(19).
We identify z~εu,h∈Lq(JT) with the corresponding function in W1,q(JT) and estimate the norms of (y~εu,h,z~εu,h).
For t∈JT we have
[TABLE]
With (21) in Lemma 3.11 and (A3) in Assumption 2.10 it follows
[TABLE]
for a constant c>0 which is independent of ε.
Note that Ψ′′(zεu(s))≥0 because Ψ is convex.
Moreover, with (6) and again (21) in Lemma 3.11 we obtain
[TABLE]
Gronwall’s Lemma yields a constant C1(yεu)>0 which depends only on yεu∈C(JT;Xα) such that
∥y~εu,h∥C(JT;Xα)≤C1(yεu)∥h∥Lq(JT;X) and ∥z~εu,h∥C(JT)≤C1(yεu)∥h∥Lq(JT;X)
for q∈(1−α1,∞).
Moreover, there holds C1(yεu)=C1(y) for ε small enough if {yεu} converges to y with ε→0. This is the case for the states yε in Theorem 3.9.
As several times before we use maximal parabolic regularity of Ap to obtain
∥y~εu,h∥Yq,0≤C2(yεu)∥h∥Lq(JT;X)
where C2(yεu)>0 has the same dependence on yεu as C1(yεu).
The inequalities in (22) are shown analogously to the estimates which we derived for (y~εu,h,z~εu,h).
We also conclude that there exists a constant C(yεu)>0 with
(dtd+Ty,εu)−1L(Lq(JT;X),Yq,0)≤C(yεu).
This proves maximal parabolic Lq(JT;X)-regularity of Ty,εu for q∈(1−α1,∞).
For ε small enough, also the values C(yεu) can be chosen independently of ε if {yεu} converges to some y with ε→0 as it is the case for the sequence {yε} in Theorem 3.9.
Maximal parabolic Lq(JT;X)-regularity of Ty,εu for q∈(1−α1,∞) implies maximal parabolic Lq′(JT;[dom(Ap)]∗)-regularity of [Ty,εu]∗
[MS15, Lemma 4.10]. To derive a representation of [Ty,εu]∗, we collect some information about the adjoint mappings of the single components which define Ty,εu.
Lemma 3.11 yields that multiplication with ∂z∂fε(yεu,zεu) is well-defined as a mapping from Lq(JT) into Lq(JT;X) and
[∂z∂fε(yεu,zεu)]∗=⟨⋅,∂z∂fε(yεu,zεu)⟩X.
Similarly, ∂y∂fε(yεu,zεu) is a linear continuous mapping from Lq(JT;Xα) into Lq(JT;X).
Moreover,
[S∂y∂fε(yεu,zεu)]∗ is given by multiplication with S∂y∂fε(yεu,zεu).
The adjoint of Tz,εu maps any v∈Lq′(JT) to the function q∈Lq′(JT) which may be identified with the unique solution of
[TABLE]
S∗ and [SAp]∗ are given
by multiplication with S and SAp.
Furthermore, SAp∈[Xα]∗ by the assumptions on w in (A3) in Assumption 2.10.
All bounds are independent of ε if yε and zε in Theorem 3.9 are considered and if ε is small enough.
We obtain
[TABLE]
Maximal parabolic Lq′(JT;[dom(Ap)]∗)-regularity of [Ty,εu]∗
implies that for each
ν∈Lq′(JT;[dom(Ap)]∗) there exists a unique pεν∈Yq′,T∗ with
(−dtd+[Tε,yu]∗)p=ν.
For given ν∈Lq′(JT;[dom(Ap)]∗) let qεν be the representative in Lq′(JT) of the solution of
[TABLE]
Let also (yεu,h,zεu,h) be the solutions of (18)-(19) for some given h∈Lq(JT;X).
Then we obtain with (18) and partial integration:
[TABLE]
By definition of qεν the last term on the right side is equal to
[TABLE]
Another partial integration together with (19) and canceling out some terms yields
[TABLE]
By definition of pεν we finally arrive at
[TABLE]
∎
We can directly write down an adjoint system for a solution uε of problem (10),(11),(16).
Theorem 3.13** (Adjoint system regularized problem).**
Adopt the assumptions of Theorem 3.9 and the notation from Lemma 3.12. For i∈{1,2} and ε∈(0,ε∗] let uε∈Ui be an optimal control for problem (10),(11),(16). Then the adjoint variables for yε∈Y2,0 and zε∈H1(JT) are given by
pε:=pεyε−yd∈Y2,T∗ and
qε:=qεyε−yd∈H1(JT).
There holds
Bi∗(pε+Sqε)=−(κ+1)uε+u in L2(JT;Ui)
and the following system of evolution equations is satisfied by pε and qε:
[TABLE]
Proof.
Note first that we can choose q=q′=2 in Lemma 3.12 since 2>1−α1⇔α<21 which is the case by (A2) in Assumption 2.10.
Moreover, the expression ⟨yε−yd,yεBiuε,Bih⟩L2(JT;dom(Ap))=∫0T∫Ω(yε−yd)⋅yεBiuε,Bihdxdt
is well-defined:
With Ip as in Definition 2.4,
yεBiuε,Bih∈dom(Ap)=ran(Ip) may be identified with the embedding of Ip−1yεBiuε,Bih from WΓD1,p(Ω) into WΓD1,p′(Ω)≃X∗. Note that p′≤2≤p.
Since \left(\mathcal{A}_{p}+I_{p}\right)^{-1}\in\mathcal{L}\bigl{(}X,\mathbb{W}_{\Gamma_{D}}^{1,p}(\Omega)\bigr{)}, see Remark 2.6, we can first estimate
[TABLE]
for a.e. t∈JT and then with the identification of WΓD1,p′(Ω) with X∗
[TABLE]
The Gâteaux-derivative of Jreg(u):=Jreg(Gε(Biu),u;u)=J(Gε(Biu),u)+21∥u−u∥Ui2
with respect to u has to be zero at uε by optimality.
Applying Lemma 3.12 we compute for h∈Ui:
[TABLE]
∎
3.6 Estimates for the adjoints of the regularized problem
Similar to [BK13, Section 3.5] and [MS15, Lemma 4.14] we estimate the norms of the adjoint states pε and qε from Theorem 3.13 independently of ε and of the norms of the optimal controls uε. In Section 4, we take a sequence {ε} with ε→0 and use those bounds to extract (weakly) converging subsequences of pε and qε. Those finally yield an adjoint system for problem (1)-(3), see Theorem 4.13 below.
Lemma 3.14** (Uniform bounds).**
Adopt the assumptions and the notation of Theorem 3.13.
There exists a constant c>0 which is independent of ε and some ε0∈(0,ε∗] such that the following holds true. If ε∈(0,ε0), then
[TABLE]
Proof.
Firstly, Theorem 3.9 yields uε→u in Ui, yε→y in Y2,0 and in C(JT;Xα) and zε→z weakly in H1(JT) and strongly in C(JT).
As in the proof of Theorem 3.13 we obtain that yε−yd is bounded in L2(JT;[dom(Ap)]∗) by
\|\overline{y}_{\varepsilon}-B_{1}y_{d}\|_{\mathrm{L}^{2}(J_{T};X)}\left\|\left(\mathcal{A}_{p}+I_{p}\right)^{-1}\right\|_{\mathcal{L}\bigl{(}X;\mathbb{W}_{\Gamma_{D}}^{1,p}(\Omega)\bigr{)}}=:c_{0}.
This constant can be estimated independently of ε because {yε} is uniformly bounded in C(JT;X).
For any ξ∈L2(JT;X), Lemma 3.12 yields
[TABLE]
Because yε→y in C(JT;Xα) we can find some ε0>0 such that C(yε)=C(y) for all ε∈(0,ε0).
From reflexivity of L2(JT;X) we conclude
[TABLE]
for all ε∈(0,ε0).
We continue with estimates for qε.
We test (26) with qε/∣qε∣, integrate from any t∈JT to T and apply (20) from Lemma 3.11 and (33) to get
[TABLE]
W.l.o.g. for the same ε0 as before there holds c1K(yε)=c1K(y)=:c2 for all ε∈(0,ε0).
Note that Ψ′′(zε)≥0 by convexity of Ψ.
This yields
[TABLE]
for all ε∈(0,ε0).
We conclude Sqε∈L2(JT;X∗) and then by (33) also pε∈L2(JT;X∗), both with a norm which is independent of ε∈(0,ε0).
We continue by estimating
[TABLE]
Because of (34) the right side is bounded by 2c2 so that
∫0T∣q˙ε(s)∣ds≤2c2=:c3
for ε∈(0,ε0).
To proceed, we
use maximal parabolic L2(JT;[dom(Ap)]∗)-regularity of Ap∗ and (25) to obtain
[TABLE]
(20) from Lemma 3.11, (33), (35) and the bound ∥yε−yd∥L2(JT;[dom(Ap)]∗)≤c0 yield
[TABLE]
for ε∈(0,ε0).
In a similar way one obtains (30)-(32) from the estimates
[TABLE]
∎
4 Adjoint system and optimality conditions for the optimal control problem
As in [BK13, Section 4] and [MS15, Theorem 4.15] we are interested in taking the limit ε→0 in Theorem 3.13 to obtain an adjoint system for problem (1)-(3).
In Subsection 4.1–Subsection 4.3 we study the general case with spatially distributed or boundary controls, i.e. i∈{1,2}.
Particularly, in Subsection 4.1 we derive an adjoint system (p,q) for problem (1)-(3) for the optimal control u from Theorem 3.9, see Lemma 4.1.
Moreover, we gather information about the continuity properties of q.
Subsection 4.2 contains the optimality conditions for problem (1)-(3) for the optimal control u in terms of the pair p and q, see Lemma 4.12 below.
In Subsection 4.3 we summarize the results from Subsection 4.1–Subsection 4.2 in Theorem 4.13.
Afterwards, we consider the particular case when f is continuously differentiable. In Corollary 4.14 we improve the optimality condition (42) from Theorem 4.13 for this instance.
Both optimality conditions (42) and (48) are restricted to test functions SyBiu,Bih with h∈Ui, i∈{1,2}.
In Subsection 4.4 we focus on the setting when the controls act inside of Ω, i.e. on i=1.
In Corollary 4.15 we improve the optimality conditions from
Theorem 4.13 as well as those from Corollary 4.14 by extending inequalities (42) and (48) to any test function of the form (Sv)φ with v∈dom(Ap),Sv>0 and φ∈C0∞(JT). Dividing the corresponding inequality by Sv yields, at least in (48), an optimality condition with arbitrary test functions φ∈C0∞(JT).
For i=1 we also prove uniqueness of p and q if f is continuously differentiable, see Corollary 4.16.
4.1 Adjoint system for distributed or boundary controls
In this subsection, we derive an adjoint system (p,q) for problem (1)-(3) and collect regularity properties of p and q.
The evolution equation of p can be derived pretty much straight forward as the limit equation of (25) for ε→0, see Lemma 4.1 below. This is not possible for q. The reason is that in Lemma 3.14 we could bound the norm of qε˙ independently of ε only in L1(JT). As a remedy we split the interval JT into the set I0 of times t where the limit z(t) is contained in the open interval (a,b) and the rest I∂ where z(t)∈{a,b}. It turns out, that the evolution of q in I0 can be described in form of an evolution equation, see Lemma 4.3 below.
As for I∂, we have to pass to weak-∗ convergence of qε and consider the limit dμ of ε1Ψ′′(zε)qε in C(JT)∗. Driving ε to zero then yields an equality for dq in the sense of measures on I∂, see Lemma 4.5. The abstract measure dμ, having support in I∂, remains part of this evolution equation. It also appears in the optimality conditions for problem (1)-(3) in (42).
In order to complete the description of q by analyzing the measure dμ, we will introduce a regularity Assumption 4.7 on Sy(t) for t∈I∂. With this assumption, we can characterize dμ in a subset of I∂. This allows us to characterize q in open subintervals of I∂ and we can prove continuity of q at so-called (0,∂)-switching times, see Lemma 4.10.
In Remark 4.11 we generalize Lemma 4.10 for when Assumption 4.7 is not satisfied.
Lemma 4.1** (Adjoint system in the limit).**
Adopt the assumptions and the notation of Theorem 3.13.
For i∈{1,2} let u∈Ui,y=G(u) and z=W[Sy] be defined as in Theorem 3.9. Then
every sequence {ε} with ε→0 has a subsequence {εk} such that the following holds true.
There exist functions functions p∈Y2,T∗ and λ1,λ2∈L2(JT;[Xα]∗) such that as k→∞,
pεk⇀p in Y2,T∗ and
[TABLE]
Moreover, there exists a function q which has bounded variation, i.e. q∈BV(JT), such that qεk converges pointwise to q with k→∞. There holds Var(q)≤liminfεk→0Var(qεk).
Alternatively,
q˙εk→dq weak-* in C(JT)∗ with k→∞
for some signed regular Borel measure dq∈C(JT)∗.
The relation between q and dq is given by
q(t−)−q(s+)=dq((s,t)) and
q(t+)−q(s−)=dq([s,t])
for [s,t]⊂JT.
The function p solves the evolution equation
[TABLE]
If f is continuously differentiable from Xα×R into X then
λ1=[∂y∂f(y,z)]∗p and λ2=S∂y∂f(y,z)q.
Furthermore,
[TABLE]
Proof.
Theorem 3.9 implies uε→u in Ui, yε→y in Y2,0 and in C(JT;Xα) and zε→z uniformly and weakly in H1(JT) with ε→0.
By (29), (30) and (31) in Lemma 3.14,
reflexivity of all spaces yields
a subsequence {εk} and some functions p, λ1 and λ2 such that
pεk⇀p in Y2,T∗,
[∂y∂fεk(yεk,zεk)]∗pεk⇀λ1 in L2(JT;[Xα]∗) and
S∂y∂fεk(yεk,zεk)qεk⇀λ2 in L2(JT;[Xα]∗)
with k→∞.
The condition p(T)=0 is included in the definition of the space Y2,T∗.
From (28) we conclude that qε has bounded variation, i.e. qε∈BV(JT), with a norm which is bounded independently of ε. This implies that (w.l.o.g. the same) subsequence
qεk converges pointwise to some q∈BV(JT) with k→∞ and Var(q)≤liminfεk→0Var(qεk).
Alternatively, by Alaoglu’s compactness theorem,
q˙εk→dq weak-* in C(JT)∗ with k→∞
for some signed regular Borel measure dq∈C(JT)∗ and the relation between q and dq is given by
q(t−)−q(s+)=dq((s,t)) and
q(t+)−q(s−)=dq([s,t])
for [s,t]⊂JT [BK13, Section 4].
We exploit weak continuity of −dtd+Ap∗ from Y2,T∗ to L2(JT;[dom(Ap)]∗)
to see that
[TABLE]
in L2(JT;[dom(Ap)]∗) with k→∞.
Consequently, p∈Y2,T∗ solves equation (36).
Note that we can set fε≡f if f is continuously differentiable from Xα×R into X and in this case
λ1=[∂y∂f(y,z)]∗p and λ2=S∂y∂f(y,z)q.
Moreover,
[TABLE]
in Ui with k→∞ since Bi∗ is weakly continuous .
This implies (37).
∎
To gather information about q from Lemma 4.1 we continue similar as in [BK13, Section 4].
Definition 4.2** (Partition of JT).**
Let z be as in Theorem 3.9.
We split JT into
I0:={t∈JT:z(t)∈(a,b)}
and
I∂:=JT\I0={t∈JT:z(t)∈{a,b}}.
We further introduce
I∂a:={t∈JT:z(t)=a}
and
I∂b:={t∈JT:z(t)=b}.
Note that I0 is open because z is continuous.
Lemma 4.3** (q in I0).**
Adopt the assumptions and the notation of Lemma 4.1 and consider the subdivision of JT from Definition 4.2.
For any interval (c,d)⊂I0 the limit q in Lemma 4.1 belongs to H1(c,d) and there exist ν1,ν2∈L2(JT) such that
−q˙=ν1+ν2
in L2(c,d).
If f is continuously differentiable from Xα×R into X then
ν1=⟨p,∂z∂f(y,z)⟩X
and
ν2=⟨Sq,∂z∂f(y,z)⟩X.
Proof.
By Theorem 3.9,zε→z uniformly in JT. Let (c,d)⊂I0 and [s,t]⊂(c,d) be arbitrary. (A4)ε in Assumption 3.1 implies that (w.l.o.g for ε0>0 from Lemma 3.14) Ψ′′(zε)≡0 on [s,t] for all ε∈(0,ε0).
For ε∈(0,ε0) we integrate from s to t in (26) in Theorem 3.13 and obtain
[TABLE]
Consider {εk} from Lemma 4.1.
Lemma 3.14 together with Lemma 3.11 implies uniform boundedness of ⟨pε,∂z∂fε(yε,zε)⟩X and ⟨Sqε,∂z∂fε(yε,zε)⟩X in L2(JT) if ε∈(0,ε0). Hence, we obtain a subsequence of {εk} (still denoted by {εk}) and functions ν1,ν2∈L2(JT), such that
⟨pεk,∂z∂fεk(yεk,zεk)⟩X⇀ν1
and
⟨Sqεk,∂z∂fεk(yεk,zεk)⟩X⇀ν2
in L2(JT) with k→∞.
If f is continuously differentiable from Xα×R into X we can set fε≡f and get
ν1=⟨p,∂z∂f(y,z)⟩X
and
ν2=⟨Sq,∂z∂f(y,z)⟩X.
In the general case we obtain
[TABLE]
with k→∞.
So the weak derivative of q exists in L2(c,d) and is given by −ν1−ν2.
∎
Our next goal is to understand the behaviour of q in I∂.
Lemma 4.4** (q in I∂: Relation to P(Sy)).**
Adopt the assumptions and the notation of Lemma 4.1 and consider the subdivision of JT from Definition 4.2. With P=Id−W, cf. Lemma 2.9, there holds
[dtdP[Sy](t)]q(t)=0 for a.e. t∈I∂.
Proof.
Consider the concrete choice for Ψ from Remark 3.2 and c and ε0 from Lemma 3.14.
By Theorem 3.9, zε→z uniformly so that
zε(t)→b for t∈I∂b and zε(t)→a for t∈I∂a
with ε→0.
Hence,
there exists some ε1∈(0,ε0] such that
[TABLE]
for all ε∈(0,ε1).
Remember that Ψ1(x)=(x−b)3(4+b−x) and Ψ≡0 on [a,b]. For ε∈(0,ε1) and t∈I∂b we obtain
[TABLE]
We apply estimate (27) from Lemma 3.14 together with (38) and (40) to see that
[TABLE]
for all ε∈(0,ε1).
We apply the convergence results from Theorem 3.9 in (11) and use the representation W+P=Id from Lemma 2.9 to obtain the weak convergence
[TABLE]
in L2(JT) with ε→0.
Furthermore, by Lemma 4.1, ∣qεk∣→∣q∣ strongly in L2(JT) with k→∞ and
dtdP[Sy]=dtdP[Sy] a.e. in I∂b by definition of I∂b.
This together with
(39) and (41) yields
[TABLE]
Similar estimates for I∂a and the fact that I∂=I∂a∪I∂b prove the statement.
∎
Next, we pass to the limit in (26) to get the following result:
Lemma 4.5** (q in I∂: Relation to dμ).**
Adopt the assumptions and the notation of Lemma 4.1 and let ν1 and ν2 be as in Lemma 4.3. Consider the subdivision of JT from Definition 4.2.
We denote dμε:=ε1Ψ′′(zε)qε.
There exists a measure dμ∈C(JT)∗, such that a subsequence {dμεk} (w.l.o.g we may consider {εk} from Lemma 4.1) converges weak-* to dμ in C(JT)∗ with k→∞.
The support of dμ is contained in I∂.
For any φ∈C(JT) there holds
[TABLE]
This implies
dμ=dq+(ν1+ν2)dt as measures on I∂.
Proof.
By (27) in Lemma 3.14 the functions
dμε
are bounded in L1(JT) independently of ε for all ε∈(0,ε0).
Consequently, a subsequence of {dμε} converges weak-* in C(JT)∗ to some measure dμ.
By (A4)ε in Assumption 3.1 and the uniform convergence of zε to z there holds φε1Ψ′′(zε)qε≡0 as soon as ε is small enough, if φ∈C(JT) has compact support in I0. Therefore, the support of dμ is contained in I∂ [BK13, p.343].
The other statements are shown similar as [BK13, Lemma 4.6] and [BK13, Lemma 4.7].
∎
It also follows:
Lemma 4.6** (Discontinuity properties of q).**
Adopt the assumptions and notation of Lemma 4.1.
The absolute value of q can only jump downwards in reverse time.
Consequently, for any t∈JT there holds ∣q(t−)∣≤∣q(t+)∣ and q(T−)=q(T)=0.
Moreover, q is right continuous in [0,T) and left continuous at T.
Proof.
From Lemma 4.1 we conclude that
qεk converges to q in L1(JT) and that dqεk=q˙εkdt converges to dq weak-* in C(JT)∗.
From [[] Chapter XII.7]visintin2013differential it follows that q has bounded variation and that the limit is right continuous in [0,T) and left continuous at T.
The rest of the statements are shown just as [BK13, Lemma 4.4].
∎
The unknown measure dμ has support in I∂ so that we only know the behaviour of the sum −dq+dμ in C(JT)∗ but not that of dq alone.
In order to analyze q also in I∂ we make the following regularity assumption, cf. [BK13, p.344]:
Assumption 4.7** (Regularity assumption).**
Let y be as in Theorem 3.9 and consider the subdivision of JT from Definition 4.2. We suppose that the function P[Sy] satisfies
dtdP[Sy]=0 a.e. in I∂.
Equivalently,
Sy˙>0 a.e. in I∂b and
Sy˙<0 a.e. in I∂a.
Remark 4.8**.**
This assumption is reasonable if Sy is the size of interest.
Consider for example the case when w in (A3) in Assumption 2.10 has the form w=m∣Ω∣1φ for some φ∈∏j=1mCΓDj∞(Ω), where the components φj, j∈{1,…,m}, are constantly equal to 1 within most of Ω and vanish only in a small neighbourhood of ΓDj.
If we identify ran(Ip) with WΓD1,p(Ω), then S acts on y∈dom(Ap) as
Sy=m∣Ω∣1∑j=1m∫Ωyjφjdx.
This means that Sy is approximately the mean value of y in Ω.
If this is the value of interest then nothing changes in the system if
Sy˙=0 in a subset of I∂ with positive measure.
In order to analyze the behaviour of q and dq in I0∩I∂ we introduce the following categories of times as in [BK13]:
Definition 4.9** (Switching times).**
Consider the subdivision of JT from Definition 4.2.
We call a time t a (0,∂)-switching time if t∈I0∩I∂ and if there is some ε>0 such that (t−ε,t)⊂I0 and [t,t+ε)⊂I∂.
We say that t is a (∂,0)-switching time if t∈I0∩I∂ and if for some ε>0 we have (t−ε,t]⊂I∂ and (t,t+ε)⊂I0.
Lemma 4.10** (q at switching times).**
Adopt the assumptions and the notation of Lemma 4.1.
If t is a (0,∂)-switching time in the sense of Definition 4.9 and if Assumption 4.7 holds then there exits some ε>0 such that
q≡0
on [t,t+ε). Moreover, q is continuous at t with t=0.
Furthermore, for every open interval (c,d)⊂I∂ there holds that q≡0 in [c,d).
Proof.
Let (c,d)⊂I∂ be arbitrary and suppose that Assumption 4.7 holds.
Then Lemma 4.4 implies q(t)=0 for a.e. t∈(c,d).
By Lemma 4.6, q is right continuous in [0,T) so that q≡0 in [c,d).
Consequently, for every subinterval [β,γ]⊂(c,d) we have
0=q(γ−)−q(β+)=dq((β,γ))
so that dq=0 as a measure on (c,d).
Again by Lemma 4.6 the absolute value of q can only jump downwards in reverse time. By Lemma 4.3, q∈H1(e,c) for any interval (e,c)⊂I0. Consequently, whenever an interval (e,c)⊂I0 is followed by an interval [c,d]⊂I∂, then q is absolutely continuous on [e,d).
Now let t be a (0,∂)-switching and consider ε>0 such that
(t−ε,t)⊂I0 and [t,t+ε)⊂I∂.
Then setting e=t−ε, c=t and d=t+ε proves the rest of the lemma.
∎
Remark 4.11**.**
In the setting of Lemma 4.10 one can prove even more about the continuity properties of q if f is continuously differentiablem, even in absence of Assumption 4.7:
•
Note first that when t∈I∂ is a (∂,0)-switching time then q might jump at t no matter if Assumption 4.7 holds or not. If it does not jump then under Assumption 4.7 then necessarily q(t)=0.
It is also possible to prove that q may only jump up at t if ∫tt−⟨p+Sq,∂z∂f(y,z)⟩Xds>0, where either t−=t−(t)∈(t,T]∩I∂a is (essentially) the first time in (t,T) for which there exists some ε>0 such that Sy˙<0 a.e. in (t−,t−+ε), or t−=T.
It can further be shown that the height of the jump is bounded by ∫tt−⟨p+Sq,∂z∂f(y,z)⟩Xds.
Analogously, one can prove that q may only jump down at t if ∫tt+⟨p+Sq,∂z∂f(y,z)⟩Xds<0, where either t+=t+(t)∈(t,T]∩I∂b is (essentially) the first time in (t,T) for which there exists some ε>0 such that Sy˙>0 a.e. in (t+,t+ε), or t+=T. In this case the height of the jump is bounded by −∫tt+⟨p+Sq,∂z∂f(y,z)⟩Xds.
•
Other categories of times can be considered. Those include isolated times in I0 or subintervals of I∂ in which Sy˙=0 a.e.
The latter can only occur if Assumption 4.7 does not hold true.
Also for those categories one can show sign conditions for dq and dμ and upper bounds for jumps.
The proof of these continuity properties is very technical and exceeds the scope of this work.
The results will be published in the dissertation of the project in which this paper originated.
4.2 Optimality conditions for distributed or boundary controls
We derive optimality conditions for problem (1)-(3) for the optimal control u from Theorem 3.9 in terms of the pair p and q from Lemma 4.1.
We can not expect a pointwise condition as in [MS15, Section 5] since the hysteresis and its derivative, and then also F′[y,⋅] in Theorem 2.12, act non-local in time. This implies that if for some direction ζ∈C(JT;Xα) and some set I⊂JT of positive measure the derivative F′[y;ζ](τ)=f′[(y(τ),W[Sy](τ));(y(τ),W′[Sy;Sζ](τ))] is not zero for τ∈I, then the values of the derivative in I might have an influence on its value at any t with max{τ∈I}<t≤T. That is, we can only expect an optimality condition for problem (1)-(3) which includes integration at least over a part of the time interval JT.
Nevertheless, we follow the steps in [MS15, Section 5] as long as possible.
The optimality condition for i∈{1,2} is derived in Lemma 4.12 and improved in Corollary 4.14 for the case when f is continuously differentiable.
We can even further improve this condition for the case when the controls act inside of Ω, i.e. for i=1. Also in this case we can not expect to obtain an inequality without integration in time. But since the range of B1 is dense in X, we are able to derive a condition without variation in space. The results can be found in Corollary 4.15 in Subsection 4.4.1. For i=1 we are also able to prove uniqueness of p,q and dμ if f is continuously differentiable, see Corollary 4.16 in Subsection 4.4.2.
Because the range of B2 is not dense in X, we treat the general case i∈{1,2} first.
Lemma 4.12** (Optimality condition).**
Adopt the assumptions and the notation of Lemma 4.1 and let ν1 and ν2 be as in Lemma 4.3.
For any h∈Ui, yBiu,Bih=G′[Biu;Bih] and
F′[y;yBiu,Bih](t)=f′[(y(t),W[Sy](t));(y(t),W′[Sy;SyBiu,Bih](t))] (see Theorem 2.12), there holds the optimality condition
[TABLE]
Proof.
Since u is an optimal control, the directional derivative of the reduced cost functional J has to be greater or equal than zero in each direction.
With yBiu,Bih=G′[Biu;Bih] this means that for any h∈Ui there holds
[TABLE]
The function yBiu,Bih solves the evolution equation
(9) in Theorem 2.12 with y replaced by y and h replaced by Bih. We test this equation with p+Sq, integrate over time and apply (37) to compute
[TABLE]
We integrate the first term on the left side of (44) by parts, insert (36) from Lemma 4.1 and use the representation of dq from Lemma 4.5 to observe
4.3 Summary: Adjoint system and optimality conditions for distributed or boundary controls
We summarize our results for the general control problem with i∈{1,2}.
Theorem 4.13** (Adjoint system and optimality condition).**
Let Assumption 2.10 and Assumption 3.1 hold. For i∈{1,2} suppose that u∈Ui is an optimal control for problem (1)-(3) together with the optimal state y∈Y2,0 and z=W[Sy]∈H1(JT). Consider the subdivision of JT from Definition 4.2.
Then there exist adjoint states p∈Y2,T∗ and q∈BV(JT) of the following kind:
There holds
Bi∗(p+Sq)=−κu in Ui.
For some functions λ1,λ2∈L2(JT;[Xα]∗) we have
[TABLE]
q* is left continuous in JT, right continuous at T and absolutely continuous in I0. There exist ν1,ν2∈L2(JT) such that q solves
−q˙=ν1+ν2
in every open subinterval of I0.
dtdP[Sy](t)q(t)=0 for a.e. t∈I∂
and there is a measure dμ∈C(JT)∗ with support in I∂ such that
dμ=dq+(ν1+ν2)dt as measures on I∂.
For all h∈Ui and with yBiu,Bih=G′[Biu;Bih] (see Theorem 2.12) there holds the optimality condition*
[TABLE]
where F′[y;yBiu,Bih](t)=f′[(y(t),W[Sy](t));(y(t),W′[Sy;SyBiu,Bih](t))].
The absolute value of q can only jump downwards in reverse time so that q(T−)=q(T)=0 and ∣q(t−)∣≤∣q(t+)∣ for all t∈JT.
If the regularity Assumption 4.7 is valid then
q is continuous at every (0,∂)-switching time t (see Definition 4.9) with q(t)=0.
In this case, for every open interval (c,d)⊂I∂ it follows q≡0 on [c,d).
We can improve the results of Theorem 4.13 if f is continuously differentiable:
Corollary 4.14** (Adjoint system and optimality condition for regular f).**
Let Assumption 2.10 and Assumption 3.1 hold. Moreover, suppose that f is continuously differentiable from Xα×R into X. For i∈{1,2} assume that u∈Ui is an optimal control for problem (1)-(3) together with the optimal state y∈Y2,0 and z=W[Sy]∈H1(JT). Consider the subdivision of JT from Definition 4.2.
Then there exist adjoint states p∈Y2,T∗ and q∈BV(JT) of the following kind:
There holds
Bi∗(p+Sq)=−κu in Ui.
We have
[TABLE]
q is left continuous in JT, right continuous at T and absolutely continuous in I0. q solves the evolution equation
−q˙=⟨p+Sq,∂z∂f(y,z)⟩X
in every open subinterval of I0.
dtdP[Sy](t)q(t)=0 for a.e. t∈I∂
and there is a measure dμ∈C(JT)∗ with support in I∂ such that
dμ=dq+⟨p+Sq,∂z∂f(y,z)⟩Xdt as measures on I∂.
For all h∈Ui and with yBiu,Bih=G′[Biu;Bih] (see Theorem 2.12) and P=Id−W (see Lemma 2.9) there holds the optimality condition
[TABLE]
The absolute value of q can only jump downwards in reverse time so that q(T−)=q(T)=0 and ∣q(t−)∣≤∣q(t+)∣ for all t∈JT.
If the regularity Assumption 4.7 is valid then
q is continuous at every (0,∂)-switching time t (see Definition 4.9) with q(t)=0.
In this case, for every open interval (c,d)⊂I∂ it follows q≡0 on [c,d).
Proof.
If f is continuously differentiable then by Lemma 4.1 and Lemma 4.3 we can replace λ1=[∂y∂f(y,z)]∗p,
λ2=S∂y∂f(y,z)q,ν1=⟨p,∂z∂f(y,z)⟩X,
ν2=⟨Sq,∂z∂f(y,z)⟩X in Theorem 4.13.
This yields all statements except for the optimality condition.
(42) takes the form
[TABLE]
Because P=Id−W (see Lemma 2.9)
we have
SyBiu,Bih−W′[Sy;SyBiu,Bih]=P′[Sy;SyBiu,Bih].
4.4 Improved optimality conditions and uniqueness for distributed controls
We want to replace yBiu,Bih in (46) and (48) by an arbitrary function of an appropriate space. This would certainly improve the optimality conditions in Theorem 4.13 and Corollary 4.14.
It is not possible in the general case i∈{1,2} without density of the ranges of Bi. Therefore, we restrict ourselves to problem (1)-(3) with distributed controls u∈U1 in this subsection.
Suppose that p in (A1) in Assumption 2.10 is chosen close to two such that 21<1−p1−d1. Then 2<d−p′dp′ and by [Mün16, Remark 2.7] we have the compact embedding
WΓD1,p′(Ω)\lhook\relbar↪[L2(Ω)]m which is also one-to-one.
That is, in this case B1 has dense range.
In Corollary 4.15 in Subsection 4.4.1 we improve the optimality conditions from Theorem 4.13 and Corollary 4.14 for this case.
For i=1 we also prove uniqueness of p,q and dμ if f is continuously differentiable, see Corollary 4.16 in Subsection 4.4.2.
4.4.1 Improved optimality conditions
We improve the optimality conditions (46) and (48).
Corollary 4.15** (Optimality condition for distributed controls).**
Let Assumption 2.10 and Assumption 3.1 hold and let 21<1−p1−d1. Assume that u∈U1 is a solution of problem (1)-(3) with i=1, together with the state y∈Y2,0 and z=W[Sy]∈H1(JT). Let v∈dom(Ap) with Sv>0 and φ∈C0∞(JT) be arbitrary.
Then in addition to (46) in Theorem 4.13 there holds
[TABLE]
If f is continuously differentiable then in addition to (48) in Corollary 4.14 there holds
[TABLE]
Proof.
Since B1 has dense range one proves just as in [MS15, Lemma 5.2] that the set
{yB1u,B1h:h∈U1}
is dense in Y2,0.
Unfortunately, we can not continue as in [MS15, Theorem 5.3] to derive a pointwise optimality condition.
The reason is that for any ζ∈C(JT;Xα) the function W′[Sy;Sζ] is non-local in time.
Nevertheless, we can still make use of the fact that W′[Sy;⋅] and f′ are positive homogeneous.
First of all, as in [MS15, Theorem 5.3], for arbitrary given η∈Y2,0, we choose a sequence in {yB1u,B1h:h∈U1} which converges to η. We pass to the limit in (46) and obtain
[TABLE]
where F′[y;η](t)=f′[(y(t),W[Sy](t));(y(t),W′[Sy;Sη](t))].
Let v∈dom(Ap) with Sv>0 be given.
Furthermore, let φ∈C0∞(JT) be arbitrary.
Then vφ∈Y2,0 and
W′[Sy;S(vφ)]=SvW′[Sy;φ].
Setting η=φv and rearranging yields
[TABLE]
Dividing both sides by Sv proves the first statement. The second inequality is shown analogously.
∎
4.4.2 Uniqueness of the adjoint variables
If f is continuously differentiable we can also show uniqueness of the adjoint couple.
Corollary 4.16** (Unique adjoint system for distributed controls).**
Let Assumption 2.10 and Assumption 3.1 hold and let 21<1−p1−d1. Moreover, suppose that f is continuously differentiable from Xα×R into X. Assume that u∈U1 is a solution of problem (1)-(3) with i=1, together with the state y∈Y2,0 and z=W[Sy]∈H1(JT). Then in the setting of Corollary 4.14 the adjoint couple p∈Y2,T∗ and q∈BV(JT) together with the measure dμ in C(JT)∗ is unique.
Proof.
Because B1 has dense range we have
ker(B1∗)=ran(B1)⊥={0}.
Therefore by Corollary 4.14 we obtain
[TABLE]
cf. [MS15, Theorem 4.15].
Suppose there are two adjoint couples (p1,q1),(p2,q2) which satisfy the conditions of Corollary 4.14.
Let ζ∈L2(JT;dom(Ap)) be arbitrary.
Then by (47) and (49) there holds
[TABLE]
This implies p˙2=p˙1 in L2(JT;[dom(Ap)]∗).
Together with
p1(T)=p2(T)=0∈[dom(Ap)]∗ we obtain p1=p2 in L2(JT;[dom(Ap)]∗).
Since the embedding dom(Ap)↪X is dense, the embedding of X∗ into [dom(Ap)]∗ is one-to-one and p1=p2 also in L2(JT;X∗) and then in Y2,T∗.
Let v∈dom(Ap) be given with Sv>0.
We already know p1=p2 so that
S(q1−q2)=0 in X∗ a.e. in JT
because of (49).
But then
[TABLE]
so that q1=q2 in L1(JT).
This way we obtain
[TABLE]
which implies
dq1−dq2=0
as measures on JT according to [Vis13, XII.7]. This yields q1=q2∈BV(0,T).
From Corollary 4.14 we conclude
dμ1=dμ2
and the proof is complete.
∎
5 Higher regularity of the solutions of the optimal control problem
In this section we improve the regularity of the optimal control u∈Ui, i∈{1,2}, and then also of the optimal state y=G(Biu) and z=W[Sy].
We denote
U~1:=[L2(Ω)]m and U~2:=∏j=1mL2(ΓNj,Hd−1).
We want to exploit the equation
Bi∗(p+Sq)=−κu in [U~i]∗ a.e. in JT
which follows from Theorem 4.13.
In order to make use of the time-regularity of p+Sq we need to enforce the conditions on Bi.
Assumption 5.1**.**
For i∈{1,2}, the operator Bi:U~i→X in N(5)
is also continuous as a mapping into Xγ for some γ∈(0,1].
We denote by I(γ) the canonical embedding from Xγ into X.
Then the assumption is equivalent to the fact that
Bi=I(γ)B~i
for a linear and continuous function B~i:U~i→Xγ.
Theorem 5.2** (Higher regularity).**
In the setting of Theorem 4.13 let Assumption 5.1 hold for some γ∈(0,1].
If γ>21, then u∈L∞(JT;U~i), y∈Ys,0 and z∈W1,s(JT) for arbitrary s∈(1,∞).
If 21(1+pd)<1, which is the case when d=2 and p>2 in (A1) in Assumption 2.10, this implies y∈C(JT;[L∞(Ω)]m).
If in addition Ω is a Lipschitz domain then y is Hölder continuous in time and space.
If γ≤21, then u∈L1−2s2(JT;U~i), y∈Y2/(1−2s),0 and z∈W1,1−2s2(JT) for arbitrary s∈(0,γ).
This implies y∈C(JT;Xθ) for any θ∈(0,21+γ).
If γ>2pd, with d and p in (A1) in Assumption 2.10, this implies y∈C(JT;[L∞(Ω)]m).
If in addition Ω is a Lipschitz domain then y is Hölder continuous in time and space.
Proof.
First note that for 0≤β≤γ≤1 we have the compact and dense embeddings Xγ\lhook\relbar↪Xβ\lhook\relbar↪X [H81, Theorem 1.4.8]. This implies X∗↪[Xγ]∗↪[Xβ]∗.
By the properties of complex interpolation and with Remark 2.6 there holds
[TABLE]
•
We prove the case when Assumption 5.1 is fulfilled with γ>21:
Since 1−γ<21 we obtain (as in Remark 2.7) an embedding
[TABLE]
Therefore, by (50) the regularity
p∈Y2,T∗ in Theorem 4.13 implies that we can identify the function p∈L2(JT;X∗) and the representative p~ of p in C(JT;[Xγ]∗).
This allows us to identify the function Bi∗p∈L2(JT;[U~i]∗) and B~i∗p~∈C(JT;[U~i]∗).
We also have Sq∈L∞(JT;X∗) since by Theorem 4.13 there holds q∈BV(JT) and because S∈X∗ by (A3) in Assumption 2.10.
That is, Bi∗Sq∈L∞(JT;[U~i]∗).
Again with Theorem 4.13 and the identification of Bi∗p and B~i∗p~ we arrive at
[TABLE]
The functions on the left side are contained in L∞(JT;[U~i]∗). We identify [U~i]∗ with U~i, so that u∈L∞(JT;U~i).
Now we use the higher regularity of u to prove a better regularity also for y.
Since u∈L∞(JT;U~i), Theorem 2.12 yields y∈Ys,0 for arbitrary s∈(1,∞).
From Remark 2.6 and Remark 2.7 we obtain y∈C(JT;Xθ) for arbitrary θ∈[0,1).
In [TR12, Theorem 3.3] it is shown that Xθ is a subset of [L∞(Ω)]m if θ>21(1+pd). By Remark 2.6 we are guaranteed that we can choose p>2. So at least if d=2 there is some θ∈(0,1) with θ>21(1+pd) and therefore y∈C(JT;[L∞(Ω)]m).
If d=2 and p>2 and if Ω is regular enough, for example a Lipschitz domain, then by [DER15, Theorem 4.5] the state y is even Hölder continuous in time and space.
•
We prove the statement for the case when Assumption 5.1 is fulfilled with γ≤21:
for arbitrary s∈(0,γ).
So the regularity
p∈Y2,T∗ in Theorem 4.13 together with (50) implies that we can identify p∈L2(JT;X∗) and the representative p~ of p in L1−2s2(JT;[Xγ]∗) and then Bi∗p∈L2(JT;[U~i]∗) and B~i∗p~∈L1−2s2(JT;[U~i]∗).
We proceed as for the case γ>21 to prove u∈L1−2s2(JT;U~i) for arbitrary s∈(0,γ). Theorem 2.12 yields y∈Y2/(1−2s),0 for arbitrary s∈(0,γ) and from Remark 2.6 and Remark 2.7 it follows y∈C(JT;Xθ) for arbitrary θ∈[0,1−(1−2s2)−1)=[0,21+s). Because s∈(0,γ) is arbitrary, this holds for all θ∈[0,21+γ). The remaining statements are shown just as for γ>21.
∎
Remark 5.3**.**
For example, take d=2 and p>2 in (A1) in Assumption 2.10 and adopt the assumptions and the notation in Theorem 4.13. By [Mün16, Remark 2.7] we have the compact embedding
[TABLE]
because p′>1 and then 2<d−p′dp′=2−p′2p′.
Therefore we can embed functions u∈U~1 into WΓD−1,p(Ω) by the assignment
∫Ωu⋅vdx,∀v∈WΓD1,p′(Ω).
We slightly reinforce Assumption 2.10.
Suppose that [Gri+02, Assumption 2.2] holds for Ω and for all ΓDj, j∈{1,…m}. This essentially means that Assumption 2.2 holds for all x∈∂Ω and that the functional determinant of each bi-Lipschitz transformation ϕx is constant a.e.
For example, this is the case if Ω is a Lipschitz domain [Gri+02, Remark 2.3]. The rest in Assumption 2.10 remains the same.
With this assumption one has
[TABLE]
for θ∈(0,1) [Gri+02, Theorem 3.1].
This way we obtain an embedding U~1↪WΓD−θ,p(Ω).
Furthermore, we have
[TABLE]
for −θ=−1+2γ by [Gri+02, Theorem 3.5] and Remark 2.6.
For any γ∈(0,21) there holds θ∈(0,1) for θ=1−2γ so that we obtain an embedding U~1↪Xγ.
Therefore, Assumption 5.1 is fulfilled for B1 with any γ∈(0,21).
By Theorem 5.2 it follows u∈L1−2s2(JT;U~1), y∈Y1/(1−2s),0 and z∈W1,1−2s2(JT) for arbitrary s∈(0,γ).
Since d=2 and p>2 we can choose γ∈(0,21) such that γ>2pd. Theorem 5.2 yields y∈C(JT;[L∞(Ω)]m). If Ω is a Lipschitz domain, then y is Hölder continuous in time and space.
6 The value function of a perturbed control problem
In this section we analyze stability properties of the minimal value function of a perturbed problem which is similar to (1)-(3).
This analysis is only relevant if the set of controls is restricted.
That is, for i∈{1,2} we consider a convex closed subset set C⊂Ui as our set of feasible controls and minimize the cost function over this set.
For given r∈Ui we analyze the perturbed problem
[TABLE]
We define the corresponding minimal value function
[TABLE]
and the multifunction
[TABLE]
We analyze the continuity properties of v and V. The proof is quite similar to the one of [BS00, Proposition 4.4].
Theorem 6.1** (Value function).**
Let Assumption 2.10 hold. For i∈{1,2}, let C⊂Ui be convex and closed. Consider the optimal control problem (51) for r∈Ui together with the corresponding minimal value function v, defined by (52), and the multifunction V from (53).
Then v is weakly lower semicontiuous.
If C is compact in Ui then v is also upper semicontiuous and therefore continuous.
In this case, also the multifunction V is upper semicontinuous, i.e. for each r0∈Ui and for any neighborhood UV(r0) of V(r0) there exists a neighborhood
Ur0 of r0 such that V(r)⊂UV(r0) for all r∈Ur0, cf. [BS00, Chapter 4.1].
Proof.
Note first that problem (51) is well-posed. This follows essentially as Theorem 2.13 for the unperturbed problem (1)-(3).
We claim that v is weakly lower semicontiuous (and then also lower semicontiuous). Let r0∈Ui be given.
We have to prove that for any sequence {rn} with rn⇀r0, n→∞, it holds
v(r0)≤liminfn→∞v(rn).
Let {rn} be such a sequence. Then {rn} is bounded in Ui.
Let ε>0 be arbitrary.
We show that for n0 large enough
v(r0)−ε≤v(rn)
for all n≥n0.
Since {rn} is bounded, by definition of J we can find some R>0 with
∪n∈NV(rn)⊂BUi(0,R).
Suppose there exists a subsequence {rnk} of {rn} such that for each nk there is some unk∈V(rnk) with v(r0)−ε>J(G(Bi(unk+rnk)),unk+rnk).
Note that {unk} is a bounded subset of C and that Ui is reflexive.
Being convex and closed, C is weakly compact.
Hence, there is another subsequence (w.l.o.g. we consider the whole sequence {unk}) and some u∈C such that unk⇀u with k→∞.
J(G(Bi(⋅+r0)),⋅) is weakly lower semicontinuous.
This follows by weak lower semicontinuity of the norm in Ui and of the solution mapping G
[Mün16, Lemma 5.3].
This implies
[TABLE]
which is a contradiction.
Therefore,
v(r0)≤liminfn→∞v(rn). Now suppose that C is compact.
We have to show that for any ε>0 there is a neighbourhood Ur0 of r0 such that
v(r)≤v(r0)+ε
for all r∈Ur0. We prove that we can choose neighbourhoods UV(r0) of V(r0) and Ur0 of r0 such that
J(G(Bi(u+r)),u+r)≤v(r0)+ε
for all (u,r)∈UV(r0)×Ur0.
Suppose that such neighbourhoods do not exist.
Then there is a sequence {rn} with rn→r0, n→∞, and a sequence {un}⊂V(r0)⊂C such that
J(G(Bi(un+rn)),un+rn)>v(r0)+ε
for all n>0.
Because J is continuous the set V(r0) is closed and therefore compact as a closed subset of a compact set.
Hence, there exists a subsequence {unk} and some u∈V(r0) with unk→u as k→∞.
This yields
[TABLE]
which is a contradiction.
So the neighbourhoods UV(r0) and Ur0 do exist and for any r∈Ur0 we obtain
v(r)≤infu∈UV(r0)J(G(Bi(u+r)),u+r)≤v(r0)+ε
which implies that v is upper semicontinuous. The last statement follows just as in [BS00, Proposition 4.4].
∎
Acknowledgement
The author is supported by the DFG through the International Research Training Group IGDK 1754 „Optimization and Numerical Analysis for Partial Differential Equations with Nonsmooth Structures”.
The author would like to thank Prof. Brokate from the Technical University of Munich and Prof. Fellner from the Karl-Franzens University of Graz for thoroughly proofreading the manuscript.
Bibliography41
The reference list from the paper itself. Each links out to its DOI / PubMed record.
1[Ama 05] H. Amann “Nonautonomous parabolic equations involving measures” In Journal of Mathematical Sciences 130.4 Springer, 2005, pp. 4780–4802
2[Aus+14] P. Auscher, N. Badr, R. Haller-Dintelmann and J. Rehberg “The square root problem for second-order, divergence form operators with mixed boundary conditions on L p superscript L 𝑝 \mathrm{L}^{p} ” In Journal of Evolution Equations 15.1 Springer, 2014, pp. 165–208
3[BJT 10] W. Barthel, C. John and F. Tröltzsch “Optimal boundary control of a system of reaction diffusion equations” In ZAMM-Journal of Applied Mathematics and Mechanics/Zeitschrift für Angewandte Mathematik und Mechanik 90.12 Wiley Online Library, 2010, pp. 966–982
4[BC 85] J.F. Bonnans and E. Casas “On the choice of the function spaces for some state-constrained control problems” In Numerical Functional Analysis and Optimization 7.4 , 1985, pp. 333–348 DOI: 10.1080/01630568508816197 · doi ↗
5[BS 00] J.F. Bonnans and A. Shapiro “Perturbation Analysis of Optimization Problems” New York: Springer, 2000
6[Bro 87] M. Brokate “Optimale Steuerung von gewöhnlichen Differentialgleichungen mit Nichtlinearitäten vom Hysteresis-Typ”, Methoden und Verfahren der mathematischen Physik P. Lang, 1987
7[Bro 88] M. Brokate “Optimal control of ODE systems with hysteresis nonlinearities” In Trends in Mathematical Optimization Springer, 1988, pp. 25–41
8[Bro 91] M. Brokate “Optimal control of systems described by ordinary differential equations with nonlinear characteristics of the hysteresis type.” In Autom. Remote Control 52 , 1991, pp. 1639–1681