Optimal control of reaction-diffusion systems with hysteresis

Christian M\"unch

arXiv:1705.11031·math.OC·June 1, 2017

Optimal control of reaction-diffusion systems with hysteresis

Christian M\"unch

PDF

TL;DR

This paper develops optimal control conditions for reaction-diffusion systems with hysteresis, addressing challenges posed by non-locality and rate-independent hysteresis, and establishing regularity and uniqueness results.

Contribution

It introduces first order optimality conditions for hysteresis-including reaction-diffusion systems, with improved conditions and uniqueness results for distributed controls.

Findings

01

Derived first order necessary optimality conditions.

02

Proved regularity and uniqueness of the adjoint system.

03

Analyzed the value function's regularity under control restrictions.

Abstract

This paper is concerned with the optimal control of hysteresis-reaction-diffusion systems. We study a control problem with two sorts of controls, namely distributed control functions, or controls which act on a part of the boundary of the domain. The state equation is given by a reaction-diffusion system with the additional challenge that the reaction term includes a scalar stop operator. We choose a variational inequality to represent the hysteresis. In this paper, we prove first order necessary optimality conditions. In particular, under certain regularity assumptions, we derive results about the continuity properties of the adjoint system. For the case of distributed controls, we improve the optimality conditions and show uniqueness of the adjoint variables. We employ the optimality system to prove higher regularity of the optimal solutions of our problem. Finally, we derive…

Equations264

u \in U_{i} min J (y, u) := \frac{1}{2} ∥ y - y_{d} ∥_{U_{1}}^{2} + \frac{κ}{2} ∥ u ∥_{U_{i}}^{2}

u \in U_{i} min J (y, u) := \frac{1}{2} ∥ y - y_{d} ∥_{U_{1}}^{2} + \frac{κ}{2} ∥ u ∥_{U_{i}}^{2}

\overset{y}{˙} (t) + (A_{p} y) (t)

\overset{y}{˙} (t) + (A_{p} y) (t)

y (0)

z

(\overset{z}{˙} (t) - \overset{v}{˙} (t)) (z (t) - ξ) \leq 0

(\overset{z}{˙} (t) - \overset{v}{˙} (t)) (z (t) - ξ) \leq 0

z (t) \in [a, b]

c_{1} r^{I} \leq ρ (B_{R^{d}} (x, r) \cap M) \leq c_{2} r^{I}

c_{1} r^{I} \leq ρ (B_{R^{d}} (x, r) \cap M) \leq c_{2} r^{I}

W_{M}^{1, p} (Ω) := \overline{{ψ ∣_{Ω} : ψ \in C_{0}^{\infty} (R^{d}), supp (ψ) \cap M = \emptyset}},

W_{M}^{1, p} (Ω) := \overline{{ψ ∣_{Ω} : ψ \in C_{0}^{\infty} (R^{d}), supp (ψ) \cap M = \emptyset}},

W_{M}^{- 1, p} (Ω) := [W_{M}^{1, p^{'}} (Ω)]^{*}

W_{M}^{- 1, p} (Ω) := [W_{M}^{1, p^{'}} (Ω)]^{*}

W_{Γ_{D}}^{1, p} (Ω) := j = 1 \prod m W_{Γ_{D_{j}}}^{1, p} (Ω)

W_{Γ_{D}}^{1, p} (Ω) := j = 1 \prod m W_{Γ_{D_{j}}}^{1, p} (Ω)

L_{p} : W_{Γ_{D}}^{1, p} (Ω) = j = 1 \prod m W_{Γ_{D_{j}}}^{1, p} (Ω) \to L^{p} (Ω, R^{m d}),

L_{p} : W_{Γ_{D}}^{1, p} (Ω) = j = 1 \prod m W_{Γ_{D_{j}}}^{1, p} (Ω) \to L^{p} (Ω, R^{m d}),

I_{p} : W_{Γ_{D}}^{1, p} (Ω) \to W_{Γ_{D}}^{- 1, p} (Ω),

I_{p} : W_{Γ_{D}}^{1, p} (Ω) \to W_{Γ_{D}}^{- 1, p} (Ω),

A_{p} : W_{Γ_{D}}^{1, p} (Ω) \to W_{Γ_{D}}^{- 1, p} (Ω),

A_{p} : W_{Γ_{D}}^{1, p} (Ω) \to W_{Γ_{D}}^{- 1, p} (Ω),

A_{p} : dom (A_{p}) = ran (I_{p}) \subset W_{Γ_{D}}^{- 1, p} (Ω) \to W_{Γ_{D}}^{- 1, p} (Ω),

A_{p} : dom (A_{p}) = ran (I_{p}) \subset W_{Γ_{D}}^{- 1, p} (Ω) \to W_{Γ_{D}}^{- 1, p} (Ω),

\overset{y}{˙} + A_{p} y = g, y (t_{0}) = 0.

\overset{y}{˙} + A_{p} y = g, y (t_{0}) = 0.

Y_{q}

Y_{q}

Y_{q}

\|(A_{p}+1)^{\theta}\exp(-A_{p}t)\|_{\mathcal{L}\bigl{(}\mathbb{W}_{\Gamma_{D}}^{-1,p}(\Omega)\bigr{)}}\leq C_{\theta}t^{-\theta}\exp((1-\gamma)t).

\|(A_{p}+1)^{\theta}\exp(-A_{p}t)\|_{\mathcal{L}\bigl{(}\mathbb{W}_{\Gamma_{D}}^{-1,p}(\Omega)\bigr{)}}\leq C_{\theta}t^{-\theta}\exp((1-\gamma)t).

∣ W [v_{1}] (t) - W [v_{2}] (t) ∣ \leq 2 0 \leq τ \leq t sup ∣ v_{1} (τ) - v_{2} (τ) ∣

∣ W [v_{1}] (t) - W [v_{2}] (t) ∣ \leq 2 0 \leq τ \leq t sup ∣ v_{1} (τ) - v_{2} (τ) ∣

P [v] = P_{r} [T (v), v (0) - z_{0}] \in C [0, T] .

P [v] = P_{r} [T (v), v (0) - z_{0}] \in C [0, T] .

∥ f (y_{1}, x_{1}) - f (y_{2}, x_{2}) ∥_{W_{Γ_{D}}^{- 1, p} (Ω)} \leq L (y_{0}) (∥ y_{1} - y_{2} ∥_{α} + ∣ x_{1} - x_{2} ∣)

∥ f (y_{1}, x_{1}) - f (y_{2}, x_{2}) ∥_{W_{Γ_{D}}^{- 1, p} (Ω)} \leq L (y_{0}) (∥ y_{1} - y_{2} ∥_{α} + ∣ x_{1} - x_{2} ∣)

∥ f (y, x) ∥_{W_{Γ_{D}}^{- 1, p} (Ω)} \leq M (1 + ∥ y ∥_{α} + ∣ x ∣)

∥ f (y, x) ∥_{W_{Γ_{D}}^{- 1, p} (Ω)} \leq M (1 + ∥ y ∥_{α} + ∣ x ∣)

\begin{aligned} \dot{y}(t)+(A_{p}y)(t)&=(F[y])(t)+u(t)&&\text{in }X&&\text{for }t>0,\\ y(0)&=0\in X.\end{aligned}\

\begin{aligned} \dot{y}(t)+(A_{p}y)(t)&=(F[y])(t)+u(t)&&\text{in }X&&\text{for }t>0,\\ y(0)&=0\in X.\end{aligned}\

y (t) = \int_{0}^{t} exp (- A_{p} (t - s)) [(F [y]) (s) + u (s)] d s, t \in J_{T} .

y (t) = \int_{0}^{t} exp (- A_{p} (t - s)) [(F [y]) (s) + u (s)] d s, t \in J_{T} .

\dot{ζ} (t) + (A_{p} ζ) (t) = F^{'} [y; ζ] (t) + h (t) for t \in J_{T}, ζ (0) = 0

\dot{ζ} (t) + (A_{p} ζ) (t) = F^{'} [y; ζ] (t) + h (t) for t \in J_{T}, ζ (0) = 0

\overset{y}{˙} (t) + (A_{p} y) (t)

\overset{y}{˙} (t) + (A_{p} y) (t)

\overset{z}{˙} (t) - S \overset{y}{˙} (t)

\overset{z}{˙} (t) - \overset{v}{˙} (t) = - \frac{1}{ε} Ψ^{'} (z (t)) for t \in J_{T}, z (0) = z_{0},

\overset{z}{˙} (t) - \overset{v}{˙} (t) = - \frac{1}{ε} Ψ^{'} (z (t)) for t \in J_{T}, z (0) = z_{0},

\overset{y}{˙} (t) + (A_{p} y) (t)

\overset{y}{˙} (t) + (A_{p} y) (t)

y (0)

∥ y_{ε}^{u} ∥_{Y_{s, 0}} \leq c (1 + ∥ u ∥_{L^{q} (J_{T}; X)})

∥ y_{ε}^{u} ∥_{Y_{s, 0}} \leq c (1 + ∥ u ∥_{L^{q} (J_{T}; X)})

0 \leq \int_{0}^{T} ∣ \overset{z}{˙}_{ε}^{u} (s) ∣^{2} d s + t \in \overline{J_{T}} sup \frac{1}{ε} Ψ (z_{ε}^{u} (t)) \leq c (1 + ∥ u ∥_{L^{2} (J_{T}; X)})^{2} .

0 \leq \int_{0}^{T} ∣ \overset{z}{˙}_{ε}^{u} (s) ∣^{2} d s + t \in \overline{J_{T}} sup \frac{1}{ε} Ψ (z_{ε}^{u} (t)) \leq c (1 + ∥ u ∥_{L^{2} (J_{T}; X)})^{2} .

∣ Z_{ε} (v) (t) - z_{0} ∣ - ∣ Z_{ε} (v) (0) - z_{0} ∣ = \int_{0}^{t} \frac{d}{d s} ∣ Z_{ε} (v) - z_{0} ∣ d s = \int_{0}^{t} \frac{\frac{d}{d s} ( Z _{ε} ( v )) ( Z _{ε} ( v ) - z _{0} )}{∣ Z _{ε} ( v ) - z _{0} ∣} d s .

∣ Z_{ε} (v) (t) - z_{0} ∣ - ∣ Z_{ε} (v) (0) - z_{0} ∣ = \int_{0}^{t} \frac{d}{d s} ∣ Z_{ε} (v) - z_{0} ∣ d s = \int_{0}^{t} \frac{\frac{d}{d s} ( Z _{ε} ( v )) ( Z _{ε} ( v ) - z _{0} )}{∣ Z _{ε} ( v ) - z _{0} ∣} d s .

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Full text

Optimal control of reaction-diffusion systems with hysteresis

Christian Münch111Department of Mathematics - M6, Technical University of Munich, Boltzmannstr. 3, 85747 Garching, Germany. [email protected]

Abstract

This paper is concerned with the optimal control of hysteresis-reaction-diffusion systems. We study a control problem with two sorts of controls, namely distributed control functions, or controls which act on a part of the boundary of the domain. The state equation is given by a reaction-diffusion system with the additional challenge that the reaction term includes a scalar stop operator. We choose a variational inequality to represent the hysteresis. In this paper, we prove first order necessary optimality conditions. In particular, under certain regularity assumptions, we derive results about the continuity properties of the adjoint system. For the case of distributed controls, we improve the optimality conditions and show uniqueness of the adjoint variables. We employ the optimality system to prove higher regularity of the optimal solutions of our problem. Finally, we derive regularity properties of the value function of a perturbed control problem when the set of controls is restricted. The specific feature of rate-independent hysteresis in the state equation leads to difficulties concerning the analysis of the solution operator. Non-locality in time of the Hadamard derivative of the control-to-state operator complicates the derivation of an adjoint system.

Keywords: Optimal control, reaction-diffusion, semilinear parabolic evolution problem, hysteresis operator, stop operator, global existence, solution operator, Hadamard differentiability, optimality conditions, adjoint system.

MSC subject class: 49J20, 47J40, 35K51

1 Introduction

In this paper, we derive an adjoint system for the optimal control problem

[TABLE]

subject to

[TABLE]

where $\mathbb{W}_{\Gamma_{D}}^{-1,p}(\Omega)$ is a product of dual spaces, see e.g. [Mün16, (16)-(18)] for the existence theory of problem (1)-(3) and related references therein. We consider either spatially distributed controls in the space $U_{1}:=\mathrm{L}^{2}\left((0,T);[\mathrm{L}^{2}(\Omega)]^{m}\right)$ , or controls which act on given Neumann boundary parts $\Gamma_{N_{j}}$ , $j\in\{1,\ldots m\}$ , of the state space, i.e. controls in $U_{2}:=\mathrm{L}^{2}\left((0,T);\prod_{j=1}^{m}\mathrm{L}^{2}(\Gamma_{N_{j}},\mathcal{H}_{d-1})\right).$ The operators $B_{1}:[\mathrm{L}^{2}(\Omega)]^{m}\rightarrow\mathbb{W}_{\Gamma_{D}}^{-1,p}(\Omega)$ and $B_{2}:\prod_{j=1}^{m}\mathrm{L}^{2}(\Gamma_{N_{j}},\mathcal{H}_{d-1})\rightarrow\mathbb{W}_{\Gamma_{D}}^{-1,p}(\Omega)$ are continuous and $A_{p}$ is an unbounded diffusion operator on the space $\mathbb{W}_{\Gamma_{D}}^{-1,p}(\Omega)$ . With $i\in\{1,2\}$ , we identify $B_{i}$ with the corresponding continuous operators from $U_{i}$ into $\mathrm{L}^{2}\left((0,T);\mathbb{W}_{\Gamma_{D}}^{-1,p}(\Omega)\right)$ which act pointwise in time, i.e. we write $(B_{i}u)(t)=B_{i}(u(t))$ for all $t\in(0,T)$ . In the same way we identify $(A_{p}y)(t)$ with $A_{p}(y(t))$ for functions $y:(0,T)\rightarrow\mathbb{W}_{\Gamma_{D}}^{-1,p}(\Omega)$ . Moreover, $S$ projects $y$ to a scalar valued function. In particular, $\mathcal{W}$ is a scalar stop operator and it is well-known (see e.g. [Vis13], [BK13]) that $\mathcal{W}$ is represented by the solution operator $z=\mathcal{W}[v]$ of the variational inequality

[TABLE]

For $i\in\{1,2\}$ , we denote by $G$ the operator, which maps $B_{i}u$ to the unique solution $y$ of (2)-(3), see [Mün16, Theorem 3.1]. Note that $y=G(B_{i}u)$ is a function of time with values in a product of dual spaces.

Optimal control of (systems of) partial differential equations has extensively been analyzed in the literature before.

In particular, optimal control problems with state equations of semilinear parabolic type are part of the well-known monograph [Trö10] and the early work [BC85]. Further studies in this direction are the subject of [RZ98] and [Cas97]. We also refer to [HKR13] for a control problem with parabolic state equation and rough boundary conditions like in our setting.

Early studies in the field of optimal control of reaction-diffusion systems and in particular in the direction of parameter sensitivity analysis have been performed in [Gri03] and were further established in [GV06] and several more papers. Optimality conditions for a similar problem were also derived in [BJT10].

The non-linearities in all the works mentioned so far are mostly smooth enough to obtain a (twice) continuously differentiable control-to-state operator, so that first and many times also second order optimality conditions could be derived.

In the literature, there are only few results available concerning optimal control of infinite-dimensional rate-independent processes. For a class of energetically driven processes, existence of optimal controls for problems of this type has first been studied in [Rin08] and [Rin09]. Subsequently, the results were applied to (thermal) control problems in the field of shape memory materials in [ELS13] and [EL14]. No optimality conditions are given in these works. Optimal control of a problem of static plasticity in the infinite-dimensional setting is the subject of [HMW12] and [HMW13]. The results were used in [HMW14] to numerically solve a quasi-static control problem by time-discretization. Optimality conditions for time-continuous, infinite-dimensional, rate-independent control problems of quasi-static plasticity type could be derived in [Wac12], [Wac15], [Wac16] by means of time-discretization. Another time-continuous, infinite-dimensional optimal control problem of a rate-independent system, which is represented by its energetic formulation, is addressed in [SWW16]. With help of viscous regularization, a necessary optimality condition is derived.

To our knowledge, the first results for optimal control of hysteresis have been achieved in [Bro87, Bro88, Bro91]. Necessary optimality conditions for the optimal control of an ODE-system with hysteresis were established. An adjoint system was derived by a time discretization approach. Optimal control of sweeping processes has been studied in [CMF14], [Col+12] and [Col+16]. In [BK13], first order optimality conditions for a control problem of an ODE-system with hysteresis of (vectorial) stop type were derived. The stop operator is represented in form of a variational inequality. The main challenge with the stop operator (as with all hysteresis operators) is the fact that hysteresis acts non-local in time so that the state $y(t)$ at each time $t\in(0,T]$ depends on the whole background $(0,t)$ . Moreover, the stop operator is not differentiable in the classical sense and so the control-to-state can not be expected to be so either. Regularization techniques were used in order to derive an optimality system. Several of the ideas of this approach are useful also for us. To handle a reaction-diffusion system requires additional work though. Firstly, the state vector $y:[0,T]\rightarrow\mathbb{W}_{\Gamma_{D}}^{-1,p}(\Omega)$ in (2) is a function with values in an infinite-dimensional space and secondly, the non-linearity $f$ in our case is not necessarily continuously differentiable but only locally Lipschitz continuous and directionally differentiable. Therefore, techniques as in [MS15] are required. Particularly, since the domain $\Omega$ has a rough boundary, we have to consider a product of dual spaces for the domain of the diffusion operator $A_{p}$ .

The existing literature provides only few rigorous results in the field of control of hysteresis-reaction-diffusion systems, especially when it comes to optimal control of such systems. In [CC02], automatic control problems governed by reaction-diffusion systems with feedback control of relay switch and Preisach type have been studied. Global existence and uniqueness of solutions were proven. Closed-loop control of a reaction-diffusion system coupled with ordinary differential inclusions has been considered in [DN11]. A feedback law for the case with a finite number of control devices was derived.

Necessary conditions for the optimal control of (general) non-smooth semilinear parabolic equations have been established in [MS15]. In particular, the non-linearity is merely locally Lipschitz continuous and directionally differentiable so that the control-to-state operator is not differentiable in the classical sense. Regularization techniques have been used to derive an adjoint system. No hysteresis is considered in this paper. Nevertheless, a modification of the approach in [MS15] is applicable for the problem at hand. In particular, we include ideas from [BK13] and adapt the proof to apply to non-localities in time such as hysteresis. We refer to the references in [MS15] for a good overview of further contributions dealing with optimal control of non-smooth parabolic equations.

In this paper, we are interested in the optimal control of non-smooth reaction-diffusion systems with hysteresis. In particular, a scalar stop operator enters the non-linearity $f$ . The function $f$ is assumed to be locally Lipschitz continuous and directionally differentiable. Additionally, the domain $\Omega$ satisfies minimal smoothness assumptions.

The outline of the paper is as follows:

In Section 2, we introduce the framework for the rest of the work and collect results from the literature. Subsection 2.3 contains the main assumption and notation.

Our first main interest is to derive an adjoint system and first order necessary optimality conditions for problem (1)-(3).

In Section 3, we introduce a family of regularized control problems with $\varepsilon$ -dependent state equations and derive adjoint systems as well as optimality conditions for those. In particular, we regularize $f$ and the stop operator $\mathcal{W}$ in dependence of the parameter $\varepsilon>0$ and replace the original control problem by a regularized one. The corresponding control-to-state operator $u\mapsto G_{\varepsilon}(B_{i}u)$ , $i\in\{1,2\}$ , and the regularization $y\mapsto Z_{\varepsilon}(Sy)$ of $\mathcal{W}[S\cdot]$ are Gâteaux-differentiable and we obtain optimal solutions $\overline{u}_{\varepsilon}$ , $\overline{y}_{\varepsilon}=G_{\varepsilon}(B_{i}\overline{u}_{\varepsilon})$ and $\overline{z}_{\varepsilon}=Z_{\varepsilon}(S\overline{y}_{\varepsilon})$ of the regularized problems. We investigate in the limit $\varepsilon\rightarrow 0$ and use standard arguments to derive a solution $(\overline{u},\overline{y},\overline{z})$ of the original problem. It still remains difficult to derive adjoint systems $(p_{\varepsilon},q_{\varepsilon})$ already for the regularized problems. The main result of Section 3 is Theorem 3.13 which contains the evolution equations of $p_{\varepsilon}$ and $q_{\varepsilon}$ and the adjoint equation which provides a relation between $(p_{\varepsilon},q_{\varepsilon})$ and $\overline{u}_{\varepsilon}$ and $\overline{u}$ .

In Section 4, we perform the key step towards an optimality system of (1)-(3) by driving the regularization parameter to zero. We exploit the adjoint systems $(p_{\varepsilon},q_{\varepsilon})$ to derive necessary optimality conditions for problem (1)-(3). While the evolution equation for $p$ follows rather straight forward, the adjoint variable $q$ which belongs to $\overline{z}$ has lower regularity, similar as in optimal control problems with implicit state constraints of the form of variational inequalities. The function $q$ is contained in the space $\mathrm{BV}(0,T)$ of functions with bounded total variation in $[0,T]$ , and instead of a time derivative we obtain a measure $dq\in\mathrm{C}([0,T])^{*}$ . In order to complete our knowledge about the optimality system, we investigate in studying $q$ and $dq$ . Indeed, we reveal a lot of the properties of $q$ and the corresponding measure $dq$ . There remains an abstract measure $d\mu\in\mathrm{C}([0,T])^{*}$ on which $dq$ depends and which we cannot fully characterize. Moreover, $d\mu$ appears in the optimality conditions for problem (1)-(3). Still, we are able to prove that $d\mu$ has its support only in a part of $[0,T]$ . With an additional regularity assumption on $S\overline{y}$ , we can characterize the measure $d\mu$ also in most of the parts where it does not vanish. The first main results of Section 4 are Theorem 4.13 and Corollary 4.14, which contain the existence of an adjoint system and optimality conditions for problem (1)-(3) for $i\in\{1,2\}$ . After having established the optimality system for the general problem (1)-(3), $i\in\{1,2\}$ , we continue to improve the optimality conditions for the particular case of distributed control functions, i.e. for $i=1$ , see Corollary 4.15 below. Moreover, in Corollary 4.16. we show uniqueness of $p$ , $q$ and $d\mu$ for $i=1$ . In the we make explicit use of the surjectivity of $B_{1}$ which implies that the operator $B_{1}^{*}$ in the adjoint equation is one-to-one. These together are the second main result of Section 4.

In Section 5, we prove higher regularity of the optimal control $\overline{u}$ and the optimal state $\overline{y}$ by means of the adjoint equation and the continuity properties of the adjoint variables, see Theorem 5.2 below. An example for a case in which Theorem 5.2 can be applied is given in Remark 5.3.

Finally, in Section 6 we study a perturbed problem similar to (1)-(3). In particular, in Theorem 6.1 we prove regularity results for the corresponding value function.

All our results are applicable for more general spaces of control functions $U=\mathrm{L}^{2}\bigl{(}(0,T);\tilde{U}\bigr{)}$ , as long as there exists a continuous operator $B:\tilde{U}\rightarrow\mathbb{W}_{\Gamma_{D}}^{-1,p}(\Omega)$ . Also $J(y,u)$ can be exchanged by a general differentiable functional $J(y,u,z)$ if the corresponding reduced cost function remains coercive in $u\in U$ . Moreover, $A_{p}$ can be replaced by a semi-linear parabolic operator which satisfies maximal parabolic regularity on the space $\mathbb{W}_{\Gamma_{D}}^{-1,p}(\Omega)$ . We focus on the two particular control problems for $U_{1}$ and $U_{2}$ and on the operator $A_{p}$ in order to give an illustration.

Notation:

We write $\mathcal{L}(X,Y)$ for the space of linear operators between spaces $X$ and $Y$ and $\mathcal{L}(X)$ for the space of linear operators on $X$ . We also abbreviate the duality in $X$ by $\langle x,y\rangle_{X^{*},X}=\langle x,y\rangle_{X}.$ $c>0$ denotes a generic constant which is adapted in the course of the paper. In Banach space valued evolution equations like (2) we sometimes omit the range space if the latter is clear from the context, i.e. we only write ”for $t\in(0,T)$ ”.

2 Preliminaries and assumptions

We introduce the setting for the rest of the work, collect results from the literature and state the main assumption.

2.1 Sobolev spaces including homogeneous Dirichlet boundary conditions

Definition 2.1.

[[, Definition 2.1] $I$ -sets]muench For $0<I\leq d$ and a closed set $M\subset\mathbb{R}^{d}$ let $\rho$ denote the restriction of the $I$ -dimensional Hausdorff measure $\mathcal{H}_{I}$ to $M$ . Then we call $M$ an $I$ -set if there are constants $c_{1},c_{2}>0$ such that

[TABLE]

for all $x$ in $M$ and $r\in]0,1[$ .

Assumption 2.2 (Domain).

[Hal+15, Assumption 2.3 and Assumption 4.11] or [Mün16, Assumption 2.2 and Assumption 2.6] For some given $d\geq 2$ , the domain $\Omega\subset\mathbb{R}^{d}$ is bounded and $\overline{\Omega}$ is a $d$ -set. For $j\in\{1,\ldots,m\}$ the Neumann boundary part $\Gamma_{N_{j}}\subset\partial\Omega$ is open and $\Gamma_{D_{j}}=\partial\Omega\backslash\Gamma_{N_{j}}$ is a $(d-1)$ -set. For any $x\in\overline{\Gamma_{N_{j}}}$ there is an open neighborhood $U_{x}$ of $x$ and a bi-Lipschitz mapping $\phi_{x}$ from $U_{x}$ onto a cube in $\mathbb{R}^{d}$ such that $\phi_{x}(\Omega\cap U_{x})$ equals the lower half of the cube and such that $\partial\Omega\cap U_{x}$ is mapped onto the top surface of the lower half cube.

We only consider real valued functions. For each component $j\in\{1,\ldots,m\}$ of the space of vector valued functions, see Definition 2.3, we decompose the boundary $\partial\Omega$ into the corresponding Dirichlet part $\Gamma_{D_{j}}$ and the Neumann boundary $\Gamma_{N_{j}}:=\partial\Omega\backslash\Gamma_{D_{j}}$ , see Assumption 2.2. The cases $\Gamma_{D_{j}}=\emptyset$ and $\Gamma_{D_{j}}=\partial\Omega$ are not excluded.

We define Sobolev spaces which include the Dirichlet boundary conditions for our state equation.

Definition 2.3 (Sobolev spaces).

[Hal+15, Definition 2.4] or [Mün16, Definition 2.4] For $\Omega$ from Assumption 2.2 and $p\in[1,\infty)$ we denote by $\mathrm{W}^{1,p}(\Omega)$ the usual Sobolev space on $\Omega$ . If $M$ is a closed subset of $\overline{\Omega}$ we define

[TABLE]

where the closure is taken in the space $\mathrm{W}^{1,p}(\Omega)$ . In the case $p\in(1,\infty)$ we denote by $p^{\prime}$ the Hölder conjugate of $p$ . Moreover, we write

[TABLE]

for the dual space $\mathrm{W}_{\mathrm{M}}^{1,p^{\prime}}(\Omega)$ . In the vectorial setting we introduce the product space

[TABLE]

and for $p\in(1,\infty)$ we denote by $\mathbb{W}_{\Gamma_{D}}^{-1,p}(\Omega)$ the (componentwise) dual space of $\mathbb{W}_{\Gamma_{D}}^{1,p^{\prime}}(\Omega)$ .

2.2 Operators and their properties

In this section we precisely define the operators $A_{p}$ in equation (2), see Definition 2.4. We apply results from the literature to assure that $A_{p}$ satisfies the properties which we need for the analysis of (2)-(3) for particular values of $p$ to be chosen, see [Hal+15, Section 6] or [Mün16, Subsection 2.2].

Definition 2.4 (Diffusion operator).

For $p\in(1,\infty)$ we define the continuous operators

[TABLE]

and

[TABLE]

With given diffusion coefficients $d_{1},\ldots,d_{m}>0$ we define the corresponding diffusion matrix in $\mathbb{R}^{md\times md}$ by $D=\mathrm{diag}(d_{1},\ldots,d_{1},\ldots,d_{m},\ldots,d_{m})$ .

For $p\in(1,\infty)$ we set

[TABLE]

and define the unbounded operator

[TABLE]

The set $\mathrm{ran}\left(I_{p}\right)$ stands for the range of $I_{p}$ . The domain $\mathrm{dom}(A_{p})$ is equipped with the graph norm.

We introduce the notion of maximal parabolic regularity as in [Mün16, Definition 2.12].

Definition 2.5 (Maximal parabolic regularity).

For $p,q\in(1,\infty)$ and $(t_{0},T)\subset\mathbb{R}$ we say that $A_{p}$ satisfies maximal parabolic $\mathrm{L}^{q}((t_{0},T);\mathbb{W}_{\Gamma_{D}}^{-1,p}(\Omega))$ -regularity if for all $g\in\mathrm{L}^{q}\left((t_{0},T);\mathbb{W}_{\Gamma_{D}}^{-1,p}(\Omega)\right)$ there is a unique solution $y\in\mathrm{W}^{1,q}((t_{0},T);\mathbb{W}_{\Gamma_{D}}^{-1,p}(\Omega))\cap\mathrm{L}^{q}((t_{0},T);\mathrm{dom}(A_{p}))$ of the equation

[TABLE]

The time derivative is taken in the sense of distributions [Aus+14, Definition 11.2].

For $t\in[0,T]$ we abbreviate

$Y_{q}:=\mathrm{W}^{1,q}((0,T);\mathbb{W}_{\Gamma_{D}}^{-1,p}(\Omega))\cap\mathrm{L}^{q}((0,T);\mathrm{dom}(A_{p}))$ , $Y_{q,t}:=\{y\in Y_{q}:\ y(t)=0\}$ and

$Y^{*}_{q,t}:=\{y\in\mathrm{W}^{1,q}(0,T;[\mathrm{dom}(A_{p})]^{*})\cap\mathrm{L}^{q}((0,T);\mathbb{W}_{\Gamma_{D}}^{1,p^{\prime}}(\Omega)):\ y(t)=0\}$ .

As in [Mün16, Remark 2.13] note the following:

Remark 2.6 (Properties of $A_{p}$ ).

If Definition 2.5 applies for $A_{p}$ with some $p\in(1,\infty)$ then the property of maximal parabolic regularity is independent of $q\in(1,\infty)$ and of the interval $(t_{0},T)$ , so we just say that $A_{p}$ satisfies maximal parabolic regularity on $\mathbb{W}_{\Gamma_{D}}^{-1,p}(\Omega)$ in this case. 2. 2.

If $A_{p}$ satisfies maximal parabolic regularity on $\mathbb{W}_{\Gamma_{D}}^{-1,p}(\Omega)$ for some $p\in(1,\infty)$ then $(\frac{d}{dt}+A_{p})^{-1}$ is bounded from $\mathrm{L}^{q}((0,T);\mathbb{W}_{\Gamma_{D}}^{-1,p}(\Omega))$ to $Y_{q,0}$ for any $q\in(1,\infty)$ . 3. 3.

In the setting of Assumption 2.2 there is an open interval $\mathrm{J}$ containing $2$ such that for $p\in\mathrm{J}$ the operator $\mathcal{A}_{p}+I_{p}$ is a topological isomorphism and such that $-A_{p}$ generates an analytic semigroup of operators on $\mathbb{W}_{\Gamma_{D}}^{-1,p}(\Omega)$ [Mün16, Theorem 2.10] or [Hal+15, Theorem 5.6 and Theorem 5.12]. 4. 4.

If $p\in\mathrm{J}$ and if $\theta\geq 0$ is given then for $A_{p}+1:=A_{p}+\mathrm{Id}$ the fractional power spaces $X^{\theta}:=\mathrm{dom}([A_{p}+1]^{\theta})\subset\mathbb{W}_{\Gamma_{D}}^{-1,p}(\Omega)$ and the unbounded operators $[A_{p}+1]^{\theta}$ in the sense of [H81, Chapter 1] are well-defined with $X^{0}=\mathbb{W}_{\Gamma_{D}}^{-1,p}(\Omega)$ . $X^{\theta}$ is equipped with the norm $\|x\|_{X^{\theta}}=\|(A_{p}+1)^{\theta}x\|_{\mathbb{W}_{\Gamma_{D}}^{-1,p}(\Omega)}$ [[, cf.]Remark 2.11]muench. Note that we can identify $X^{1}$ with the space $\mathrm{dom}(A_{p})$ endowed with the graph norm. 5. 5.

If $p\in\mathrm{J}\cap[2,\infty)$ , then $A_{p}$ satisfies maximal parabolic Sobolev regularity on $\mathbb{W}_{\Gamma_{D}}^{-1,p}(\Omega)$ and we have the topological equivalences $[\mathbb{W}_{\Gamma_{D}}^{-1,p}(\Omega),\mathbb{W}_{\Gamma_{D}}^{1,p}(\Omega)]_{\theta}\simeq[\mathbb{W}_{\Gamma_{D}}^{-1,p}(\Omega),\mathrm{dom}(A_{p})]_{\theta}\simeq X^{\theta}$ for all $\theta\in(0,1)$ [CA01, Theorem 11.6.1]. By $[\cdot,\cdot]_{\theta}$ we mean complex interpolation.

We will make use of the following embeddings:

Remark 2.7 (Embeddings).

[Mün16, Remark 2.14] With $q\in(1,\infty)$ one has

[TABLE]

for every $0<\theta<\eta<1-1/q$ and $0\leq\beta<1-1/q-\eta$ . $(\cdot,\cdot)_{\eta,1}$ or $(\cdot,\cdot)_{\eta,q}$ respectively means real interpolation. The first embeddings are compact because $\mathrm{dom}(A_{p})$ is compactly embedded into $\mathbb{W}_{\mathrm{\Gamma_{D}}}^{-1,p}(\Omega)$ .

With $p\in\mathrm{J}$ , the following estimate for the fractional powers of $A_{p}+1$ and the analytic semigroup $\exp(-A_{p}t)$ is crucial:

Remark 2.8.

[Mün16, Remark 2.15] Let $p\in\mathrm{J}$ with $\mathrm{J}$ from Remark 2.6. For $t>0$ and arbitrary $\gamma\in(0,1)$ and $\theta\geq 0$ there exists some $C_{\theta}\in(0,\infty)$ such that

[TABLE]

The stop operator has the following regularity properties.

Lemma 2.9 (Stop operator).

With $T>0$ the stop operator $\mathcal{W}$ , which is represented by (4)-(5), is Lipschitz continuous as a mapping on $\mathrm{C}[0,T]$ and

[TABLE]

for all $v,v_{1},v_{2}\in\mathrm{C}[0,T]$ and $t\in[0,T]$ . Note that we have to add $|z_{0}|$ in (7) because, by (5), $\mathcal{W}[v](0)=z_{0}$ for any $v\in\mathrm{C}[0,T]$ . For $q\in[1,\infty)$ it is also bounded and weakly continuous on $\mathrm{W}^{1,q}(0,T)$ . $\mathcal{W}:\mathrm{C}[0,T]\rightarrow\mathrm{L}^{q}(0,T)$ is Hadamard directionally differentiable, see Definition 2.11 below. The same regularity properties hold for the operator $\mathcal{P}=\mathrm{Id}-\mathcal{W}$ . $\mathcal{P}$ is a scalar play operator. More precisely, for $r=\frac{b-a}{2}$ let $\mathcal{P}_{r}:\mathrm{C}[0,T]\times\mathbb{R}\rightarrow\mathrm{C}[0,T]$ denote a symmetrical scalar play operator (as in [BK15]). Consider the affine linear transformation $\mathcal{T}:[-r,r]\rightarrow[a,b],\ \mathcal{T}:x\mapsto x-\frac{b+a}{2}.$ Then for $v\in\mathrm{C}[0,T]$ there holds

[TABLE]

Proof.

Follows from [Mün16, Subsection 2.4 and Subsection 4.2], see also [Vis13, Part 1, Chapter III] and [BK15]. ∎

2.3 Assumptions and notation

Our main assumption is the following:

Assumption 2.10 (Main assumption).

[Mün16, Assumptions 2.16, 4.6 and 5.1] We always suppose that Assumption 2.2 holds. Moreover we assume:

(A1)

Dimension and Sobolev exponent: $d\geq 2$ and with $\mathrm{J}$ from Remark 2.6 there holds $p\in\mathrm{J}\cap[2,\infty)$ and $2\geq p\left(1-\frac{1}{d}\right)$ .

(A2)

Nonlinearity locally Lipschitz + Hadamard: We will need a fractional power space $X^{\alpha}=\mathrm{dom}([A_{p}+1]^{\alpha})$ with exponent strictly smaller than one half. This fact is highlighted by a new parameter $\alpha$ which we use instead of $\theta\in[0,\infty)$ . For some $\alpha\in(0,\frac{1}{2})$ suppose that the function $f:X^{\alpha}\times\mathbb{R}\rightarrow\mathbb{W}_{\Gamma_{D}}^{-1,p}(\Omega)$ is locally Lipschitz continuous with respect to the $X^{\alpha}$ -norm. This means that given any $y_{0}\in X^{\alpha}$ there is a constant $L(y_{0})$ and a neighbourhood $V(y_{0})=\left\{y\in X^{\alpha}:\|y-y_{0}\|_{X^{\alpha}}\leq\delta\in(0,\infty)\right\}$ of $y_{0}$ such that

[TABLE]

for every $y_{1},y_{2}\in V(y_{0})$ and all $x_{1},x_{2}\in\mathbb{R}$ . $f$ is assumed to be directionally differentiable and therefore Hadamard directionally differentiable, see Definition 2.11 below. Furthermore, the linear growth condition

[TABLE]

holds for some $M>0$ .

(A3)

Scalar projection: For some $w\in\mathbb{W}_{\Gamma_{D}}^{1,p^{\prime}}(\Omega)\backslash\{0\}$ the operator $S\in[\mathbb{W}_{\Gamma_{D}}^{-1,p}(\Omega)]^{*}$ in equation (3) is given by $Sy=\langle y,w\rangle_{\mathbb{W}_{\Gamma_{D}}^{1,p^{\prime}}(\Omega)}\ \forall y\in\mathbb{W}_{\Gamma_{D}}^{-1,p}(\Omega).$ We assume that $w$ is even contained in the space $\mathrm{dom}([(1+A_{p})^{1-\alpha}]^{*})$ . Note that $S$ belongs to $[X^{\theta}]^{*}$ for all $\theta\geq 0$ because of the embedding $X^{\theta}\hookrightarrow\mathbb{W}_{\Gamma_{D}}^{-1,p}(\Omega)$ .

(A4)

Desired state: The desired state $y_{d}$ in (1) is in $\mathrm{L}^{2}\left((0,T);[\mathrm{L}^{2}(\Omega)]^{m}\right)$ .

We introduce some more notation for the rest of the work:

(N1)

For the particular $p$ from (A1) in Assumption 2.10 we set $X:=\mathbb{W}_{\Gamma_{D}}^{-1,p}(\Omega)$ with $\mathbb{W}_{\Gamma_{D}}^{-1,p}(\Omega)$ from Definition 2.3. We sometimes identify elements $v\in X^{*}$ with their Riesz representation in $\mathbb{W}_{\Gamma_{D}}^{1,p^{\prime}}(\Omega)$ , i.e. $\langle v,y\rangle_{X}=\langle y,v\rangle_{\mathbb{W}_{\Gamma_{D}}^{1,p^{\prime}}(\Omega)},\ \forall y\in X.$

(N2)

The operators $A_{p}$ and the spaces $X^{\theta}=\mathrm{dom}([A_{p}+1]^{\theta})$ are defined as in Definition 2.4 and Remark 2.6.

(N3)

The spaces $Y_{q}$ , $Y_{q,t}$ and $Y^{*}_{q,t}$ are defined as in Definition 2.5.

(N4)

$\mathcal{W}$ is the scalar stop operator from Lemma 2.9.

(N5)

$B_{1}$ is defined by $B_{1}:[\mathrm{L}^{2}(\Omega)]^{m}\rightarrow X,\ \langle B_{1}u,v\rangle_{\mathbb{W}_{\Gamma_{D}}^{1,p^{\prime}}(\Omega)}:=\int_{\Omega}u\cdot v\,dx\ v\in\mathbb{W}_{\Gamma_{D}}^{1,p^{\prime}}(\Omega).$

Since $2\geq p\left(1-\frac{1}{d}\right)$ , the embeddings $\mathrm{L}^{2}(\Gamma_{N_{j}},\mathcal{H}_{d-1})\hookrightarrow\mathrm{W}_{\mathrm{\Gamma_{D}}_{j}}^{-1,p}(\Omega)$ are continuous for $j\in\{1,\ldots,m\}$ [Hal+15, Remark 5.11]. Therefore also the operator

$B_{2}:\prod_{j=1}^{m}\mathrm{L}^{2}(\Gamma_{N_{j}},\mathcal{H}_{d-1})\rightarrow X,\ \langle B_{2}y,v\rangle_{\mathbb{W}^{1,p^{\prime}}(\Omega)}:=\sum_{j=1}^{m}\int_{\Gamma_{N_{j}}}y_{j}v_{j}\,d\mathcal{H}_{d-1}\ \forall v\in\mathbb{W}_{\Gamma_{D}}^{1,p^{\prime}}(\Omega)$ is continuous.

(N6)

We write $J_{T}=(0,T)$ , $U_{1}=\mathrm{L}^{2}\left(J_{T};[\mathrm{L}^{2}(\Omega)]^{m}\right)$ and $U_{2}=\mathrm{L}^{2}\left(J_{T};\prod_{j=1}^{m}\mathrm{L}^{2}(\Gamma_{N_{j}},\mathcal{H}_{d-1})\right).$

2.4 Solution operator and optimal control

As in [Mün16, Equation (1)] we denote $F[y](t):=f(y(t),\mathcal{W}[Sy](t))$ and introduce the more general abstract evolution equation

[TABLE]

Note that $F[y]$ is non-local in time. In order to obtain some kind of differentiability of the reduced cost function, the solution operator of the state equation has to be differentiable in a sense which allows for the chain rule. We can not expect a Fréchet derivative because of the non-smooth hysteresis operator, see [BK15]. But the chain rule can also be applied within the weaker concept of Hadamard directional differentiability.

Definition 2.11.

[Hadamard directional differentiability] Let $X,Y$ be normed vector spaces and let $U\subset X$ be open. If $g:U\rightarrow Y$ is directionally differentiable at $x\in U$ and if in addition for all functions $r:[0,\lambda_{0})\rightarrow X$ with $\lim\limits_{\lambda\rightarrow 0}\frac{r(\lambda)}{\lambda}=0$ it holds $g^{\prime}[x;h]=\lim\limits_{\lambda\downarrow 0}\frac{g(x+\lambda h+r(\lambda))-g(x)}{\lambda}$ for all directions $h\in X$ , we call $g^{\prime}[x;h]$ the Hadamard directional derivative of $g$ at $x$ in the direction $h$ . Note that $g(x+\lambda h+r(\lambda))$ is only well defined if $\lambda$ is already small enough so that $x+\lambda h+r(\lambda)\in U$ . The chain rule applies for Hadamard directionally differentiable functions [Mün16, Lemma 4.3].

Hadamard directional differentiability of the solution operator $G$ is shown in [Mün16]. By [Mün16, Theorem 3.1 and Theorem 4.7] we have:

Theorem 2.12 (Solution operator for the state equation).

Let Assumption 2.10 hold. Then for the fixed value $\alpha\in\left(0,\frac{1}{2}\right)$ and for all $u\in\mathrm{L}^{q}(J_{T};X)$ with $q\in(\frac{1}{1-\alpha},\infty]$ problem (8) has a unique mild solution $y=y(u)=:y^{u}$ in $\mathrm{C}(\overline{J_{T}};X^{\alpha})$ . In particular, this means that $(F[y])+u$ is contained in $\mathrm{L}^{1}(J_{T};X)$ and that $y$ solves the integral equation

[TABLE]

The solution mapping $G:u\mapsto y(u),\ \mathrm{L}^{q}(J_{T};X)\rightarrow\mathrm{C}(\overline{J_{T}};X^{\alpha})$ is locally Lipschitz continuous. $G$ is linearly bounded with values in $\mathrm{C}(\overline{J_{T}};X^{\alpha})$ . All statements remain valid if $\mathrm{C}(\overline{J_{T}};X^{\alpha})$ is replaced by $Y_{s,0}$ where $s=q$ if $q<\infty$ and with $s\in(1,\infty)$ arbitrary if $q=\infty$ . $G$ is Hadamard directionally differentiable as a mapping into $\mathrm{C}(\overline{J_{T}};X^{\alpha})$ as well as into $Y_{q,0}$ for any $q\in(\frac{1}{1-\alpha},\infty)$ . Its derivative $y^{u,h}:=G^{\prime}[u;h]$ at $u\in\mathrm{L}^{q}(J_{T};X)$ in direction $h\in\mathrm{L}^{q}(J_{T};X)$ is given by the unique solution $\zeta$ of

[TABLE]

where $F^{\prime}[y;\zeta](t)=f^{\prime}[(y(t),\mathcal{W}[Sy](t));(y(t),\mathcal{W}^{\prime}[Sy;S\zeta](t))]$ and $y=G(u)$ . The mapping $h\mapsto G^{\prime}[u;h]$ is Lipschitz continuous from $\mathrm{L}^{q}(J_{T};X)$ to $\mathrm{C}(\overline{J_{T}};X^{\alpha})$ and to $Y_{q,0}$ with a modulus $C=C(G(u),T)>0$ .

Proof.

See [Mün16, Theorem 3.1 and Theorem 4.7]. ∎

Existence of an optimal control for problem (1)-(3) is shown in [Mün16, Theorem 5.4]:

Theorem 2.13 (Existence of optimal control).

Let Assumption 2.10 hold. Then for $i\in\{1,2\}$ , there exists an optimal control $\overline{u}\in U_{i}$ for the optimal control problem (1)-(3). This means that $\overline{u}$ , together with the optimal state $\overline{y}=G(\overline{u})$ , which solves (2), are a solution of the minimization problem (1). The solution of (3) is given by $\overline{z}=\mathcal{W}[S\overline{y}]$ .

Proof.

See [Mün16, Theorem 5.4]. ∎

3 Regularized control problem

In order to derive an adjoint system for problem (1)-(3) we introduce a sequence of control problems with regularized $\varepsilon$ -dependent state equations, for which we can derive adjoint systems. To this aim we regularize the variational inequality which defines $\mathcal{W}$ and the non-linearity $f$ , which yields a regularization of the solution operator of (8). The regularization of $\mathcal{W}$ follows the techniques in [BK13, Section 3] and the approach for the regularization of semilinear parabolic equations relies on [MS15, Section 4].

In the end of Subsection 3.1, we estimate the norms of the solutions of the regularized state equations against the forcing term $u$ , independently of $\varepsilon$ .

The dynamics of the regularized state equations in dependence of $\varepsilon$ is analyzed in Subsection 3.2: The estimates from Subsection 3.1 together with a weak compactness argument imply weak compactness of the regularized solution operators for fixed $\varepsilon>0$ . This yields weakly converging subsequences $y_{\varepsilon_{k}}$ and $z_{\varepsilon_{k}}$ for any weakly converging sequence $u_{\varepsilon}$ , $\varepsilon\rightarrow 0$ .

In Subsection 3.3, we apply the convergence result from Subsection 3.2 to deduce convergence of the solutions of regularized control problems, which are introduced in Subsection 3.3, to an optimal solution of problem (1)-(3) as $\varepsilon\rightarrow 0$ , see Theorem 3.9.

The adjoint equations for the solutions of the regularized control problems with $\varepsilon>0$ fixed are derived in Subsection 3.5, see Theorem 3.13 below.

In Subsection 3.6, we derive uniform-in- $\varepsilon$ bounds for the norms of the adjoint variables $p_{\varepsilon},q_{\varepsilon}$ from Theorem 3.13.

The norm bounds on $p_{\varepsilon},q_{\varepsilon}$ from Subsection 3.6 give rise to weakly converging subsequences $p_{\varepsilon_{k}}$ and $q_{\varepsilon_{k}}$ . Taking the limit $k\rightarrow\infty$ then yields an adjoint system for (1)-(3). This step is carried out in Section 4.

We begin with several assumptions on the functions which will enter the regularized problems.

Assumption 3.1 (Regularization).

For $\varepsilon_{*}>0$ and $\varepsilon\in(0,\varepsilon_{*}]$ we assume that:

$(A1)_{\varepsilon}$

$f_{\varepsilon}:X^{\alpha}\times\mathbb{R}\rightarrow$ is Gâteaux differentiable.

$(A2)_{\varepsilon}$

$\sup_{(y,z)\in X^{\alpha}\times\mathbb{R}}\|f_{\varepsilon}(y,z)-f(y,z)\|_{X}\rightarrow 0$ as $\varepsilon\rightarrow 0$ .

$(A3)_{\varepsilon}$

$f_{\varepsilon}$ is locally Lipschitz continuous with respect to the $X^{\alpha}$ -norm and all the neighbourhoods and Lipschitz constants are equal to the ones of $f$ in (A2) in Assumption 2.10, independently of $\varepsilon$ . The growth condition $\|f_{\varepsilon}(y,x)\|_{X}\leq M\left(1+\|y\|_{X^{\alpha}}+|x|\right)$ holds for all $y\in X^{\alpha}$ and $x\in\mathbb{R}$ , with $M$ from (A2) in Assumption 2.10.

$(A4)_{\varepsilon}$

Following the ideas of [BK13], we introduce a convex function $\Psi:\mathbb{R}\rightarrow\mathbb{R}$ with $\Psi(x)\equiv 0$ for $x\in[a,b]$ and $\Psi(x)>0$ for $x\in\mathbb{R}\backslash[a,b]$ . We assume that $\Psi$ is twice continuously differentiable and $\Psi^{\prime}(x)\leq m_{1}|x-a|$ for some $m_{1}>0$ and all $x\in\mathbb{R}$ . Moreover, $\Psi^{\prime\prime}(x)\leq m_{2}$ for some $m_{2}>0$ and all $x\in\mathbb{R}$ and $\Psi^{\prime\prime}$ is assumed to be locally Lipschitz continuous.

Remark 3.2.

A function $\Psi$ as in Assumption 3.1 can be contructed as a piecewise defined mapping $\Psi=\chi_{(-\infty,a_{1}]}\Psi_{-2}+\chi_{(a_{1},a]}\Psi_{-1}+\chi_{(b,b_{1}]}\Psi_{1}+\chi_{(b_{1},\infty)}\Psi_{2},$ where $a_{1}<a<b<b_{1}$ . $\chi$ denotes the characteristic function. $\Psi_{-2}$ and $\Psi_{2}$ are affine linear and $\Psi_{-1}$ and $\Psi_{1}$ are polynomials of order four with roots in $a$ and $b$ which are at the same time saddle points and with turning points in $a_{1}$ and $b_{1}$ .

For example we can choose $b_{1}:=b+2$ , $\Psi_{1}(x):=(x-b)^{3}(4+b-x)$ and $\Psi_{2}(x):=16(x-1-b)$ and define $\Psi_{-1},\Psi_{-2}$ in a similar way, cf. Figure 1. Local Lipschitz continuity of $\Psi^{\prime\prime}$ also in the points where the functions $\Psi_{-2},\ldots,\Psi_{2}$ are glued together is not hard to see. It follows that $\Psi^{\prime\prime}$ is Lipschitz continuous.

We introduce the following regularized state equations for $i\in\{1,2\}$ and $\varepsilon>0$ :

[TABLE]

3.1 Regularization of (8) and uniform-in- $\varepsilon$ estimates

In this subsection, we introduce a regularization of (8), similar to the regularized state equations (10)-(11) but for source terms $u\in\mathrm{L}^{q}(J_{T};X)$ . We show well-posedness and estimate the norms of the solutions in $u$ , independently of $\varepsilon$ . The ideas for many of the steps in this subsection go back to [BK13, Subsection 3.1].

Definition 3.3 (Regularized stop).

For $\varepsilon\in(0,\varepsilon_{*}]$ we denote by $Z_{\varepsilon}:v\mapsto Z_{\varepsilon}(v)$ the solution operator of

[TABLE]

or of the corresponding integral equation. The input $v$ is a function defined on $J_{T}$ .

Remark 3.4.

By standard techniques it follows that $Z_{\varepsilon}$ is continuously differentiable on $\mathrm{C}(\overline{J_{T}})$ . Its derivative at $v$ in direction $h$ is given by the unique solution $Z^{\prime}_{\varepsilon}[v;h]=z$ of the integral equation $z(t)=h(t)-\int_{0}^{t}\frac{1}{\varepsilon}\Psi^{\prime\prime}(Z_{\varepsilon}(v)(s))z(s)ds.$ $Z_{\varepsilon}$ is bounded on $\mathrm{W}^{1,q}(J_{T})$ for all $q\in(1,\infty)$ .

Similar to the definition of $F$ in Subsection 2.4 we denote $(F_{\varepsilon}(y))(t):=f_{\varepsilon}(y(t),Z_{\varepsilon}(Sy)(t))$ . Consider the abstract evolution equation

[TABLE]

Corollary 3.5 (Existence of regularized problem).

Let Assumption 2.10 and Assumption 3.1 hold and let $\varepsilon\in(0,\varepsilon_{*}]$ be arbitrary. Furthermore, assume $q\in(\frac{1}{1-\alpha},\infty]$ and set $s=q$ if $q<\infty$ or $s\in(1,\infty)$ arbitrary if $q=\infty$ . Then for all $u\in\mathrm{L}^{q}(J_{T};X)$ problem (12) has a unique solution $y_{\varepsilon}(u)$ in $Y_{s,0}$ . The solution mapping $G_{\varepsilon}:u\mapsto y_{\varepsilon}(u)=:y_{\varepsilon}^{u}$ is locally Lipschitz continuous from $\mathrm{L}^{q}(J_{T};X)$ to $\mathrm{C}(\overline{J_{T}};X^{\alpha})$ and to $Y_{s,0}$ . We denote $z_{\varepsilon}^{u}:=z_{\varepsilon}(u):=Z_{\varepsilon}(Sy_{\varepsilon}^{u})$ .

Proof.

Unique solvability of (12) and local Lipschitz continuity of the solution mapping follow because $Z_{\varepsilon}$ satisfies the properties of $\mathcal{W}$ in Theorem 2.12. ∎

In the next step we estimate the norms of the solutions of (12) independently of $\varepsilon$ by the norm of the source function $u\in\mathrm{L}^{q}(J_{T};X)$ . This yields analogous estimates also for the solutions of (10)-(11) if $u$ is replaced by $B_{i}u$ .

Lemma 3.6 (Uniform bounds).

Adopt the assumptions and the notation from Lemma 3.5. There exists a constant $c>0$ which is independent of $\varepsilon$ and $u$ such that the following holds true. For all $q\in(\frac{1}{1-\alpha},\infty]$ and $\varepsilon\in(0,\varepsilon_{*}]$ we have

[TABLE]

with $s=q$ if $q<\infty$ and for all $s\in(1,\infty)$ if $q=\infty$ . Moreover, there holds

[TABLE]

Proof.

Note first that for $v\in\mathrm{W}^{1,s}(J_{T})$ and for $t\in J_{T}$ we have

[TABLE]

Note that $\Psi^{\prime}(x)(x-z_{0})\geq 0$ for all $x\in\mathbb{R}$ because $\Psi$ is convex and since $\Psi^{\prime}(z_{0})=0$ . We insert $Z_{\varepsilon}(v)(0)=z_{0}$ and $\frac{d}{ds}(Z_{\varepsilon}(v))=\dot{v}-\frac{1}{\varepsilon}\Psi^{\prime}(Z_{\varepsilon}(v))$ according to Definition 3.3. The triangle inequality and rearranging yields

[TABLE]

Hence, with $z_{\varepsilon}^{u}=Z_{\varepsilon}(Sy_{\varepsilon}^{u})$ and $v=Sy_{\varepsilon}^{u}$ there follows

[TABLE]

Because the Riesz representation $w$ of $S$ is contained in $\mathrm{dom}([(A_{p}+1)^{1-\alpha}]^{*})$ by (A3) in Assumption 2.10, we can estimate for all $y\in\mathrm{dom}(A_{p})$ :

[TABLE]

For $y=Sy_{\varepsilon}^{u}(t)$ , this together with (12) and the triangle inequality implies that for a.e. $t\in J_{T}$

[TABLE]

Consequently, by the linear growth condition on $f_{\varepsilon}$ in $(A3)_{\varepsilon}$ of Assumption 3.1 we further estimate (15) by

[TABLE]

Remember that $y_{\varepsilon}^{u}(0)=0$ for any $\varepsilon\in(0,\varepsilon_{*}]$ . Since $y_{\varepsilon}^{u}$ is the mild solution of (12), we can use (6) for arbitrary $\gamma\in(0,1)$ and again the linear growth condition on $f_{\varepsilon}$ to obtain

[TABLE]

Note that $\left(\int_{0}^{t}(t-s)^{-\alpha q^{\prime}}\,ds\right)^{1/q^{\prime}}=\left(\frac{t^{1-\alpha q^{\prime}}}{1-\alpha q^{\prime}}\right)^{1/q^{\prime}}=\frac{t^{1/q^{\prime}-\alpha}}{(1-\alpha q^{\prime})^{1/q^{\prime}-\alpha}}$ since $q<\frac{1}{1-\alpha}\Leftrightarrow\frac{1}{q^{\prime}}-\alpha>0$ . We sum up the estimates for $|z_{\varepsilon}^{u}(t)|$ and $\|y_{\varepsilon}^{u}(t)\|_{X^{\alpha}}$ and apply Gronwall’s Lemma to arrive at

[TABLE]

for all $q\in(\frac{1}{1-\alpha},\infty]$ and a constant $c_{3}>0$ which depends on $T$ , $q^{\prime}$ and $\alpha$ but not on $\varepsilon$ and $u$ . By maximal parabolic regularity of $A_{p}$ , see Remark 2.6, one obtains

[TABLE]

for $s=q$ if $q\in(\frac{1}{1-\alpha},\infty)$ and for all $s\in(1,\infty)$ if $q=\infty$ , again for some $c_{4}>0$ which is independent of $\varepsilon$ and $u$ . This shows (13). We are left to prove (14). Note that $2>\frac{1}{1-\alpha}$ by (A2) in Assumption 2.10. Because $S\in X^{*}$ , (13) yields $\|S\dot{y}_{\varepsilon}^{u}\|_{\mathrm{L}^{2}(J_{T})}\leq c_{5}(1+\|u\|_{\mathrm{L}^{2}(J_{T};X)})$ for $c_{5}=c_{4}\|S\|_{X^{*}}$ . We test $\dot{z}_{\varepsilon}^{u}$ in Definition 3.3 by $\dot{z}_{\varepsilon}^{u}$ , integrate from zero to $t$ and use Young’s inequality to compute for $t\in\overline{J_{T}}$ :

[TABLE]

Since $\Psi(z_{\varepsilon}^{u}(0))=0$ and because $\Psi\geq 0$ it follows

[TABLE]

∎

The estimates which we derived in this subsection are crucial for Subsection 3.2.

3.2 Dynamics of the regularized states

This subsection contains ideas from [MS15, Section 4] and [BK13, Section 3.1]. We prove weak continuity of the solution operator of (12) for fixed $\varepsilon\in(0,\varepsilon_{*}]$ . This yields weakly converging subsequences $y_{\varepsilon_{k}}$ and $z_{\varepsilon_{k}}$ for any weakly converging sequence $u_{\varepsilon}$ , $\varepsilon\rightarrow 0$ . All results then also hold for the regularized state equations (10)-(11).

Using this, we are able to prove convergence of the solutions of the regularized control problems, as defined in Subsection 3.3, to an optimal solution of problem (1)-(3) with $\varepsilon\rightarrow 0$ .

The following lemma is proved as [Mün16, Lemma 5.3].

Lemma 3.7.

Let Assumption 2.10 and Assumption 3.1 hold and consider the notation from Lemma 3.5. Suppose that $u_{n}\rightharpoonup u$ in $\mathrm{L}^{2}(J_{T};X)$ with $n\rightarrow\infty$ for some sequence $\{u_{n}\}\subset\mathrm{L}^{2}(J_{T};X)$ . For $\varepsilon\in(0,\varepsilon_{*}]$ fixed consider the solutions $y_{\varepsilon}^{u_{n}}$ and $y_{\varepsilon}^{u}$ of (12), together with $z_{\varepsilon}^{u_{n}}$ and $z_{\varepsilon}^{u}$ . Then $y_{\varepsilon}^{u_{n}}\rightarrow y_{\varepsilon}^{u}$ with $n\rightarrow\infty$ weakly in $Y_{2,0}$ and strongly in $\mathrm{C}(\overline{J_{T}};X^{\alpha})$ and $z_{\varepsilon}^{u_{n}}\rightarrow z_{\varepsilon}^{u}$ with $n\rightarrow\infty$ weakly in $\mathrm{H}^{1}(J_{T})$ and strongly in $\mathrm{C}(\overline{J_{T}})$ . If the convergence of $\{u_{n}\}$ is strong then the convergence of $\{y_{\varepsilon}^{u_{n}}\}$ in $Y_{2,0}$ is also strong. The same holds if $\mathrm{L}^{2}(J_{T};X)$ is replaced by $U_{i}$ for $i\in\{1,2\}$ and if $u_{n}$ and $u$ are replaced by $B_{i}u_{n}$ and $B_{i}u$ . In this case, $(y_{\varepsilon}^{B_{i}u_{n}},z_{\varepsilon}^{B_{i}u_{n}})$ and $(y_{\varepsilon}^{B_{i}u},z_{\varepsilon}^{B_{i}u})$ are the solutions of (10)-(11).

Furthermore, we have the following convergence result:

Lemma 3.8.

Let Assumption 2.10 and Assumption 3.1 hold and consider the notation from Lemma 3.5. Suppose that $u_{\varepsilon}\rightharpoonup u$ in $\mathrm{L}^{2}(J_{T};X)$ as $\varepsilon\rightarrow 0$ . Consider the solutions $y_{\varepsilon}^{u_{\varepsilon}}$ and $y_{\varepsilon}^{u}$ of (12), together with $z_{\varepsilon}^{u_{\varepsilon}}$ and $z_{\varepsilon}^{u}$ . Then $y_{\varepsilon}^{u_{\varepsilon}}\rightarrow y^{u}$ with $\varepsilon\rightarrow 0$ weakly in $Y_{2,0}$ and strongly in $\mathrm{C}(\overline{J_{T}};X^{\alpha})$ and $z_{\varepsilon}^{u_{\varepsilon}}\rightarrow\mathcal{W}[Sy^{u}]$ with $\varepsilon\rightarrow 0$ weakly in $\mathrm{H}^{1}(J_{T})$ and strongly in $\mathrm{C}(\overline{J_{T}})$ . If the convergence of $\{u_{\varepsilon}\}$ is strong then also the convergence of $\{y_{\varepsilon}^{u_{\varepsilon}}\}$ in $Y_{2,0}$ is strong. The same holds if $\mathrm{L}^{2}(J_{T};X)$ is replaced by $U_{i}$ for $i\in\{1,2\}$ and if $u_{\varepsilon}$ and $u$ are replaced by $B_{i}u_{\varepsilon}$ and $B_{i}u$ . In this case, $(y_{\varepsilon}^{B_{i}u_{\varepsilon}},z_{\varepsilon}^{B_{i}u_{\varepsilon}})$ and $(y_{\varepsilon}^{B_{i}u},z_{\varepsilon}^{B_{i}u})$ are the solutions of (10)-(11).

Proof.

The proof combines the proofs of [BK13, Lemma 3.2] and [Mün16, Lemma 5.3]. By Lemma 3.6 we obtain a bound for $y_{\varepsilon}^{u_{\varepsilon}}$ in $Y_{2,0}$ and for $z_{\varepsilon}^{u_{\varepsilon}}$ in $\mathrm{H}^{1}(J_{T})$ which is independent of $\varepsilon\in(0,\varepsilon_{*}]$ . Hence, there exists a subsequence $\{\varepsilon_{k}\}$ of the sequence $\{\varepsilon\}$ and functions $\tilde{y}\in Y_{2,0}$ and $\tilde{z}\in\mathrm{H}^{1}(J_{T})$ to which $y_{\varepsilon_{k}}(u_{\varepsilon_{k}})$ and $z_{\varepsilon_{k}}(u_{\varepsilon_{k}})$ converge weakly in $Y_{2,0}$ and $\mathrm{H}^{1}(J_{T})$ and strongly in $\mathrm{C}(\overline{J_{T}};X^{\alpha})$ and $\mathrm{C}(\overline{J_{T}})$ with $k\rightarrow\infty$ . We abbreviate $y_{\varepsilon_{k}}:=y_{\varepsilon_{k}}(u_{\varepsilon_{k}})$ and $z_{\varepsilon_{k}}:=z_{\varepsilon_{k}}(u_{\varepsilon_{k}})$ . (14) implies that $\Psi(z_{\varepsilon_{k}}(t))\rightarrow 0$ with $k\rightarrow\infty$ for $t\in\overline{J_{T}}$ . By $(A4)_{\varepsilon}$ in Assumption 3.1 this yields $\tilde{z}(t)\in[a,b]$ for $t\in\overline{J_{T}}$ . For any $x\in\mathbb{R}$ and $\xi\in[a,b]$ there holds $\Psi^{\prime}(x)(x-\xi)\geq 0$ because $\Psi$ is convex and since $\Psi^{\prime}(\xi)=0$ for $\xi\in[a,b]$ . For any $\xi\in[a,b]$ we therefore have

[TABLE]

Taking the limit $k\rightarrow\infty$ yields $\tilde{z}=\mathcal{W}[S\tilde{y}]$ since $\tilde{z}$ solves (4)-(5) with $v=S\tilde{y}$ . Weak continuity of $\frac{d}{dt}$ and $A_{p}$ implies $\frac{d}{dt}y_{\varepsilon_{k}}+A_{p}y_{\varepsilon_{k}}\rightharpoonup\frac{d}{dt}\tilde{y}+A_{p}\tilde{y}$ in $\mathrm{L}^{2}(J_{T};X)$ with $k\rightarrow\infty$ . For $\varepsilon_{k}$ small enough we estimate with $(A3)_{\varepsilon}$ in Assumption 3.1:

[TABLE]

Because the right side converges to zero, we conclude that $F_{\varepsilon_{k}}[y_{\varepsilon_{k}}]$ converges to $F[\tilde{y}]$ in $\mathrm{C}(\overline{J_{T}};X)$ with $k\rightarrow\infty$ . This together with $\tilde{z}=\mathcal{W}[S\tilde{y}]$ yields $\tilde{y}=G(u)$ . Uniqueness of the limit implies convergence of the whole sequence. The statement about strong convergence follows essentially the same way as in [Mün16, Lemma 5.3]. ∎

3.3 The regularized optimal control problem

In this subsection, we introduce regularized optimal control problems. It still requires work to get adjoint systems for those problems. Nevertheless, we can exploit linearity of the derivatives of the solution operators of (10)-(11) to derive adjoint systems by a direct approach. This will be done in Subsection 3.5 below.

We follow the ideas in [BK13, Section 3.2] and [MS15, Section 4] in this subsection. For $i\in\{1,2\}$ consider an optimal control $\overline{u}\in U_{i}$ of problem (1)-(3) together with the state $\overline{y}=G(B_{i}\overline{u})$ and $\overline{z}=\mathcal{W}[S\overline{y}]$ . Existence of $\overline{u}$ follows from Theorem 2.13. For $\varepsilon\in(0,\varepsilon_{*}]$ we introduce the regularized optimal control problem

[TABLE]

subject to (10)-(11).

Theorem 3.9 (Convergence of optimal solutions).

Let Assumption 2.10 and Assumption 3.1 hold. For $i\in\{1,2\}$ suppose that $\overline{u}\in U_{i}$ is an optimal control for problem (1)-(3). Then for all $\varepsilon\in(0,\varepsilon_{*}]$ problem (10),(11),(16) has an optimal control $\overline{u}_{\varepsilon}\in U_{i}$ . This means that $\overline{u}_{\varepsilon}$ , together with $\overline{y}_{\varepsilon}=G_{\varepsilon}(B_{i}\overline{u}_{\varepsilon})$ and $\overline{z}_{\varepsilon}=Z_{\varepsilon}(S\overline{y}_{\varepsilon})$ (see Definition 3.3), are a solution of the minimization problem (16). Furthermore, $\overline{u}_{\varepsilon}\rightarrow\overline{u}$ in $U_{i}$ , $\overline{y}_{\varepsilon}\rightarrow\overline{y}=G(B_{i}\overline{u})$ in $Y_{2,0}$ and in $\mathrm{C}(\overline{J_{T}};X^{\alpha})$ and $\overline{z}_{\varepsilon}\rightarrow\overline{z}=\mathcal{W}[S\overline{y}]$ weakly in $\mathrm{H}^{1}(J_{T})$ and strongly in $\mathrm{C}(\overline{J_{T}})$ with $\varepsilon\rightarrow 0$ .

Proof.

First of all note that the embedding $Y_{0,2}\hookrightarrow U_{1}$ is continuous, because $\mathrm{dom}(A_{p})\simeq\mathbb{W}_{\Gamma_{D}}^{1,p}(\Omega)\hookrightarrow\mathrm{L}^{2}(\Omega)$ . Note also that $\overline{u}$ exists by Theorem 2.13. Existence of optimal controls $\overline{u}_{\varepsilon}$ for (10),(11),(16) follows essentially the same way as for problem (1)-(3) by using Lemma 3.7, see also Theorem 2.13. For all $\varepsilon\in(0,\varepsilon_{*}]$ , we deduce from optimality of $(\overline{y}_{\varepsilon},\overline{z}_{\varepsilon},\overline{u}_{\varepsilon})$ for problem (10),(11),(16) and of $(\overline{y},\overline{z},\overline{u})$ for problem (1)-(3) that

[TABLE]

Moreover, by (13) in Lemma 3.6, $G_{\varepsilon}(B_{i}\overline{u})\in Y_{2,0}$ is uniformly bounded for $\varepsilon\in(0,\varepsilon_{*}]$ so that $J(G_{\varepsilon}(B_{i}\overline{u}),\overline{u})\leq c$ for some constant $c>0$ . Hence, $c>J_{\mathrm{reg}}(\overline{y}_{\varepsilon},\overline{u}_{\varepsilon};\overline{u})=\frac{1}{2}\|\overline{y}_{\varepsilon}-y_{d}\|_{U_{1}}^{2}+\frac{\kappa}{2}\|\overline{u}_{\varepsilon}\|_{U_{i}}^{2}+\frac{1}{2}\|\overline{u}_{\varepsilon}-\overline{u}\|_{U_{i}}^{2}$ and the norms of $\overline{u}_{\varepsilon}$ in $U_{i}$ are bounded from above independently of $\varepsilon\in(0,\varepsilon_{*}]$ . Consequently, we can extract a subsequence $\{u_{\varepsilon_{k}}\}$ which converges weakly in $U_{i}$ to some $\tilde{u}$ with $k\rightarrow\infty$ . By Lemma 3.8, $y_{\varepsilon_{k}}\rightarrow G(\tilde{u})$ with $k\rightarrow\infty$ weakly in $Y_{0,2}$ and then also in $U_{1}$ . Also by Lemma 3.8, $G_{\varepsilon}(B_{i}\overline{u})\rightarrow\overline{y}$ with $\varepsilon\rightarrow 0$ strongly in $Y_{2,0}$ and then in $U_{1}$ . $J_{\mathrm{reg}}$ is weakly lower semi-continuous. Hence, with (17) we obtain

[TABLE]

But this implies $\tilde{u}=\overline{u}$ and that the convergence of $\{u_{\varepsilon_{k}}\}$ in $U_{i}$ is strong. Since the limit is uniquely determined by $\overline{u}$ , the whole sequence $\{u_{\varepsilon}\}$ converges to $\overline{u}$ in $U_{i}$ with $\varepsilon\rightarrow 0$ . All results then follow by applying the statement about strong convergence in Lemma 3.8. ∎

3.4 Gâteaux differentiability of the solution operator of the regularized state equation

In this subsection, we show that $G_{\varepsilon}$ is Gâteaux differentiable for all $\varepsilon\in(0,\varepsilon_{*}]$ .

Lemma 3.10 (Gâteaux differentiability of $G_{\varepsilon}$ ).

Let Assumption 2.10 and Assumption 3.1 hold and take the notation from Lemma 3.5. Then for any $\varepsilon\in(0,\varepsilon_{*}]$ and $q\in(\frac{1}{1-\alpha},\infty)$ the solution operator $G_{\varepsilon}:\mathrm{L}^{q}(J_{T};X)\rightarrow Y_{q,0}$ of problem (12) is Gâteaux differentiable. The derivative $G_{\varepsilon}^{\prime}[u;h]$ at $u\in\mathrm{L}^{q}(J_{T};X)$ in direction $h\in\mathrm{L}^{q}(J_{T};X)$ is given by $y^{u,h}_{\varepsilon}$ , where $y^{u,h}_{\varepsilon}$ together with $z=z^{u,h}_{\varepsilon}=Z^{\prime}_{\varepsilon}[Sy_{\varepsilon}^{u};Sy^{u,h}_{\varepsilon}]\in\mathrm{W}^{1,q}(J_{T})$ are the unique solution of

[TABLE]

For $i\in\{1,2\}$ and $u,h\in U_{i}$ the derivative of the solution mapping $u\mapsto G_{\varepsilon}(B_{i}u)$ at $u$ in direction $h$ is given by $y^{B_{i}u,B_{i}h}_{\varepsilon}$ , i.e. by the unique solution of (18) with $h$ replaced by $B_{i}h$ and $z=z^{B_{i}u,B_{i}h}_{\varepsilon}=Z^{\prime}_{\varepsilon}[Sy_{\varepsilon}^{B_{i}u};Sy^{B_{i}u,B_{i}h}_{\varepsilon}]$ .

Proof.

$G_{\varepsilon}$ is Hadamard directionally differentiable because $Z_{\varepsilon}$ satisfies the properties of $\mathcal{W}$ in Theorem 2.12. Gâteaux differentiability then follows from linearity of all the derivatives. To see that $z^{u,h}_{\varepsilon}\in\mathrm{W}^{1,q}(J_{T})$ , insert $Sy^{u,h}$ for $h$ in Remark 3.4 and note that the right side is contained in $\mathrm{W}^{1,q}(J_{T})$ . ∎

3.5 Adjoint system for the regularized problem

In this section, we derive adjoint systems for the regularized problems (10),(11),(16) with $\varepsilon\in(0,\varepsilon_{*}]$ , see Theorem 3.13 below. We proceed in a similar way as in [BK13, Sections 3.3 and 3.5] and [MS15, Section 4]. The following estimates are needed.

Lemma 3.11.

Let Assumption 2.10 and Assumption 3.1 hold. With a little abuse of notation we use the same symbol for the Nemitskii operator of $f_{\varepsilon}$ , i.e. we write $f_{\varepsilon}:(y,z)\mapsto f_{\varepsilon}(y(\cdot),z(\cdot)).$ Then $f_{\varepsilon}$ is locally Lipschitz continuous and Gâteaux differentiable from $\mathrm{C}(\overline{J_{T}};X^{\alpha})\times\mathrm{L}^{q}(J_{T})$ to $\mathrm{L}^{q}(J_{T};X)$ for all $\varepsilon\in(0,\varepsilon_{*}]$ and $q\in(\frac{1}{1-\alpha},\infty)$ .

Moreover, the derivative $f_{\varepsilon}^{\prime}[(y,z);(\cdot,\cdot)]$ at $(y,z)\in\mathrm{C}(\overline{J_{T}};X^{\alpha})\times\mathrm{L}^{q}(J_{T})$ is Lipschitz continuous with a modulus of the form $K(y)=L(y)(1+T^{1/q})$ , where $L(y)>0$ only depends on $y\in\mathrm{C}(\overline{J_{T}};X^{\alpha})$ . $K(y)$ and $L(y)$ are independent of $\varepsilon$ and remain the same in a sufficiently small neighbourhood of $y$ . For $(v,h)\in\mathrm{C}(\overline{J_{T}};X^{\alpha})\times\mathrm{L}^{q}(J_{T})$ we can estimate

[TABLE]

For a.e. $t\in J_{T}$ , there also holds the pointwise estimate

[TABLE]

Furthermore, $\frac{\partial}{\partial y}f_{\varepsilon}(y,z)=\frac{\partial}{\partial y}f_{\varepsilon}(y(\cdot),z(\cdot))$ is bounded by $K(y)$ in $\mathrm{L}^{\infty}(J_{T};\mathcal{L}(X^{\alpha},X))$ . Moreover, $\frac{\partial}{\partial z}f_{\varepsilon}(y,z)=\frac{\partial}{\partial z}f_{\varepsilon}(y(\cdot),z(\cdot))$ is bounded by $K(y)$ in $\mathrm{L}^{\infty}(J_{T};X)$ .

Proof.

First of all, $f_{\varepsilon}$ is locally Lipschitz continuous and Gâteaux differentiable from the space $\mathrm{C}(\overline{J_{T}};X^{\alpha})\times\mathrm{L}^{q}(J_{T})$ to $\mathrm{L}^{q}(J_{T};X)$ for all $\varepsilon\in(0,\varepsilon_{*}]$ and $q\in(\frac{1}{1-\alpha},\infty)$ . This follows from Step 3 in the proof of [Mün16, Theorem 3.1] and Step 1 in the proof of [Mün16, Theorem 4.7]. We give a sketch of the proof:

One first makes use of $(A3)_{\varepsilon}$ in Assumption 3.1 to show that $(y(\cdot),v)\mapsto f_{\varepsilon}(y(\cdot),v)$ is locally Lipschitz continuous from $\mathrm{C}(\overline{J_{T}};X^{\alpha})\times\mathbb{R}$ to $\mathrm{C}(\overline{J_{T}};X)$ with respect to the $\mathrm{C}(\overline{J_{T}};X^{\alpha})$ -norm. The proof contains a pointwise estimate of the following form: For $y\in\mathrm{C}(\overline{J_{T}};X^{\alpha})$ and some neighbourhood $\overline{B_{\mathrm{C}(\overline{J_{T}};X^{\alpha})}(y,\delta)}$ of $y$ there holds

[TABLE]

for all $y_{1},y_{2}\in\overline{B_{\mathrm{C}(\overline{J_{T}};X^{\alpha})}(y,\delta)}$ , $z_{1},z_{2}\in\mathbb{R}$ and $t\in\overline{J_{T}}$ and for some $L(y)>0$ . This local estimate leads to a pointwise estimate of the form

[TABLE]

for a.e. $s\in J_{T}$ , for any $y_{1},y_{2}\in\overline{B_{\mathrm{C}(\overline{J_{T}};X^{\alpha})}(y,\delta)}$ and $z_{1},z_{2}\in\mathrm{L}^{q}(J_{T})$ . By Minkowski’s inequality, $f_{\varepsilon}$ is locally Lipschitz continuous from $\mathrm{C}(\overline{J_{T}};X^{\alpha})\times\mathrm{L}^{q}(J_{T})$ to $\mathrm{L}^{q}(J_{T};X)$ with Lipschitz constants of the form $K(y)=L(y)(1+T^{1/q})$ .

In a second step one shows that $f_{\varepsilon}$ is directionally differentiable. Convergence of the difference quotients

[TABLE]

for a.e. $s\in J_{T}$ and $(y,z),(v,h)\in\mathrm{C}(\overline{J_{T}};X^{\alpha})\times\mathrm{L}^{q}(J_{T})$ follows from $(A3)_{\varepsilon}$ in Assumption 3.1. Lebesgue’s dominated convergence theorem yields directional differentiability of $f_{\varepsilon}$ from the space $\mathrm{C}(\overline{J_{T}};X^{\alpha})\times\mathrm{L}^{q}(J_{T})$ to $\mathrm{L}^{q}(J_{T};X)$ and the bounds (20) and (21) for $f_{\varepsilon}^{\prime}[(y,z);(\cdot,\cdot)]$ . Linearity of the derivative and local Lipschitz continuity then already imply Gâteaux differentiability of $f_{\varepsilon}$ .

Now for arbitrary $y\in X^{\alpha}$ with $\|y\|_{X^{\alpha}}=1$ , we choose the constant function $v\in\mathrm{C}(\overline{J}_{T};X^{\alpha})$ , $v(t)=y$ for $t\in\overline{J_{T}}$ and set $h=0\in\mathrm{L}^{q}(J_{T})$ in (21). This implies that $\frac{\partial}{\partial y}f_{\varepsilon}(y,z)=\frac{\partial}{\partial y}f_{\varepsilon}(y(\cdot),z(\cdot))$ is bounded by $K(y)$ in $\mathrm{L}^{\infty}(J_{T};\mathcal{L}(X^{\alpha},X))$ . Then we choose $v=0\in\mathrm{C}(\overline{J}_{T};X^{\alpha})$ , $h\in\mathrm{L}^{q}(J_{T})$ , $h(t)=c>0$ for $t\in\overline{J_{T}}$ in (21) and divide by $c$ on both sides to prove that $\frac{\partial}{\partial z}f_{\varepsilon}(y,z)=\frac{\partial}{\partial z}f_{\varepsilon}(y(\cdot),z(\cdot))$ is bounded by $K(y)$ in $\mathrm{L}^{\infty}(J_{T};X)$ . ∎

The following lemma provides the main tool to derive adjoint systems for the regularized problems (10),(11),(16). The hardest part in the proof is to find an explicit expression of the adjoint operator $[G_{\varepsilon}^{\prime}[u;\cdot]]^{*}:Y_{q,0}^{*}\rightarrow\mathrm{L}^{q^{\prime}}(J_{T};X^{*})$ of $G_{\varepsilon}^{\prime}[u;\cdot]$ from Lemma 3.10. This comes from the fact that $G_{\varepsilon}^{\prime}[u;\cdot]$ is defined as the mapping which assigns to each $h\in\mathrm{L}^{q}(J_{T};X)$ the solution $y_{\varepsilon}^{u,h}\in Y_{q,0}$ of (18), which contains the solution $z_{\varepsilon}^{u,h}$ of (19) only implicitly.

Lemma 3.12.

Let Assumption 2.10 and Assumption 3.1 hold and adopt the notation from Lemma 3.10. For $\varepsilon\in(0,\varepsilon_{*}]$ and any $q\in(\frac{1}{1-\alpha},\infty)$ , $h\in\mathrm{L}^{q}(J_{T};X)$ and $\nu\in\mathrm{L}^{q^{\prime}}(J_{T};[\mathrm{dom}(A_{p})]^{*})$ there holds

[TABLE]

where $p^{\nu}_{\varepsilon}\in Y^{*}_{q^{\prime},T}$ and $q^{\nu}_{\varepsilon}\in\mathrm{L}^{q^{\prime}}(J_{T})$ are the unique solution of

[TABLE]

and where $y_{\varepsilon}^{u,h}\in Y_{q,0}$ and $z_{\varepsilon}^{u,h}\in\mathrm{W}^{1,q}(J_{T})$ are the unique solution of (18)-(19). Moreover,

[TABLE]

for some constant $C(y_{\varepsilon}^{u})>0$ . $C(y_{\varepsilon}^{u})$ remains the same in a sufficiently small neighbourhood of $y_{\varepsilon}^{u}$ .

Proof.

Let $q\in(\frac{1}{1-\alpha},\infty)$ be arbitrary. Consider the solution operator of

[TABLE]

which maps any $v\in\mathrm{L}^{q}(J_{T})$ to $z\in\mathrm{W}^{1,q}(J_{T})$ . We denote by $T_{z,\varepsilon}^{u}:\mathrm{L}^{q}(J_{T})\rightarrow\mathrm{L}^{q}(J_{T}),\ v\mapsto T_{z,\varepsilon}^{u}v$ the corresponding operator on $\mathrm{L}^{q}(J_{T})$ .

Consider then the operator $T_{y,\varepsilon}^{u}:=A_{p}-\frac{\partial}{\partial y}f_{\varepsilon}(y_{\varepsilon}^{u},z_{\varepsilon}^{u})-\frac{\partial}{\partial z}f_{\varepsilon}(y_{\varepsilon}^{u},z_{\varepsilon}^{u})T_{z,\varepsilon}^{u}S\left(-A_{p}+\frac{\partial}{\partial y}f_{\varepsilon}(y_{\varepsilon}^{u},z_{\varepsilon}^{u})\right)$ from $Y_{q,0}$ to $\mathrm{L}^{q}(J_{T};X)$ . It follows as for the system (18)-(19) that for each $h\in\mathrm{L}^{q}(J_{T};X)$ there exists a unique couple of solutions $(\tilde{y}^{u,h}_{\varepsilon},\tilde{z}^{u,h}_{\varepsilon})$ in $Y_{q,0}\times\mathrm{L}^{q}(J_{T})$ of the system

[TABLE]

This implies that $\left(\frac{d}{dt}+T_{y,\varepsilon}^{u}\right)^{-1}$ is bijective from $\mathrm{L}^{q}(J_{T};X)$ to $Y_{q,0}$ . Note the difference between (23)-(24) and (18)-(19). We identify $\tilde{z}^{u,h}_{\varepsilon}\in\mathrm{L}^{q}(J_{T})$ with the corresponding function in $\mathrm{W}^{1,q}(J_{T})$ and estimate the norms of $(\tilde{y}^{u,h}_{\varepsilon},\tilde{z}^{u,h}_{\varepsilon})$ . For $t\in\overline{J_{T}}$ we have

[TABLE]

With (21) in Lemma 3.11 and (A3) in Assumption 2.10 it follows

[TABLE]

for a constant $c>0$ which is independent of $\varepsilon$ . Note that $\Psi^{\prime\prime}(z_{\varepsilon}^{u}(s))\geq 0$ because $\Psi$ is convex. Moreover, with (6) and again (21) in Lemma 3.11 we obtain

[TABLE]

Gronwall’s Lemma yields a constant $C_{1}(y_{\varepsilon}^{u})>0$ which depends only on $y_{\varepsilon}^{u}\in\mathrm{C}(\overline{J_{T}};X^{\alpha})$ such that $\|\tilde{y}^{u,h}_{\varepsilon}\|_{\mathrm{C}(\overline{J_{T}};X^{\alpha})}\leq C_{1}(y_{\varepsilon}^{u})\|h\|_{\mathrm{L}^{q}(J_{T};X)}$ and $\|\tilde{z}^{u,h}_{\varepsilon}\|_{\mathrm{C}(\overline{J_{T}})}\leq C_{1}(y_{\varepsilon}^{u})\|h\|_{\mathrm{L}^{q}(J_{T};X)}$ for $q\in(\frac{1}{1-\alpha},\infty)$ . Moreover, there holds $C_{1}(y_{\varepsilon}^{u})=C_{1}(y)$ for $\varepsilon$ small enough if $\{y_{\varepsilon}^{u}\}$ converges to $y$ with $\varepsilon\rightarrow 0$ . This is the case for the states $\overline{y}_{\varepsilon}$ in Theorem 3.9. As several times before we use maximal parabolic regularity of $A_{p}$ to obtain $\|\tilde{y}^{u,h}_{\varepsilon}\|_{Y_{q,0}}\leq C_{2}(y_{\varepsilon}^{u})\|h\|_{\mathrm{L}^{q}(J_{T};X)}$ where $C_{2}(y_{\varepsilon}^{u})>0$ has the same dependence on $y_{\varepsilon}^{u}$ as $C_{1}(y_{\varepsilon}^{u})$ . The inequalities in (22) are shown analogously to the estimates which we derived for $(\tilde{y}^{u,h}_{\varepsilon},\tilde{z}^{u,h}_{\varepsilon})$ . We also conclude that there exists a constant $C(y_{\varepsilon}^{u})>0$ with $\left\|\left(\frac{d}{dt}+T_{y,\varepsilon}^{u}\right)^{-1}\right\|_{\mathcal{L}(\mathrm{L}^{q}(J_{T};X),Y_{q,0})}\leq C(y_{\varepsilon}^{u}).$ This proves maximal parabolic $\mathrm{L}^{q}(J_{T};X)$ -regularity of $T_{y,\varepsilon}^{u}$ for $q\in(\frac{1}{1-\alpha},\infty)$ . For $\varepsilon$ small enough, also the values $C(y_{\varepsilon}^{u})$ can be chosen independently of $\varepsilon$ if $\{y_{\varepsilon}^{u}\}$ converges to some $y$ with $\varepsilon\rightarrow 0$ as it is the case for the sequence $\{\overline{y}_{\varepsilon}\}$ in Theorem 3.9. Maximal parabolic $\mathrm{L}^{q}(J_{T};X)$ -regularity of $T_{y,\varepsilon}^{u}$ for $q\in(\frac{1}{1-\alpha},\infty)$ implies maximal parabolic $\mathrm{L}^{q^{\prime}}(J_{T};[\mathrm{dom}(A_{p})]^{*})$ -regularity of $[T_{y,\varepsilon}^{u}]^{*}$ [MS15, Lemma 4.10]. To derive a representation of $[T_{y,\varepsilon}^{u}]^{*}$ , we collect some information about the adjoint mappings of the single components which define $T_{y,\varepsilon}^{u}$ . Lemma 3.11 yields that multiplication with $\frac{\partial}{\partial z}f_{\varepsilon}(y_{\varepsilon}^{u},z_{\varepsilon}^{u})$ is well-defined as a mapping from $\mathrm{L}^{q}(J_{T})$ into $\mathrm{L}^{q}(J_{T};X)$ and $\left[\frac{\partial}{\partial z}f_{\varepsilon}(y_{\varepsilon}^{u},z_{\varepsilon}^{u})\right]^{*}=\langle\cdot,\frac{\partial}{\partial z}f_{\varepsilon}(y_{\varepsilon}^{u},z_{\varepsilon}^{u})\rangle_{X}.$ Similarly, $\frac{\partial}{\partial y}f_{\varepsilon}(y_{\varepsilon}^{u},z_{\varepsilon}^{u})$ is a linear continuous mapping from $\mathrm{L}^{q}(J_{T};X^{\alpha})$ into $\mathrm{L}^{q}(J_{T};X)$ . Moreover, $\left[S\frac{\partial}{\partial y}f_{\varepsilon}(y_{\varepsilon}^{u},z_{\varepsilon}^{u})\right]^{*}$ is given by multiplication with $S\frac{\partial}{\partial y}f_{\varepsilon}(y_{\varepsilon}^{u},z_{\varepsilon}^{u})$ . The adjoint of $T_{z,\varepsilon}^{u}$ maps any $v\in\mathrm{L}^{q^{\prime}}(J_{T})$ to the function $q\in\mathrm{L}^{q^{\prime}}(J_{T})$ which may be identified with the unique solution of

[TABLE]

$S^{*}$ and $[SA_{p}]^{*}$ are given by multiplication with $S$ and $SA_{p}$ . Furthermore, $SA_{p}\in[X^{\alpha}]^{*}$ by the assumptions on $w$ in (A3) in Assumption 2.10. All bounds are independent of $\varepsilon$ if $\overline{y}_{\varepsilon}$ and $\overline{z}_{\varepsilon}$ in Theorem 3.9 are considered and if $\varepsilon$ is small enough. We obtain

[TABLE]

Maximal parabolic $\mathrm{L}^{q^{\prime}}(J_{T};[\mathrm{dom}(A_{p})]^{*})$ -regularity of $[T_{y,\varepsilon}^{u}]^{*}$ implies that for each

$\nu\in\mathrm{L}^{q^{\prime}}(J_{T};[\mathrm{dom}(A_{p})]^{*})$ there exists a unique $p^{\nu}_{\varepsilon}\in Y_{q^{\prime},T}^{*}$ with $\left(-\frac{d}{dt}+[T_{\varepsilon,y}^{u}]^{*}\right)p=\nu.$

For given $\nu\in\mathrm{L}^{q^{\prime}}(J_{T};[\mathrm{dom}(A_{p})]^{*})$ let $q^{\nu}_{\varepsilon}$ be the representative in $\mathrm{L}^{q^{\prime}}(J_{T})$ of the solution of

[TABLE]

Let also $(y_{\varepsilon}^{u,h},z_{\varepsilon}^{u,h})$ be the solutions of (18)-(19) for some given $h\in\mathrm{L}^{q}(J_{T};X)$ . Then we obtain with (18) and partial integration:

[TABLE]

By definition of $q^{\nu}_{\varepsilon}$ the last term on the right side is equal to

[TABLE]

Another partial integration together with (19) and canceling out some terms yields

[TABLE]

By definition of $p^{\nu}_{\varepsilon}$ we finally arrive at

[TABLE]

∎

We can directly write down an adjoint system for a solution $\overline{u}_{\varepsilon}$ of problem (10),(11),(16).

Theorem 3.13 (Adjoint system regularized problem).

Adopt the assumptions of Theorem 3.9 and the notation from Lemma 3.12. For $i\in\{1,2\}$ and $\varepsilon\in(0,\varepsilon_{*}]$ let $\overline{u}_{\varepsilon}\in U_{i}$ be an optimal control for problem (10),(11),(16). Then the adjoint variables for $\overline{y}_{\varepsilon}\in Y_{2,0}$ and $\overline{z}_{\varepsilon}\in\mathrm{H}^{1}(J_{T})$ are given by $p_{\varepsilon}:=p_{\varepsilon}^{\overline{y}_{\varepsilon}-y_{d}}\in Y^{*}_{2,T}$ and $q_{\varepsilon}:=q_{\varepsilon}^{\overline{y}_{\varepsilon}-y_{d}}\in\mathrm{H}^{1}(J_{T}).$ There holds $B_{i}^{*}(p_{\varepsilon}+Sq_{\varepsilon})=-(\kappa+1)\overline{u}_{\varepsilon}+\overline{u}$ in $\mathrm{L}^{2}(J_{T};U_{i})$ and the following system of evolution equations is satisfied by $p_{\varepsilon}$ and $q_{\varepsilon}$ :

[TABLE]

Proof.

Note first that we can choose $q=q^{\prime}=2$ in Lemma 3.12 since $2>\frac{1}{1-\alpha}\Leftrightarrow\alpha<\frac{1}{2}$ which is the case by (A2) in Assumption 2.10. Moreover, the expression $\langle\overline{y}_{\varepsilon}-y_{d},y_{\varepsilon}^{B_{i}\overline{u}_{\varepsilon},B_{i}h}\rangle_{\mathrm{L}^{2}(J_{T};\mathrm{dom}(A_{p}))}=\int_{0}^{T}\int_{\Omega}(\overline{y}_{\varepsilon}-y_{d})\cdot y_{\varepsilon}^{B_{i}\overline{u}_{\varepsilon},B_{i}h}dxdt$ is well-defined: With $I_{p}$ as in Definition 2.4, $y_{\varepsilon}^{B_{i}\overline{u}_{\varepsilon},B_{i}h}\in\mathrm{dom}(A_{p})=\mathrm{ran}(I_{p})$ may be identified with the embedding of $I_{p}^{-1}y_{\varepsilon}^{B_{i}\overline{u}_{\varepsilon},B_{i}h}$ from $\mathbb{W}_{\Gamma_{D}}^{1,p}(\Omega)$ into $\mathbb{W}_{\Gamma_{D}}^{1,p^{\prime}}(\Omega)\simeq X^{*}$ . Note that $p^{\prime}\leq 2\leq p$ . Since $\left(\mathcal{A}_{p}+I_{p}\right)^{-1}\in\mathcal{L}\bigl{(}X,\mathbb{W}_{\Gamma_{D}}^{1,p}(\Omega)\bigr{)}$ , see Remark 2.6, we can first estimate

[TABLE]

for a.e. $t\in J_{T}$ and then with the identification of $\mathbb{W}_{\Gamma_{D}}^{1,p^{\prime}}(\Omega)$ with $X^{*}$

[TABLE]

The Gâteaux-derivative of $\mathcal{J}_{\mathrm{reg}}(u):=J_{\mathrm{reg}}(G_{\varepsilon}(B_{i}u),u;\overline{u})=J(G_{\varepsilon}(B_{i}u),u)+\frac{1}{2}\|u-\overline{u}\|_{U_{i}}^{2}$ with respect to $u$ has to be zero at $\overline{u}_{\varepsilon}$ by optimality. Applying Lemma 3.12 we compute for $h\in U_{i}$ :

[TABLE]

∎

3.6 Estimates for the adjoints of the regularized problem

Similar to [BK13, Section 3.5] and [MS15, Lemma 4.14] we estimate the norms of the adjoint states $p_{\varepsilon}$ and $q_{\varepsilon}$ from Theorem 3.13 independently of $\varepsilon$ and of the norms of the optimal controls $\overline{u}_{\varepsilon}$ . In Section 4, we take a sequence $\{\varepsilon\}$ with $\varepsilon\rightarrow 0$ and use those bounds to extract (weakly) converging subsequences of $p_{\varepsilon}$ and $q_{\varepsilon}$ . Those finally yield an adjoint system for problem (1)-(3), see Theorem 4.13 below.

Lemma 3.14 (Uniform bounds).

Adopt the assumptions and the notation of Theorem 3.13. There exists a constant $c>0$ which is independent of $\varepsilon$ and some $\varepsilon_{0}\in(0,\varepsilon_{*}]$ such that the following holds true. If $\varepsilon\in(0,\varepsilon_{0})$ , then

[TABLE]

Proof.

Firstly, Theorem 3.9 yields $\overline{u}_{\varepsilon}\rightarrow\overline{u}$ in $U_{i}$ , $\overline{y}_{\varepsilon}\rightarrow\overline{y}$ in $Y_{2,0}$ and in $\mathrm{C}(\overline{J_{T}};X^{\alpha})$ and $\overline{z}_{\varepsilon}\rightarrow\overline{z}$ weakly in $\mathrm{H}^{1}(J_{T})$ and strongly in $\mathrm{C}(\overline{J_{T}})$ . As in the proof of Theorem 3.13 we obtain that $\overline{y}_{\varepsilon}-y_{d}$ is bounded in $\mathrm{L}^{2}(J_{T};[\mathrm{dom}(A_{p})]^{*})$ by $\|\overline{y}_{\varepsilon}-B_{1}y_{d}\|_{\mathrm{L}^{2}(J_{T};X)}\left\|\left(\mathcal{A}_{p}+I_{p}\right)^{-1}\right\|_{\mathcal{L}\bigl{(}X;\mathbb{W}_{\Gamma_{D}}^{1,p}(\Omega)\bigr{)}}=:c_{0}$ . This constant can be estimated independently of $\varepsilon$ because $\{\overline{y}_{\varepsilon}\}$ is uniformly bounded in $\mathrm{C}(\overline{J_{T}};X)$ . For any $\xi\in\mathrm{L}^{2}(J_{T};X)$ , Lemma 3.12 yields

[TABLE]

Because $\overline{y}_{\varepsilon}\rightarrow\overline{y}$ in $\mathrm{C}(\overline{J_{T}};X^{\alpha})$ we can find some $\varepsilon_{0}>0$ such that $C(\overline{y}_{\varepsilon})=C(\overline{y})$ for all $\varepsilon\in(0,\varepsilon_{0})$ . From reflexivity of $\mathrm{L}^{2}(J_{T};X)$ we conclude

[TABLE]

for all $\varepsilon\in(0,\varepsilon_{0})$ . We continue with estimates for $q_{\varepsilon}$ . We test (26) with $q_{\varepsilon}/|q_{\varepsilon}|$ , integrate from any $t\in J_{T}$ to $T$ and apply (20) from Lemma 3.11 and (33) to get

[TABLE]

W.l.o.g. for the same $\varepsilon_{0}$ as before there holds $c_{1}K(\overline{y}_{\varepsilon})=c_{1}K(\overline{y})=:c_{2}$ for all $\varepsilon\in(0,\varepsilon_{0})$ . Note that $\Psi^{\prime\prime}(\overline{z}_{\varepsilon})\geq 0$ by convexity of $\Psi$ . This yields

[TABLE]

for all $\varepsilon\in(0,\varepsilon_{0})$ . We conclude $Sq_{\varepsilon}\in\mathrm{L}^{2}(J_{T};X^{*})$ and then by (33) also $p_{\varepsilon}\in\mathrm{L}^{2}(J_{T};X^{*})$ , both with a norm which is independent of $\varepsilon\in(0,\varepsilon_{0})$ . We continue by estimating

[TABLE]

Because of (34) the right side is bounded by $2c_{2}$ so that $\int_{0}^{T}|\dot{q}_{\varepsilon}(s)|ds\leq 2c_{2}=:c_{3}$ for $\varepsilon\in(0,\varepsilon_{0})$ .

To proceed, we use maximal parabolic $\mathrm{L}^{2}(J_{T};[\mathrm{dom}(A_{p})]^{*})$ -regularity of $A_{p}^{*}$ and (25) to obtain

[TABLE]

(20) from Lemma 3.11, (33), (35) and the bound $\left\|\overline{y}_{\varepsilon}-y_{d}\right\|_{\mathrm{L}^{2}(J_{T};[\mathrm{dom}(A_{p})]^{*})}\leq c_{0}$ yield

[TABLE]

for $\varepsilon\in(0,\varepsilon_{0})$ . In a similar way one obtains (30)-(32) from the estimates

[TABLE]

∎

4 Adjoint system and optimality conditions for the optimal control problem

As in [BK13, Section 4] and [MS15, Theorem 4.15] we are interested in taking the limit $\varepsilon\rightarrow 0$ in Theorem 3.13 to obtain an adjoint system for problem (1)-(3). In Subsection 4.1–Subsection 4.3 we study the general case with spatially distributed or boundary controls, i.e. $i\in\{1,2\}$ . Particularly, in Subsection 4.1 we derive an adjoint system $(p,q)$ for problem (1)-(3) for the optimal control $\overline{u}$ from Theorem 3.9, see Lemma 4.1. Moreover, we gather information about the continuity properties of $q$ . Subsection 4.2 contains the optimality conditions for problem (1)-(3) for the optimal control $\overline{u}$ in terms of the pair $p$ and $q$ , see Lemma 4.12 below. In Subsection 4.3 we summarize the results from Subsection 4.1–Subsection 4.2 in Theorem 4.13. Afterwards, we consider the particular case when $f$ is continuously differentiable. In Corollary 4.14 we improve the optimality condition (42) from Theorem 4.13 for this instance. Both optimality conditions (42) and (48) are restricted to test functions $Sy^{B_{i}\overline{u},B_{i}h}$ with $h\in U_{i}$ , $i\in\{1,2\}$ .

In Subsection 4.4 we focus on the setting when the controls act inside of $\Omega$ , i.e. on $i=1$ . In Corollary 4.15 we improve the optimality conditions from Theorem 4.13 as well as those from Corollary 4.14 by extending inequalities (42) and (48) to any test function of the form $(Sv)\varphi$ with $v\in\mathrm{dom}(A_{p}),$ $Sv>0$ and $\varphi\in\mathrm{C}_{0}^{\infty}(J_{T})$ . Dividing the corresponding inequality by $Sv$ yields, at least in (48), an optimality condition with arbitrary test functions $\varphi\in\mathrm{C}_{0}^{\infty}(J_{T})$ . For $i=1$ we also prove uniqueness of $p$ and $q$ if $f$ is continuously differentiable, see Corollary 4.16.

4.1 Adjoint system for distributed or boundary controls

In this subsection, we derive an adjoint system $(p,q)$ for problem (1)-(3) and collect regularity properties of $p$ and $q$ . The evolution equation of $p$ can be derived pretty much straight forward as the limit equation of (25) for $\varepsilon\rightarrow 0$ , see Lemma 4.1 below. This is not possible for $q$ . The reason is that in Lemma 3.14 we could bound the norm of $\dot{q_{\varepsilon}}$ independently of $\varepsilon$ only in $\mathrm{L}^{1}(J_{T})$ . As a remedy we split the interval $J_{T}$ into the set $I_{0}$ of times $t$ where the limit $\overline{z}(t)$ is contained in the open interval $(a,b)$ and the rest $I_{\partial}$ where $\overline{z}(t)\in\{a,b\}$ . It turns out, that the evolution of $q$ in $I_{0}$ can be described in form of an evolution equation, see Lemma 4.3 below. As for $I_{\partial}$ , we have to pass to weak- $*$ convergence of $q_{\varepsilon}$ and consider the limit $d\mu$ of $\frac{1}{\varepsilon}\Psi^{\prime\prime}(\overline{z}_{\varepsilon})q_{\varepsilon}$ in $\mathrm{C}(\overline{J_{T}})^{*}$ . Driving $\varepsilon$ to zero then yields an equality for $dq$ in the sense of measures on $I_{\partial}$ , see Lemma 4.5. The abstract measure $d\mu$ , having support in $I_{\partial}$ , remains part of this evolution equation. It also appears in the optimality conditions for problem (1)-(3) in (42). In order to complete the description of $q$ by analyzing the measure $d\mu$ , we will introduce a regularity Assumption 4.7 on $S\overline{y}(t)$ for $t\in I_{\partial}$ . With this assumption, we can characterize $d\mu$ in a subset of $I_{\partial}$ . This allows us to characterize $q$ in open subintervals of $I_{\partial}$ and we can prove continuity of $q$ at so-called $(0,\partial)$ -switching times, see Lemma 4.10. In Remark 4.11 we generalize Lemma 4.10 for when Assumption 4.7 is not satisfied.

Lemma 4.1 (Adjoint system in the limit).

Adopt the assumptions and the notation of Theorem 3.13. For $i\in\{1,2\}$ let $\overline{u}\in U_{i},$ $\overline{y}=G(\overline{u})$ and $\overline{z}=\mathcal{W}[S\overline{y}]$ be defined as in Theorem 3.9. Then every sequence $\{\varepsilon\}$ with $\varepsilon\rightarrow 0$ has a subsequence $\{\varepsilon_{k}\}$ such that the following holds true. There exist functions functions $p\in Y^{*}_{2,T}$ and $\lambda_{1},\lambda_{2}\in\mathrm{L}^{2}(J_{T};[X^{\alpha}]^{*})$ such that as $k\rightarrow\infty$ , $p_{\varepsilon_{k}}\rightharpoonup p\text{ in }Y^{*}_{2,T}$ and

[TABLE]

Moreover, there exists a function $q$ which has bounded variation, i.e. $q\in\mathrm{BV}(J_{T})$ , such that $q_{\varepsilon_{k}}$ converges pointwise to $q$ with $k\rightarrow\infty$ . There holds $\mathrm{Var}(q)\leq\liminf_{\varepsilon_{k}\rightarrow 0}\mathrm{Var}(q_{\varepsilon_{k}})$ . Alternatively, $\dot{q}_{\varepsilon_{k}}\rightarrow dq$ weak-* in $\mathrm{C}(\overline{J_{T}})^{*}$ with $k\rightarrow\infty$ for some signed regular Borel measure $dq\in\mathrm{C}(\overline{J_{T}})^{*}$ . The relation between $q$ and $dq$ is given by $q(t-)-q(s+)=dq((s,t))$ and $q(t+)-q(s-)=dq([s,t])$ for $[s,t]\subset\overline{J_{T}}$ .

The function $p$ solves the evolution equation

[TABLE]

If $f$ is continuously differentiable from $X^{\alpha}\times\mathbb{R}$ into $X$ then $\lambda_{1}=\left[\frac{\partial}{\partial y}f(\overline{y},\overline{z})\right]^{*}p\text{ and }\lambda_{2}=S\frac{\partial}{\partial y}f(\overline{y},\overline{z})q.$ Furthermore,

[TABLE]

Proof.

Theorem 3.9 implies $u_{\varepsilon}\rightarrow\overline{u}$ in $U_{i}$ , $\overline{y}_{\varepsilon}\rightarrow\overline{y}$ in $Y_{2,0}$ and in $\mathrm{C}(\overline{J_{T}};X^{\alpha})$ and $\overline{z}_{\varepsilon}\rightarrow\overline{z}$ uniformly and weakly in $\mathrm{H}^{1}(J_{T})$ with $\varepsilon\rightarrow 0$ . By (29), (30) and (31) in Lemma 3.14, reflexivity of all spaces yields a subsequence $\{\varepsilon_{k}\}$ and some functions $p$ , $\lambda_{1}$ and $\lambda_{2}$ such that $p_{\varepsilon_{k}}\rightharpoonup p\text{ in }Y^{*}_{2,T}$ , $\left[\frac{\partial}{\partial y}f_{\varepsilon_{k}}(\overline{y}_{\varepsilon_{k}},\overline{z}_{\varepsilon_{k}})\right]^{*}p_{\varepsilon_{k}}\rightharpoonup\lambda_{1}\text{ in }\mathrm{L}^{2}(J_{T};[X^{\alpha}]^{*})$ and $S\frac{\partial}{\partial y}f_{\varepsilon_{k}}(\overline{y}_{\varepsilon_{k}},\overline{z}_{\varepsilon_{k}})q_{\varepsilon_{k}}\rightharpoonup\lambda_{2}\text{ in }\mathrm{L}^{2}(J_{T};[X^{\alpha}]^{*})$ with $k\rightarrow\infty$ . The condition $p(T)=0$ is included in the definition of the space $Y^{*}_{2,T}$ . From (28) we conclude that $q_{\varepsilon}$ has bounded variation, i.e. $q_{\varepsilon}\in\mathrm{BV}(J_{T})$ , with a norm which is bounded independently of $\varepsilon$ . This implies that (w.l.o.g. the same) subsequence $q_{\varepsilon_{k}}$ converges pointwise to some $q\in\mathrm{BV}(J_{T})$ with $k\rightarrow\infty$ and $\mathrm{Var}(q)\leq\liminf_{\varepsilon_{k}\rightarrow 0}\mathrm{Var}(q_{\varepsilon_{k}})$ . Alternatively, by Alaoglu’s compactness theorem, $\dot{q}_{\varepsilon_{k}}\rightarrow dq$ weak-* in $\mathrm{C}(\overline{J_{T}})^{*}$ with $k\rightarrow\infty$ for some signed regular Borel measure $dq\in\mathrm{C}(\overline{J_{T}})^{*}$ and the relation between $q$ and $dq$ is given by $q(t-)-q(s+)=dq((s,t))$ and $q(t+)-q(s-)=dq([s,t])$ for $[s,t]\subset\overline{J_{T}}$ [BK13, Section 4]. We exploit weak continuity of $-\frac{d}{dt}+A_{p}^{*}$ from $Y^{*}_{2,T}$ to $\mathrm{L}^{2}(J_{T};[\mathrm{dom}(A_{p})]^{*})$ to see that

[TABLE]

in $\mathrm{L}^{2}(J_{T};[\mathrm{dom}(A_{p})]^{*})$ with $k\rightarrow\infty$ . Consequently, $p\in Y^{*}_{2,T}$ solves equation (36). Note that we can set $f_{\varepsilon}\equiv f$ if $f$ is continuously differentiable from $X^{\alpha}\times\mathbb{R}$ into $X$ and in this case $\lambda_{1}=\left[\frac{\partial}{\partial y}f(\overline{y},\overline{z})\right]^{*}p\text{ and }\lambda_{2}=S\frac{\partial}{\partial y}f(\overline{y},\overline{z})q.$ Moreover,

[TABLE]

in $U_{i}$ with $k\rightarrow\infty$ since $B_{i}^{*}$ is weakly continuous . This implies (37). ∎

To gather information about $q$ from Lemma 4.1 we continue similar as in [BK13, Section 4].

Definition 4.2 (Partition of $J_{T}$ ).

Let $\overline{z}$ be as in Theorem 3.9. We split $\overline{J_{T}}$ into $I_{0}:=\{t\in\overline{J_{T}}:\overline{z}(t)\in(a,b)\}$ and $I_{\partial}:=\overline{J_{T}}\backslash I_{0}=\{t\in\overline{J_{T}}:\overline{z}(t)\in\{a,b\}\}.$ We further introduce $I_{\partial}^{a}:=\{t\in\overline{J_{T}}:\overline{z}(t)=a\}$ and $I_{\partial}^{b}:=\{t\in\overline{J_{T}}:\overline{z}(t)=b\}.$

Note that $I_{0}$ is open because $\overline{z}$ is continuous.

Lemma 4.3 ( $q$ in $I_{0}$ ).

Adopt the assumptions and the notation of Lemma 4.1 and consider the subdivision of $\overline{J_{T}}$ from Definition 4.2. For any interval $(c,d)\subset I_{0}$ the limit $q$ in Lemma 4.1 belongs to $\mathrm{H}^{1}(c,d)$ and there exist $\nu_{1},\nu_{2}\in\mathrm{L}^{2}(J_{T})$ such that $-\dot{q}=\nu_{1}+\nu_{2}$ in $\mathrm{L}^{2}(c,d)$ . If $f$ is continuously differentiable from $X^{\alpha}\times\mathbb{R}$ into $X$ then $\nu_{1}=\langle p,\frac{\partial}{\partial z}f(\overline{y},\overline{z})\rangle_{X}$ and $\nu_{2}=\langle Sq,\frac{\partial}{\partial z}f(\overline{y},\overline{z})\rangle_{X}.$

Proof.

By Theorem 3.9 $,\overline{z}_{\varepsilon}\rightarrow\overline{z}$ uniformly in $\overline{J_{T}}$ . Let $(c,d)\subset I_{0}$ and $[s,t]\subset(c,d)$ be arbitrary. $(A4)_{\varepsilon}$ in Assumption 3.1 implies that (w.l.o.g for $\varepsilon_{0}>0$ from Lemma 3.14) $\Psi^{\prime\prime}(\overline{z}_{\varepsilon})\equiv 0$ on $[s,t]$ for all $\varepsilon\in(0,\varepsilon_{0})$ . For $\varepsilon\in(0,\varepsilon_{0})$ we integrate from $s$ to $t$ in (26) in Theorem 3.13 and obtain

[TABLE]

Consider $\{\varepsilon_{k}\}$ from Lemma 4.1. Lemma 3.14 together with Lemma 3.11 implies uniform boundedness of $\langle p_{\varepsilon},\frac{\partial}{\partial z}f_{\varepsilon}(\overline{y}_{\varepsilon},\overline{z}_{\varepsilon})\rangle_{X}$ and $\langle Sq_{\varepsilon},\frac{\partial}{\partial z}f_{\varepsilon}(\overline{y}_{\varepsilon},\overline{z}_{\varepsilon})\rangle_{X}$ in $\mathrm{L}^{2}(J_{T})$ if $\varepsilon\in(0,\varepsilon_{0})$ . Hence, we obtain a subsequence of $\{\varepsilon_{k}\}$ (still denoted by $\{\varepsilon_{k}\}$ ) and functions $\nu_{1},\nu_{2}\in\mathrm{L}^{2}(J_{T})$ , such that $\langle p_{\varepsilon_{k}},\frac{\partial}{\partial z}f_{\varepsilon_{k}}(\overline{y}_{\varepsilon_{k}},\overline{z}_{\varepsilon_{k}})\rangle_{X}\rightharpoonup\nu_{1}$ and $\langle Sq_{\varepsilon_{k}},\frac{\partial}{\partial z}f_{\varepsilon_{k}}(\overline{y}_{\varepsilon_{k}},\overline{z}_{\varepsilon_{k}})\rangle_{X}\rightharpoonup\nu_{2}$ in $\mathrm{L}^{2}(J_{T})$ with $k\rightarrow\infty$ . If $f$ is continuously differentiable from $X^{\alpha}\times\mathbb{R}$ into $X$ we can set $f_{\varepsilon}\equiv f$ and get $\nu_{1}=\langle p,\frac{\partial}{\partial z}f(\overline{y},\overline{z})\rangle_{X}$ and $\nu_{2}=\langle Sq,\frac{\partial}{\partial z}f(\overline{y},\overline{z})\rangle_{X}.$ In the general case we obtain

[TABLE]

with $k\rightarrow\infty$ . So the weak derivative of $q$ exists in $\mathrm{L}^{2}(c,d)$ and is given by $-\nu_{1}-\nu_{2}$ . ∎

Our next goal is to understand the behaviour of $q$ in $I_{\partial}$ .

Lemma 4.4 ( $q$ in $I_{\partial}$ : Relation to $\mathcal{P}(S\overline{y})$ ).

Adopt the assumptions and the notation of Lemma 4.1 and consider the subdivision of $\overline{J_{T}}$ from Definition 4.2. With $\mathcal{P}=\mathrm{Id}-\mathcal{W}$ , cf. Lemma 2.9, there holds $\left[\frac{d}{dt}\mathcal{P}[S\overline{y}](t)\right]q(t)=0\text{ for a.e. }t\in I_{\partial}.$

Proof.

Consider the concrete choice for $\Psi$ from Remark 3.2 and $c$ and $\varepsilon_{0}$ from Lemma 3.14. By Theorem 3.9, $\overline{z}_{\varepsilon}\rightarrow z$ uniformly so that $\overline{z}_{\varepsilon}(t)\rightarrow b\text{ for }t\in I_{\partial}^{b}\text{ and }\overline{z}_{\varepsilon}(t)\rightarrow a\text{ for }t\in I_{\partial}^{a}$ with $\varepsilon\rightarrow 0$ . Hence, there exists some $\varepsilon_{1}\in(0,\varepsilon_{0}]$ such that

[TABLE]

for all $\varepsilon\in(0,\varepsilon_{1})$ . Remember that $\Psi_{1}(x)=(x-b)^{3}(4+b-x)$ and $\Psi\equiv 0$ on $[a,b]$ . For $\varepsilon\in(0,\varepsilon_{1})$ and $t\in I_{\partial}^{b}$ we obtain

[TABLE]

We apply estimate (27) from Lemma 3.14 together with (38) and (40) to see that

[TABLE]

for all $\varepsilon\in(0,\varepsilon_{1})$ . We apply the convergence results from Theorem 3.9 in (11) and use the representation $\mathcal{W}+\mathcal{P}=\mathrm{Id}$ from Lemma 2.9 to obtain the weak convergence

[TABLE]

in $\mathrm{L}^{2}(J_{T})$ with $\varepsilon\rightarrow 0$ . Furthermore, by Lemma 4.1, $|q_{\varepsilon_{k}}|\rightarrow|q|$ strongly in $\mathrm{L}^{2}(J_{T})$ with $k\rightarrow\infty$ and $\frac{d}{dt}\mathcal{P}[S\overline{y}]=\left|\frac{d}{dt}\mathcal{P}[S\overline{y}]\right|$ a.e. in $I_{\partial}^{b}$ by definition of $I_{\partial}^{b}$ . This together with (39) and (41) yields

[TABLE]

Similar estimates for $I_{\partial}^{a}$ and the fact that $I_{\partial}=I_{\partial}^{a}\cup I_{\partial}^{b}$ prove the statement. ∎

Next, we pass to the limit in (26) to get the following result:

Lemma 4.5 ( $q$ in $I_{\partial}$ : Relation to $d\mu$ ).

Adopt the assumptions and the notation of Lemma 4.1 and let $\nu_{1}$ and $\nu_{2}$ be as in Lemma 4.3. Consider the subdivision of $\overline{J_{T}}$ from Definition 4.2. We denote $d\mu_{\varepsilon}:=\frac{1}{\varepsilon}\Psi^{\prime\prime}(\overline{z}_{\varepsilon})q_{\varepsilon}$ . There exists a measure $d\mu\in\mathrm{C}(\overline{J_{T}})^{*}$ , such that a subsequence $\{d\mu_{\varepsilon_{k}}\}$ (w.l.o.g we may consider $\{\varepsilon_{k}\}$ from Lemma 4.1) converges weak-* to $d\mu$ in $\mathrm{C}(\overline{J_{T}})^{*}$ with $k\rightarrow\infty$ . The support of $d\mu$ is contained in $I_{\partial}$ . For any $\varphi\in\mathrm{C}(\overline{J_{T}})$ there holds

[TABLE]

This implies $d\mu=dq+(\nu_{1}+\nu_{2})dt$ as measures on $I_{\partial}$ .

Proof.

By (27) in Lemma 3.14 the functions $d\mu_{\varepsilon}$ are bounded in $\mathrm{L}^{1}(J_{T})$ independently of $\varepsilon$ for all $\varepsilon\in(0,\varepsilon_{0})$ . Consequently, a subsequence of $\{d\mu_{\varepsilon}\}$ converges weak-* in $\mathrm{C}(\overline{J_{T}})^{*}$ to some measure $d\mu$ . By $(A4)_{\varepsilon}$ in Assumption 3.1 and the uniform convergence of $\overline{z}_{\varepsilon}$ to $\overline{z}$ there holds $\varphi\,\frac{1}{\varepsilon}\Psi^{\prime\prime}(\overline{z}_{\varepsilon})q_{\varepsilon}\equiv 0$ as soon as $\varepsilon$ is small enough, if $\varphi\in\mathrm{C}(\overline{J_{T}})$ has compact support in $I_{0}$ . Therefore, the support of $d\mu$ is contained in $I_{\partial}$ [BK13, p.343]. The other statements are shown similar as [BK13, Lemma 4.6] and [BK13, Lemma 4.7]. ∎

It also follows:

Lemma 4.6 (Discontinuity properties of $q$ ).

Adopt the assumptions and notation of Lemma 4.1. The absolute value of $q$ can only jump downwards in reverse time. Consequently, for any $t\in\overline{J_{T}}$ there holds $|q(t-)|\leq|q(t+)|$ and $q(T-)=q(T)=0$ . Moreover, $q$ is right continuous in $[0,T)$ and left continuous at $T$ .

Proof.

From Lemma 4.1 we conclude that $q_{\varepsilon_{k}}$ converges to $q$ in $\mathrm{L}^{1}(J_{T})$ and that $dq_{\varepsilon_{k}}=\dot{q}_{\varepsilon_{k}}dt$ converges to $dq$ weak-* in $\mathrm{C}(\overline{J_{T}})^{*}$ . From [[] Chapter XII.7]visintin2013differential it follows that $q$ has bounded variation and that the limit is right continuous in $[0,T)$ and left continuous at $T$ . The rest of the statements are shown just as [BK13, Lemma 4.4]. ∎

The unknown measure $d\mu$ has support in $I_{\partial}$ so that we only know the behaviour of the sum $-dq+d\mu$ in $\mathrm{C}(\overline{J_{T}})^{*}$ but not that of $dq$ alone. In order to analyze $q$ also in $I_{\partial}$ we make the following regularity assumption, cf. [BK13, p.344]:

Assumption 4.7 (Regularity assumption).

Let $\overline{y}$ be as in Theorem 3.9 and consider the subdivision of $\overline{J_{T}}$ from Definition 4.2. We suppose that the function $\mathcal{P}[S\overline{y}]$ satisfies $\frac{d}{dt}\mathcal{P}[S\overline{y}]\neq 0\ \text{ a.e. in }I_{\partial}.$ Equivalently, $S\dot{\overline{y}}>0\ \text{ a.e. in }I_{\partial}^{b}$ and $S\dot{\overline{y}}<0\ \text{ a.e. in }I_{\partial}^{a}.$

Remark 4.8.

This assumption is reasonable if $S\overline{y}$ is the size of interest. Consider for example the case when $w$ in (A3) in Assumption 2.10 has the form $w=\frac{1}{m|\Omega|}\varphi$ for some $\varphi\in\prod_{j=1}^{m}\mathrm{C}_{\Gamma_{D_{j}}}^{\infty}(\Omega)$ , where the components $\varphi_{j}$ , $j\in\{1,\ldots,m\}$ , are constantly equal to $1$ within most of $\Omega$ and vanish only in a small neighbourhood of $\Gamma_{D_{j}}$ . If we identify $\mathrm{ran}\left(I_{p}\right)$ with $\mathbb{W}_{\Gamma_{D}}^{1,p}(\Omega)$ , then $S$ acts on $y\in\mathrm{dom}(A_{p})$ as $Sy=\frac{1}{m|\Omega|}\sum_{j=1}^{m}\int_{\Omega}y_{j}\varphi_{j}dx.$ This means that $Sy$ is approximately the mean value of $y$ in $\Omega$ . If this is the value of interest then nothing changes in the system if $S\dot{\overline{y}}=0$ in a subset of $I_{\partial}$ with positive measure.

In order to analyze the behaviour of $q$ and $dq$ in $\overline{I_{0}}\cap I_{\partial}$ we introduce the following categories of times as in [BK13]:

Definition 4.9 (Switching times).

Consider the subdivision of $\overline{J_{T}}$ from Definition 4.2. We call a time $t$ a $(0,\partial)$ -switching time if $t\in\overline{I_{0}}\cap I_{\partial}$ and if there is some $\varepsilon>0$ such that $(t-\varepsilon,t)\subset I_{0}$ and $[t,t+\varepsilon)\subset I_{\partial}$ . We say that $t$ is a $(\partial,0)$ -switching time if $t\in\overline{I_{0}}\cap I_{\partial}$ and if for some $\varepsilon>0$ we have $(t-\varepsilon,t]\subset I_{\partial}$ and $(t,t+\varepsilon)\subset I_{0}$ .

Lemma 4.10 ( $q$ at switching times).

Adopt the assumptions and the notation of Lemma 4.1. If $t$ is a $(0,\partial)$ -switching time in the sense of Definition 4.9 and if Assumption 4.7 holds then there exits some $\varepsilon>0$ such that $q\equiv 0$ on $[t,t+\varepsilon)$ . Moreover, $q$ is continuous at $t$ with $t=0$ . Furthermore, for every open interval $(c,d)\subset I_{\partial}$ there holds that $q\equiv 0$ in $[c,d)$ .

Proof.

Let $(c,d)\subset I_{\partial}$ be arbitrary and suppose that Assumption 4.7 holds. Then Lemma 4.4 implies $q(t)=0$ for a.e. $t\in(c,d)$ . By Lemma 4.6, $q$ is right continuous in $[0,T)$ so that $q\equiv 0$ in $[c,d)$ . Consequently, for every subinterval $[\beta,\gamma]\subset(c,d)$ we have $0=q(\gamma-)-q(\beta+)=dq((\beta,\gamma))$ so that $dq=0$ as a measure on $(c,d)$ . Again by Lemma 4.6 the absolute value of $q$ can only jump downwards in reverse time. By Lemma 4.3, $q\in\mathrm{H}^{1}(e,c)$ for any interval $(e,c)\subset I_{0}$ . Consequently, whenever an interval $(e,c)\subset I_{0}$ is followed by an interval $[c,d]\subset I_{\partial}$ , then $q$ is absolutely continuous on $[e,d)$ .

Now let $t$ be a $(0,\partial)$ -switching and consider $\varepsilon>0$ such that $(t-\varepsilon,t)\subset I_{0}$ and $[t,t+\varepsilon)\subset I_{\partial}$ . Then setting $e=t-\varepsilon$ , $c=t$ and $d=t+\varepsilon$ proves the rest of the lemma.

∎

Remark 4.11.

In the setting of Lemma 4.10 one can prove even more about the continuity properties of $q$ if $f$ is continuously differentiablem, even in absence of Assumption 4.7:

•

Note first that when $t\in I_{\partial}$ is a $(\partial,0)$ -switching time then $q$ might jump at $t$ no matter if Assumption 4.7 holds or not. If it does not jump then under Assumption 4.7 then necessarily $q(t)=0$ . It is also possible to prove that $q$ may only jump up at $t$ if $\int_{t}^{t^{-}}\langle p+Sq,\frac{\partial}{\partial z}f(\overline{y},\overline{z})\rangle_{X}\,ds>0$ , where either $t^{-}=t^{-}(t)\in(t,T]\cap I_{\partial}^{a}$ is (essentially) the first time in $(t,T)$ for which there exists some $\varepsilon>0$ such that $S\dot{\overline{y}}<0$ a.e. in $(t^{-},t^{-}+\varepsilon)$ , or $t^{-}=T$ . It can further be shown that the height of the jump is bounded by $\int_{t}^{t^{-}}\langle p+Sq,\frac{\partial}{\partial z}f(\overline{y},\overline{z})\rangle_{X}\,ds$ . Analogously, one can prove that $q$ may only jump down at $t$ if $\int_{t}^{t^{+}}\langle p+Sq,\frac{\partial}{\partial z}f(\overline{y},\overline{z})\rangle_{X}\,ds<0$ , where either $t^{+}=t^{+}(t)\in(t,T]\cap I_{\partial}^{b}$ is (essentially) the first time in $(t,T)$ for which there exists some $\varepsilon>0$ such that $S\dot{\overline{y}}>0$ a.e. in $(t^{+},t^{+}\varepsilon)$ , or $t^{+}=T$ . In this case the height of the jump is bounded by $-\int_{t}^{t^{+}}\langle p+Sq,\frac{\partial}{\partial z}f(\overline{y},\overline{z})\rangle_{X}\,ds$ .

•

Other categories of times can be considered. Those include isolated times in $I_{0}$ or subintervals of $I_{\partial}$ in which $S\dot{\overline{y}}=0$ a.e. The latter can only occur if Assumption 4.7 does not hold true. Also for those categories one can show sign conditions for $dq$ and $d\mu$ and upper bounds for jumps.

The proof of these continuity properties is very technical and exceeds the scope of this work. The results will be published in the dissertation of the project in which this paper originated.

4.2 Optimality conditions for distributed or boundary controls

We derive optimality conditions for problem (1)-(3) for the optimal control $\overline{u}$ from Theorem 3.9 in terms of the pair $p$ and $q$ from Lemma 4.1. We can not expect a pointwise condition as in [MS15, Section 5] since the hysteresis and its derivative, and then also $F^{\prime}[\overline{y},\cdot]$ in Theorem 2.12, act non-local in time. This implies that if for some direction $\zeta\in\mathrm{C}(\overline{J_{T}};X^{\alpha})$ and some set $I\subset J_{T}$ of positive measure the derivative $F^{\prime}[y;\zeta](\tau)=f^{\prime}[(y(\tau),\mathcal{W}[Sy](\tau));(y(\tau),\mathcal{W}^{\prime}[Sy;S\zeta](\tau))]$ is not zero for $\tau\in I$ , then the values of the derivative in $I$ might have an influence on its value at any $t$ with $\max\{\tau\in I\}<t\leq T$ . That is, we can only expect an optimality condition for problem (1)-(3) which includes integration at least over a part of the time interval $J_{T}$ . Nevertheless, we follow the steps in [MS15, Section 5] as long as possible. The optimality condition for $i\in\{1,2\}$ is derived in Lemma 4.12 and improved in Corollary 4.14 for the case when $f$ is continuously differentiable. We can even further improve this condition for the case when the controls act inside of $\Omega$ , i.e. for $i=1$ . Also in this case we can not expect to obtain an inequality without integration in time. But since the range of $B_{1}$ is dense in $X$ , we are able to derive a condition without variation in space. The results can be found in Corollary 4.15 in Subsection 4.4.1. For $i=1$ we are also able to prove uniqueness of $p,q$ and $d\mu$ if $f$ is continuously differentiable, see Corollary 4.16 in Subsection 4.4.2.

Because the range of $B_{2}$ is not dense in $X$ , we treat the general case $i\in\{1,2\}$ first.

Lemma 4.12 (Optimality condition).

Adopt the assumptions and the notation of Lemma 4.1 and let $\nu_{1}$ and $\nu_{2}$ be as in Lemma 4.3. For any $h\in U_{i}$ , $y^{B_{i}\overline{u},B_{i}h}=G^{\prime}[B_{i}\overline{u};B_{i}h]$ and

$F^{\prime}[\overline{y};y^{B_{i}\overline{u},B_{i}h}](t)=f^{\prime}[(\overline{y}(t),\mathcal{W}[S\overline{y}](t));(\overline{y}(t),\mathcal{W}^{\prime}[S\overline{y};Sy^{B_{i}\overline{u},B_{i}h}](t))]$ (see Theorem 2.12), there holds the optimality condition

[TABLE]

Proof.

Since $\overline{u}$ is an optimal control, the directional derivative of the reduced cost functional $\mathcal{J}$ has to be greater or equal than zero in each direction. With $y^{B_{i}\overline{u},B_{i}h}=G^{\prime}[B_{i}\overline{u};B_{i}h]$ this means that for any $h\in U_{i}$ there holds

[TABLE]

The function $y^{B_{i}\overline{u},B_{i}h}$ solves the evolution equation (9) in Theorem 2.12 with $y$ replaced by $\overline{y}$ and $h$ replaced by $B_{i}h$ . We test this equation with $p+Sq$ , integrate over time and apply (37) to compute

[TABLE]

We integrate the first term on the left side of (44) by parts, insert (36) from Lemma 4.1 and use the representation of $dq$ from Lemma 4.5 to observe

[TABLE]

We insert (44) into (43) and use (45) to obtain

[TABLE]

∎

4.3 Summary: Adjoint system and optimality conditions for distributed or boundary controls

We summarize our results for the general control problem with $i\in\{1,2\}$ .

Theorem 4.13 (Adjoint system and optimality condition).

Let Assumption 2.10 and Assumption 3.1 hold. For $i\in\{1,2\}$ suppose that $\overline{u}\in U_{i}$ is an optimal control for problem (1)-(3) together with the optimal state $\overline{y}\in Y_{2,0}$ and $\overline{z}=\mathcal{W}[S\overline{y}]\in\mathrm{H}^{1}(J_{T})$ . Consider the subdivision of $\overline{J_{T}}$ from Definition 4.2. Then there exist adjoint states $p\in Y_{2,T}^{*}$ and $q\in\mathrm{BV}(J_{T})$ of the following kind: There holds $B_{i}^{*}(p+Sq)=-\kappa\overline{u}\text{ in }U_{i}.$ For some functions $\lambda_{1},\lambda_{2}\in\mathrm{L}^{2}(J_{T};[X^{\alpha}]^{*})$ we have

[TABLE]

$q$ * is left continuous in $J_{T}$ , right continuous at $T$ and absolutely continuous in $I_{0}$ . There exist $\nu_{1},\nu_{2}\in\mathrm{L}^{2}(J_{T})$ such that $q$ solves $-\dot{q}=\nu_{1}+\nu_{2}$ in every open subinterval of $I_{0}$ . $\frac{d}{dt}\mathcal{P}[S\overline{y}](t)q(t)=0\text{ for a.e. }t\in I_{\partial}$ and there is a measure $d\mu\in\mathrm{C}(\overline{J_{T}})^{*}$ with support in $I_{\partial}$ such that $d\mu=dq+(\nu_{1}+\nu_{2})dt$ as measures on $I_{\partial}$ . For all $h\in U_{i}$ and with $y^{B_{i}\overline{u},B_{i}h}=G^{\prime}[B_{i}\overline{u};B_{i}h]$ (see Theorem 2.12) there holds the optimality condition*

[TABLE]

where $F^{\prime}[\overline{y};y^{B_{i}\overline{u},B_{i}h}](t)=f^{\prime}[(\overline{y}(t),\mathcal{W}[S\overline{y}](t));(\overline{y}(t),\mathcal{W}^{\prime}[S\overline{y};Sy^{B_{i}\overline{u},B_{i}h}](t))]$ . The absolute value of $q$ can only jump downwards in reverse time so that $q(T-)=q(T)=0$ and $|q(t-)|\leq|q(t+)|$ for all $t\in\overline{J_{T}}$ . If the regularity Assumption 4.7 is valid then $q$ is continuous at every $(0,\partial)$ -switching time $t$ (see Definition 4.9) with $q(t)=0$ . In this case, for every open interval $(c,d)\subset I_{\partial}$ it follows $q\equiv 0$ on $[c,d)$ .

We can improve the results of Theorem 4.13 if $f$ is continuously differentiable:

Corollary 4.14 (Adjoint system and optimality condition for regular $f$ ).

Let Assumption 2.10 and Assumption 3.1 hold. Moreover, suppose that $f$ is continuously differentiable from $X^{\alpha}\times\mathbb{R}$ into $X$ . For $i\in\{1,2\}$ assume that $\overline{u}\in U_{i}$ is an optimal control for problem (1)-(3) together with the optimal state $\overline{y}\in Y_{2,0}$ and $\overline{z}=\mathcal{W}[S\overline{y}]\in\mathrm{H}^{1}(J_{T})$ . Consider the subdivision of $\overline{J_{T}}$ from Definition 4.2. Then there exist adjoint states $p\in Y_{2,T}^{*}$ and $q\in\mathrm{BV}(J_{T})$ of the following kind: There holds $B_{i}^{*}(p+Sq)=-\kappa\overline{u}\text{ in }U_{i}.$ We have

[TABLE]

$q$ is left continuous in $J_{T}$ , right continuous at $T$ and absolutely continuous in $I_{0}$ . $q$ solves the evolution equation $-\dot{q}=\langle p+Sq,\frac{\partial}{\partial z}f(\overline{y},\overline{z})\rangle_{X}$ in every open subinterval of $I_{0}$ . $\frac{d}{dt}\mathcal{P}[S\overline{y}](t)q(t)=0\text{ for a.e. }t\in I_{\partial}$ and there is a measure $d\mu\in\mathrm{C}(\overline{J_{T}})^{*}$ with support in $I_{\partial}$ such that $d\mu=dq+\langle p+Sq,\frac{\partial}{\partial z}f(\overline{y},\overline{z})\rangle_{X}dt$ as measures on $I_{\partial}$ . For all $h\in U_{i}$ and with $y^{B_{i}\overline{u},B_{i}h}=G^{\prime}[B_{i}\overline{u};B_{i}h]$ (see Theorem 2.12) and $\mathcal{P}=\mathrm{Id}-\mathcal{W}$ (see Lemma 2.9) there holds the optimality condition

[TABLE]

The absolute value of $q$ can only jump downwards in reverse time so that $q(T-)=q(T)=0$ and $|q(t-)|\leq|q(t+)|$ for all $t\in\overline{J_{T}}$ . If the regularity Assumption 4.7 is valid then $q$ is continuous at every $(0,\partial)$ -switching time $t$ (see Definition 4.9) with $q(t)=0$ . In this case, for every open interval $(c,d)\subset I_{\partial}$ it follows $q\equiv 0$ on $[c,d)$ .

Proof.

If $f$ is continuously differentiable then by Lemma 4.1 and Lemma 4.3 we can replace $\lambda_{1}=\left[\frac{\partial}{\partial y}f(\overline{y},\overline{z})\right]^{*}p$ , $\lambda_{2}=S\frac{\partial}{\partial y}f(\overline{y},\overline{z})q,\ \nu_{1}=\langle p,\frac{\partial}{\partial z}f(\overline{y},\overline{z})\rangle_{X}$ , $\nu_{2}=\langle Sq,\frac{\partial}{\partial z}f(\overline{y},\overline{z})\rangle_{X}$ in Theorem 4.13. This yields all statements except for the optimality condition. (42) takes the form

[TABLE]

Because $\mathcal{P}=\mathrm{Id}-\mathcal{W}$ (see Lemma 2.9) we have $Sy^{B_{i}\overline{u},B_{i}h}-\mathcal{W}^{\prime}[S\overline{y};Sy^{B_{i}\overline{u},B_{i}h}]=\mathcal{P}^{\prime}[S\overline{y};Sy^{B_{i}\overline{u},B_{i}h}].$

This yields the optimality condition (48). ∎

4.4 Improved optimality conditions and uniqueness for distributed controls

We want to replace $y^{B_{i}\overline{u},B_{i}h}$ in (46) and (48) by an arbitrary function of an appropriate space. This would certainly improve the optimality conditions in Theorem 4.13 and Corollary 4.14. It is not possible in the general case $i\in\{1,2\}$ without density of the ranges of $B_{i}$ . Therefore, we restrict ourselves to problem (1)-(3) with distributed controls $u\in U_{1}$ in this subsection. Suppose that $p$ in (A1) in Assumption 2.10 is chosen close to two such that $\frac{1}{2}<1-\frac{1}{p}-\frac{1}{d}$ . Then $2<\frac{dp^{\prime}}{d-p^{\prime}}$ and by [Mün16, Remark 2.7] we have the compact embedding $\mathbb{W}_{\Gamma_{D}}^{1,p^{\prime}}(\Omega)\lhook\mkern-3.0mu\relbar\mkern-12.0mu\hookrightarrow[\mathrm{L}^{2}(\Omega)]^{m}$ which is also one-to-one. That is, in this case $B_{1}$ has dense range. In Corollary 4.15 in Subsection 4.4.1 we improve the optimality conditions from Theorem 4.13 and Corollary 4.14 for this case. For $i=1$ we also prove uniqueness of $p,q$ and $d\mu$ if $f$ is continuously differentiable, see Corollary 4.16 in Subsection 4.4.2.

4.4.1 Improved optimality conditions

We improve the optimality conditions (46) and (48).

Corollary 4.15 (Optimality condition for distributed controls).

Let Assumption 2.10 and Assumption 3.1 hold and let $\frac{1}{2}<1-\frac{1}{p}-\frac{1}{d}$ . Assume that $\overline{u}\in U_{1}$ is a solution of problem (1)-(3) with $i=1$ , together with the state $\overline{y}\in Y_{2,0}$ and $\overline{z}=\mathcal{W}[S\overline{y}]\in\mathrm{H}^{1}(J_{T})$ . Let $v\in\mathrm{dom}(A_{p})$ with $Sv>0$ and $\varphi\in\mathrm{C}^{\infty}_{0}(J_{T})$ be arbitrary. Then in addition to (46) in Theorem 4.13 there holds

[TABLE]

If $f$ is continuously differentiable then in addition to (48) in Corollary 4.14 there holds

[TABLE]

Proof.

Since $B_{1}$ has dense range one proves just as in [MS15, Lemma 5.2] that the set $\{y^{B_{1}\overline{u},B_{1}h}:h\in U_{1}\}$ is dense in $Y_{2,0}$ . Unfortunately, we can not continue as in [MS15, Theorem 5.3] to derive a pointwise optimality condition. The reason is that for any $\zeta\in\mathrm{C}(\overline{J_{T}};X^{\alpha})$ the function $\mathcal{W}^{\prime}[S\overline{y};S\zeta]$ is non-local in time. Nevertheless, we can still make use of the fact that $\mathcal{W}^{\prime}[S\overline{y};\cdot]$ and $f^{\prime}$ are positive homogeneous. First of all, as in [MS15, Theorem 5.3], for arbitrary given $\eta\in Y_{2,0}$ , we choose a sequence in $\{y^{B_{1}\overline{u},B_{1}h}:h\in U_{1}\}$ which converges to $\eta$ . We pass to the limit in (46) and obtain

[TABLE]

where $F^{\prime}[\overline{y};\eta](t)=f^{\prime}[(\overline{y}(t),\mathcal{W}[S\overline{y}](t));(\overline{y}(t),\mathcal{W}^{\prime}[S\overline{y};S\eta](t))]$ . Let $v\in\mathrm{dom}(A_{p})$ with $Sv>0$ be given. Furthermore, let $\varphi\in\mathrm{C}^{\infty}_{0}(J_{T})$ be arbitrary. Then $v\varphi\in Y_{2,0}$ and $\mathcal{W}^{\prime}[S\overline{y};S(v\varphi)]=Sv\mathcal{W}^{\prime}[S\overline{y};\varphi].$ Setting $\eta=\varphi v$ and rearranging yields

[TABLE]

Dividing both sides by $Sv$ proves the first statement. The second inequality is shown analogously. ∎

4.4.2 Uniqueness of the adjoint variables

If $f$ is continuously differentiable we can also show uniqueness of the adjoint couple.

Corollary 4.16 (Unique adjoint system for distributed controls).

Let Assumption 2.10 and Assumption 3.1 hold and let $\frac{1}{2}<1-\frac{1}{p}-\frac{1}{d}$ . Moreover, suppose that $f$ is continuously differentiable from $X^{\alpha}\times\mathbb{R}$ into $X$ . Assume that $\overline{u}\in U_{1}$ is a solution of problem (1)-(3) with $i=1$ , together with the state $\overline{y}\in Y_{2,0}$ and $\overline{z}=\mathcal{W}[S\overline{y}]\in\mathrm{H}^{1}(J_{T})$ . Then in the setting of Corollary 4.14 the adjoint couple $p\in Y_{2,T}^{*}$ and $q\in\mathrm{BV}(J_{T})$ together with the measure $d\mu$ in $\mathrm{C}(\overline{J_{T}})^{*}$ is unique.

Proof.

Because $B_{1}$ has dense range we have $\mathrm{ker}(B_{1}^{*})=\overline{\mathrm{ran}(B_{1})}^{\perp}=\{0\}.$ Therefore by Corollary 4.14 we obtain

[TABLE]

cf. [MS15, Theorem 4.15]. Suppose there are two adjoint couples $(p_{1},q_{1}),(p_{2},q_{2})$ which satisfy the conditions of Corollary 4.14. Let $\zeta\in\mathrm{L}^{2}(J_{T};\mathrm{dom}(A_{p}))$ be arbitrary. Then by (47) and (49) there holds

[TABLE]

This implies $\dot{p}_{2}=\dot{p}_{1}$ in $\mathrm{L}^{2}(J_{T};[\mathrm{dom}(A_{p})]^{*})$ . Together with $p_{1}(T)=p_{2}(T)=0\in[\mathrm{dom}(A_{p})]^{*}$ we obtain $p_{1}=p_{2}$ in $\mathrm{L}^{2}(J_{T};[\mathrm{dom}(A_{p})]^{*})$ . Since the embedding $\mathrm{dom}(A_{p})\hookrightarrow X$ is dense, the embedding of $X^{*}$ into $[\mathrm{dom}(A_{p})]^{*}$ is one-to-one and $p_{1}=p_{2}$ also in $\mathrm{L}^{2}(J_{T};X^{*})$ and then in $Y_{2,T}^{*}$ . Let $v\in\mathrm{dom}(A_{p})$ be given with $Sv>0$ . We already know $p_{1}=p_{2}$ so that $S(q_{1}-q_{2})=0\text{ in }X^{*}\text{ a.e. in }J_{T}$ because of (49). But then

[TABLE]

so that $q_{1}=q_{2}$ in $\mathrm{L}^{1}(J_{T})$ . This way we obtain

[TABLE]

which implies $dq_{1}-dq_{2}=0$ as measures on $\overline{J_{T}}$ according to [Vis13, XII.7]. This yields $q_{1}=q_{2}\in\mathrm{BV}(0,T)$ . From Corollary 4.14 we conclude $d\mu_{1}=d\mu_{2}$ and the proof is complete. ∎

5 Higher regularity of the solutions of the optimal control problem

In this section we improve the regularity of the optimal control $\overline{u}\in U_{i}$ , $i\in\{1,2\}$ , and then also of the optimal state $\overline{y}=G(B_{i}\overline{u})$ and $\overline{z}=\mathcal{W}[S\overline{y}]$ . We denote $\tilde{U}_{1}:=[\mathrm{L}^{2}(\Omega)]^{m}$ and $\tilde{U}_{2}:=\prod_{j=1}^{m}\mathrm{L}^{2}(\Gamma_{N_{j}},\mathcal{H}_{d-1})$ . We want to exploit the equation $B_{i}^{*}(p+Sq)=-\kappa\overline{u}\text{ in }[\tilde{U}_{i}]^{*}\text{ a.e. in }J_{T}$ which follows from Theorem 4.13. In order to make use of the time-regularity of $p+Sq$ we need to enforce the conditions on $B_{i}$ .

Assumption 5.1.

For $i\in\{1,2\}$ , the operator $B_{i}:\tilde{U}_{i}\rightarrow X$ in N(5) is also continuous as a mapping into $X^{\gamma}$ for some $\gamma\in(0,1]$ . We denote by $I_{(\gamma)}$ the canonical embedding from $X^{\gamma}$ into $X$ . Then the assumption is equivalent to the fact that $B_{i}=I_{(\gamma)}\tilde{B}_{i}$ for a linear and continuous function $\tilde{B}_{i}:\tilde{U}_{i}\rightarrow X^{\gamma}$ .

Theorem 5.2 (Higher regularity).

In the setting of Theorem 4.13 let Assumption 5.1 hold for some $\gamma\in(0,1]$ .

If $\gamma>\frac{1}{2}$ , then $\overline{u}\in\mathrm{L}^{\infty}(J_{T};\tilde{U}_{i})$ , $\overline{y}\in Y_{s,0}$ and $\overline{z}\in\mathrm{W}^{1,s}(J_{T})$ for arbitrary $s\in(1,\infty)$ . If $\frac{1}{2}(1+\frac{d}{p})<1$ , which is the case when $d=2$ and $p>2$ in (A1) in Assumption 2.10, this implies $\overline{y}\in\mathrm{C}(\overline{J_{T}};[\mathrm{L}^{\infty}(\Omega)]^{m})$ . If in addition $\Omega$ is a Lipschitz domain then $\overline{y}$ is Hölder continuous in time and space.

If $\gamma\leq\frac{1}{2}$ , then $\overline{u}\in\mathrm{L}^{\frac{2}{1-2s}}(J_{T};\tilde{U}_{i})$ , $\overline{y}\in Y_{2/(1-2s),0}$ and $\overline{z}\in\mathrm{W}^{1,\frac{2}{1-2s}}(J_{T})$ for arbitrary $s\in(0,\gamma)$ . This implies $\overline{y}\in\mathrm{C}(\overline{J_{T}};X^{\theta})$ for any $\theta\in(0,\frac{1}{2}+\gamma)$ . If $\gamma>\frac{d}{2p}$ , with $d$ and $p$ in (A1) in Assumption 2.10, this implies $\overline{y}\in\mathrm{C}(\overline{J_{T}};[\mathrm{L}^{\infty}(\Omega)]^{m})$ . If in addition $\Omega$ is a Lipschitz domain then $\overline{y}$ is Hölder continuous in time and space.

Proof.

First note that for $0\leq\beta\leq\gamma\leq 1$ we have the compact and dense embeddings $X^{\gamma}\lhook\mkern-3.0mu\relbar\mkern-12.0mu\hookrightarrow X^{\beta}\lhook\mkern-3.0mu\relbar\mkern-12.0mu\hookrightarrow X$ [H81, Theorem 1.4.8]. This implies $X^{*}\hookrightarrow[X^{\gamma}]^{*}\hookrightarrow[X^{\beta}]^{*}$ . By the properties of complex interpolation and with Remark 2.6 there holds

[TABLE]

•

We prove the case when Assumption 5.1 is fulfilled with $\gamma>\frac{1}{2}$ :

Since $1-\gamma<\frac{1}{2}$ we obtain (as in Remark 2.7) an embedding

[TABLE]

Therefore, by (50) the regularity $p\in Y_{2,T}^{*}$ in Theorem 4.13 implies that we can identify the function $p\in\mathrm{L}^{2}(J_{T};X^{*})$ and the representative $\tilde{p}$ of $p$ in $\mathrm{C}(\overline{J_{T}};[X^{\gamma}]^{*})$ . This allows us to identify the function $B_{i}^{*}p\in\mathrm{L}^{2}(J_{T};[\tilde{U}_{i}]^{*})$ and $\tilde{B}_{i}^{*}\tilde{p}\in\mathrm{C}(\overline{J_{T}};[\tilde{U}_{i}]^{*})$ . We also have $Sq\in\mathrm{L}^{\infty}(J_{T};X^{*})$ since by Theorem 4.13 there holds $q\in\mathrm{BV}(J_{T})$ and because $S\in X^{*}$ by (A3) in Assumption 2.10. That is, $B_{i}^{*}Sq\in\mathrm{L}^{\infty}(J_{T};[\tilde{U}_{i}]^{*})$ . Again with Theorem 4.13 and the identification of $B_{i}^{*}p$ and $\tilde{B}_{i}^{*}\tilde{p}$ we arrive at

[TABLE]

The functions on the left side are contained in $\mathrm{L}^{\infty}(J_{T};[\tilde{U}_{i}]^{*})$ . We identify $[\tilde{U}_{i}]^{*}$ with $\tilde{U}_{i}$ , so that $\overline{u}\in\mathrm{L}^{\infty}(J_{T};\tilde{U}_{i})$ . Now we use the higher regularity of $\overline{u}$ to prove a better regularity also for $\overline{y}$ . Since $\overline{u}\in\mathrm{L}^{\infty}(J_{T};\tilde{U}_{i})$ , Theorem 2.12 yields $\overline{y}\in Y_{s,0}$ for arbitrary $s\in(1,\infty)$ . From Remark 2.6 and Remark 2.7 we obtain $\overline{y}\in\mathrm{C}(\overline{J_{T}};X^{\theta})$ for arbitrary $\theta\in[0,1)$ . In [TR12, Theorem 3.3] it is shown that $X^{\theta}$ is a subset of $[\mathrm{L}^{\infty}(\Omega)]^{m}$ if $\theta>\frac{1}{2}(1+\frac{d}{p})$ . By Remark 2.6 we are guaranteed that we can choose $p>2$ . So at least if $d=2$ there is some $\theta\in(0,1)$ with $\theta>\frac{1}{2}(1+\frac{d}{p})$ and therefore $\overline{y}\in\mathrm{C}(\overline{J_{T}};[\mathrm{L}^{\infty}(\Omega)]^{m})$ . If $d=2$ and $p>2$ and if $\Omega$ is regular enough, for example a Lipschitz domain, then by [DER15, Theorem 4.5] the state $\overline{y}$ is even Hölder continuous in time and space.

•

We prove the statement for the case when Assumption 5.1 is fulfilled with $\gamma\leq\frac{1}{2}$ :

From [Ama05, Theorem 3 and (22)] it follows

[TABLE]

for arbitrary $s\in(0,\gamma)$ . So the regularity $p\in Y_{2,T}^{*}$ in Theorem 4.13 together with (50) implies that we can identify $p\in\mathrm{L}^{2}(J_{T};X^{*})$ and the representative $\tilde{p}$ of $p$ in $\mathrm{L}^{\frac{2}{1-2s}}(J_{T};[X^{\gamma}]^{*})$ and then $B_{i}^{*}p\in\mathrm{L}^{2}(J_{T};[\tilde{U}_{i}]^{*})$ and $\tilde{B}_{i}^{*}\tilde{p}\in\mathrm{L}^{\frac{2}{1-2s}}(J_{T};[\tilde{U}_{i}]^{*})$ . We proceed as for the case $\gamma>\frac{1}{2}$ to prove $\overline{u}\in\mathrm{L}^{\frac{2}{1-2s}}(J_{T};\tilde{U}_{i})$ for arbitrary $s\in(0,\gamma)$ . Theorem 2.12 yields $\overline{y}\in Y_{2/(1-2s),0}$ for arbitrary $s\in(0,\gamma)$ and from Remark 2.6 and Remark 2.7 it follows $\overline{y}\in\mathrm{C}(\overline{J_{T}};X^{\theta})$ for arbitrary $\theta\in[0,1-\left(\frac{2}{1-2s}\right)^{-1})=[0,\frac{1}{2}+s)$ . Because $s\in(0,\gamma)$ is arbitrary, this holds for all $\theta\in[0,\frac{1}{2}+\gamma)$ . The remaining statements are shown just as for $\gamma>\frac{1}{2}$ .

∎

Remark 5.3.

For example, take $d=2$ and $p>2$ in (A1) in Assumption 2.10 and adopt the assumptions and the notation in Theorem 4.13. By [Mün16, Remark 2.7] we have the compact embedding

[TABLE]

because $p^{\prime}>1$ and then $2<\frac{dp^{\prime}}{d-p^{\prime}}=\frac{2p^{\prime}}{2-p^{\prime}}$ . Therefore we can embed functions $u\in\tilde{U}_{1}$ into $\mathbb{W}_{\Gamma_{D}}^{-1,p}(\Omega)$ by the assignment $\int_{\Omega}u\cdot v\,dx,$ $\forall v\in\mathbb{W}_{\Gamma_{D}}^{1,p^{\prime}}(\Omega)$ . We slightly reinforce Assumption 2.10. Suppose that [Gri+02, Assumption 2.2] holds for $\Omega$ and for all $\Gamma_{D_{j}}$ , $j\in\{1,\ldots m\}$ . This essentially means that Assumption 2.2 holds for all $x\in\partial\Omega$ and that the functional determinant of each bi-Lipschitz transformation $\phi_{x}$ is constant a.e. For example, this is the case if $\Omega$ is a Lipschitz domain [Gri+02, Remark 2.3]. The rest in Assumption 2.10 remains the same. With this assumption one has

[TABLE]

for $\theta\in(0,1)$ [Gri+02, Theorem 3.1]. This way we obtain an embedding $\tilde{U}_{1}\hookrightarrow\mathbb{W}_{\Gamma_{D}}^{-\theta,p}(\Omega)$ . Furthermore, we have

[TABLE]

for $-\theta=-1+2\gamma$ by [Gri+02, Theorem 3.5] and Remark 2.6. For any $\gamma\in(0,\frac{1}{2})$ there holds $\theta\in(0,1)$ for $\theta=1-2\gamma$ so that we obtain an embedding $\tilde{U}_{1}\hookrightarrow X^{\gamma}$ . Therefore, Assumption 5.1 is fulfilled for $B_{1}$ with any $\gamma\in(0,\frac{1}{2})$ . By Theorem 5.2 it follows $\overline{u}\in\mathrm{L}^{\frac{2}{1-2s}}(J_{T};\tilde{U}_{1})$ , $\overline{y}\in Y_{1/(1-2s),0}$ and $\overline{z}\in\mathrm{W}^{1,\frac{2}{1-2s}}(J_{T})$ for arbitrary $s\in(0,\gamma)$ . Since $d=2$ and $p>2$ we can choose $\gamma\in(0,\frac{1}{2})$ such that $\gamma>\frac{d}{2p}$ . Theorem 5.2 yields $\overline{y}\in\mathrm{C}(\overline{J_{T}};[\mathrm{L}^{\infty}(\Omega)]^{m})$ . If $\Omega$ is a Lipschitz domain, then $\overline{y}$ is Hölder continuous in time and space.

6 The value function of a perturbed control problem

In this section we analyze stability properties of the minimal value function of a perturbed problem which is similar to (1)-(3). This analysis is only relevant if the set of controls is restricted. That is, for $i\in\{1,2\}$ we consider a convex closed subset set $C\subset U_{i}$ as our set of feasible controls and minimize the cost function over this set. For given $r\in U_{i}$ we analyze the perturbed problem

[TABLE]

We define the corresponding minimal value function

[TABLE]

and the multifunction

[TABLE]

We analyze the continuity properties of $v$ and $V$ . The proof is quite similar to the one of [BS00, Proposition 4.4].

Theorem 6.1 (Value function).

Let Assumption 2.10 hold. For $i\in\{1,2\}$ , let $C\subset U_{i}$ be convex and closed. Consider the optimal control problem (51) for $r\in U_{i}$ together with the corresponding minimal value function $v$ , defined by (52), and the multifunction $V$ from (53). Then $v$ is weakly lower semicontiuous. If $C$ is compact in $U_{i}$ then $v$ is also upper semicontiuous and therefore continuous. In this case, also the multifunction $V$ is upper semicontinuous, i.e. for each $r_{0}\in U_{i}$ and for any neighborhood $U_{V(r_{0})}$ of $V(r_{0})$ there exists a neighborhood $U_{r_{0}}$ of $r_{0}$ such that $V(r)\subset U_{V(r_{0})}$ for all $r\in U_{r_{0}}$ , cf. [BS00, Chapter 4.1].

Proof.

Note first that problem (51) is well-posed. This follows essentially as Theorem 2.13 for the unperturbed problem (1)-(3). We claim that $v$ is weakly lower semicontiuous (and then also lower semicontiuous). Let $r_{0}\in U_{i}$ be given. We have to prove that for any sequence $\{r_{n}\}$ with $r_{n}\rightharpoonup r_{0}$ , $n\rightarrow\infty$ , it holds $v(r_{0})\leq\liminf_{n\rightarrow\infty}v(r_{n}).$ Let $\{r_{n}\}$ be such a sequence. Then $\{r_{n}\}$ is bounded in $U_{i}$ . Let $\varepsilon>0$ be arbitrary. We show that for $n_{0}$ large enough $v(r_{0})-\varepsilon\leq v(r_{n})$ for all $n\geq n_{0}$ . Since $\{r_{n}\}$ is bounded, by definition of $J$ we can find some $R>0$ with $\mathop{\cup}_{n\in\mathbb{N}}V(r_{n})\subset\mathrm{B}_{U_{i}}(0,R).$ Suppose there exists a subsequence $\{r_{n_{k}}\}$ of $\{r_{n}\}$ such that for each $n_{k}$ there is some $u_{n_{k}}\in V(r_{n_{k}})$ with $v(r_{0})-\varepsilon>J(G(B_{i}(u_{n_{k}}+r_{n_{k}})),u_{n_{k}}+r_{n_{k}}).$ Note that $\{u_{n_{k}}\}$ is a bounded subset of $C$ and that $U_{i}$ is reflexive. Being convex and closed, $C$ is weakly compact. Hence, there is another subsequence (w.l.o.g. we consider the whole sequence $\{u_{n_{k}}\}$ ) and some $\overline{u}\in C$ such that $u_{n_{k}}\rightharpoonup\overline{u}$ with $k\rightarrow\infty$ . $J(G(B_{i}(\cdot+r_{0})),\cdot)$ is weakly lower semicontinuous. This follows by weak lower semicontinuity of the norm in $U_{i}$ and of the solution mapping $G$ [Mün16, Lemma 5.3]. This implies

[TABLE]

which is a contradiction. Therefore, $v(r_{0})\leq\liminf_{n\rightarrow\infty}v(r_{n}).$ Now suppose that $C$ is compact. We have to show that for any $\varepsilon>0$ there is a neighbourhood $U_{r_{0}}$ of $r_{0}$ such that $v(r)\leq v(r_{0})+\varepsilon$ for all $r\in U_{r_{0}}$ . We prove that we can choose neighbourhoods $U_{V(r_{0})}$ of $V(r_{0})$ and $U_{r_{0}}$ of $r_{0}$ such that $J(G(B_{i}(u+r)),u+r)\leq v(r_{0})+\varepsilon$ for all $(u,r)\in U_{V(r_{0})}\times U_{r_{0}}$ . Suppose that such neighbourhoods do not exist. Then there is a sequence $\{r_{n}\}$ with $r_{n}\rightarrow r_{0}$ , $n\rightarrow\infty$ , and a sequence $\{u_{n}\}\subset V(r_{0})\subset C$ such that $J(G(B_{i}(u_{n}+r_{n})),u_{n}+r_{n})>v(r_{0})+\varepsilon$ for all $n>0$ . Because $J$ is continuous the set $V(r_{0})$ is closed and therefore compact as a closed subset of a compact set. Hence, there exists a subsequence $\{u_{n_{k}}\}$ and some $\overline{u}\in V(r_{0})$ with $u_{n_{k}}\rightarrow\overline{u}$ as $k\rightarrow\infty$ . This yields

[TABLE]

which is a contradiction. So the neighbourhoods $U_{V(r_{0})}$ and $U_{r_{0}}$ do exist and for any $r\in U_{r_{0}}$ we obtain $v(r)\leq\inf_{u\in U_{V(r_{0})}}J(G(B_{i}(u+r)),u+r)\leq v(r_{0})+\varepsilon$ which implies that $v$ is upper semicontinuous. The last statement follows just as in [BS00, Proposition 4.4]. ∎

Acknowledgement

The author is supported by the DFG through the International Research Training Group IGDK 1754 „Optimization and Numerical Analysis for Partial Differential Equations with Nonsmooth Structures”. The author would like to thank Prof. Brokate from the Technical University of Munich and Prof. Fellner from the Karl-Franzens University of Graz for thoroughly proofreading the manuscript.

Bibliography41

The reference list from the paper itself. Each links out to its DOI / PubMed record.

1[Ama 05] H. Amann “Nonautonomous parabolic equations involving measures” In Journal of Mathematical Sciences 130.4 Springer, 2005, pp. 4780–4802
2[Aus+14] P. Auscher, N. Badr, R. Haller-Dintelmann and J. Rehberg “The square root problem for second-order, divergence form operators with mixed boundary conditions on L p superscript L 𝑝 \mathrm{L}^{p} ” In Journal of Evolution Equations 15.1 Springer, 2014, pp. 165–208
3[BJT 10] W. Barthel, C. John and F. Tröltzsch “Optimal boundary control of a system of reaction diffusion equations” In ZAMM-Journal of Applied Mathematics and Mechanics/Zeitschrift für Angewandte Mathematik und Mechanik 90.12 Wiley Online Library, 2010, pp. 966–982
4[BC 85] J.F. Bonnans and E. Casas “On the choice of the function spaces for some state-constrained control problems” In Numerical Functional Analysis and Optimization 7.4 , 1985, pp. 333–348 DOI: 10.1080/01630568508816197 · doi ↗
5[BS 00] J.F. Bonnans and A. Shapiro “Perturbation Analysis of Optimization Problems” New York: Springer, 2000
6[Bro 87] M. Brokate “Optimale Steuerung von gewöhnlichen Differentialgleichungen mit Nichtlinearitäten vom Hysteresis-Typ”, Methoden und Verfahren der mathematischen Physik P. Lang, 1987
7[Bro 88] M. Brokate “Optimal control of ODE systems with hysteresis nonlinearities” In Trends in Mathematical Optimization Springer, 1988, pp. 25–41
8[Bro 91] M. Brokate “Optimal control of systems described by ordinary differential equations with nonlinear characteristics of the hysteresis type.” In Autom. Remote Control 52 , 1991, pp. 1639–1681

TL;DR

Contribution

Findings

Abstract

Peer Reviews

Videos

Optimal control of reaction-diffusion systems with hysteresis

Abstract

1 Introduction

2 Preliminaries and assumptions

2.1 Sobolev spaces including homogeneous Dirichlet boundary conditions

Definition 2.1**.**

Assumption 2.2** (Domain).**

Definition 2.3** (Sobolev spaces).**

2.2 Operators and their properties

Definition 2.4** (Diffusion operator).**

Definition 2.5** (Maximal parabolic regularity).**

Remark 2.6** (Properties of ApA_{p}Ap​).**

Remark 2.7** (Embeddings).**

Remark 2.8**.**

Lemma 2.9** (Stop operator).**

Proof.

2.3 Assumptions and notation

Assumption 2.10** (Main assumption).**

2.4 Solution operator and optimal control

Definition 2.11**.**

Theorem 2.12** (Solution operator for the state equation).**

Proof.

Theorem 2.13** (Existence of optimal control).**

Proof.

3 Regularized control problem

Assumption 3.1** (Regularization).**

Remark 3.2**.**

3.1 Regularization of (8) and uniform-in-ε\varepsilonε estimates

Definition 3.3** (Regularized stop).**

Remark 3.4**.**

Corollary 3.5** (Existence of regularized problem).**

Proof.

Lemma 3.6** (Uniform bounds).**

Proof.

3.2 Dynamics of the regularized states

Lemma 3.7**.**

Lemma 3.8**.**

Proof.

3.3 The regularized optimal control problem

Theorem 3.9** (Convergence of optimal solutions).**

Proof.

3.4 Gâteaux differentiability of the solution operator of the regularized state equation

Lemma 3.10** (Gâteaux differentiability of GεG_{\varepsilon}Gε​).**

Proof.

3.5 Adjoint system for the regularized problem

Lemma 3.11**.**

Proof.

Lemma 3.12**.**

Proof.

Theorem 3.13** (Adjoint system regularized problem).**

Proof.

3.6 Estimates for the adjoints of the regularized problem

Lemma 3.14** (Uniform bounds).**

Proof.

4 Adjoint system and optimality conditions for the optimal control problem

4.1 Adjoint system for distributed or boundary controls

Lemma 4.1** (Adjoint system in the limit).**

Proof.

Definition 4.2** (Partition of JTJ_{T}JT​).**

Lemma 4.3** (qqq in I0I_{0}I0​).**

Proof.

Lemma 4.4** (qqq in I∂I_{\partial}I∂​: Relation to P(Sy‾)\mathcal{P}(S\overline{y})P(Sy​)).**

Proof.

Lemma 4.5** (qqq in I∂I_{\partial}I∂​: Relation to dμd\mudμ).**

Proof.

Lemma 4.6** (Discontinuity properties of qqq).**

Proof.

Assumption 4.7** (Regularity assumption).**

Remark 4.8**.**

Definition 4.9** (Switching times).**

Lemma 4.10** (qqq at switching times).**

Proof.

Remark 4.11**.**

4.2 Optimality conditions for distributed or boundary controls

Definition 2.1.

Assumption 2.2 (Domain).

Definition 2.3 (Sobolev spaces).

Definition 2.4 (Diffusion operator).

Definition 2.5 (Maximal parabolic regularity).

Remark 2.6 (Properties of $A_{p}$ ).

Remark 2.7 (Embeddings).

Remark 2.8.

Lemma 2.9 (Stop operator).

Assumption 2.10 (Main assumption).

Definition 2.11.

Theorem 2.12 (Solution operator for the state equation).

Theorem 2.13 (Existence of optimal control).

Assumption 3.1 (Regularization).

Remark 3.2.

3.1 Regularization of (8) and uniform-in- $\varepsilon$ estimates

Definition 3.3 (Regularized stop).

Remark 3.4.

Corollary 3.5 (Existence of regularized problem).

Lemma 3.6 (Uniform bounds).

Lemma 3.7.

Lemma 3.8.

Theorem 3.9 (Convergence of optimal solutions).

Lemma 3.10 (Gâteaux differentiability of $G_{\varepsilon}$ ).

Lemma 3.11.

Lemma 3.12.

Theorem 3.13 (Adjoint system regularized problem).

Lemma 3.14 (Uniform bounds).

Lemma 4.1 (Adjoint system in the limit).

Definition 4.2 (Partition of $J_{T}$ ).

Lemma 4.3 ( $q$ in $I_{0}$ ).

Lemma 4.4 ( $q$ in $I_{\partial}$ : Relation to $\mathcal{P}(S\overline{y})$ ).

Lemma 4.5 ( $q$ in $I_{\partial}$ : Relation to $d\mu$ ).

Lemma 4.6 (Discontinuity properties of $q$ ).

Assumption 4.7 (Regularity assumption).

Remark 4.8.

Definition 4.9 (Switching times).

Lemma 4.10 ( $q$ at switching times).

Remark 4.11.

Lemma 4.12 (Optimality condition).

Theorem 4.13 (Adjoint system and optimality condition).

Corollary 4.14 (Adjoint system and optimality condition for regular $f$ ).

Corollary 4.15 (Optimality condition for distributed controls).

Corollary 4.16 (Unique adjoint system for distributed controls).

Assumption 5.1.

Theorem 5.2 (Higher regularity).

Remark 5.3.

Theorem 6.1 (Value function).