State-constrained control-affine parabolic problems I: first and second   order necessary optimality conditions

M. Soledad Aronna; J. Fr\'ed\'eric Bonnans; Axel Kr\"oner

arXiv:1906.00237·math.OC·September 16, 2020

State-constrained control-affine parabolic problems I: first and second order necessary optimality conditions

M. Soledad Aronna, J. Fr\'ed\'eric Bonnans, Axel Kr\"oner

PDF

TL;DR

This paper establishes first and second order necessary optimality conditions for a control problem governed by a semilinear heat equation with integral state constraints, bilinear control terms, and a tracking cost functional.

Contribution

It introduces novel second order necessary conditions using alternative costates and quasi-radial critical directions for complex PDE control problems.

Findings

01

Derived second order necessary conditions for the control problem.

02

Provided an example demonstrating the applicability of the theoretical results.

03

Extended optimality conditions to problems with integral state constraints.

Abstract

In this paper we consider an optimal control problem governed by a semilinear heat equation with bilinear control-state terms and subject to control and state constraints. The state constraints are of integral type, the integral being with respect to the space variable. The control is multidimensional. The cost functional is of a tracking type and contains a linear term in the control variables. We derive second order necessary conditions relying on the concept of alternative costates and quasi-radial critical directions. The appendix provides an example illustrating the applicability of our results.

Equations292

Y := {y \in H^{2, 1} (Q); y = 0 a.e. on Σ} .

Y := {y \in H^{2, 1} (Q); y = 0 a.e. on Σ} .

W (0, T) := {y \in L^{2} (0, T; H_{0}^{1} (Ω)); \overset{y}{˙} \in L^{2} (0, T; H^{- 1} (Ω))} .

W (0, T) := {y \in L^{2} (0, T; H_{0}^{1} (Ω)); \overset{y}{˙} \in L^{2} (0, T; H^{- 1} (Ω))} .

⎩ ⎨ ⎧ \overset{y}{˙} (x, t) - Δ y (x, t) + γ y^{3} (x, t) = f (x, t) + y (x, t) i = 0 \sum m u_{i} (t) b_{i} (x) in Q, y = 0 on Σ, y (\cdot, 0) = y_{0} in Ω,

⎩ ⎨ ⎧ \overset{y}{˙} (x, t) - Δ y (x, t) + γ y^{3} (x, t) = f (x, t) + y (x, t) i = 0 \sum m u_{i} (t) b_{i} (x) in Q, y = 0 on Σ, y (\cdot, 0) = y_{0} in Ω,

y_{0} \in H_{0}^{1} (Ω), f \in L^{2} (Q), b \in L^{\infty} (Ω)^{m + 1},

y_{0} \in H_{0}^{1} (Ω), f \in L^{2} (Q), b \in L^{\infty} (Ω)^{m + 1},

U_{ad}

U_{ad}

g_{j} (y (\cdot, t)) := \int_{Ω} c_{j} (x) y (x, t) d x + d_{j} \leq 0, for t \in [0, T], j = 1, \dots, q,

g_{j} (y (\cdot, t)) := \int_{Ω} c_{j} (x) y (x, t) d x + d_{j} \leq 0, for t \in [0, T], j = 1, \dots, q,

\begin{split}J(u,y):=&\mbox{$\frac{1}{2}$}\int_{Q}(y(x,t)-y_{d}(x))^{2}{\rm d}x{\rm d}t\\ &+\mbox{$\frac{1}{2}$}\int_{\Omega}(y(x,T)-y_{dT}(x))^{2}{\rm d}x+\sum_{i=1}^{m}\alpha_{i}\int_{0}^{T}u_{i}(t){\rm d}t,\end{split}

\begin{split}J(u,y):=&\mbox{$\frac{1}{2}$}\int_{Q}(y(x,t)-y_{d}(x))^{2}{\rm d}x{\rm d}t\\ &+\mbox{$\frac{1}{2}$}\int_{\Omega}(y(x,T)-y_{dT}(x))^{2}{\rm d}x+\sum_{i=1}^{m}\alpha_{i}\int_{0}^{T}u_{i}(t){\rm d}t,\end{split}

y_{d} \in L^{2} (Q), y_{d T} \in H_{0}^{1} (Ω),

y_{d} \in L^{2} (Q), y_{d T} \in H_{0}^{1} (Ω),

Min_{u \in U_{ad}} J (u, y [u]); subject to \eqref stateconstraint .

Min_{u \in U_{ad}} J (u, y [u]); subject to \eqref stateconstraint .

∥ u_{i} b_{i} y ∥_{2} \leq ∥ u_{i} ∥_{2} ∥ b_{i} ∥_{\infty} ∥ y ∥_{L^{\infty} (0, T; L^{2} (Ω))} .

∥ u_{i} b_{i} y ∥_{2} \leq ∥ u_{i} ∥_{2} ∥ b_{i} ∥_{\infty} ∥ y ∥_{L^{\infty} (0, T; L^{2} (Ω))} .

∥ y ∥_{L^{\infty} (0, T; L^{2} (Ω))} + ∥\nabla y ∥_{2} \leq C_{1} (∥ y_{0} ∥_{2}, ∥ f ∥_{2}, ∥ u ∥_{2} ∥ b ∥_{\infty}),

∥ y ∥_{L^{\infty} (0, T; L^{2} (Ω))} + ∥\nabla y ∥_{2} \leq C_{1} (∥ y_{0} ∥_{2}, ∥ f ∥_{2}, ∥ u ∥_{2} ∥ b ∥_{\infty}),

∥ y ∥_{Y} \leq C_{2} (∥ y_{0} ∥_{H_{0}^{1} (Ω)}, ∥ f ∥_{2}, ∥ u ∥_{2} ∥ b ∥_{\infty}) .

H_{0}^{1} (Ω) \subset L^{6} (Ω), when n \leq 3.

H_{0}^{1} (Ω) \subset L^{6} (Ω), when n \leq 3.

\begin{array}[]{lll}\displaystyle\mbox{$\frac{1}{2}$}\frac{\rm d}{{\rm d}t}\int_{\Omega}y(x,t)^{2}{\rm d}x+\int_{\Omega}|\nabla y(x,t)|^{2}{\rm d}x+\gamma\int_{\Omega}y(x,t)^{4}{\rm d}x\vspace{1mm}\\ \displaystyle\hskip 42.67912pt\leq\mbox{$\frac{1}{2}$}\int_{\Omega}f(x,t)^{2}{\rm d}x+(\mbox{$\frac{1}{2}$}+|u(t)|_{1}\|b\|_{\infty})\int_{\Omega}y(x,t)^{2}{\rm d}x.\end{array}

\begin{array}[]{lll}\displaystyle\mbox{$\frac{1}{2}$}\frac{\rm d}{{\rm d}t}\int_{\Omega}y(x,t)^{2}{\rm d}x+\int_{\Omega}|\nabla y(x,t)|^{2}{\rm d}x+\gamma\int_{\Omega}y(x,t)^{4}{\rm d}x\vspace{1mm}\\ \displaystyle\hskip 42.67912pt\leq\mbox{$\frac{1}{2}$}\int_{\Omega}f(x,t)^{2}{\rm d}x+(\mbox{$\frac{1}{2}$}+|u(t)|_{1}\|b\|_{\infty})\int_{\Omega}y(x,t)^{2}{\rm d}x.\end{array}

\begin{array}[]{lll}\displaystyle\dot{\eta}(t)\leq\int_{\Omega}f(x,t)^{2}{\rm d}x+(1+2|u(t)|_{1}\|b\|_{\infty})\eta(t).\end{array}

\begin{array}[]{lll}\displaystyle\dot{\eta}(t)\leq\int_{\Omega}f(x,t)^{2}{\rm d}x+(1+2|u(t)|_{1}\|b\|_{\infty})\eta(t).\end{array}

∥ η ∥_{\infty} \leq (∥ y_{0} ∥_{2}^{2} + ∥ f ∥_{2}^{2}) e^{T + 2∥ u ∥_{1} ∥ b ∥_{\infty}}

∥ η ∥_{\infty} \leq (∥ y_{0} ∥_{2}^{2} + ∥ f ∥_{2}^{2}) e^{T + 2∥ u ∥_{1} ∥ b ∥_{\infty}}

\begin{array}[]{lll}\displaystyle\int_{\Omega}\dot{y}(x,t)^{2}{\rm d}x+\mbox{$\frac{1}{2}$}\frac{\rm d}{{\rm d}t}\int_{\Omega}|\nabla y(x,t)|^{2}{\rm d}x+\frac{\gamma}{4}\frac{\rm d}{{\rm d}t}\int_{\Omega}y(x,t)^{4}{\rm d}x\vspace{1mm}\\ \displaystyle\hskip 42.67912pt\leq\frac{1}{\varepsilon}\int_{\Omega}f(x,t)^{2}{\rm d}x+\frac{1}{\varepsilon}{\color[rgb]{0,0,0}|u(t)|^{2}}\|b\|^{2}_{\infty}\int_{\Omega}y(x,t)^{2}{\rm d}x+\frac{\varepsilon}{2}\int_{\Omega}\dot{y}(x,t)^{2}{\rm d}x.\end{array}

\begin{array}[]{lll}\displaystyle\int_{\Omega}\dot{y}(x,t)^{2}{\rm d}x+\mbox{$\frac{1}{2}$}\frac{\rm d}{{\rm d}t}\int_{\Omega}|\nabla y(x,t)|^{2}{\rm d}x+\frac{\gamma}{4}\frac{\rm d}{{\rm d}t}\int_{\Omega}y(x,t)^{4}{\rm d}x\vspace{1mm}\\ \displaystyle\hskip 42.67912pt\leq\frac{1}{\varepsilon}\int_{\Omega}f(x,t)^{2}{\rm d}x+\frac{1}{\varepsilon}{\color[rgb]{0,0,0}|u(t)|^{2}}\|b\|^{2}_{\infty}\int_{\Omega}y(x,t)^{2}{\rm d}x+\frac{\varepsilon}{2}\int_{\Omega}\dot{y}(x,t)^{2}{\rm d}x.\end{array}

\begin{array}[]{lll}\displaystyle\int_{\Omega}\dot{y}(x,t)^{2}{\rm d}x+\frac{\rm d}{{\rm d}t}\int_{\Omega}|\nabla y(x,t)|^{2}{\rm d}x+\frac{\gamma}{2}\frac{\rm d}{{\rm d}t}\int_{\Omega}y(x,t)^{4}{\rm d}x\vspace{1mm}\\ \displaystyle\hskip 42.67912pt\leq 2\int_{\Omega}f(x,t)^{2}{\rm d}x+2{\color[rgb]{0,0,0}|u(t)|^{2}}\|b\|^{2}_{\infty}\int_{\Omega}y(x,t)^{2}{\rm d}x.\end{array}

\begin{array}[]{lll}\displaystyle\int_{\Omega}\dot{y}(x,t)^{2}{\rm d}x+\frac{\rm d}{{\rm d}t}\int_{\Omega}|\nabla y(x,t)|^{2}{\rm d}x+\frac{\gamma}{2}\frac{\rm d}{{\rm d}t}\int_{\Omega}y(x,t)^{4}{\rm d}x\vspace{1mm}\\ \displaystyle\hskip 42.67912pt\leq 2\int_{\Omega}f(x,t)^{2}{\rm d}x+2{\color[rgb]{0,0,0}|u(t)|^{2}}\|b\|^{2}_{\infty}\int_{\Omega}y(x,t)^{2}{\rm d}x.\end{array}

\|y\|_{H^{1}(0,T;L^{2}({\Omega}))}+\|\nabla y\|_{L^{\infty}(0,T;L^{2}({\Omega}))}\leq C_{2}({\color[rgb]{0,0,0}\|y_{0}\|_{H^{1}_{0}({\Omega})}},\|f\|_{2},\|u\|_{2}\|b\|_{\infty}).

\|y\|_{H^{1}(0,T;L^{2}({\Omega}))}+\|\nabla y\|_{L^{\infty}(0,T;L^{2}({\Omega}))}\leq C_{2}({\color[rgb]{0,0,0}\|y_{0}\|_{H^{1}_{0}({\Omega})}},\|f\|_{2},\|u\|_{2}\|b\|_{\infty}).

F(u,y,y_{0},f):=\Big{(}\dot{y}-\Delta y+\gamma y^{3}-y\sum_{i=1}^{m}u_{i}b_{i},y(0)-y_{0}\Big{)},

F(u,y,y_{0},f):=\Big{(}\dot{y}-\Delta y+\gamma y^{3}-y\sum_{i=1}^{m}u_{i}b_{i},y(0)-y_{0}\Big{)},

\dot{z}-\Delta z+{\color[rgb]{0,0,0}z\sum_{i=1}^{m}u_{i}b_{i}}+3\gamma\hat{y}^{2}z={\color[rgb]{0,0,0}\tilde{f}};\quad z(0)={\color[rgb]{0,0,0}\tilde{y}_{0}}

\dot{z}-\Delta z+{\color[rgb]{0,0,0}z\sum_{i=1}^{m}u_{i}b_{i}}+3\gamma\hat{y}^{2}z={\color[rgb]{0,0,0}\tilde{f}};\quad z(0)={\color[rgb]{0,0,0}\tilde{y}_{0}}

\mbox{$\frac{1}{2}$}\dot{\nu}-|u(t)|\|b\|_{\infty}\nu(t)\leq\mbox{$\frac{1}{2}$}\dot{\nu}+\int_{\Omega}z^{2}_{+}\sum_{i=1}^{m}u_{i}b_{i}\leq\int_{\Omega}\tilde{f}z_{+}\leq 0

\mbox{$\frac{1}{2}$}\dot{\nu}-|u(t)|\|b\|_{\infty}\nu(t)\leq\mbox{$\frac{1}{2}$}\dot{\nu}+\int_{\Omega}z^{2}_{+}\sum_{i=1}^{m}u_{i}b_{i}\leq\int_{\Omega}\tilde{f}z_{+}\leq 0

(A z) (x, t) := - Δ z (x, t) + 3 γ \overset{y}{ˉ} (x, t)^{2} z (x, t) - i = 0 \sum m \overset{u}{ˉ}_{i} (t) b_{i} (x) z (x, t) .

(A z) (x, t) := - Δ z (x, t) + 3 γ \overset{y}{ˉ} (x, t)^{2} z (x, t) - i = 0 \sum m \overset{u}{ˉ}_{i} (t) b_{i} (x) z (x, t) .

{\overset{z}{˙} + A z = \overset{ˉ}{f}, in Q, z = 0 on Σ, z (x, 0) = 0 in Ω,

{\overset{z}{˙} + A z = \overset{ˉ}{f}, in Q, z = 0 on Σ, z (x, 0) = 0 in Ω,

\|z\|_{L^{\infty}(0,T;L^{2}(\Omega))}\leq e^{{\mbox{$\frac{1}{2}$}T+\sum_{i=0}^{m}\|\bar{u}_{i}\|_{1}\|b_{i}\|_{\infty}}}\|\bar{f}\|_{L^{2}(0,T;L^{2}(\Omega))}.

\|z\|_{L^{\infty}(0,T;L^{2}(\Omega))}\leq e^{{\mbox{$\frac{1}{2}$}T+\sum_{i=0}^{m}\|\bar{u}_{i}\|_{1}\|b_{i}\|_{\infty}}}\|\bar{f}\|_{L^{2}(0,T;L^{2}(\Omega))}.

\begin{split}\mbox{$\frac{1}{2}$}\frac{{\rm d}}{{\rm d}t}\left\lVert z(\cdot,t)\right\rVert_{L^{2}({\Omega})}^{2}&+\left\lVert\nabla z(\cdot,t)\right\rVert_{L^{2}({\Omega})}^{2}+3\gamma\|\bar{y}(\cdot,t)z(\cdot,t)\|_{L^{2}({\Omega})}^{2}\\ &=\int_{\Omega}z(x,t)\left(\bar{f}(x,t)+{\sum_{i=0}^{m}\bar{u}_{i}(t)\cdot b_{i}(x)}z(x,t)\right){\rm d}x.\end{split}

\begin{split}\mbox{$\frac{1}{2}$}\frac{{\rm d}}{{\rm d}t}\left\lVert z(\cdot,t)\right\rVert_{L^{2}({\Omega})}^{2}&+\left\lVert\nabla z(\cdot,t)\right\rVert_{L^{2}({\Omega})}^{2}+3\gamma\|\bar{y}(\cdot,t)z(\cdot,t)\|_{L^{2}({\Omega})}^{2}\\ &=\int_{\Omega}z(x,t)\left(\bar{f}(x,t)+{\sum_{i=0}^{m}\bar{u}_{i}(t)\cdot b_{i}(x)}z(x,t)\right){\rm d}x.\end{split}

\|\bar{f}(\cdot,t)\|^{2}_{L^{2}({\Omega})}+\left(\mbox{$\frac{1}{2}$}+{\sum_{i=0}^{m}|\bar{u}_{i}|\|b_{i}\|_{\infty}}\right)\|z(\cdot,t)\|^{2}_{L^{2}({\Omega})}.

\|\bar{f}(\cdot,t)\|^{2}_{L^{2}({\Omega})}+\left(\mbox{$\frac{1}{2}$}+{\sum_{i=0}^{m}|\bar{u}_{i}|\|b_{i}\|_{\infty}}\right)\|z(\cdot,t)\|^{2}_{L^{2}({\Omega})}.

\left\{\begin{array}[]{lll}\text{For any $p\in[1,10)$, the following injection is compact:}\\ Y\hookrightarrow L^{p}(0,T;L^{10}({\Omega})),\;\;\text{when $n\leq 3$.}\end{array}\right.

\left\{\begin{array}[]{lll}\text{For any $p\in[1,10)$, the following injection is compact:}\\ Y\hookrightarrow L^{p}(0,T;L^{10}({\Omega})),\;\;\text{when $n\leq 3$.}\end{array}\right.

\begin{array}[]{lll}\frac{1}{\nu^{\sigma}}\|y_{\ell}^{\nu}-\hat{y}^{\nu}\|^{\sigma}_{\sigma}=\int_{Q}\tilde{y}_{\ell}^{\sigma(\nu-1)}(y_{\ell}-\hat{y})^{\sigma}{\rm d}x{\rm d}t&\leq&\|\tilde{y}_{\ell}^{\sigma(\nu-1)}\|_{p}\|(y_{\ell}-\hat{y})^{\sigma}\|_{q}\\ &=&\|\tilde{y}_{\ell}\|^{\sigma(\nu-1)}_{6}\|y_{\ell}-\hat{y}\|_{6}^{\sigma}.\end{array}

\begin{array}[]{lll}\frac{1}{\nu^{\sigma}}\|y_{\ell}^{\nu}-\hat{y}^{\nu}\|^{\sigma}_{\sigma}=\int_{Q}\tilde{y}_{\ell}^{\sigma(\nu-1)}(y_{\ell}-\hat{y})^{\sigma}{\rm d}x{\rm d}t&\leq&\|\tilde{y}_{\ell}^{\sigma(\nu-1)}\|_{p}\|(y_{\ell}-\hat{y})^{\sigma}\|_{q}\\ &=&\|\tilde{y}_{\ell}\|^{\sigma(\nu-1)}_{6}\|y_{\ell}-\hat{y}\|_{6}^{\sigma}.\end{array}

⎩ ⎨ ⎧ \overset{z}{˙} + A z = i = 1 \sum m v_{i} b_{i} \overset{y}{ˉ} in Q; z = 0 on Σ, z (\cdot, 0) = 0 on Ω,

⎩ ⎨ ⎧ \overset{z}{˙} + A z = i = 1 \sum m v_{i} b_{i} \overset{y}{ˉ} in Q; z = 0 on Σ, z (\cdot, 0) = 0 on Ω,

∥ z ∥_{L^{\infty} (0, T; L^{2} (Ω))} \leq M_{1} i = 1 \sum m ∥ b_{i} ∥_{\infty} ∥ v_{i} ∥_{1},

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Full text

State-constrained control-affine parabolic problems I: First and Second order necessary optimality conditions

M. Soledad Aronna

EMAp/FGV, Rio de Janeiro 22250-900, Brazil

[email protected]

,

Frédéric Bonnans

INRIA-Saclay and Centre de Mathématiques Appliquées, Ecole Polytechnique, 91128 Palaiseau, France

[email protected]

and

Axel Kröner

Weierstrass Institute for Applied Analysis and Stochastics, 10117 Berlin, Germany

[email protected]

Abstract.

In this paper we consider an optimal control problem governed by a semilinear heat equation with bilinear control-state terms and subject to control and state constraints. The state constraints are of integral type, the integral being with respect to the space variable. The control is multidimensional. The cost functional is of a tracking type and contains a linear term in the control variables. We derive second order necessary conditions relying on the concept of alternative costates and quasi-radial critical directions. The appendix provides an example illustrating the applicability of our results.

Key words and phrases:

optimal control of partial differential equations, semilinear parabolic equations, state constraints, second order analysis, control-affine problems

The first author was supported by FAPERJ, CNPq and CAPES (Brazil) and by the Alexander von Humboldt Foundation (Germany). The second author thanks the ‘Laboratoire de Finance pour les Marchés de l’Energie’ for its support. The second and third authors were supported by a public grant as part of the Investissement d’avenir project, reference ANR-11-LABX-0056-LMH, LabEx LMH, in a joint call with Gaspard Monge Program for optimization, operations research and their interactions with data sciences.

This is the first part of a work on optimality conditions for a control problem of a semilinear heat equation. More precisely, the full version, available at https://arxiv.org/abs/1906.00237v1, has been divided in two, resulting in the current manuscript (that corresponds to Part I) and https://arxiv.org/abs/1909.05056 (which is Part II)

Keywords:

1. Introduction

This is the first part of two papers on necessary and sufficient optimality conditions for an optimal control problem governed by a semilinear heat equation containing bilinear terms coupling the control and the state, and subject to constraints on the control and state. The control may have several components and enters in an affine way in the cost. In this first part we derive necessary optimality conditions of first and second order, in the second part [2] sufficient optimality conditions are shown.

In the context of second order conditions for problems governed by control-affine ordinary differential equations we can mention several works, starting with the early papers [18] by Goh and [19] by Kelley, later [15] by Dmitruk, and recently [1]. In this context, the case dealing with both control and state constraints was treated in e.g. Maurer [25], McDanell and Powers [28], Maurer, Kim and Vossen [27], Schättler [30], and Aronna et al. [3]. Fore a more detailed description of the contributions in this framework, we refer to [3].

In the infinite dimensional case, the issue of second order conditions for problems governed by elliptic equations and assuming state constraints was treated by several authors, see e.g. Casas, Tröltzsch and Unger [12], Bonnans [6], Casas, Mateos and Tröltzsch [11] and Casas and Tröltzsch [13].

Parabolic optimal control problems with state constraints were discussed in several articles. For a semilinear equation in the presence of pure-state constraints, Raymond and Tröltzsch [29], and Krumbiegel and Rehberg [20] obtained second order sufficient conditions. Casas, de Los Reyes, and Tröltzsch [10] and de Los Reyes, Merino, Rehberg and Tröltzsch [14] proved sufficient second order conditions for semilinear equations, both in the elliptic and parabolic cases. The articles mentioned in this paragraph did not consider bilinear terms as we do in the current work.

Further details regarding the existing results on second order analysis of control-affine state-constrained problems are given in the second part [2] of this research.

The contribution of this paper are first and second order necessary optimality conditions for an optimal control problem for a semilinear parabolic equation with cubic nonlinearity, several controls coupled with the state variable through bilinear terms, pointwise control constraints and state constraints that are integral in space. To incorporate the state constraints we use the concept of alternative costates (see Bonnans and Jaisson [8]) and the concept of quasi-radial directions (see Bonnans and Shapiro [9] and Aronna, Bonnans and Goh [3]).

The paper is organized as follows. In Section 2 the problem is stated and main assumptions are formulated. In Section 3 first order analysis is done. Section 4 is devoted to second order necessary conditions. Finally, in the appendix, we give an example satisfying the hypotheses of our main results.

Notation

Let ${\Omega}$ be an open and bounded subset of $\mathbb{R}^{n},$ $n\leq 3$ , with $C^{\infty}$ boundary $\partial{\Omega}$ . Given $p\in[1,\infty]$ and $k\in\mathbb{N}$ , let $W^{k,p}(\Omega)$ be the Sobolev space of functions in $L^{p}(\Omega)$ with derivatives (here and after, derivatives w.r.t. $x\in{\Omega}$ or w.r.t. time are taken in the sense of distributions) in $L^{p}(\Omega)$ up to order $k.$ Let $\mathcal{D}(\Omega)$ be the set of $C^{\infty}$ functions with compact support in ${\Omega}$ . By $W^{k,p}_{0}(\Omega)$ we denote the closure of $\mathcal{D}(\Omega)$ with respect to the $W^{k,p}$ -topology. Given a horizon $T>0$ , we write $Q:={\Omega}\times(0,T)$ . $\left\lVert\cdot\right\rVert_{p}$ denotes the norm in $L^{p}(0,T),$ $L^{p}(\Omega)$ and $L^{p}(Q)$ , indistinctively. When a function depends on both space and time, but the norm is computed only with respect to one of these variables, we specify both the space and domain. For example, if $y\in L^{p}(Q)$ and we fix $t\in(0,T),$ we write $\|y(\cdot,t)\|_{L^{p}(\Omega)}$ . For the $p$ -norm in $\mathbb{R}^{m},$ for $m\in\mathbb{N},$ we use $|\cdot|_{p}$ , for the Euclidean norm we omit the index. We set $H^{k}(\Omega):=W^{k,2}(\Omega)$ and $H^{k}_{0}(\Omega):=W_{0}^{k,2}(\Omega)$ , with dual denoted by $H^{-k}({\Omega})$ . By $W^{2,1,p}(Q)$ we mean the Sobolev space of $L^{p}(Q)$ -functions whose second derivative in space and first derivative in time belong to $L^{p}(Q)$ . For $p>n+1$ , we denote by $Y_{p}$ the set of elements of $W^{2,1,p}(Q)$ with zero trace on $\Sigma$ , and by $Y^{0}_{p}$ its trace at time zero. We write $H^{2,1}(Q)$ for $W^{2,1,2}(Q)$ and, setting $\Sigma:=\partial{\Omega}\times(0,T),$ we define the state space as

[TABLE]

The latter is continuously embedded in

[TABLE]

Note that if $y$ is a function over $Q$ , we use $\dot{y}$ to denote its time derivative in the sense of distributions. As usual we denote the spatial gradient and the Laplacian by $\nabla$ and $\Delta$ . By $\operatorname{dist}(t,I):=\inf\{\left\lVert t-\bar{t}\right\rVert_{\;};\;\bar{t}\in I\}$ for $I\subset\mathbb{R}$ , we denote the distance of $t$ to the set $I$ .

2. Statement of the problem and main assumptions

In this section we introduce the optimal control problem we deal with and we show well-posedness of the state equation and existence of solutions of the optimal control problem.

2.1. Setting

Consider the state equation

[TABLE]

and

[TABLE]

$\gamma\geq 0$ , $u_{0}\equiv 1$ is a constant, and $u:=(u_{1},\ldots,u_{m})\in L^{2}(0,T)^{m}$ . Lemma 2.3 below shows that for each control $u\in{L^{2}(0,T)}^{m},$ there is a unique associated solution $y\in Y$ of (2.1), called the associated state. Let $y[u]$ denote this solution. We consider control constraints of the form $u\in\mathcal{U}_{\rm ad}$ , where

[TABLE]

In some statements, we will consider a specific form of $\mathcal{U}_{\rm ad}$ (see (3.26) below). In addition, we have finitely many linear running state constraints of the form

[TABLE]

where $c_{j}\in H^{2}({\Omega})\cap H^{1}_{0}({\Omega})$ for $j=1,\dots,q$ , and $d\in\mathbb{R}^{q}$ . The $H^{1}_{0}(\Omega)$ regularity of $c$ is used in Lemma 3.2 to derive regularity results for the adjoint state and the $H^{2}(\Omega)$ regularity in Proposition 3.11 for results on the Lagrange multiplier associated with the state constraint.

We call any $(u,y[u])\in L^{2}(0,T)^{m}\times Y$ a trajectory, and if it additionally satisfies the control and state constraints, we say it is an admissible trajectory. The cost function is

[TABLE]

where

[TABLE]

and $\alpha\in\mathbb{R}^{m}$ . We consider the optimal control problem

[TABLE]

For problem (P) we consider the two types of solution given next.

Definition 2.1.

Let $\bar{u}\in\mathcal{U}_{\rm ad}$ . We say that $(\bar{u},y[\bar{u}])$ is an $L^{2}$ -local solution (resp., $L^{\infty}$ -local solution) if there exists $\varepsilon>0$ such that $(\bar{u},y[\bar{u}])$ is a minimum among the admissible trajectories $(u,y)$ that satisfy $\|u-\bar{u}\|_{2}<\varepsilon$ (resp., $\|u-\bar{u}\|_{\infty}<\varepsilon$ ).

2.2. Well-posedness of the state equation

Here we study the state equation and analyze, by means of the Implicit Function Theorem, the control-to-state mapping, i.e. the mapping that associates to each control, the corresponding solution of the state equation. We start by the following easily checked technical result.

Lemma 2.2.

For $i=0,\dots,m$ , the mapping defined on $L^{2}(0,T)\times L^{\infty}({\Omega})\times L^{\infty}(0,T;L^{2}({\Omega}))$ , given by $(u_{i},b_{i},y)\mapsto u_{i}b_{i}y,$ has image in $L^{2}(Q)$ , is of class $C^{\infty}$ , and satisfies

[TABLE]

A uniqueness and existence result, and a priori estimates for the state follows.

Lemma 2.3.

The state equation (2.1) has a unique solution $y=y[u,y_{0},f]$ in $Y$ . The mapping $(u,y_{0},f)\mapsto y[{\color[rgb]{0,0,0}u,y_{0},f}]$ is $C^{\infty}$ from $L^{2}(0,T)^{m}\times H^{1}_{0}({\Omega})\times L^{2}(Q)$ to $Y$ , and nondecreasing w.r.t. $y_{0}$ and $f$ . In addition, there exist functions $C_{i}$ , $i=1$ to 2, not decreasing w.r.t. each component, such that

[TABLE]

Moreover, the state $y$ also belongs to $C([0,T];H^{1}_{0}(\Omega))$ , since $Y$ is continuously embedded in that space [24, Theorem 3.1, p.23].

In the proof that follows, we use several times the (continuous) Sobolev inclusion

[TABLE]

Proof.

(i) Observe first that by the standard Sobolev inclusions and Lemma 2.2, any $y\in Y$ is such that $y^{3}$ and $y\sum_{i=0}^{m}u_{i}b_{i}$ belong to $L^{2}(Q)$ . So, $\dot{y}-\Delta y\in L^{2}(Q)$ and, therefore, the notion of solution of the state equation in $Y$ is clear. We could as well define a solution in $W(0,T)$ but since by (2.10), for $n\leq 3$ , $W(0,T)\subset L^{2}(0,T;L^{6}({\Omega})),$ and the compatibility condition (equality between the trace of the initial condition on $\partial{\Omega}$ and the Dirichlet condition on $\Sigma$ ) holds, it follows then that any solution in $W(0,T)$ is a solution in $Y$ .

(ii) We establish the a priori estimates (2.8)-(2.9). Multiplying the state equation by $y$ and integrating over ${\Omega}$ , we get

[TABLE]

In particular, $\eta(t):=\int_{\Omega}y(x,t)^{2}{\rm d}x$ satisfies

[TABLE]

By Gronwall’s Lemma:

[TABLE]

and then (2.8) easily follows.

Now multiplying the state equation by $\dot{y}$ we get, for all $\varepsilon>0$ ,

[TABLE]

Choosing $\varepsilon=1$ we get, after cancellation,

[TABLE]

For $\tau\in[0,T)$ , integrating from [math] to $\tau$ , and using (2.10), we obtain that

[TABLE]

We easily deduce (2.9) since we can estimate $\|\Delta y\|_{L^{2}(Q)}$ and, therefore, also $\|y\|_{L^{2}(0,T;H^{2}({\Omega}))}$ with the previous relations.

(iii) We construct a sequence $y_{k}$ of Galerkin approximations for which estimates analogous to (2.8) hold. Some subsequence weakly converges in $W(0,T)$ to some $y$ and is such that the sequence $y^{3}_{k}$ , bounded in $L^{2}(Q)$ , weakly converges in this space. By the Aubin-Lions lemma [4], the injection of $W(0,T)$ into $L^{2}(Q)$ is compact. So (extracting again a subsequence if necessary), $y^{3}_{k}$ converges a.e. to $y^{3}$ . By Lions [22, Lem. 1.3, p. 12], the weak limit of $y^{3}_{k}$ is $y^{3}$ , and $y$ is therefore solution of the state equation.

(iv) The $C^{\infty}$ regularity of $y[u,y_{0},f]$ is a consequence of the Implicit Function Theorem. In fact, let $Y^{0}$ denote the trace at time 0 of elements of $Y$ , which with the trace norm is a Banach space containing $H^{1}_{0}({\Omega})$ . Then the mapping $F:L^{2}(0,T)\times Y\times Y^{0}\times L^{2}(Q)\rightarrow L^{2}(Q)\times Y^{0}$ defined by

[TABLE]

is of class $C^{\infty}$ . That the linearized mapping $D_{y}F$ is bijective follows from results already shown in this proof.

(v) Uniqueness follows from the monotonicity w.r.t. $(y_{0},f)$ , that we prove as follows. Consider the difference $z:=y_{2}-y_{1}$ of two solutions $y_{1}$ and $y_{2}$ of (2.1), with data $(y_{01},f_{1})\leq(y_{02},f_{2})$ , resp. By the Mean Value Theorem, $z$ is solution of

[TABLE]

where $\hat{y}\in[y_{1},y_{2}]$ a.e., ${\color[rgb]{0,0,0}\tilde{y}_{0}}:=y_{02}-y_{01}\leq 0$ and $\tilde{f}:=f_{2}-f_{1}\leq 0$ . Testing the equation with $z_{+}:=\max(z,0)$ we get that $\nu(t):=\int_{{\Omega}}z^{2}_{+}$ satisfies

[TABLE]

and applying Gronwall’s inequality we obtain that $z_{+}=0$ . ∎

In the analysis that follows, we fix a trajectory $(\bar{u},\bar{y}=y[\bar{u}]).$

For this trajectory $(\bar{u},\bar{y})$ , let us consider the linear continuous operator $A$ from $L^{2}(0,T;H^{2}({\Omega}))$ to $L^{2}(Q)$ such that, for each $z\in Y$ and $(x,t)\in Q,$

[TABLE]

Lemma 2.4.

For any $\bar{f}\in L^{2}(Q)$ , the equation

[TABLE]

has a unique solution $z\in Y$ that verifies

[TABLE]

Proof.

We follow the same method used in Lemma 2.3. Multiplying (2.21) by $z(x,t)$ and integrating over space we obtain that for a.a. $t\in(0,T)$

[TABLE]

The r.h.s. of (2.23) can be bounded above by

[TABLE]

Then we deduce the estimate (2.22) with Gronwall’s Lemma. ∎

2.3. Existence of solution of the optimal control problem

In order to study the existence of local solutions, we need to establish the sequential weak continuity of the control-to-state mapping. We use ’ $\rightharpoonup$ ’ to denote the weak convergence of a sequence, the space being indicated in each case. We need the following result (see [23, p. 14]):

[TABLE]

Lemma 2.5.

The mapping $u\mapsto y[u]$ is sequentially weakly continuous from ${L^{2}(0,T)}^{m}$ into $Y$ .

Proof.

Taking $u_{\ell}\rightharpoonup\bar{u}$ in ${L^{2}(0,T)}^{m}$ , we shall prove that $y_{\ell}\rightharpoonup\bar{y}$ in $Y$ , where $y_{\ell}:=y[u_{\ell}]$ , $\bar{y}:=y[\bar{u}].$ We know that it is enough to check that any subsequence of $y_{\ell}$ weakly converges to $\bar{y}$ in $Y$ . To do this, we prove that we can pass to the limit in each term of the state equation.

(a) We know by Lemma 2.3 that $y_{\ell}$ is bounded in $Y$ , so extracting a subsequence if necessary, we may assume that it weakly converges in $Y$ to some $\hat{y}$ . By (2.25), $y_{\ell}\rightarrow\hat{y}$ in $L^{6}(Q)$ and, therefore, maybe for a subsequence, it converges almost everywhere in $Q$ .

Let $\nu\in[2,5]$ be integer. Set $\sigma:=6/\nu$ . By the mean value theorem, $y_{\ell}^{\nu}-\hat{y}^{\nu}=\nu\tilde{y}_{\ell}^{\nu-1}(y_{\ell}-\hat{y})$ , with $\tilde{y}_{\ell}(x,t)\in[y_{\ell}(x,t),\hat{y}(x,t)]$ a.e. Obviously $\tilde{y}_{\ell}$ is measurable and bounded in $L^{6}(Q)$ . By Hölder’s inequality, with $p=\nu/(\nu-1)$ and $q=6/\sigma=\nu$ (note that $1/p+1/q=1$ ), we get

[TABLE]

Therefore, $y_{\ell}^{\nu}\rightarrow\hat{y}^{\nu}$ in $L^{\sigma}(Q)$ . Taking $\nu=3$ we get the desired result.

(b) We claim that $u_{\ell}y_{\ell}b$ weakly converges in $L^{2}(Q)$ to $\bar{u}\hat{y}b$ . It is enough to get the result when $m=1$ . Fix $\varphi$ in $L^{\infty}(Q)$ . By Lemma 2.2, $u_{\ell}y_{\ell}$ is bounded in $L^{2}(Q)$ and has therefore (up to a subsequence) a weak limit $w$ in that space. Since $y_{\ell}\rightarrow\hat{y}$ in $L^{6}(Q)$ , $\int_{Q}u_{\ell}(y_{\ell}-\hat{y})b\varphi\rightarrow 0$ . On the other hand $\int_{Q}u_{\ell}\hat{y}b\varphi\rightarrow\int_{Q}\bar{u}\hat{y}b\varphi$ since $\hat{y}b\varphi\in L^{2}(Q)$ . Therefore $\int_{Q}u_{\ell}y_{\ell}b\varphi\rightarrow\int_{Q}\bar{u}\hat{y}b\varphi$ . Since $L^{\infty}(Q)$ is a dense subset of $L^{2}(Q)$ . The claim follows.

By steps (a)-(b), we can pass to the limit in the weak formulation, and obtain (due to the uniqueness of solution) that $\hat{y}=\bar{y}$ . The conclusion follows. ∎

Theorem 2.6.

(i)* The function $u\mapsto J(u,y[u]),$ from $L^{2}(0,T)^{m}$ to $\mathbb{R}$ , is weakly sequentially l.s.c. (ii) The set of solutions of the optimal control problem (P) is weakly sequentially closed in $L^{2}(0,T)^{m}.$ (iii) If (P) has a bounded minimizing sequence, the set of solutions of (P) is non empty. This is the case in particular if (P) is admissible and $\mathcal{U}_{\rm ad}$ is a nonempty, bounded subset of $L^{2}(0,T)^{m}$ .*

Proof.

(i) Combine Lemma 2.5 and the fact that the cost function $J$ is continuous and convex on ${L^{2}(0,T)}^{m}\times Y$ , hence it is also weakly lower semicontinuous over this product space.

(ii) Let $(u_{\ell})\subset L^{2}(0,T)^{m}$ be a sequence of solutions weakly converging to $\bar{u}\in L^{2}(0,T)^{m},$ with associated states $y_{\ell}$ . By Lemma 2.5, $(y_{\ell})$ weakly converge in $Y$ to the state $\bar{y}$ associated with $\bar{u}$ and, by point (i), $J(\bar{u},\bar{y})\leq\liminf_{\ell}J(u_{\ell},y_{\ell})$ . This lower limit being nothing but the value of problem (P), the conclusion follows.

(iii) By the previous arguments, a weak limit of a minimizing sequence is a solution of (P). This weak limit exists iff the sequence is bounded. This concludes the proof. ∎

3. First order analysis

In this section we state first order necessary optimality conditions. More precisely, we introduce the adjoint equation, and define and prove existence of associated Lagrange multipliers.

Throughout the section, $(\bar{u},\bar{y})$ is a trajectory of problem (P). We recall the hypotheses (2.2), (2.6) on the data, and the definition of the operator $A$ given in (2.20).

3.1. Linearized state equation and costate equation

The linearized state equation at $(\bar{u},\bar{y})$ is given by

[TABLE]

For $v\in{L^{2}(0,T)}^{m}$ , equation (3.1) above possesses a unique solution $z[v]\in Y$ (as follows from Lemma 2.4), and the mapping $v\mapsto z[v]$ is linear and continuous from ${L^{2}(0,T)}^{m}$ to $Y.$ Particularly, the following estimate holds.

Proposition 3.1.

One has

[TABLE]

*where $M_{1}:=e^{\frac{T}{2}+\sum_{{i=0}}^{m}\|\bar{u}_{i}\|_{1}\|b_{i}\|_{\infty}}\|\bar{y}\|_{L^{\infty}(0,T;L^{2}(\Omega))}.$ *

Proof.

Immediate consequence of Lemma 2.4. ∎

It is well-known that the dual of $C([0,T])$ is the set of (finite) Radon measures, and that the action of a finite Radon measure coincides with the Stieltjes integral associated with a bounded variation function $\mu\in BV(0,T)$ . We may assume w.l.g. that $\mu(T)=0$ , and we let ${\rm d}\mu$ denote the Radon measure associated to $\mu$ . Note that if ${\rm d}\mu$ belongs to the set $\mathcal{M}_{+}(0,T)$ of nonnegative finite Radon measures then we may take $\mu$ nondecreasing. Set

[TABLE]

The generalized Lagrangian of problem $(P)$ is, choosing the multiplier of the state equation to be $(p,p_{0})\in L^{2}(Q)\times H^{-1}({\Omega})$ and taking $\beta\in\mathbb{R}_{+}$ , $\mu\in BV(0,T)^{q}_{0,+},$

[TABLE]

The costate equation is the condition of stationarity of the Lagrangian ${\mathcal{L}}$ with respect to the state that is, for any $z\in Y$ :

[TABLE]

To each $(\varphi,\psi)\in L^{2}(Q)\times H^{1}_{0}({\Omega})$ , let us associate $z=z[\varphi,\psi]\in Y$ , the unique solution of

[TABLE]

Since this mapping is onto, the costate equation (3.5) can be rewritten, for $z=z[\varphi,\psi]$ and arbitrary $(\varphi,\psi)\in L^{2}(Q)\times H^{1}_{0}({\Omega})$ , as

[TABLE]

The r.h.s. of (3.7) can be seen as a linear continuous form on the pairs $(\varphi,\psi)$ of the space $L^{2}(Q)\times H^{1}_{0}({\Omega})$ . By the Riesz Representation Theorem, there exists a unique $(p,p_{0})\in L^{2}(Q)\times H^{-1}({\Omega})$ satisfying (3.7), that means, there is a unique solution of the costate equation.

Next consider the alternative costates

[TABLE]

Lemma 3.2.

Let $(p,p_{0},\mu)\in L^{2}(Q)\times H^{-1}(\Omega)\times BV(0,T)^{q}_{0,+}$ satisfy (3.7), let $(p^{1},p^{1}_{0})$ be given by (3.8). Then $p^{1}\in Y$ , it satisfies $p^{1}(0)=p^{1}_{0}$ , and it is the unique solution of

[TABLE]

Moreover, $p(x,0)$ and $p(x,T)$ are well-defined as elements of $H^{1}_{0}(\Omega)$ in view of (3.8), and we have

[TABLE]

Proof.

Let $z\in Y$ . Note that, for $1\leq j\leq q$ , the function $t\mapsto\int_{\Omega}c_{j}(x)z(x,t){\rm d}x$ , belongs to $W^{1,1}(0,T)$ and is, therefore, of bounded variation. Using the integration by parts formula for the product of scalar functions with bounded variation, one of them being continuous (see e.g. [8, Lemma 3.6]), and taking into account the fact that $\mu_{j}(T)=0$ , we get that, for $\psi=z(\cdot,0)$ ,

[TABLE]

By the definition (3.8) of the alternative costate, the latter equation can be rewritten as

[TABLE]

Now adding (3.7) and (3.12), as well as the identity

[TABLE]

we obtain, since $\varphi=\dot{z}+Az$ , that (implicitly identifying, as usual, $L^{2}({\Omega})$ with its dual)

[TABLE]

Since $A$ is symmetric, using (2.6), we see that $p^{1}$ is solution in $Y$ of (3.9); the solution of the latter being clearly unique. Multiplying (3.9) by $z\in Y$ and integrating over $Q$ , with an integration by parts of the term with $\dot{p}^{1}z$ , we recover (using (3.8)) equation (3.14) implying that $p^{1}(x,0)=p^{1}_{0}(x)$ for a.a. $x$ in ${\Omega}$ . Conversely, it is easy to prove that any solution of (3.14) is solution of (3.9).

Since $p^{1}$ and $c_{j}\mu_{j}$ belong to $L^{\infty}(0,T;H^{1}_{0}({\Omega}))$ , by (3.8) also $p$ has this regularity. Use (3.8) again, the final condition on $p^{1}$ and the fact that $\mu(T)=0$ to get the second relation of (3.10). Furthermore, we have

[TABLE]

∎

Corollary 3.3.

If $\mu\in H^{1}(0,T)^{q},$ then $p\in Y$ and

[TABLE]

Proof.

This follows immediately from (3.8) and (3.9). ∎

3.2. First order optimality conditions

Let $(\bar{u},\bar{y})$ be an admissible trajectory of problem $(P)$ . We say that $\mu\in BV(0,T)^{q}_{0,+}$ is complementary to the state constraint for $\bar{y}$ if

[TABLE]

Let $(\beta,\mu)\in\mathbb{R}_{+}\times BV(0,T)^{q}_{0,+}.$ We say that $p\in L^{\infty}(0,T;H^{1}_{0}({\Omega}))$ is the costate associated with $(\bar{u},\bar{y},\beta,\mu)$ , or shortly to $(\beta,\mu),$ if it is the unique solution of (3.5) with $p_{0}=p(\cdot,0)$ .

Definition 3.4.

We say that the triple $(\beta,p,\mu)\in\mathbb{R}_{+}\times L^{\infty}(0,T;H^{1}_{0}({\Omega}))\times BV(0,T)^{q}_{0,+}$ is a generalized Lagrange multiplier if it satisfies the following first-order optimality conditions: $\mu$ is complementary to the state constraint, $p$ is the costate associated with $(\beta,\mu)$ , the non-triviality condition

[TABLE]

holds and, for $i=1$ to $m$ , defining the switching function by

[TABLE]

one has $\Psi^{p}\in L^{\infty}(0,T){{}^{m}}$ and

[TABLE]

We let $\Lambda(\bar{u},\bar{y})$ denote the set of generalized Lagrange multipliers $(\beta,p,\mu)$ associated with $(\bar{u},\bar{y})$ . If $\beta=0$ we say that the corresponding multiplier is singular. Finally, we write $\Lambda_{1}(\bar{u},\bar{y})$ for the set of pairs $(p,\mu)$ with $(1,p,\mu)\in\Lambda(\bar{u},\bar{y})$ . When the nominal solution is fixed and there is no place for confusion, we just write $\Lambda$ and $\Lambda_{1}.$

Note that, in view of (3.10), $p_{0}=p(\cdot,0)$ and hence we do not need to consider $p_{0}$ as a component of the multiplier.

3.2.1. The reduced abstract problem

Set $F(u):=J(u,y[u]),$ and $G:{L^{2}(0,T)}^{m}\rightarrow C([0,T])^{q}$ , $G(u):=g(y[u])$ . The reduced problem is

[TABLE]

where $K:=C([0,T])_{-}^{q}$ is the closed convex cone of continuous functions over $[0,T],$ with values in $\mathbb{R}_{-}^{q}.$ Its interior is the set of functions in $C([0,T])^{q}$ with negative values. We say that the reduced problem (RP) is qualified at $\bar{u}$ if:

[TABLE]

Given a Banach space $X,$ a closed convex subset $S\subseteq X$ and a point $\bar{s}\in S,$ the normal cone to $S$ at $\bar{s}$ is defined as

[TABLE]

We get the following first order conditions for our problem $(P)$ :

Lemma 3.5.

(i)* If $(\bar{u},y[\bar{u}])$ is an $L^{2}$ -local solution of $(P),$ then the associated set $\Lambda$ of multipliers is nonempty.

(ii) If in addition the qualification condition (3.21) holds at $\bar{u}$ , then there is no singular multiplier, and $\Lambda_{1}$ is bounded in $L^{\infty}(0,T;H^{1}_{0}({\Omega}))\times BV(0,T)^{q}_{0,+}$ .*

Proof.

(i) Let us consider the generalized Lagrangian associated with the reduced problem (RP):

[TABLE]

Let $\bar{u}$ be a local solution of (RP). By, e.g., [9, Proposition 3.18], since $K$ has nonempty interior, there exists a generalized Lagrange multiplier associated with problem (RP), that is, $(\beta,{\rm d}\mu)\in\mathbb{R}_{+}\times N_{K}(G(\bar{u}))$ for $\mu\in BV(0,T)^{q}_{0,+}$ such that

[TABLE]

Due to the costate equation (3.7), the latter condition is equivalent to (3.20).

(ii) That $\Lambda_{1}$ is nonempty and weakly-* compact follows from [9, Proposition 3.16]. ∎

Observe that the qualification condition for (RP) given in (3.21) holds if and only if the following qualification condition for the original problem (P) is satisfied:

[TABLE]

In view of Lemma 3.5, if (3.25) is satisfied, then $\Lambda_{1}$ is nonempty and weakly-* compact.

In the sequel of this section, we consider $(\bar{u},\bar{y},\beta,p,\mu),$ with $\bar{y}$ the state associated with the admissible control $\bar{u}$ and $(\beta,p,\mu)\in\Lambda.$

3.3. Arcs and junction points

We assume in the remainder of the article that the admissible set of controls has the form

[TABLE]

for some constants $\check{u}_{i}<\hat{u}_{i}$ , for $i=1,\dots,m.$ Consider the contact sets associated to the control bounds defined, up to null measure sets, by

[TABLE]

For $j=1,\dots,q,$ the contact set associated with the $j$ th state constraint is

[TABLE]

Given $0\leq a<b\leq T$ , we say that $(a,b)$ is a maximal state constrained arc for the $j$ th state constraints, if $I^{C}_{j}$ contains $(a,b)$ but it contains no open interval strictly containing $(a,b)$ . We define in the same way a maximal (lower or upper) control bound constraints arc (having in mind that the latter are defined up to a null measure set).

We will assume the following finite arc property:

[TABLE]

In the sequel we identify $\bar{u}$ (defined up to a null measure set) with a function whose $i$ th component is constant over each interval of time that is included, up to a zero-measure set, in either $\check{I}_{i}$ or $\hat{I}_{i}$ . For almost all $t\in[0,T]$ , the set of active constraints at time $t$ is denoted by $(\check{B}(t),\hat{B}(t),C(t))$ where

[TABLE]

These sets are well-defined over open subsets of $(0,T)$ where the set of active constraints is constant, and by (3.29), there exist time points called junction points

[TABLE]

such that the intervals $(\tau_{k},\tau_{k+1})$ are maximal arcs with constant active constraints, for $k=0,\dots,r-1.$ We may sometimes call them shortly maximal arcs.

Definition 3.6.

For $k=0,\dots,r-1,$ let $\check{B}_{k},\hat{B}_{k},C_{k}$ denote the set of indexes of active lower and upper bound constraints, and state constraints, on the maximal arc $(\tau_{k},\tau_{k+1})$ , and set $B_{k}:=\check{B}_{k}\cup\hat{B}_{k}$ .

As a consequence of above definitions and hypothesis (3.26) on the admissible set of controls, we get the following characterization of the first order condition.

Corollary 3.7.

The first order optimality condition (3.20) is equivalent to

[TABLE]

for every $(\beta,p,\mu)\in\Lambda.$

3.4. About the jumps of the multiplier at junction points

Given a function $v:[0,T]\rightarrow X$ , where $X$ is a Banach space, we denote (if they exist) its left and right limits at $\tau\in[0,T]$ by $v(\tau\pm)$ , with the convention $v(0-):=v(0)$ , $v(T+):=v(T)$ ; then the jump of $v$ at time $\tau$ is defined as $[v(\tau)]:=v(\tau+)-v(\tau-)$ .

We denote the time derivative of the state constraints by

[TABLE]

Note that $\bar{g}^{(1)}_{j}[t]$ is an element of $L^{1}(0,T),$ for each $j=1,\ldots,q.$

Lemma 3.8.

Let $\bar{u}$ have left and right limits at $\tau\in(0,T)$ . Then

[TABLE]

Proof.

Since $p=p^{1}-\sum_{j=1}^{q}c_{j}\mu_{j}$ , $p^{1}\in Y\subset C([0,T];H^{1}_{0}({\Omega}))$ , $\mu\in BV(0,T)^{q}_{0,+},$ and any function with bounded variation has left and right limits, we have that $p(\cdot,\tau)$ has left and right limits in $H^{1}_{0}({\Omega})$ and satisfies

[TABLE]

Consequently $\Psi^{p}$ has left and right limits over $[0,T]$ , and

[TABLE]

Next, if $\bar{u}$ has left and right limits at some $\tau\in(0,T)$ , then, using the state equation and (3.33), we get

[TABLE]

Thus, by (3.36) and (3.37), we have

[TABLE]

By the first order conditions (3.32) we have $[\Psi^{p}_{i}(\tau)][\bar{u}_{i}(\tau)]\leq 0$ , for $i=1$ to $m$ . Also $[\mu_{j}(\tau)]\geq 0$ , and if $[\mu_{j}(\tau)]\neq 0$ , the corresponding state constraint has a maximum at time $\tau$ . Then $[{{\color[rgb]{0,0,0}\bar{g}^{(1)}_{j}[\tau]}}]\leq 0$ . So, all terms in the sums in (3.38) are nonpositive and therefore are equal to zero. The conclusion follows. ∎

3.5. Regularity of the switching function and multiplier over maximal arcs

In the discussion that follows we fix $k$ in $\{0,\dots,r-1\}$ , and consider a maximal arc $(\tau_{k},\tau_{k+1}),$ where the junction points are given in (3.31). Recall Definition 3.6 for $\check{B}_{k},\hat{B}_{k},B_{k}\subset\{1,\ldots,m\}$ and $C_{k}\subset\{1,\ldots,q\}$ . Set $\bar{B}_{k}:=\{1,\ldots,m\}\setminus B_{k}$ and

[TABLE]

Let $\bar{M}_{k}(t)$ (of size $|\bar{B}_{k}|\times|C_{k}|$ ) denote the submatrix of $M(t)$ having rows with index in $\bar{B}_{k}$ and columns with index in $C_{k}$ . In the sequel we make the following assumption.

Hypothesis 3.9.

We assume that $|C_{k}|\leq|\bar{B}_{k}|,$ for $k=0,\dots,r-1,$ and that the following (uniform) local controllability condition holds:

[TABLE]

Remark 3.10.

This hypothesis was already used in a different setting (i.e. higher-order state constraints in the finite dimensional case) in e.g. [7, 26]. Note that condition (3.40) implies, in particular, that the matrix $\bar{M}_{k}(t)$ has rank $|C_{k}|$ over $(\tau_{k},\tau_{k+1})$ .

The expression of the derivative of the $j$ th state constraint, for $1\leq j\leq q$ , is

[TABLE]

or, in vector form, for the active state constraints (denoting by ${{\color[rgb]{0,0,0}\bar{g}^{(1)}_{C_{k}}[t]}}$ the vector of components ${{\color[rgb]{0,0,0}\bar{g}^{(1)}_{j}[t]}}$ for $j\in C_{k}$ ), we get

[TABLE]

where $\bar{u}_{\bar{B}_{k}}$ is the restriction of $\bar{u}$ to the components in $\bar{B}_{k}$ , and $G_{k}(t)$ takes into account the contributions of the integral in (3.41) and of the components of $\bar{u}$ in $B_{k}$ , that is, for $j\in C_{k}$ :

[TABLE]

By the controllability condition (3.40), $\bar{M}_{k}(t)^{\top}$ is onto from $\mathbb{R}^{|\bar{B}_{k}|}$ to $\mathbb{R}^{|C_{k}|}$ . In view of the state equation, by an integration by parts argument, $M(t)$ has a bounded derivative and is therefore Lipschitz continuous. So there exists a linear change of control variables of the form $u(t)=N_{k}(t)\hat{u}(t),$ for some invertible Lipschitz continuous matrix $N_{k}(t)$ of size $m\times m$ , such that, calling $\bar{N}_{k}(t)$ the upper $|\bar{B}_{k}|\times|\bar{B}_{k}|-$ diagonal block of $N_{k}(t),$ it holds that $\bar{M}_{k}(t)^{\top}\bar{N}_{k}(t)$ has its first $|C_{k}|$ columns being equal to the identity matrix, the other columns having null components. That is, for all $\hat{u}\in\mathbb{R}^{|\bar{B}_{k}|}$ :

[TABLE]

Over a maximal arc $(\tau_{k},\tau_{k+1})$ , we have that ${{\color[rgb]{0,0,0}\bar{g}^{(1)}_{j}[t]}}=0$ for $j\in C_{k}$ is equivalent to

[TABLE]

The following result on the regularity of the state constraint multiplier holds. Recall the definition of the switching function $\Psi^{p}$ given in (3.19).

Proposition 3.11.

There exists $a\in L^{1}(0,T)^{m}$ such that

(i)

[TABLE]

(ii)

We have that $\dot{\mu}_{C_{k}}$ is locally integrable over $(\tau_{k},\tau_{k+1})$ , hence $\mu_{C_{k}}$ is locally absolutely continuous, and the following expression holds

[TABLE]

Proof.

By (3.8) and (3.19), one has, for $i\in\{1,\ldots,m\}$ :

[TABLE]

Let $a\colon(0,T)\to\mathbb{R}^{m}$ be given by

[TABLE]

Note that $\dot{M}_{ij}(t)=\int_{\Omega}b_{i}(x)c_{j}(x)\dot{\bar{y}}(x,t){\rm d}x$ is integrable (this follows integrating by parts the contribution of $\Delta\bar{y}$ and since $Y\subset C([0,T];H^{1}_{0}({\Omega}))$ ), and that

[TABLE]

Integrating by parts the terms in (3.50) containing Laplacians, we get, for the integral term in (3.49),

[TABLE]

It follows that $a\in L^{1}(0,T)^{m}$ and (3.46) holds. Consequently $\Psi^{p}$ has bounded variation.

Over $(\tau_{k},\tau_{k+1})$ , we have ${\rm d}\mu_{j}(t)=0$ whenever $j\not\in C_{k}$ , and so

[TABLE]

Since $\bar{M}_{k}(t)$ is continuous and injective, and $a$ is integrable, this implies the existence of $\dot{\mu}_{j}(t)\in L^{1}(0,T)$ , for $j\in C_{k}$ . This yields (3.47).

And so, $\mu_{C_{k}}(t)$ is locally absolutely continuous. ∎

Corollary 3.12.

Let the finite maximal arc property (3.29) and the uniform controllability condition (3.40) hold.

(i)

If $f,y_{d}\in L^{\infty}(0,T;L^{2}(\Omega)),$ then $a\in L^{\infty}(0,T)^{m}.$

(ii)

If additionally $f,y_{d}\in C([0,T];L^{2}({\Omega})),$ then $\mu$ is $C^{1}$ over each maximal arc $(\tau_{k},\tau_{k+1}).$

Proof.

Indeed, a careful inspection of the previous proof shows that $a$ is a sum of essentially bounded terms, so (i) follows. If the additional regularity hypotheses of item (ii) hold, then $a$ is continuous. The regularity of $\mu$ follows from (3.52) and the local controllability assumption (3.40). This concludes the proof. ∎

4. Second order necessary conditions

In this section we derive second order necessary optimality conditions, based on the concept of radiality of critical directions.

Let us consider an admissible trajectory $(\bar{u},\bar{y})$ .

4.1. Assumptions and additional regularity

For the remainder of the article we make the following set of assumptions.

Hypothesis 4.1.

The following conditions hold:

the control set has the form (3.26),

2.

the finite maximal arc property (3.29),

3.

the qualification hypothesis (3.25),

4.

the local (uniform) controllability condition (3.40) over each maximal arc $(\tau_{k},\tau_{k+1})$ ,

5.

the discontinuity of the derivative of the state constraints at corresponding junction points, i.e.,

[TABLE]

6.

the uniform distance to control bounds whenever they are not active, i.e. there exists $\delta>0$ such that,

[TABLE]

7.

*the following regularity for the data (we do not try to take the weakest hypotheses) *for some $r>n+1$ :

[TABLE]

8.

*the control $\bar{u}$ has left and right limits at the junction points $\tau_{k}\in(0,T)$ , (this will allow to apply Lemma 3.8). *

In view of point 3 above, we consider from now on $\beta=1$ and thus we omit the component $\beta$ of the multipliers.

Theorem 4.2.

The following assertions hold.

(i)

For any $u\in L^{\infty}(0,T)^{m},$ the associated state $y[u]$ belongs to $C(\bar{Q}).$ If $u$ remains in a bounded subset of $L^{\infty}(0,T)^{m}$ then the corresponding states form a bounded set in $C(\bar{Q})$ . In addition, if the sequence $(u_{\ell})$ of admissible controls converges to $\bar{u}$ a.e. on $(0,T)$ , then the associated sequence of states $(y_{\ell}:=y[u_{\ell}])$ converges uniformly to $\bar{y}$ in $\bar{Q}$ .

(ii)

For every $(p,\mu)\in\Lambda_{1},$ one has that $\mu\in W^{1,\infty}(0,T)^{q}$ and $p$ is essentially bounded in $Q$ .

Proof.

(i) Let $r\in[2,\infty)$ . That $y\in W^{2,1,r}(Q)$ follows from Theorem A.3 in the Appendix. Taking $r>n+1$ , it follows from the Sobolev Embedding Theorem (see e.g. [17, Theorem 5, p. 269]) that $y$ is continuous (and even Hölder-continuous) on the closure of $Q$ , with uniform bound over the set of admissible controls. If the sequence $(u_{\ell})$ of admissible controls converges a.e. to $\bar{u}$ , by the Dominated Convergence Theorem, $u_{\ell}\rightarrow\bar{u}$ in $L^{q}(0,T)$ for all $q\in[1,\infty)$ . So, by similar arguments it can be proved that the associated sequence of states converges uniformly to $\bar{y}$ .

(ii) By Hypothesis 4.1, $y_{dT}$ is the trace at time $T$ of an element of $W^{2,1,r}(Q)$ vanishing on $\Sigma$ and this obviously holds also for $y(T)$ in view of Theorem A.3 in the Appendix. It follows then from corollary A.2 that $p^{1}\in W^{2,1,r}(Q)$ . The continuity of $\mu$ at junction points follows from (4.1) in Hypothesis 4.1 and Lemma 3.8. The boundedness on each arc of the derivative of $\mu$ follows from (3.47) for $\dot{\mu}$ , since by Corollary 3.12, $a\in L^{\infty}(0,T)^{m}$ and by (3.40), $\bar{M}(t)$ is ‘uniformly injective’ over each arc. The conclusion follows. ∎

4.2. Second variation

For $(p,\mu)\in\Lambda_{1},$ set

[TABLE]

and consider the quadratic form

[TABLE]

Let $(u,y)$ be a trajectory, and set

[TABLE]

Recall the definition of the operator $A$ given in (2.20). Subtracting the state equation at $(\bar{u},\bar{y})$ from the one at $(u,y)$ , we get that

[TABLE]

Combining with the linearized state equation (3.1), we deduce that $\eta$ given by

[TABLE]

satisfies the equation

[TABLE]

where $r$ and $\tilde{r}$ are defined as

[TABLE]

Proposition 4.3.

Let $(p,\mu)\in\Lambda_{1}$ , and let $(u,y)$ be a trajectory. Then

[TABLE]

Here, we omit the dependence of the Lagrangian on $(\beta,p_{0})$ being equal to $(1,p(\cdot,0))$ .

Proof.

Use $\Delta{\mathcal{L}}$ to denote the l.h.s. of (4.11). We have

[TABLE]

By (3.5) we obtain

[TABLE]

Thus, from (4.12) and (4.13) we get

[TABLE]

which leads to (4.11) in view of the definition of $\Psi^{p}_{i}$ given in (3.19). This concludes the proof. ∎

4.3. Critical directions

Recall the definitions of $\check{I}_{i},\hat{I}_{i}$ and $I^{C}_{j}$ given in (3.27) and (3.28), and remember that we use $z[v]$ to denote the solution of the linearized state equation (3.1) associated to $v.$

Let us define the cone of critical directions at $\bar{u}$ in $L^{2}$ , or in short critical cone, by

[TABLE]

The strict critical cone is defined below, and it is obtained by imposing that the linearization of active constraints is zero,

[TABLE]

Hence, clearly $C_{\rm s}\subseteq C,$ and $C_{\rm s}$ is a closed subspace of $Y\times{L^{2}(0,T)}^{m}.$ Now, note that in the interior of each $I^{C}_{j}$ one has, for every $(z[v],v)\in C_{\rm s},$

[TABLE]

which can be rewritten as

[TABLE]

in view of the definition of $M_{ij}$ given in (3.39). Therefore, over any arc $(a,b)$ we have $g^{\prime}_{j}(\bar{y}(\cdot,t))z[v](\cdot,t)=0$ for $t\in(a,b)$ if and only if $g^{\prime}_{j}(\bar{y}(\cdot,a))z[v](\cdot,a)=0$ and (4.18) holds over $(a,b)$ . We define the entry (resp. exit) point of a time interval $(t^{\prime},t^{\prime\prime})$ as $t^{\prime}$ (resp. $t^{\prime\prime}$ ). This induces the consideration of the following sets

[TABLE]

With these definitions, we can write the strict critical cone as

[TABLE]

and prove the following result.

Lemma 4.4.

$C_{\rm s}\cap\Big{(}Y\times L^{\infty}(0,T)^{m}\Big{)}$ * is dense in $C_{\rm s},$ with respect to the $Y\times L^{2}(0,T)^{m}$ -topology.*

Proof.

In view of Dmitruk’s density lemma (see [16, Lemma 1]), it is enough to prove that $C_{\rm n}\cap\Big{(}Y\times L^{\infty}(0,T)^{m}\Big{)}$ is a dense subset of $C_{\rm n}$ .

Let us then take $(z,v)\in C_{\rm n}.$ Recall the definition of the junction times $\tau_{k}$ given after equation (3.39). Fix $k\in\{0,\dots,r-1\}.$ Note that we can take a partition of $[0,T]$ , say $0=t_{0}\leq\dots\leq t_{\ell}\leq\dots\leq t_{N}=T$ , such that $(t_{\ell},t_{\ell+1})$ is contained in some $(\tau_{k},\tau_{k+1})$ , and on $(t_{\ell},t_{\ell+1})$ a fixed set of the rows of $M(t)$ is linearly independent with rank equal to the one of $M(t)$ . Now consider the matrix $\bar{M}_{k}$ given after (3.39). Using the same notation as in (3.42), let us write $v_{\bar{B}_{k}}$ to refer to the restriction of $v$ to the components in $\bar{B}_{k}.$ For each $t\in(t_{\ell},t_{\ell+1})$ , we can write

[TABLE]

where $v_{\bar{B}_{k},0}(t)\in\mathop{\rm Ker}\bar{M}_{k}(t)^{\top}$ and $v_{\bar{B}_{k},1}(t)\in\Im\bar{M}_{k}(t)$ for almost all $t,$ hence $v_{\bar{B}_{k},1}(t)=\bar{M}_{k}(t)\lambda_{k}(t)$ for some $\lambda_{k}(t)\in\mathbb{R}^{|C_{k}|}.$ Let $E_{C_{k}}(t)$ be the $|C_{k}|$ -dimensional vector with components

[TABLE]

Then (4.18) can be rewritten as

[TABLE]

and, therefore, $\lambda_{k}(t)=\big{(}\bar{M}_{k}(t)^{\top}\bar{M}_{k}(t)\big{)}^{-1}E_{C_{k}}(t),$ so that

[TABLE]

By an integration by parts (in space) argument, it follows that $E_{C_{k}}(t)$ is a continuous function, and so is $\bar{M}_{k}(t)$ . Therefore, $v_{\bar{B}_{k},1}$ is continuous on each maximal arc. We may also view the application $z\mapsto v_{\bar{B}_{k},1}$ as a linear and continuous mapping say

[TABLE]

where $C_{k}$ is the set of active state constraints on $(\tau_{k},\tau_{k+1})$ and, for $t^{\prime}<t^{\prime\prime}$ , $\mathop{\rm Lip}(t^{\prime},t^{\prime\prime})$ is the Banach space of continuous real functions with domain $(t^{\prime},t^{\prime\prime})$ , endowed with the norm

[TABLE]

with the convention “ $0/0=0$ ”.

For any $\varepsilon>0,$ there exists $v_{\bar{B}_{k},0}^{\varepsilon}$ in $L^{\infty}(0,T)^{|B_{k}|}$ such that $\|v_{\bar{B}_{k},0}^{\varepsilon}-v_{\bar{B}_{k},0}\|_{2}<\varepsilon$ , it has zero components for indexes corresponding to active control bound constraints, and $v_{\bar{B}_{k},0}^{\varepsilon}(t)\in\mathop{\rm Ker}\bar{M}_{k}(t)^{\top}$ for a.a. $t.$ In fact, to construct this $v_{\bar{B}_{k},0}^{\varepsilon}$ it suffices to project an approximation of $v_{\bar{B}_{k},0}$ obtained by a truncation argument on the kernel $\mathop{\rm Ker}\bar{M}_{k}(t)^{\top}$ . In what follows we shall abuse notation and use the same symbol to denote a vector and its canonical immersion in $\mathbb{R}^{m}.$ Let $z_{\varepsilon}$ be the unique solution in $Y$ of the linearized equation

[TABLE]

with the usual initial and boundary conditions, and where $v_{B}$ is the restriction of $v$ to the set $B.$ Set $v_{\bar{B},1}^{\varepsilon}:=L_{1}(z_{\varepsilon}),$ $v_{\bar{B}_{k}}^{\varepsilon}:=v_{\bar{B}_{k},1}^{\varepsilon}+v_{\bar{B}_{k},0}^{\varepsilon},$ and define $v_{\varepsilon}$ to have the restriction to $\bar{B}_{k}$ equal to $v_{\bar{B}_{k}}^{\varepsilon}$ and the restriction to $B_{k}$ equal to $v.$ Then $v_{\varepsilon}$ is in $C_{\rm n}\cap(Y\times L^{\infty}(0,T)^{m})$ and $\|v_{\varepsilon}-v\|_{2}=O(\varepsilon).$ Hence, $C_{\rm n}\cap\Big{(}Y\times L^{\infty}(0,T)^{m}\Big{)}$ is a dense subset of $C_{\rm n}$ . The conclusion follows. ∎

4.3.1. Radiality of critical directions

According to Aronna et al. [3, Definition 6], a critical direction $(z,v)$ is quasi radial if there exists $\tau_{0}>0$ such that, for $\tau\in[0,\tau_{0}],$ the following conditions are satisfied:

[TABLE]

Lemma 4.5.

Every direction in $C_{\rm s}\cap\Big{(}Y\times L^{\infty}(0,T)^{m}\Big{)}$ is quasi radial.

Proof.

Let $(z,v)\in C_{\rm s}\cap\Big{(}Y\times L^{\infty}(0,T)^{m}\Big{)}.$ Then (4.30) follows from (4.2). Let us next prove (4.29). The function $h(t):=g^{\prime}_{j}(\bar{y}(t))z(t)$ has the derivative $\dot{h}(t)=\int_{\Omega}c_{j}(x)\dot{z}(x,t){\rm d}x$ , so that $|\dot{h}(t)|\leq\|c_{j}\|_{L^{2}({\Omega})}\|\dot{z}(\cdot,t)\|_{L^{2}({\Omega})}$ and hence, $\dot{h}\in L^{2}(0,T)$ . Let $0\leq t^{\prime}<t^{\prime\prime}\leq T$ . By the Cauchy-Schwarz inequality, for any $\varepsilon>0$ :

[TABLE]

Let $(a,b)$ be a maximal constrained arc with say $a>0$ . Take $t^{\prime}<a$ , and $t^{\prime\prime}=a$ . When $t^{\prime}\uparrow a$ , by the Dominated Convergence Theorem, $\|\dot{h}\|_{L^{2}(t^{\prime},t^{\prime\prime})}\rightarrow 0$ . Given $\varepsilon>0$ , we deduce with (4.1) that for $\tau>0$ and $t^{\prime}<a$ close enough to $a$ :

[TABLE]

The maximum of the r.h.s. of (4.32) over $t\in[a-\varepsilon,a]$ is attained when

[TABLE]

So the r.h.s. of (4.32) is less or equal than $\tau^{2}\varepsilon^{2}/(4c)$ . Since we can take $\varepsilon$ arbitrarily small, it is of order $o(\tau^{2})$ . For $t>b$ close to $b$ , we have a similar result. For $t$ far from the boundary, (4.29) is a consequence of hypothesis (4.1). The conclusion follows. ∎

Combining the previous result with Lemma 4.4, we deduce that:

Corollary 4.6.

The set of quasi radial critical directions of $C_{\rm s}$ is dense in $C_{\rm s}.$

4.4. Second order necessary condition

We obtain the following result applying Corollary 4.6 above and the second order condition in an abstract setting proved in [3, Theorem 8].

Theorem 4.7 (Second order necessary condition).

Let the admissible trajectory $(\bar{u},\bar{y})$ be an $L^{\infty}$ -local solution of $(P)$ . Then

[TABLE]

Proof.

Let $(z,v)\in C_{\rm s}.$ By Corollary 4.6, there exists a sequence $(z^{\ell},v^{\ell})$ of quasi radial directions converging to $(z,v)$ in $Y\times L^{2}(0,T)^{m}$ . Doing as in [3, Theorem 8], we get the existence of a multiplier $(p^{\ell},\mu^{\ell})\in\Lambda_{1}$ (with $\Lambda_{1}$ defined in Section 3.2.1), such that

[TABLE]

By Lemma 3.5, $\Lambda_{1}$ is bounded so that ${\rm d}\mu^{\ell}$ is also bounded. Extracting if necessary a subsequence, we may assume that ${\rm d}\mu^{\ell}$ weakly- $*$ converges to some ${\rm d}\mu$ with $\mu\in BV(0,T)^{q}_{0,+}$ , and since $L^{\infty}(0,T,H^{1}_{0}({\Omega}))$ is included in $L^{2}(Q)$ , $p^{\ell}$ weakly converges in $L^{2}(Q)$ to some $p\in L^{2}(Q)$ , such that $(p,\mu)\in\Lambda_{1}$ . Since $(z^{\ell},v^{\ell})\rightarrow(z[v],v)$ in $Y\times L^{2}(0,T)^{m}$ , by lemma 2.2, $\sum_{i}v^{\ell}_{i}b_{i}z^{\ell}$ strongly converges to $\sum_{i}v_{i}b_{i}z$ , and so we easily deduce that ${\mathcal{Q}}[p^{\ell}](z^{\ell},v^{\ell})\rightarrow{\mathcal{Q}}[p](z[v],v).$ The conclusion follows. ∎

Appendix A Strong solutions of the heat equation

We consider the heat equation with Dirichlet boundary condition:

[TABLE]

We have the following result, see Lieberman [21, Thm 7.32, p. 182]:

Theorem A.1.

Let $r\geq 2$ , $w\in W^{2,1,r}(Q)$ and $f\in L^{r}(Q)$ . Setting $y_{0}:=w(\cdot,0)$ and $h:=\tau_{\Sigma}w$ (trace of $w$ over $\Sigma$ ), equation (A.1) has a unique solution $y\in W^{2,1,r}(Q)$ . In addition there exists $C>0$ such that

[TABLE]

Corollary A.2.

Given $r\geq 2$ , $y_{0}\in{\color[rgb]{0,0,0}W^{1,r}_{0}({\Omega})\cap W^{2,r}({\Omega})}$ and $f\in L^{r}(Q)$ , equation (A.1) has, for $h=0$ , a unique solution $y\in W^{2,1,r}(Q)$ that satisfies

[TABLE]

Proof.

Apply Theorem A.1 with $w(x,t):=y_{0}(x)$ . It is clear that $w\in W^{2,1,r}(Q)$ and that $w$ has trace $y_{0}$ at time 0 and zero trace over $\Sigma$ . The conclusion follows. ∎

By the standard Sobolev embeddings, we have the continuous inclusion

[TABLE]

This allows to prove the following.

Theorem A.3.

Assume that $u\in L^{\infty}(0,T)$ , $y_{0}\in{\color[rgb]{0,0,0}W^{1,r}_{0}({\Omega})\cap W^{2,r}({\Omega})}$ and $f\in L^{r}(Q)$ , with $r>n+1$ . Then the state equation (2.1) has a unique solution $y[u,y_{0},f]$ in $W^{2,1,r}(Q)$ , and the mapping $y[u,y_{0},f]$ is of class $C^{\infty}$ from $L^{\infty}(0,T)\times{\color[rgb]{0,0,0}W^{1,r}_{0}({\Omega})\cap W^{2,r}({\Omega})}\times L^{r}({\Omega})$ into $W^{2,1,r}(Q)$ .

Proof.

We have that $g:=-\Delta y_{0}$ belongs to $L^{r}({\Omega})$ . Let $y^{\pm}_{0}$ be the unique solution of $-\Delta y^{\pm}_{0}=g^{\pm}$ in ${\Omega}$ , where $g^{+}:=\max(g,0)$ and $g^{-}:=-\min(g,0)$ , with homogeneous Dirichlet condition on the boundary. Set $f^{+}:=\max(f,0)$ and $f^{-}:=-\min(f,0)$ . Denote by $y^{+}$ (resp., $y^{-}$ ) the solution of the state equation (2.1) when $(y_{0},f)$ is $(y_{0}^{+},f^{+})$ (resp. $(y_{0}^{-},f^{-})$ ). By the monotonicity results in Lemma 2.3, we have that $-y^{-}\leq y\leq y^{+}$ . Now let $y^{++}$ , $y^{--}$ denote the solutions of the state equation (2.1) when $(y_{0},f)$ is $(y_{0}^{+},f^{+})$ , $(y_{0}^{-},f^{-}),$ respectively and, in addition, $\gamma=0$ . We claim that $-y^{--}\leq-y^{-}\leq y\leq y^{+}\leq y^{++}$ . Indeed, for $z\in Y$ , set $H_{u}z:=\dot{z}-\Delta z-z\sum_{i}u_{i}b_{i}$ . Then

[TABLE]

Since $y^{+}$ and $y^{++}$ have the same initial conditions, it follows that $y^{+}\leq y^{++}$ . In an analogous way, it can be proved that $-y^{--}\leq-y^{-}.$

Since $y_{0}^{\pm}\in{\color[rgb]{0,0,0}W^{1,r}_{0}({\Omega})\cap W^{2,r}({\Omega})}$ and $f^{\pm}\in L^{r}(Q)$ , by Corollary A.2, $y^{++}$ and $y^{--}$ belong to $W^{2,1,r}(Q)$ and, therefore, since $r>n+1$ , they are also elements of $L^{\infty}(Q)$ . So, $y\in L^{\infty}(Q)$ . Consequently, $H_{u}y=f-\gamma y^{3}\in L^{r}({\Omega})$ and, by Theorem A.1 again, $y\in W^{2,1,r}(Q)$ .

We recall that, for $r>n+1$ , $Y_{r}$ denotes the set of elements of $W^{2,1,r}(Q)$ with zero trace on $\Sigma$ , and $Y^{0}_{r}$ denotes the trace of $Y_{r}$ at time zero. Endowed with the “trace norm”, $Y^{0}_{r}$ is a Banach space that contains $W^{1,r}_{0}({\Omega})\cap W^{2,r}({\Omega})$ in view of the proof of the above Corollary A.2 (by Lions [23, p. 20], $Y^{0}_{r}$ is a subset of $W^{2-2/r,r}({\Omega})$ ). That $(u,y_{0},f)\mapsto y[u,y_{0},f]$ is of class $C^{\infty}$ is a consequence of the Implicit Function Theorem applied to the mapping $F$ from $Y_{r}\times L^{\infty}(0,T)\times Y_{r}^{0}\times L^{r}(Q)$ into $L^{r}(Q)\times Y^{0}_{r}$ , defined by

[TABLE]

The key step is to prove that the partial derivative $D_{y}F$ is bijective; this can be done easily, taking advantage of the fact that $W^{2,1,r}(Q)\subset L^{\infty}(Q)$ when $r>n+1$ . ∎

Appendix B An example

Since we made a number of hypotheses about the optimal trajectory, especially at junction points, it is useful to give an example where these hypotheses are satisfied. For that purpose we discuss a particular case in which the original optimal control problem can be reduced to the optimal control of a scalar ODE.

Let ${\Omega}=(0,1),$ and denote by $c_{1}(x):=\sqrt{2}\sin\pi x$ the first (normalized) eigenvector of the Laplace operator.

We assume that $\gamma=0$ , the control is scalar ( $m=1$ ), $b_{0}\equiv 0$ and $b_{1}\equiv 1$ in $\Omega,$ and that $f\equiv 0$ in $Q.$ Then the state equation with initial condition $c_{1}$ reads

[TABLE]

It is easily seen that the state satisfies $y(x,t)=y_{1}(t)c_{1}(x)$ , where $y_{1}$ is solution of

[TABLE]

We set $T=3$ and consider the state constraint (3.17) with $q=1$ and $d_{1}:=-2,$ and the cost function (2.5) with $\alpha_{1}=0$ . The state constraint reduces to

[TABLE]

As target functions take $y_{dT}:=c_{1}$ and $y_{d}(x,t):=\hat{y}_{d}(t)c_{1}(x)$ with

[TABLE]

We assume that the lower and upper bounds for the control are $\check{u}:=-1$ and $\hat{u}:=\pi^{2}+1$ . We will check that the optimal control is

[TABLE]

Thus, for the optimal state we have

[TABLE]

The above control is feasible. The trajectory $(\bar{u},\bar{y})$ is optimal since for any $t\in(0,T)$ , the state $\bar{y}_{1}(t)$ has the best possible value (in order to approach $\hat{y}_{d}$ and minimize the cost function) that respects the state constraint.

Let us check Hypothesis 4.1 for this example. Conditions 1 and 2 are obviously satisfied. For the constraint qualification in Condition 3 consider the linearized state equation with unique $z_{1}[v]$ :

[TABLE]

with $v(t):=\check{u}-\bar{u}(t)<0$ . One easily checks that $z_{1}[v](t)<0$ for all $t>0$ . Hence, we can find $\varepsilon>0$ such that

[TABLE]

Conditions 4 holds, since

[TABLE]

For Condition 5 we have

[TABLE]

and hence,

[TABLE]

Conditions 6 and 8 hold by the choice of the control in (B.5). Condition 7 holds by definition.

We solve this problem numerically using BOCOP [5] and get the optimal control and state given in Figure 1.

We now discuss the second order optimality condition for this example. The costate equation is

[TABLE]

with $A$ as defined in (2.20). Since $\bar{y}$ and $y_{d}$ are colinear to $c_{1}$ , it follows that $p(x,t)=p_{1}(t)c_{1}(x)$ , and

[TABLE]

Over $(2,3)$ , $\dot{\mu}_{1}=0$ (sate constraint not active) and $\bar{y}_{1}=\hat{y}_{d}$ , therefore $p_{1}$ and $p$ identically vanish. Over $(\log 2,2)$ , $\bar{u}$ is out of bounds and therefore

[TABLE]

It follows that $p_{1}$ and $p$ also vanish on $(\log 2,2)$ and that

[TABLE]

Over $(0,\log 2),$ the control attains its upper bound, then

[TABLE]

with final condition $p_{1}(\log 2)=0$ , so that

[TABLE]

As expected, $p_{1}$ is negative.

Next, the linearized state equation at $(\bar{u},\bar{y})$ reads

[TABLE]

Since $\bar{y}=\bar{y}_{1}(t)c_{1}(x)$ , we deduce that $z=z_{1}(t)c_{1}(x)$ , with $z_{1}$ solution of

[TABLE]

Therefore if $(v,z)$ satisfy the linearized state equation

[TABLE]

If in addition $v$ is a critical direction, since $v=0$ and $z_{1}=0$ a.e. on $(0,2)$ , and $p_{1}(t)=0$ on $(2,3),$ we get

[TABLE]

Thus, ${\mathcal{Q}}$ is non-negative for any critical directions $(z[v],v)$ , in accordance with the second-order necessary condition of Theorem 4.7.

Bibliography30

The reference list from the paper itself. Each links out to its DOI / PubMed record.

1[1] M. S. Aronna, J. F. Bonnans, A. V. Dmitruk, and P. A. Lotito, Quadratic order conditions for bang-singular extremals , Numerical Algebra, Control and Optimization, AIMS Journal 2 (2012), no. 3, 511–546.
2[2] M. S. Aronna, J. F. Bonnans, and A. Kröner, State-constrained control-affine parabolic problems II: Second-order sufficient optimality conditions , (2019).
3[3] M. S. Aronna, J.F. Bonnans, and B. S. Goh, Second order analysis of control-affine problems with scalar state constraint , Math. Program. 160 (2016), no. 1-2, Ser. A, 115–147.
4[4] J.-P. Aubin, Un théorème de compacité , C. R. Acad. Sci. Paris 256 (1963), 5042–5044.
5[5] J. Bonnans, J.F., D. Giorgi, V. Grélard, B. Heymann, S. Maindrault, P. Martinon, O. Tissot, and J. Liu, Bocop – A collection of examples , Tech. report, INRIA, 2017.
6[6] J.F. Bonnans, Second-order analysis for control constrained optimal control problems of semilinear elliptic systems , Appl. Math. Optim. 38 (1998), no. 3, 303–325.
7[7] J.F. Bonnans and A. Hermant, Second-order analysis for optimal control problems with pure state constraints and mixed control-state constraints , Ann. Inst. H. Poincaré Anal. Non Linéaire 26 (2009), no. 2, 561–598.
8[8] J.F. Bonnans and P. Jaisson, Optimal control of a parabolic equation with time-dependent state constraints , SIAM J. Control Optim. 48 (2010), no. 7, 4550–4571.

TL;DR

Contribution

Findings

Abstract

Peer Reviews

Videos

State-constrained control-affine parabolic problems I: First and Second order necessary optimality conditions

Abstract.

Key words and phrases:

1. Introduction

Notation

2. Statement of the problem and main assumptions

2.1. Setting

Definition 2.1**.**

2.2. Well-posedness of the state equation

Lemma 2.2**.**

Lemma 2.3**.**

Proof.

Lemma 2.4**.**

Proof.

2.3. Existence of solution of the optimal control problem

Lemma 2.5**.**

Proof.

Theorem 2.6**.**

Proof.

3. First order analysis

3.1. Linearized state equation and costate equation

Proposition 3.1**.**

Proof.

Lemma 3.2**.**

Proof.

Corollary 3.3**.**

Proof.

3.2. First order optimality conditions

Definition 3.4**.**

3.2.1. The reduced abstract problem

Lemma 3.5**.**

Proof.

3.3. Arcs and junction points

Definition 3.6**.**

Corollary 3.7**.**

3.4. About the jumps of the multiplier at junction points

Lemma 3.8**.**

Proof.

3.5. Regularity of the switching function and multiplier over maximal arcs

Hypothesis 3.9**.**

Remark 3.10**.**

Proposition 3.11**.**

Proof.

Corollary 3.12**.**

Proof.

4. Second order necessary conditions

4.1. Assumptions and additional regularity

Hypothesis 4.1**.**

Theorem 4.2**.**

Proof.

4.2. Second variation

Proposition 4.3**.**

Proof.

4.3. Critical directions

Lemma 4.4**.**

Proof.

4.3.1. Radiality of critical directions

Lemma 4.5**.**

Proof.

Corollary 4.6**.**

4.4. Second order necessary condition

Theorem 4.7** (Second order necessary condition).**

Proof.

Appendix A Strong solutions of the heat equation

Theorem A.1**.**

Corollary A.2**.**

Proof.

Theorem A.3**.**

Proof.

Appendix B An example

Definition 2.1.

Lemma 2.2.

Lemma 2.3.

Lemma 2.4.

Lemma 2.5.

Theorem 2.6.

Proposition 3.1.

Lemma 3.2.

Corollary 3.3.

Definition 3.4.

Lemma 3.5.

Definition 3.6.

Corollary 3.7.

Lemma 3.8.

Hypothesis 3.9.

Remark 3.10.

Proposition 3.11.

Corollary 3.12.

Hypothesis 4.1.

Theorem 4.2.

Proposition 4.3.

Lemma 4.4.

Lemma 4.5.

Corollary 4.6.

Theorem 4.7 (Second order necessary condition).

Theorem A.1.

Corollary A.2.

Theorem A.3.