Numerical Analysis of Sparse Initial Data Identification for Parabolic   Problems

Dmitriy Leykekhman; Boris Vexler; Daniel Walter

arXiv:1905.01226·math.OC·May 6, 2019

Numerical Analysis of Sparse Initial Data Identification for Parabolic Problems

Dmitriy Leykekhman, Boris Vexler, Daniel Walter

PDF

TL;DR

This paper develops a numerical method for identifying sparse initial data in parabolic equations from final observations, formulating it as a convex optimization problem with error estimates and efficient algorithms.

Contribution

It introduces a novel sparse regularization approach for initial data identification in parabolic problems, with rigorous error analysis and practical algorithms.

Findings

01

Control variable can be a finite sum of Dirac measures under structural assumptions.

02

Error estimates for the locations and coefficients of Dirac measures are established.

03

Numerical experiments validate the theoretical error bounds and algorithm efficiency.

Abstract

In this paper we consider a problem of initial data identification from the final time observation for homogeneous parabolic problems. It is well-known that such problems are exponentially ill-posed due to the strong smoothing property of parabolic equations. We are interested in a situation when the initial data we intend to recover is known to be sparse, i.e. its support has Lebesgue measure zero. We formulate the problem as an optimal control problem and incorporate the information on the sparsity of the unknown initial data into the structure of the objective functional. In particular, we are looking for the control variable in the space of regular Borel measures and use the corresponding norm as a regularization term in the objective functional. This leads to a convex but non-smooth optimization problem. For the discretization we use continuous piecewise linear finite elements in…

Figures4

Click any figure to enlarge with its caption.

Equations686

\partial_{t} u - Δ u

\partial_{t} u - Δ u

u

u (0)

Minimize J (q, u) = \frac{1}{2} ∥ u (T) - u_{d} ∥_{L^{2} (Ω)}^{2} + α ∥ q ∥_{M (Ω)}, q \in M (Ω), subject to \eqref e q : s t a t e .

Minimize J (q, u) = \frac{1}{2} ∥ u (T) - u_{d} ∥_{L^{2} (Ω)}^{2} + α ∥ q ∥_{M (Ω)}, q \in M (Ω), subject to \eqref e q : s t a t e .

Minimize ∥ q ∥_{M (Ω)} subject to ∥ u (T) - u_{d} ∥_{L^{2} (Ω)} \leq ε and \eqref e q : s t a t e .

Minimize ∥ q ∥_{M (Ω)} subject to ∥ u (T) - u_{d} ∥_{L^{2} (Ω)} \leq ε and \eqref e q : s t a t e .

∥(\overset{u}{ˉ} - \overset{u}{ˉ}_{k h}) (T) ∥_{L^{2} (Ω)} \leq c (k^{r + \frac{1}{2}} + ℓ_{k h} h),

∥(\overset{u}{ˉ} - \overset{u}{ˉ}_{k h}) (T) ∥_{L^{2} (Ω)} \leq c (k^{r + \frac{1}{2}} + ℓ_{k h} h),

\overset{q}{ˉ} = i = 1 \sum K \overset{ˉ}{β}_{i} δ_{\overset{x}{ˉ}_{i}},

\overset{q}{ˉ} = i = 1 \sum K \overset{ˉ}{β}_{i} δ_{\overset{x}{ˉ}_{i}},

\overset{q}{ˉ}_{k h} = i = 1 \sum K j = 1 \sum n_{i} \overset{ˉ}{β}_{k h, ij} δ_{\overset{x}{ˉ}_{k h, ij}},

\overset{q}{ˉ}_{k h} = i = 1 \sum K j = 1 \sum n_{i} \overset{ˉ}{β}_{k h, ij} δ_{\overset{x}{ˉ}_{k h, ij}},

∥(\overset{u}{ˉ} - \overset{u}{ˉ}_{k h}) (T) ∥_{L^{2} (Ω)} \leq c (k^{2 r + 1} + ℓ_{k h} h) .

∥(\overset{u}{ˉ} - \overset{u}{ˉ}_{k h}) (T) ∥_{L^{2} (Ω)} \leq c (k^{2 r + 1} + ℓ_{k h} h) .

∣ \overset{x}{ˉ}_{i} - \overset{x}{ˉ}_{k h, ij} ∣ \leq c (k^{2 r + 1} + ℓ_{k h}^{\frac{1}{2}} h)

∣ \overset{x}{ˉ}_{i} - \overset{x}{ˉ}_{k h, ij} ∣ \leq c (k^{2 r + 1} + ℓ_{k h}^{\frac{1}{2}} h)

∥ q ∥_{KR} = sup {⟨ q, φ ⟩} φ \in C (Ω), x_{1}, x_{2} \in Ω, x_{1} \neq = x_{2} sup \frac{∣ φ ( x _{1} ) - φ ( x _{2} ) ∣}{∣ x _{1} - x _{2} ∣} \leq 1, ∣ φ (x) ∣ \leq 1, x \in Ω

∥ q ∥_{KR} = sup {⟨ q, φ ⟩} φ \in C (Ω), x_{1}, x_{2} \in Ω, x_{1} \neq = x_{2} sup \frac{∣ φ ( x _{1} ) - φ ( x _{2} ) ∣}{∣ x _{1} - x _{2} ∣} \leq 1, ∣ φ (x) ∣ \leq 1, x \in Ω

\partial_{t} v - Δ v

\partial_{t} v - Δ v

v

v (0)

∣ (v - v_{k h}) (T, x_{0}) ∣ \leq C (T) (k^{2 r + 1} + ℓ_{k h} h^{2}) ∥ v_{0} ∥_{L^{2} (Ω)},

∣ (v - v_{k h}) (T, x_{0}) ∣ \leq C (T) (k^{2 r + 1} + ℓ_{k h} h^{2}) ∥ v_{0} ∥_{L^{2} (Ω)},

W (0, T) = L^{2} (I; H_{0}^{1} (Ω)) \cap H^{1} (I; H^{- 1} (Ω)) .

W (0, T) = L^{2} (I; H_{0}^{1} (Ω)) \cap H^{1} (I; H^{- 1} (Ω)) .

(ψ, u)_{I \times Ω} = ⟨ q, φ (0)⟩

(ψ, u)_{I \times Ω} = ⟨ q, φ (0)⟩

- \partial_{t} φ - Δ φ

- \partial_{t} φ - Δ φ

φ

φ (T)

\frac{2}{r} + \frac{N}{p} > N + 1

\frac{2}{r} + \frac{N}{p} > N + 1

∥ u ∥_{L^{r} (I; W_{0}^{1, p} (Ω))} \leq c ∥ q ∥_{M (Ω)}

∥ u ∥_{L^{r} (I; W_{0}^{1, p} (Ω))} \leq c ∥ q ∥_{M (Ω)}

∥ u (T) ∥_{L^{2} (Ω)} \leq c ∥ q ∥_{M (Ω)} .

∥ u (T) ∥_{L^{2} (Ω)} \leq c ∥ q ∥_{M (Ω)} .

j (q) = \frac{1}{2} ∥ S (q) - u_{d} ∥_{L^{2} (Ω)}^{2} + α ∥ q ∥_{M (Ω)} .

j (q) = \frac{1}{2} ∥ S (q) - u_{d} ∥_{L^{2} (Ω)}^{2} + α ∥ q ∥_{M (Ω)} .

Minimize j (q), q \in M (Ω) .

Minimize j (q), q \in M (Ω) .

∥ \overset{u}{ˉ} (T) ∥_{L^{2} (Ω)} \leq 2 ∥ u_{d} ∥_{L^{2} (Ω)} and α ∥ \overset{q}{ˉ} ∥_{M (Ω)} \leq \frac{1}{2} ∥ u_{d} ∥_{L^{2} (Ω)}^{2},

∥ \overset{u}{ˉ} (T) ∥_{L^{2} (Ω)} \leq 2 ∥ u_{d} ∥_{L^{2} (Ω)} and α ∥ \overset{q}{ˉ} ∥_{M (Ω)} \leq \frac{1}{2} ∥ u_{d} ∥_{L^{2} (Ω)}^{2},

- \partial_{t} \overset{z}{ˉ} - Δ \overset{z}{ˉ}

- \partial_{t} \overset{z}{ˉ} - Δ \overset{z}{ˉ}

\overset{z}{ˉ}

\overset{z}{ˉ} (T)

- ⟨ q - \overset{q}{ˉ}, \overset{z}{ˉ} (0)⟩ \leq α (∥ q ∥_{M (Ω)} - ∥ \overset{q}{ˉ} ∥_{M (Ω)}) for all q \in M (Ω) .

- ⟨ q - \overset{q}{ˉ}, \overset{z}{ˉ} (0)⟩ \leq α (∥ q ∥_{M (Ω)} - ∥ \overset{q}{ˉ} ∥_{M (Ω)}) for all q \in M (Ω) .

∥ \overset{z}{ˉ} (0) ∥_{H^{4} (Ω_{0})} \leq c ∥ u_{d} ∥_{L^{2} (Ω)},

∥ \overset{z}{ˉ} (0) ∥_{H^{4} (Ω_{0})} \leq c ∥ u_{d} ∥_{L^{2} (Ω)},

∥ Δ \overset{z}{ˉ} (0) ∥_{H^{2} (Ω)} \leq c ∥ Δ^{2} \overset{z}{ˉ} (0) ∥_{L^{2} (Ω)} \leq c ∥ \overset{u}{ˉ} (T) - u_{d} ∥_{L^{2} (Ω)},

∥ Δ \overset{z}{ˉ} (0) ∥_{H^{2} (Ω)} \leq c ∥ Δ^{2} \overset{z}{ˉ} (0) ∥_{L^{2} (Ω)} \leq c ∥ \overset{u}{ˉ} (T) - u_{d} ∥_{L^{2} (Ω)},

∥ \overset{z}{ˉ} (0) ∥_{H^{4} (Ω_{0})} \leq c ∥ Δ \overset{z}{ˉ} (0) ∥_{H^{2} (Ω)} \leq c ∥ \overset{u}{ˉ} (T) - u_{d} ∥_{L^{2} (Ω)} \leq c ∥ u_{d} ∥_{L^{2} (Ω)},

∥ \overset{z}{ˉ} (0) ∥_{H^{4} (Ω_{0})} \leq c ∥ Δ \overset{z}{ˉ} (0) ∥_{H^{2} (Ω)} \leq c ∥ \overset{u}{ˉ} (T) - u_{d} ∥_{L^{2} (Ω)} \leq c ∥ u_{d} ∥_{L^{2} (Ω)},

∣ \overset{z}{ˉ} (0, x)∣ \leq α for all x \in \overset{ˉ}{Ω},

∣ \overset{z}{ˉ} (0, x)∣ \leq α for all x \in \overset{ˉ}{Ω},

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Full text

Numerical Analysis of Sparse Initial Data Identification

for Parabolic Problems

Dmitriy Leykekhman

Department of Mathematics, University of Connecticut, Storrs, CT 06269, USA ([email protected]).

,

Boris Vexler

Chair of Optimal Control, Technical University of Munich, Department of Mathematics , Boltzmannstraße 3, 85748 Garching b. Munich, Germany ([email protected]).

and

Daniel Walter

Johann Radon Institute for Computational and Applied Mathematics, ÖAW, Altenbergerstraße 69, 4040 Linz, Austria ([email protected]). The third author gratefully acknowledges support from the International Research Training Group IGDK, funded by the German Science Foundation (DFG) and the Austrian Science Fund (FWF).

Abstract.

In this paper we consider a problem of initial data identification from the final time observation for homogeneous parabolic problems. It is well-known that such problems are exponentially ill-posed due to the strong smoothing property of parabolic equations. We are interested in a situation when the initial data we intend to recover is known to be sparse, i.e. its support has Lebesgue measure zero. We formulate the problem as an optimal control problem and incorporate the information on the sparsity of the unknown initial data into the structure of the objective functional. In particular, we are looking for the control variable in the space of regular Borel measures and use the corresponding norm as a regularization term in the objective functional. This leads to a convex but non-smooth optimization problem. For the discretization we use continuous piecewise linear finite elements in space and discontinuous Galerkin finite elements of arbitrary degree in time. For the general case we establish error estimates for the state variable. Under a certain structural assumption, we show that the control variable consists of a finite linear combination of Dirac measures. For this case we obtain error estimates for the locations of Dirac measures as well as for the corresponding coefficients. The key to the numerical analysis are the sharp smoothing type pointwise finite element error estimates for homogeneous parabolic problems, which are of independent interest. Moreover, we discuss an efficient algorithmic approach to the problem and show several numerical experiments illustrating our theoretical results.

Key words and phrases:

optimal control, sparse control, initial data identification, smoothing estimates, parabolic problems, finite elements, discontinuous Galerkin, error estimates, pointwise error estimates

1991 Mathematics Subject Classification:

65N30,65N15

1. Introduction

In this paper we consider a problem of identification of an unknown initial data $q$ for a homogenous parabolic equation

[TABLE]

from a given (measured) data $u_{d}\approx u(T)$ of the terminal state $u(T)$ for some $T>0$ . In general, this problem is known to be exponentially ill-posed, see, e.g., [17]. We are interested in the situation, where the initial data we are looking for, is known to be sparse, i.e. to have a support of Lebesgue measure zero. The strong smoothing property of the above equation makes it difficult to identify such sparse initial data. The remedy is the incorporation of the information that the unknown $q$ should be sparse in the optimal control formulation. Following the idea for measure valued formulation of sparse control problems, see, e.g., [7, 8, 9, 20, 28], we will look for the initial state $q$ in the space of regular Borel measures $\mathcal{M}(\Omega)$ on the domain $\Omega$ , which is known to be isomorphic to the dual space of continuous functions which are zero on $\partial\Omega$ , $C_{0}(\Omega)^{*}$ .

The corresponding optimal control formulation reads as follows

[TABLE]

Here and in what follows, $\Omega$ is a convex polygonal/polyhedral domain in $\mathbb{R}^{N}$ , $N=2,3$ , $I=(0,T]$ is the time interval, $u_{d}\in L^{2}(\Omega)$ is the given (desired /measured) final state, and $\alpha>0$ is the regularization parameter. A very similar problem is considered in [7]. There, the initial state $q$ is also searched for in the space $\mathcal{M}(\Omega)$ . For given $\varepsilon>0$ and $u_{d}\in L^{2}(\Omega)$ the optimal control problem in [7] is formulated as follows:

[TABLE]

One can directly show, that problems (2) and (3) are equivalent by appropriate choices of $\alpha$ and $\varepsilon$ .

The optimal control problem (2) possesses a unique solution $(\bar{q},\bar{u})$ , see next section for details. For a numerical solution of the optimal control problem under consideration we will use discontinuous Galerkin methods dG( $r$ ) of order $r$ for temporal and linear (conforming) finite elements for spatial discretizations of the state equation (1) leading to the discrete optimal solution $(\bar{q}_{kh},\bar{u}_{kh})$ . The same type of discretization (with $r=0$ ) is used in [7], where weak-star convergence $\bar{q}_{kh}\overset{\ast}{\rightharpoonup}\bar{q}$ in $\mathcal{M}(\Omega)$ for the control and strong convergence $\bar{u}_{kh}(T)\to\bar{u}(T)$ in $L^{\infty}(\Omega)$ is shown for the discretization parameters $k$ and $h$ tending to zero. However, no convergence rates with respect to $k$ or $h$ are derived in [7]. The main goal of this paper is to close this gap and obtain precise error estimates. In addition, in the case when the optimal control is in the form of linear combination of Diracs, we obtain convergence rates for the source locations and the corresponding coefficients. We illustrate the theoretical results with numerical experiments.

For the general case (i.e. without any further assumptions) we will prove the following error estimate

[TABLE]

where $k$ denotes the maximal time step, $h$ is the spatial mesh size, and $\ell_{kh}$ is a logarithmic term, see Theorem 5 for details.

From the optimality system (see next section) we will deduce, that the support of the optimal control (optimal initial state) $\bar{q}$ is contained in the set of maxima and minima of the adjoint state $\bar{z}(0)$ , see Corollary 2. Under additional assumptions (Assumption 1) on this set, which implies that the optimal control $\bar{q}$ consists of finitely many Dirac measures, i.e.

[TABLE]

we will show, that the discrete optimal control $\bar{q}_{kh}$ has a similar structure, i.e.

[TABLE]

where each Dirac measure $\delta_{\bar{x}_{i}}$ on the continuous level is approximated by $n_{i}\geq 1$ Dirac measures $\delta_{\bar{x}_{kh,ij}}$ on the discrete level, see Lemma 6.2 for details. In this setting we will provide (see Theorem 6.1 and Theorem 6.2) an improved error estimate for the optimal states, i.e.

[TABLE]

Moreover, we will prove an estimate for the error in position of the support points,

[TABLE]

for all $1\leq i\leq K$ and $1\leq j\leq n_{i}$ and a corresponding estimates for the coefficients. As a corollary we obtain an error estimate for the discrete optimal solutions in the norm on the topological dual of the Sobolev space $W^{1,\infty}(\Omega)$ . This also implies the same rate of convergence for $\bar{q}_{kh}$ with respect to the Kantorovich-Rubinshtein norm, [3, Section 8.3], given by

[TABLE]

for $q\in\mathcal{M}(\Omega)$ . In fact, we readily verify that this norm is equiavalent to the $(W^{1,\infty})^{*}$ norm. Roughly speaking, the metric induced by the Kantorovich-Rubinshtein norm can be interpreted as an extension of the well-known Wasserstein-1 distance, [18], which is defined for probability measures, to signed measures with different mean values.

In order to obtain such convergence rates we need to revise fully discrete pointwise smoothing error estimates for a homogeneous parabolic problem

[TABLE]

with a general initial condition $v_{0}\in L^{2}(\Omega)$ . This means that for the fully discrete approximation $v_{kh}$ we need optimal pointwise spatial error estimates for $(v-v_{kh})(T)$ in terms of the $L^{2}(\Omega)$ norm of the initial data. This problem is classical and was considered in a number of papers, we only mention the most relevant ones to our presentation. Global $L^{\infty}(\Omega)$ error estimates for smooth domains and uniform time steps were established in [16], on the other hand superconvergent results at time nodes in $L^{2}(\Omega)$ norm, again on smooth domains were established in [12]. One of the main contributions of our paper is the derivation of superconvergent in time and pointwise in space interior error estimates on convex polygonal/polyhedral domains. More precisely, we establish the following result

[TABLE]

where $x_{0}\in\Omega$ is an interior point. The precise form of the constants and the logarithmic terms are given in the statements of the Theorem 3.2 and Theorem 3.2. This result is required for our error analysis for the problem (2) and is also of independent interest.

Throughout the paper we use $\lvert\cdot\rvert$ for the absolute value and also for the Euclidian norm of a vector in ${\mathbb{R}}^{n}$ . We employ the usual notation for the Lebesgue and Sobolev spaces. We denote by $(\cdot,\cdot)$ the inner product in $L^{2}(\Omega)$ , by $\langle\cdot,\cdot\rangle$ the duality product between $\mathcal{M}(\Omega)$ and $C_{0}(\Omega)$ , and by $(\cdot,\cdot)_{J\times\Omega}$ the inner product in $L^{2}(J\times\Omega)$ with a subinterval $J\subset I$ . With $W(0,T)$ we denote the usual space

[TABLE]

The paper is organized as follows. In the next section we introduce the optimal control problem, derive first order optimality conditions and discuss structural properties of the optimal solutions. In section 3, we present a fully discrete scheme for the homogeneous parabolic equation (5) and state key smoothing error estimates, the proofs of which are postponed until sections 7 and 8. In section 4, we look separately at the time semidiscretization and the full discretization of the optimal control problem and derive some preliminary results. In section 5 we first obtain suboptimal error estimates for the general case which under additional assumptions we improve in section 6. Finally, the last two sections are devoted to the description of the algorithm and numerical illustrations of our theoretical results.

2. Optimal control problem

To introduce the precise formulation of the optimal control problem under the consideration we first discuss the solution of the state equation (1). For a given $q\in\mathcal{M}(\Omega)$ we define a (very weak) solution $u=u(q)\in L^{1}(I\times\Omega)$ of (1) if the following identity holds

[TABLE]

for all $\psi\in L^{\infty}(I\times\Omega)$ , where $\varphi\in W(0,T)$ is the weak solution of

[TABLE]

It is well known, that $\varphi\in C(\bar{I}\times\bar{\Omega})$ for $\psi\in L^{\infty}(I\times\Omega)$ , see, e.g., [15, Theorem 6.8] on general Lipschitz domains, or [4, Theorem 5.1]. Therefore, $\varphi(0)\in C_{0}(\Omega)$ and the solution $u$ is well defined. There holds the following proposition, see [7, Lemma 2.2].

{prpstn}

For each $q\in\mathcal{M}(\Omega)$ there exists a unique solution $u$ of (1) in the above sense. Moreover, there holds $u\in L^{r}(I;W_{0}^{1,p}(\Omega))$ for all $r,p\in[1,2)$ with

[TABLE]

and $u(T)\in L^{2}(\Omega)$ with the corresponding estimates

[TABLE]

and

[TABLE]

{rmrk}

The final state $u(T)$ has more regularity. There holds $(-\Delta)^{k}u(T)\in L^{2}(\Omega)$ for any natural number $k$ . For example by taking $k=1$ , we obtain $u(T)\in H^{2}(\Omega)\cap H^{1}_{0}(\Omega)$ using the convexity of the domain.

The unique solvability of the state equation allows us to introduce the control-to-state mapping $S\colon\mathcal{M}(\Omega)\to L^{2}(\Omega)$ with $S(q)=u(q)(T)$ . By the discussion above this operator is linear continuous and due to $S(q)\in H^{2}(\Omega)$ it maps every weakly star converging sequence $\{q_{n}\}\subset\mathcal{M}(\Omega)$ to a strongly converging sequence in $L^{2}(\Omega)$ . Based on this operator we define the reduced cost functional $j\colon\mathcal{M}(\Omega)\to\mathbb{R}$ by

[TABLE]

The optimal control problem (2) can then be formulated as

[TABLE]

{thrm}

The problem (7) possesses a unique solution $\bar{q}\in\mathcal{M}(\Omega)$ . There holds the estimates

[TABLE]

where $\bar{u}=u(\bar{q})$ is the corresponding optimal state.

Proof.

The existence follows by standard arguments, cf., e.g, [9, Proposition 2.2.]. The uniqueness follows as in [7, Theorem 2.4] using density of the range of the semigroup generated by the heat equation [14], which is equivalent to the backward uniqueness property of the heat equation. The estimates follow from $j(\bar{q})\leq j(0)$ . ∎

The unique solution $\bar{q}$ and the corresponding optimal state $\bar{u}$ can be characterized by the following optimality conditions.

{thrm}

The control $\bar{q}\in\mathcal{M}(\Omega)$ is the solution of (7) if and only if the triple $(\bar{q},\bar{u},\bar{z})$ satisfies the following conditions:

•

state equation, $\bar{u}=u(\bar{q})$ in the sense of Proposition 2.

•

adjoint equation for $\bar{z}\in W(0,T)$ being the weak solution of

[TABLE]

•

variational inequality

[TABLE]

Proof.

The proof is similar to [5, Theorem 2.1]. Note, that $\bar{z}(0)\in C_{0}(\Omega)$ , which makes the duality product in the variational inequality well defined. ∎

The next lemma states additional regularity for $\bar{z}(0)$ .

{lmm}

Let $\bar{q}\in\mathcal{M}(\Omega)$ be the solution of (7), $\bar{u}$ be the corresponding state and $\bar{z}$ the corresponding adjoint state. Let $\Omega_{0}$ be an interior subdomain of $\Omega$ , i.e. $\bar{\Omega}_{0}\subset\Omega$ . Then there holds $\bar{z}(0)\in H^{4}(\Omega_{0})\hookrightarrow C^{2}(\Omega_{0})$ with

[TABLE]

where the constant $c$ depends on $\Omega$ , $T$ and $\Omega_{0}$ .

Proof.

As in Remark 2, one shows directly $-\Delta\bar{z}(0)\in H^{2}(\Omega)\cap H^{1}_{0}(\Omega)$ with

[TABLE]

cf. also (17) below. Then the elliptic interior regularity result from [13, Chapter 6.3,Theorem 2] implies

[TABLE]

where in the last estimate we used Theorem 2. ∎

From the above optimality condition we obtain the following structural properties of the optimal solution $\bar{q}$ and the corresponding optimal adjoint state $\bar{z}$ .

{crllr}

Let $\bar{q}$ be the solution of (7), $\bar{u}$ be the corresponding state and $\bar{z}$ the corresponding adjoint state. Then there hold

(a)

a bound for the adjoint state $\bar{z}(0)$

[TABLE]

(b)

a support condition for the positive and the negative parts in the Jordan decomposition of $\bar{q}=\bar{q}^{+}-\bar{q}^{-}$

[TABLE]

Moreover there is a subdomain $\Omega_{0}$ with $\bar{\Omega}_{0}\subset\Omega$ such that

[TABLE]

Proof.

The proof is similar to [9] or [5]. ∎

{rmrk}

The adjoint state $\bar{z}(0)$ is analytic on $\Omega_{0}$ , see [19]. This implies by the above corollary that Lebesgue measure of $\operatorname{supp}\bar{q}$ is zero.

3. Discretization and smoothing type error estimates

In this section we describe the (fully discrete) finite element discretization of the (axillary) homogeneous equation (5) and present smoothing type error estimates. To discretize the problem we use continuous linear Lagrange finite elements in space and discontinuous Galerkin methods of order $r$ in time. To be more precise, we partition $I=(0,T]$ into subintervals $I_{m}=(t_{m-1},t_{m}]$ of length $k_{m}=t_{m}-t_{m-1}$ , where $0=t_{0}<t_{1}<\cdots<t_{M-1}<t_{M}=T$ . The maximal and minimal time steps are denoted by $k=\max_{m}k_{m}$ and $k_{\min}=\min_{m}k_{m}$ , respectively. We impose the following conditions on the time mesh (as in [22] or [24]):

(i)

There are constants $c,\beta>0$ independent on $k$ such that

[TABLE] 2. (ii)

There is a constant $\kappa>0$ independent on $k$ such that for all $m=1,2,\dots,M-1$

[TABLE] 3. (iii)

It holds $k\leq\frac{T}{2r+2}$ .

The semidiscrete space $X_{k}^{r}$ of piecewise polynomial functions in time is defined by

[TABLE]

where $\mathbb{P}_{r}(I_{m};V)$ is the space of polynomial functions of degree $r$ in time om $I_{m}$ with values in a Banach space $V$ . We will employ the following notation for functions with possible discontinuities at the nodes $t_{m}$ :

[TABLE]

Next we define the following bilinear form

[TABLE]

where $\langle\cdot,\cdot\rangle_{I_{m}\times\Omega}$ is the duality product between $L^{2}(I_{m};H^{-1}(\Omega))$ and $L^{2}(I_{m};H^{1}_{0}(\Omega))$ . We note, that the first sum vanishes for $w\in X^{0}_{k}$ . The dG( $r$ ) semidiscrete (in time) approximation $v_{k}\in X_{k}^{q}$ of (5) is defined as

[TABLE]

Rearranging the terms in (9), we obtain an equivalent (dual) expression for $B$ :

[TABLE]

In the sequel we require the projection operator $\pi_{k}$ for $w\in C(I,L^{2}(\Omega))$ with $\pi_{k}w|_{I_{m}}\in\mathbb{P}_{r}(I_{m};L^{2}(\Omega))$ for $m=1,2,\dots,M$ on each subinterval $I_{m}$ by

[TABLE]

In the case $r=0$ , $\pi_{k}w$ is defined only by the second condition.

Next we define the fully discrete approximation scheme. For $h\in(0,h_{0}]$ ; $h_{0}>0$ , let $\mathcal{T}$ denote a quasi-uniform triangulation of $\Omega$ with mesh size $h$ , i.e., $\mathcal{T}=\{\tau\}$ is a partition of $\Omega$ into cells (triangles or tetrahedrons) $\tau$ of diameter $h_{\tau}$ such that for $h=\max_{\tau}h_{\tau}$ ,

[TABLE]

hold. Let $V_{h}$ be the set of all functions in $H^{1}_{0}(\Omega)$ that are affine linear on each cell $\tau$ , i.e. $V_{h}$ is the usual space of linear conforming finite elements. We define the following three operators to be used in the sequel: discrete Laplacian $\Delta_{h}\colon V_{h}\to V_{h}$ defined by

[TABLE]

the $L^{2}$ projection $P_{h}\colon L^{2}(\Omega)\to V_{h}$ defined by

[TABLE]

and the Ritz projection $R_{h}\colon H^{1}_{0}(\Omega)\to V_{h}$ defined by

[TABLE]

To obtain the fully discrete approximation of (5) we consider the space-time finite element space

[TABLE]

We define a fully discrete cG( $1$ )dG( $r$ ) approximation $v_{kh}\in X^{r,1}_{k,h}$ of (5) by

[TABLE]

Notice that we have the following orthogonality relations

[TABLE]

In the proofs we will use the following truncation argument. For $w_{k},\varphi_{k}\in X_{k}^{r}$ , we let $\tilde{w}_{k}=\chi_{(t_{\tilde{m}},T]}w_{k}$ and $\tilde{\varphi}_{k}=\chi_{(t_{\tilde{m}},T]}\varphi_{k}$ , where $\chi_{(t_{\tilde{m}},T]}$ is the characteristic function on the interval $(t_{\tilde{m}},T]$ , for some $1\leq\tilde{m}\leq M$ , i.e. $\tilde{w}_{k}=0$ on $I_{1}\cup\cdots\cup I_{\tilde{m}}$ for some $\tilde{m}$ and $\tilde{w}_{k}={w}_{k}$ on the remaining time intervals. Then from (9), we have the identity

[TABLE]

Same identity holds of course for fully discrete functions $w_{kh},\varphi_{kh}\in X^{r,1}_{k,h}$ . The following smoothing properties of the continuous, semidiscrete and fully discrete solutions are essential in our arguments.

3.1. Parabolic smoothing

It is well known that the solution $v$ to the homogeneous problem (5) has the following smoothing property.

[TABLE]

To get smoothing estimates in some other norms, we will frequently use the Gagliardo-Nirenberg inequality

[TABLE]

which holds for any subdomain $B\subset\Omega$ fulfilling the cone condition (in particular for $B=\Omega$ ) and for all $g\in H^{2}(B)$ , see [1, Theorem 3]. For $B=\Omega$ it follows with the $H^{2}$ -regularity

[TABLE]

The following smoothing estimates can be obtained from (17). {lmm} Let $v_{0}\in L^{2}(\Omega)$ and $v\in W(0,T)$ be the solution of (5). Then $v(T)\in H^{2}(\Omega)\cap H^{1}_{0}(\Omega)$ and the following estimate holds

[TABLE]

Moreover, for each interior subdomain $\Omega_{0}$ with $\bar{\Omega}_{0}\subset\Omega$ , the final state $v(T)$ is (real) analytic on such $\Omega_{0}$ and there hold

[TABLE]

Proof.

The first inequality follows right the way from (17) with $l=1$ by $H^{2}$ regularity. The analyticity can be found, e.g., in [19]. To prove the second inequality we first observe that

[TABLE]

Then we use Gagliardo-Nirenberg inequality (18) for $g=\nabla v(T)$ and $B=\Omega_{0}$ resulting in

[TABLE]

where we have used the interior regularity result [13, Chapter 6.3,Theorem 2]. To show the last inequality we use Gagliardo-Nirenberg inequality (18) for $g=\nabla^{2}v(T)$ and $B=\Omega_{0}$ resulting in

[TABLE]

where we again have used the interior regularity result [13, Chapter 6.3,Theorem 2] and convexity of $\Omega$ . ∎

For the discontinuous Galerkin methods similar smoothing type estimates also hold, see Theorems 3,4,5,10 in [23] for general $L^{p}$ norms, cf. also [11, Theorem 5.1] for the case of the $L^{2}$ norm. {lmm}[Smoothing estimate] Let $v_{k}$ and $v_{kh}$ be the semidiscrete and fully discrete solutions of (10) and (14), respectively. Then, there exists a constant $C$ independent of $k$ and $h$ such that

[TABLE]

for $m=1,2,\dots,M$ and any $1\leq p\leq\infty$ . For $m=1$ the jump term is understood as $[v_{k}]_{0}=v_{k,0}^{+}-v_{0}$ and $[v_{kh}]_{0}=v_{kh,0}^{+}-P_{h}v_{0}$ .

In addition the stability with respect to the $L^{p}(\Omega)$ norm is valid for the semidiscrete and fully discrete approximations of the heat equation. For the proof we refer to [23, Lemma 5], see also [27]. {lmm} Let $v_{k}$ and $v_{kh}$ be the semidiscrete and fully discrete solutions of (10) and (14), respectively. Then, there exists a constant $C$ independent of $k$ and $h$ such that

[TABLE]

holds for any $1\leq p\leq\infty$ .

From Lemma 3.1 we immediately obtain the following corollary. Note, that the corresponding estimate is not true on the continuous level, which explains the presence of the logarithmic term. {crllr} Under the assumptions of Lemma 3.1, for any $1\leq p\leq\infty$ we have

[TABLE]

and

[TABLE]

Proof.

We only provide the proof for the semidiscrete case, the fully discrete case is identical. Using the above smoothing result from Lemma 3.1, we have

[TABLE]

where in the last step we used that

[TABLE]

∎

For sufficiently many time steps, applying Lemma 3.1 iteratively, we immediately obtain the following result. {lmm} Let $v_{k}$ and $v_{kh}$ be the semidiscrete and fully discrete solutions of (10) and (14), respectively. For any $m\in\{1,2,\dots M\}$ , any $l\leq m$ , and any $1\leq p\leq\infty$ there hold

[TABLE]

and

[TABLE]

provided $k\leq\frac{t_{m}}{l+1}$ .

The next lemma is the semidiscrete analog of Lemma 3.1. {lmm} Let $v_{0}\in L^{2}(\Omega)$ and $v_{k}\in X_{k}^{r}$ be the semidiscrete solution of (10). Let $\Omega_{0}$ be an interior subdomain, i.e. $\bar{\Omega}_{0}\subset\Omega$ . Then $v_{k}(T)\in W^{1,\infty}(\Omega_{0})\cap C^{2}(\Omega_{0})$ and the followings estimates hold

[TABLE]

Proof.

The proof is similar to the proof of Lemma 3.1 and uses Lemma 3.1. ∎

Using the discrete version of the Gagliardo-Nirenberg inequality

[TABLE]

which for example was established for smooth domains in [16, Lemma 3.3], but the proof is valid for convex domains as well, we immediately obtain the following smoothing result. {crllr} Under the assumption of Lemma 3.1 for all $m=2,3,\dots,M$ , we have

[TABLE]

3.2. Smoothing pointwise error estimates

One of the main tools in obtaining error estimates for the optimal control problem under consideration are the pointwise smoothing error estimates that have an independent interest. The next theorems show that for the error at a point $(T,x_{0})$ we can obtain nearly optimal convergence rates in space and superconvergent rates in time. For elliptic problems such interior pointwise elliptic results are known from [30, 31]. For homogeneous parabolic problems with smoothing such results are new.

The first theorem provides an $L^{\infty}(\Omega)$ error estimate for the semidiscrete error $(v-v_{k})(T)$ .

{thrm}

Let $v_{0}\in L^{2}(\Omega)$ , let $v$ and $v_{k}$ satisfy (5) and (10). Then there holds

[TABLE]

with $C(T)\sim T^{-(2r+1+\frac{N}{4})}$ .

Note, that we obtain here a superconvergent estimate of order ${\mathcal{O}}(k^{2r+1})$ for the discretization with polynomials of order $r$ . The proof of this theorem is given in Section 7.

{rmrk}

In the sequel we will apply this and the following theorems for both, a heat equation formulated forward in time (5) and for a heat equation formulated backward in time, i.e.

[TABLE]

for some $y_{T}\in L^{2}(\Omega)$ . Its semidiscrete approximation $y_{k}\in X_{k}^{r}$ solves

[TABLE]

For this case the statement of the above theorem reads

[TABLE]

Correspondingly we will apply also Theorem 3.2 and Theorem 3.2 for this setting.

A corresponding result is true also for the $L^{\infty}$ norm of the gradient.

{thrm}

Let $v_{0}\in L^{2}(\Omega)$ , let $v$ and $v_{k}$ satisfy (5) and (10). Let moreover $\Omega_{0}$ with $\bar{\Omega}_{0}\subset\Omega$ be an interior subdomain. Then there holds

[TABLE]

with $C(T)\sim T^{-(2r+\frac{3}{2}+\frac{N}{4})}$ .

The proof of this theorem is given in Section 7. {rmrk} The result of Theorem 3.2 is valid also on the whole domain $\Omega$ instead of $\Omega_{0}$ with a slightly different constant $C(T)$ .

For the spatial error $(v_{k}-v_{kh})(T)$ we can not expect an ${\mathcal{O}}(h^{2})$ estimate with respect to the global $L^{\infty}(\Omega)$ norm. However for a given point $x_{0}\in\Omega$ we obtain the following result.

{thrm}

Let $v_{0}\in L^{2}(\Omega)$ , let $v_{k}$ and $v_{kh}$ satisfy (10) and (14), respectively and let $x_{0}\in\Omega$ such that $dist(x_{0},\partial\Omega)=d$ with $d>4h$ . Then there holds

[TABLE]

where $\ell_{kh}=\ln{\frac{T}{k}}+\lvert\ln{h}\rvert$ and $C(T,d)$ is a constant, which explicit dependence on $T$ and $d$ can be tracked from the proof.

The proof of this theorem is given in Section 8. Combining both theorems we immediately obtain an estimate for $(v-v_{kh})(T,x_{0})$ . {crllr} Let $v_{0}\in L^{2}(\Omega)$ , let $v$ and $v_{kh}$ satisfy (5) and (14), respectively and let $x_{0}\in\Omega$ such that $dist(x_{0},\partial\Omega)=d$ with $d>4h$ . Then there holds

[TABLE]

where $\ell_{kh}=\ln{\frac{T}{k}}+\lvert\ln{h}\rvert$ .

4. Discretization of optimal control problem

In this section we describe the temporal and spatial discretizations of the optimal control problem (2).

4.1. Temporal semidiscretization

To introduce the associated semidiscrete state $u_{k}=u_{k}(q)$ for a given control $q\in\mathcal{M}(\Omega)$ we consider slightly modified semidiscrete spaces $\widehat{X}_{k}^{r}\subset X_{k}^{r}\subset\widetilde{X}_{k}^{r}$ defined by

[TABLE]

and

[TABLE]

with some $1<s<\frac{N}{N-1}$ and $s^{\prime}>N$ with $\frac{1}{s}+\frac{1}{s^{\prime}}=1$ . For this setting we have $\varphi_{k,0}^{+}\in C_{0}(\Omega)$ for all $\varphi_{k}\in\widehat{X}_{k}^{r}$ due to the embedding $W^{1,s^{\prime}}_{0}(\Omega)\hookrightarrow C_{0}(\Omega)$ . The bilinear form $B(\cdot,\cdot)$ from (9) can be extended to $\widetilde{X}_{k}^{r}\times\widehat{X}_{k}^{r}$ . This allows us to define the semidiscrete state $u_{k}(q)\in\widetilde{X}_{k}^{r}$ by

[TABLE]

The corresponding semidiscrete control-to-state mappings $S_{k}\colon\mathcal{M}(\Omega)\to L^{2}(\Omega)$ is given by $S_{k}(q)=u_{k}(q)(T)$ and the semidiscrete reduced cost functional $j_{k}\colon\mathcal{M}(\Omega)\to\mathbb{R}$ by

[TABLE]

With this reduced cost functional we formulate the semidiscrete optimal control problems without discretization of the control space as follows:

[TABLE]

As on the continuous level we obtain the existence of a solution to (23). {thrm} The problem 23 possesses at least one solution $\bar{q}_{k}\in\mathcal{M}(\Omega)$ with corresponding state $\bar{u}_{k}=u_{k}(\bar{q}_{k})$ . There hold the estimates

[TABLE]

Proof.

The existence and the estimates follow by standard arguments, as on the continuous level. ∎

The question of uniqueness of $\bar{q}_{k}$ is more involved and is discussed after the statement of the optimality system.

{thrm}

The control $\bar{q}_{k}\in\mathcal{M}(\Omega)$ is a solution of (23) if and only if the triple $(\bar{q}_{k},\bar{u}_{k},\bar{z}_{k})$ fulfills the following conditions:

•

semidiscrete state equation, $\bar{u}_{k}=u_{k}(\bar{q}_{k})\in\widetilde{X}_{k}^{r}$ in the sense of (22).

•

semidiscrete adjoint equation for $\bar{z}_{k}\in\widehat{X}_{k}^{r}$ being the solution of

[TABLE]

•

variational inequality

[TABLE]

Proof.

The proof is the same as for the continuous problem. ∎

{crllr}

Let $\bar{q}_{k}\in\mathcal{M}(\Omega)$ be a solution of (23), $\bar{u}_{k}\in\widetilde{X}_{k}^{r}$ be the corresponding state, and $\bar{z}_{k}\in\widehat{X}_{k}^{r}$ the corresponding adjoint state. Then there hold

(a)

a bound for the adjoint state $\bar{z}_{k,0}^{+}$

[TABLE]

(b)

a support condition for the positive and the negative parts in the Jordan decomposition of $\bar{q}_{k}=\bar{q}^{+}_{k}-\bar{q}^{-}_{k}$

[TABLE]

Moreover there is a subdomain $\Omega_{0}$ with $\bar{\Omega}_{0}\subset\Omega$ such that $\operatorname{supp}\bar{q}_{k}\subset\Omega_{0}.$

Proof.

The proof is the same as for the continuous problem. ∎

The uniqueness of the solution $\bar{q}$ on the continuous level follows (cf. [7, Theorem 2.4]) by the fact that for the solution of the heat equation (5) we have that $v(T)=0$ implies $v_{0}=0$ . This is also true for the $dG(0)$ discretization but is in general wrong for the dG( $r$ ) semidiscretization with $r\geq 1$ . However, the following technical lemma allows us to prove uniqueness of the semidiscrete control $\bar{q}_{k}$ .

{lmm}

Let $q\in\mathcal{M}(\Omega)$ and $u_{k}=u_{k}(q)\in\widetilde{X}_{k}^{r}$ be the corresponding semidiscrete state defined by (22). Let $u_{k}(T)=0$ . Then the holds:

(1)

For $r=0$ we have $q=0$ . 2. (2)

Let $r>0$ . If there exists an open set $D\subset\Omega$ such that $q|_{D}=0$ , then $q=0$ .

Proof.

It is well known, cf., e.g., [12], that dG( $r$ ) discretization of a homogeneous problem coincides with the corresponding subdiagonal Padé approximation scheme. Therefore, there is a rational function $f_{r}=a_{r}/b_{r}$ with polynomials $a_{r}\in\mathbb{P}_{r}$ , $b_{r}\in\mathbb{P}_{r+1}$ and $b_{r}(s)\neq 0$ for $s\in\mathbb{R}_{+}$ , such that

[TABLE]

By the assumption of the lemma we have $u_{k,M}^{-}=u_{k}(T)=0$ .

(1)

For $r=0$ we have $f_{0}(s)=\frac{1}{1+s}$ and therefore

[TABLE]

which implies $u_{k,M-1}^{-}=0$ . Similarly, we obtain $u_{k,m}^{-}=0$ for all $m=2,3,\dots M$ and consequently $q=0$ . 2. (2)

For $r>0$ we argue differently. We consider the eigenvalues $0<\lambda_{1}\leq\lambda_{2}\leq\lambda_{3}\dots$ of $-\Delta$ and the corresponding system of eigenfunctions $w_{1},w_{2},\dots$ with $(w_{i},w_{j})=\delta_{ij}$ . The initial condition $q\in\mathcal{M}(\Omega)\subset H^{-2}(\Omega)$ can be expanded as

[TABLE]

and the convergence to be understood in $H^{-2}(\Omega)$ . We define the polynomials

[TABLE]

With this notation we have

[TABLE]

and consequently $A_{k}(-\Delta)q=0$ . This results is

[TABLE]

and therefore

[TABLE]

Assume now that $q\neq 0$ . Since $A_{k}\in{\mathbb{P}}_{rM}$ has no more than $rM$ positive zeros, there are only finitely many $q_{n}$ with $q_{n}\neq 0$ . For this reason we have that the expansion (24) is a finite sum and therefore $q\in H^{2}(\Omega)\cap H^{1}_{0}(\Omega)$ since $w_{n}\in H^{2}(\Omega)\cap H^{1}_{0}(\Omega)$ for every $n$ by convexity of $\Omega$ . We have with some $R\in\mathbb{N}$

[TABLE]

From $q\in H^{2}(\Omega)$ and $q|_{D}=0$ we obtain that $(-\Delta)^{l}q$ also vanishes on $D$ for every $l\in\mathbb{N}$ . Therefore, we have

[TABLE]

and dividing by $\lambda_{n_{R}}^{l}$ we have

[TABLE]

For $l\to\infty$ all summands with $\lambda_{n_{i}}<\lambda_{n_{R}}$ converge to zero resulting in

[TABLE]

This $w\neq 0$ is an eigenfunction of $-\Delta$ , which provides a contradiction, since a nontrivial eigenfunction can not vanish on an open set by the unique continuation principle, see, e.g., [21, p. 64]. This completes the proof.

∎

{thrm}

The solution $\bar{q}_{k}\in\mathcal{M}(\Omega)$ of (23) is unique.

Proof.

We first observe the uniqueness of $\bar{u}_{k}(T)$ by the strict convexity of the tracking term in $j_{k}(q)$ . It remains to show, that this implies the uniqueness of $\bar{q}_{k}$ . Assume there are two optimal controls and consider the difference $q:=\bar{q}_{k,1}-\bar{q}_{k,2}\in\mathcal{M}(\Omega)$ . Let $w_{k}=u_{k}(\bar{q}_{1})-u_{k}(\bar{q}_{2})$ , i.e. $w_{k}=u_{k}(q)$ . Then there holds $u_{k}(T)=0$ . In the case $r=0$ we immediately obtain $q=0$ by the first statement of Lemma 4.1. For $r>0$ we obtain from Corollary 4.1, that $\operatorname{supp}\bar{q}_{k,i}\subset\Omega_{0}$ with $\bar{\Omega}_{0}\subset\Omega$ and therefore $\operatorname{supp}q\subset\Omega_{0}$ . This implies the existence of an open set $D\subset\Omega\setminus\bar{\Omega}_{0}$ with $q|_{D}=0$ . Then we obtain $q=0$ from the second statement of Lemma 4.1. ∎

4.2. Space-time discretization

For a given control $q\in\mathcal{M}(\Omega)$ we also introduce the associated fully discrete state $u_{kh}=u_{kh}(q)\in X^{r,1}_{k,h}$ by

[TABLE]

the fully discrete control-to-state mappings $S_{kh}\colon\mathcal{M}(\Omega)\to L^{2}(\Omega)$ by $S_{kh}(q)=u_{kh}(q)(T)$ , and the fully discrete reduced cost functional $j_{kh}\colon\mathcal{M}(\Omega)\to\mathbb{R}$ by

[TABLE]

Based on this definition we formulate the corresponding optimal control problem, where we first look for the control variable in the whole space $\mathcal{M}(\Omega)$ . This leads to the following formulation.

[TABLE]

One can not expect, that this problem has a unique solution. For $r=0$ however, where is a unique solution in the properly defined discrete subspace $\mathcal{M}_{h}$ of $\mathcal{M}(\Omega)$ , see the discussion below. To introduce the space $\mathcal{M}_{h}$ , let ${\mathcal{N}}_{h}$ be the set of all interior nodes of the mesh $\mathcal{T}$ . For $x_{i}\in{\mathcal{N}}_{h}$ let $\delta_{x_{i}}\in\mathcal{M}(\Omega)$ denote the Dirac measure concentrated in $x_{i}$ and $\varphi_{h,i}\in V_{h}$ be the nodal basis function associated to the node $x_{i}$ . Then we define the space $\mathcal{M}_{h}$ as

[TABLE]

and introduce a projection operator $\Lambda_{h}\colon\mathcal{M}(\Omega)\to\mathcal{M}_{h}$ (cf., e.g., [5]) by

[TABLE]

The definition implies that

[TABLE]

where $i_{h}\colon C_{0}(\Omega)\to V_{h}$ is the nodal interpolation operator. The following two properties of $\Lambda_{h}$ can be directly checked. {lmm} There holds

(a)

$\lVert\Lambda_{h}q\rVert_{\mathcal{M}(\Omega)}\leq\lVert q\rVert_{\mathcal{M}(\Omega)}$ for all $q\in\mathcal{M}(\Omega)$ .

(b)

The fully discrete solutions of the state equation associated with $q$ and with $\Lambda_{h}q$ are the same, i.e.

[TABLE]

Proof.

The proof of (a) follows from [6, Thm. 3.1] and the proof of (b) uses the definition (25) of $u_{kh}$ and (27). ∎

The next theorem provides the existence of a solution to (26). {thrm} There exists a solution of (26). For each solution $\tilde{q}_{kh}\in\mathcal{M}(\Omega)$ the projection $\bar{q}_{kh}=\Lambda_{h}\tilde{q}_{kh}\in\mathcal{M}_{h}$ is also a solution of (26). For $r=0$ the solution $\bar{q}_{kh}\in\mathcal{M}_{h}$ is unique. For any solution $\bar{q}_{kh}\in\mathcal{M}_{h}$ and the corresponding state $\bar{u}_{kh}$ the following estimates hold

[TABLE]

Proof.

The existence and the estimates follow as on the continuous level. The fact that $\bar{q}_{kh}=\Lambda_{h}\tilde{q}_{kh}\in\mathcal{M}_{h}$ is also a solution of (26) follows directly from Lemma 27. The uniqueness in the case of $r=0$ follows from the fully discrete analog of the first statement of Lemma 4.1, cf. also the proof of [7, Theorem 4.8]. ∎

{rmrk}

For $r>0$ it seems that problem (26) may in general have multiple solutions in $\mathcal{M}_{h}$ . The argument we used to prove uniqueness of the semidiscrete solution $\bar{q}_{k}$ is based on the second statement of Lemma 4.1, which does not extend to the fully discrete setting.

In the next theorem we state the optimality system on the fully discrete level. {thrm} The control $\bar{q}_{kh}\in\mathcal{M}_{h}$ is a solution of (26) in $\mathcal{M}_{h}$ if and only if the triple $(\bar{q}_{kh},\bar{u}_{kh},\bar{z}_{kh})$ fulfills the following conditions:

•

fully discrete state equation, $\bar{u}_{kh}=u_{kh}(\bar{q}_{kh})\in X^{r,1}_{k,h}$ in the sense (25).

•

fully discrete adjoint equation for $\bar{z}_{kh}\in X^{r,1}_{k,h}$ being the solution of

[TABLE]

•

variational inequality

[TABLE]

Proof.

The proof is the same as for the continuous problem. ∎

{rmrk}

Please note, that the variational inequality in the above theorem holds for all variations $q\in\mathcal{M}(\Omega)$ and not only for those from $\mathcal{M}_{h}$ . This is due to the fact that the solution $\bar{q}_{kh}\in\mathcal{M}_{h}$ solves the problem (26), where the control is not discretized, see Theorem 4.2.

{crllr}

Let $\bar{q}_{kh}\in\mathcal{M}_{h}$ be a solution of (26), $\bar{u}_{kh}\in X^{r,1}_{k,h}$ be the corresponding state and $\bar{z}_{kh}\in X^{r,1}_{k,h}$ the corresponding adjoint state. Then there hold

(a)

a bound for the adjoint state $\bar{z}_{kh,0}^{+}$

[TABLE]

(b)

a support condition for the positive and the negative parts in the Jordan decomposition of $\bar{q}_{kh}=\bar{q}^{+}_{kh}-\bar{q}^{-}_{kh}$

[TABLE]

Moreover there is a subdomain $\Omega_{0}$ independent on $k$ and $h$ with $\bar{\Omega}_{0}\subset\Omega$ such that $\operatorname{supp}\bar{q}_{kh}\subset\Omega_{0}.$

Proof.

The proof is the same as for the continuous problem. ∎

5. General error estimates for the optimal control problem

In this section we prove an error estimate for the error between the optimal state on the continuous and on the discrete level, which does not require any further assumptions on the structure of the solution.

As the first step we provide an estimate for the error in the state at terminal time for a given control $q\in\mathcal{M}(\Omega)$ .

{lmm}

Let $q\in\mathcal{M}(\Omega)$ be a given control with $\operatorname{supp}q\subset\Omega_{0}$ and $\bar{\Omega}_{0}\subset\Omega$ . Let $u=u(q)$ be the solution of the state equation (1), $u_{k}=u_{k}(q)\in X_{k}^{r}$ be the semidiscrete approximation (22) and $u_{kh}=u_{kh}(q)\in X^{r,1}_{k,h}$ the fully discrete approximation (25). Then there hold

[TABLE]

and

[TABLE]

where $\ell_{kh}=\ln{\frac{T}{k}}+\lvert\ln{h}\rvert$ .

Proof.

To prove the first estimate we consider the solution $y\in W(0,T)$ of the dual problem

[TABLE]

and its semidiscrete approximation $y_{k}\in\widehat{X}_{k}^{r}$ solving

[TABLE]

There holds

[TABLE]

where in the last step we used Theorem 3.2 for the error $y(0)-y_{k,0}^{+}$ in the $L^{\infty}(\Omega)$ norm, see also Remark 3.2.

For the proof of the spatial estimate we consider the dual solution $w_{k}\in\widehat{X}_{k}^{r}$ solving

[TABLE]

and $w_{kh}\in X^{r,1}_{k,h}$ solving

[TABLE]

Then we get

[TABLE]

where we used the fact that $\operatorname{supp}q\subset\Omega_{0}$ and Theorem 3.2 in the last step.

∎

{rmrk}

Please note that the assumption $\operatorname{supp}q\subset\Omega_{0}$ with $\bar{\Omega}_{0}\subset\Omega$ in the above theorem is required only for the spatial estimate.

Based on this theorem we can directly obtain estimates for optimal values of the cost functional.

{thrm}

Let $\bar{q}\in\mathcal{M}(\Omega)$ be the optimal solution of (7) with the corresponding optimal state $\bar{u}$ . Let $\bar{q}_{k}\in\mathcal{M}(\Omega)$ be the optimal solution of the semidiscrete problem (23) with the corresponding state $\bar{u}_{k}\in X_{k}^{r}$ and let $\bar{q}_{kh}\in\mathcal{M}_{h}$ be a solution of the fully discrete problem (26) with the corresponding state $\bar{u}_{kh}\in X^{r,1}_{k,h}$ . Then there hold:

[TABLE]

and

[TABLE]

where $\ell_{kh}=\ln{\frac{T}{k}}+\lvert\ln{h}\rvert$ and $C=C(T,u_{d})$ depends on $T$ and $\lVert u_{d}\rVert_{L^{2}(\Omega)}$ .

Proof.

By the optimality of $\bar{q}$ for (7) we have

[TABLE]

Similarly by the optimality of $\bar{q}_{k}$ for (7) we have

[TABLE]

and therefore

[TABLE]

For both $q=\bar{q}$ and $q=\bar{q}_{k}$ we estimate

[TABLE]

Then using the first estimate from Lemma 5, the estimates

[TABLE]

as wells as estimates for $\bar{q}$ and $\bar{q}_{k}$ from Theorem 2 and Theorem 4.1 we complete the proof for the temporal error. The spatial error is estimated similarly by using the second estimate from Lemma 5. ∎

The next theorem is the main result of this section, which provides an estimate for the error between the optimal states.

{thrm}

Let $\bar{q}\in\mathcal{M}(\Omega)$ be the optimal solution of (7) with the corresponding optimal state $\bar{u}$ . Let $\bar{q}_{k}\in\mathcal{M}(\Omega)$ be the optimal solution of the semidiscrete problem (23) with the corresponding state $\bar{u}_{k}\in X_{k}^{r}$ and let $\bar{q}_{kh}\in\mathcal{M}_{h}$ be a solution of the fully discrete problem (26) with the corresponding state $\bar{u}_{kh}\in X^{r,1}_{k,h}$ . Then there hold:

[TABLE]

and

[TABLE]

where $\ell_{kh}=\ln{\frac{T}{k}}+\lvert\ln{h}\rvert$ and $C=C(T,\alpha,u_{d})$ depends on $T$ , $\alpha$ and $\lVert u_{d}\rVert_{L^{2}(\Omega)}$ .

Proof.

To prove the first estimate we use the variational inequality from Theorem 2 with $q=\bar{q}_{k}$

[TABLE]

and the corresponding variational inequality from Theorem 4.1 with $q=\bar{q}$

[TABLE]

Adding these two inequalities results in

[TABLE]

To proceed we introduce $\hat{u}_{k}=u_{k}(\bar{q})\in X_{k}^{r}$ as the solution of (22) for $q=\bar{q}$ and $\hat{z}_{k}\in X_{k}^{r}$ fulfilling

[TABLE]

Using the semidiscrete state and adjoint equations we obtain

[TABLE]

This results in

[TABLE]

By the Cauchy-Schwarz inequality in the last term and absorbing $\lVert(\bar{u}-\bar{u}_{k})(T)\rVert_{L^{2}(\Omega)}$ in the left-hand side we obtain

[TABLE]

Using the estimates for $\lVert\bar{q}\rVert_{\mathcal{M}(\Omega)}$ and $\lVert\bar{q}_{k}\rVert_{\mathcal{M}(\Omega)}$ from Theorem 2 and Theorem 4.1 we get

[TABLE]

For the term $\lVert\bar{z}(0)-\hat{z}_{k,0}^{+}\rVert_{L^{\infty}(\Omega)}$ we can directly apply Theorem 3.2 resulting in

[TABLE]

The term $\lVert(\bar{u}-\hat{u}_{k})(T)\rVert_{L^{2}(\Omega)}$ is estimated by the first estimate in Lemma 5 leading to

[TABLE]

Putting these estimates together we obtain

[TABLE]

which is the the first desired estimate. The estimate for $(\bar{u}_{k}-\bar{u}_{kh})(T)$ is obtained similarly using Theorem 3.2 and the second estimate from Lemma 5. ∎

For the error in the control we can in general only expect a weak star convergence, see the following lemma. {lmm} Let $\bar{q}\in\mathcal{M}(\Omega)$ be the optimal solution of (7), $\bar{q}_{k}\in\mathcal{M}(\Omega)$ be the optimal solution of the semidiscrete problem (23), and $\bar{q}_{kh}\in\mathcal{M}_{h}$ be a solution of the fully discrete problem (26). Then there holds

[TABLE]

and for fixed $k>0$

[TABLE]

Proof.

The proof is similar to the proof of [7, Theorem 4.10]. ∎

Under an additional assumption stronger results are discussed in the next section.

6. Improved error estimates for the optimal state and control

In the previous section we provided error estimates for the error in the cost functional and for the optimal states at the terminal time. In general we can not expect an error estimate for the control, $\bar{q}-\bar{q}_{kh}$ , with respect to the norm in $\mathcal{M}(\Omega)$ , since only weak star convergence of the controls can be expected, cf. the corresponding discussion in [7]. However, if the optimal control consists of finitely many Diracs, error rates for the positions and the coefficients of these Diracs can be shown. To prove such error estimates and to improve the estimate for the state from Theorem 5 we make the following assumption.

Assumption 1.

Let $\bar{q}$ be the solution of the problem of (7) with the corresponding optimal state $\bar{u}$ and adjoint state $\bar{z}$ . We assume that

[TABLE]

with $K\in\mathbb{N}$ and $\bar{x}_{i}\in\Omega$ for $i=1,2,\dots,K$ are pairwise disjoint points. Moreover, there holds

[TABLE]

and

[TABLE]

where $\nabla^{2}\bar{z}(0,\bar{x}_{i})$ denotes the Hessian matrix of $\bar{z}$ with respect to the spatial variable.

{rmrk}

•

From Corollary 2 (b) we have that

[TABLE]

Here, we assume equality of these two sets and the finite cardinality of them.

•

Due to the fact $\lvert\bar{z}(0,x)\rvert\leq\alpha$ by Corollary 2 (a), the points $\bar{x}_{i}$ with $\bar{z}(0,\bar{x}_{i})=-\alpha$ are the minimizers and the points $\bar{x}_{i}$ with $\bar{z}(0,\bar{x}_{i})=\alpha$ are the maximizers of $\bar{z}(0)$ . Therefore, we have $\nabla\bar{z}(0,\bar{x}_{i})=0$ and the corresponding Hessian matrices are positive semidefinite in the former and negative semidefinite in the later case. In addition we assume positive and negative definiteness respectively. This assumption corresponds to sufficient second order optimality conditions for minimizers and maximizers of $\bar{z}(0)$ .

•

Similar assumptions can be found in the literature, see [25, 32] in the context of semi-infinite programming and the notion of non-degeneracy in super-resolution [10].

Under the above assumption the optimal control $\bar{q}$ consists of finitely many Diracs and has the form

[TABLE]

with $\bar{\beta}=\{\bar{\beta}_{i}\}\in\mathbb{R}^{K}$ , where $\bar{\beta}_{i}>0$ for $\bar{x}_{i}$ with $\bar{z}(0,\bar{x}_{i})=-\alpha$ and $\bar{\beta}_{i}<0$ for $\bar{x}_{i}$ with $\bar{z}(0,\bar{x}_{i})=\alpha$ .

6.1. Error estimates for the temporal error

We first prove that under Assumption 1 the structure of the semidiscrete control $\bar{q}_{k}$ is similar to that of $\bar{q}$ (30). To this end we first show that Hessian matrix of the discrete adjoint state $\bar{z}_{k}$ has the same definiteness properties as of the continuous adjoint state $\bar{z}$ in the neighborhoods of the points $\bar{x}_{i}$ . {lmm} Let Assumption 1 be fulfilled. Let $\bar{q}_{k}\in\mathcal{M}(\Omega)$ be the optimal solution of the semidiscrete problem (23) with the corresponding state $\bar{u}_{k}\in X_{k}^{r}$ and the adjoint state $\bar{z}_{k}\in X_{k}^{r}$ . Then there exist $\varepsilon>0$ , $k_{0}>0$ , and $\gamma>0$ such that

[TABLE]

for all $x\in B_{\varepsilon}(\bar{x}_{i})$ and all $k\leq k_{0}$ for $\bar{x}_{i}$ with $\bar{z}(0,\bar{x}_{i})=-\alpha$ , where $\lambda_{\min}(\cdot)$ denotes the smallest eigenvalue of the corresponding matrix. Similarly,

[TABLE]

for all $x\in B_{\varepsilon}(\bar{x}_{i})$ and all $k\leq k_{0}$ for $\bar{x}_{i}$ with $\bar{z}(0,\bar{x}_{i})=\alpha$ .

Proof.

We consider $\bar{x}_{i}$ with $\bar{z}(0,\bar{x}_{i})=-\alpha$ . The Hessian matrix $\nabla^{2}\bar{z}(0,\bar{x}_{i})$ is positive definite by Assumption 1. Moreover $\bar{z}(0)\in C^{2}(\Omega_{0})$ by Lemma 2. Therefore, there exists a neighborhood $B_{\varepsilon}(\bar{x}_{i})$ such that $\nabla^{2}\bar{z}(0,x)$ is uniformly positive definite for $x\in B_{\varepsilon}(\bar{x}_{i})$ . It remains to prove that

[TABLE]

There holds

[TABLE]

by the embedding $H^{4}(\Omega_{0})\hookrightarrow C^{2}(\Omega_{0})$ , the interior regularity result [13, Chapter 6.3,Theorem 2] and convexity of $\Omega$ . To proceed we insert $\hat{z}_{k}\in X_{k}$ defined by (28) leading to

[TABLE]

The first term is directly estimated by Lemma 7 (below) with $j=2$ resulting in

[TABLE]

and for the second term we have by the smoothing estimate Lemma 3.1

[TABLE]

where in the last step we used the first estimate from Theorem 5. This completes the proof. ∎

{lmm}

Let Assumption 1 be fulfilled. Let $\bar{q}_{k}\in\mathcal{M}(\Omega)$ be the optimal solution of the semidiscrete problem (23) with the corresponding state $\bar{u}_{k}\in X_{k}^{r}$ and the adjoint state $\bar{z}_{k}\in X_{k}^{r}$ . Then there is an $\varepsilon>0$ and $k_{0}>0$ such that the neighborhoods $B_{\varepsilon}(\bar{x}_{i})$ are pairwise disjoint and for each $i$ and $k\leq k_{0}$ there is a unique $\bar{x}_{k,i}\in B_{\varepsilon}(\bar{x}_{i})$ such that

[TABLE]

and

[TABLE]

Moreover there are no further points $x\in\Omega$ with $\bar{z}_{k,0}^{+}(x)=\pm\alpha$ and the semidiscrete control has the structure

[TABLE]

with $\bar{\beta}_{k}=\{\bar{\beta}_{k,i}\}\in\mathbb{R}^{K}$ , where $\bar{\beta}_{k,i}>0$ for $\bar{x}_{k,i}$ with $\bar{z}_{k,0}^{+}(\bar{x}_{k,i})=-\alpha$ and $\bar{\beta}_{k,i}<0$ for $\bar{x}_{k,i}$ with $\bar{z}_{k,0}^{+}(\bar{x}_{k,i})=\alpha$ .

Proof.

First we choose $\varepsilon>0$ such that the statement of Lemma 6.1 are fulfilled for all $i$ and the balls $\bar{B}_{\varepsilon}(\bar{x}_{i})$ are pairwise disjoint. Let $i$ be fixed with $\bar{z}_{k,0}^{+}(\bar{x}_{k,i})=-\alpha$ . The case of $\bar{z}_{k,0}^{+}(\bar{x}_{k,i})=\alpha$ is discussed in the same fashion. From Lemma 5 we have $\bar{q}_{k}\overset{\ast}{\rightharpoonup}\bar{q}$ in $\mathcal{M}(\Omega)$ . We choose a smooth cut-off function $\omega$ with $\omega(x)=1$ for all $x\in B_{\varepsilon/2}(\bar{x}_{i})$ and with $\operatorname{supp}\omega\subset B_{\varepsilon}(\bar{x}_{i})$ . From the weak star convergence we obtain

[TABLE]

Therefore, there exists $k_{0}>0$ such that $\langle\bar{q}_{k},\omega\rangle>0$ for all $k\leq k_{0}$ , which proves that $\operatorname{supp}\bar{q}_{k}\cap B_{\varepsilon}(\bar{x}_{i})$ is not empty. The support condition for $\bar{q}_{k}$ from Corollary 4.1 implies the existence of at least one $\bar{x}_{k,i}\in B_{\varepsilon}(\bar{x}_{i})$ with $\bar{z}_{k,0}^{+}(\bar{x}_{k,i})=-\alpha$ . By Lemma 6.1 $\bar{z}_{k,0}^{+}$ is strictly convex on $B_{\varepsilon}(\bar{x}_{i})$ . This implies the uniqueness of the minimizer $\bar{x}_{k,i}$ in $B_{\varepsilon}(\bar{x}_{i})$ . In order to show that there are no further points $x$ with $\bar{z}_{k,0}^{+}(x)=-\alpha$ in the complement of the union of all $B_{\varepsilon}(\bar{x}_{i})$ , it is sufficient to show that $\lVert\bar{z}(0)-\bar{z}_{k,0}^{+}\rVert_{L^{\infty}(\Omega)}\to 0$ for $k\to 0$ . We have as in the proof of the previous lemma

[TABLE]

where $\hat{z}_{k}\in X_{k}$ is defined by (28). For the first term we obtain by Theorem 3.2

[TABLE]

and for the second one

[TABLE]

where in the last step we used the first estimate from Theorem 5. This completes the proof. ∎

The main result of this section provides optimal order estimates for the error in the position $\bar{x}_{i}-\bar{x}_{k,i}$ , the coefficients $\bar{\beta}_{i}-\bar{\beta}_{k,i}$ and improves the first estimate from Theorem 5 for the state error $\lVert(\bar{u}-\bar{u}_{k})(T)\rVert_{L^{2}(\Omega)}$ .

{thrm}

Let $\bar{q}$ be the solution of the problem of (7) with the corresponding optimal state $\bar{u}$ and the adjoint state $\bar{z}$ and let Assumption 1 be fulfilled. Let moreover $\bar{q}_{k}\in\mathcal{M}(\Omega)$ be the optimal solution of the semidiscrete problem (23) with the corresponding state $\bar{u}_{k}\in X_{k}^{r}$ and the adjoint state $\bar{z}_{k}\in X_{k}^{r}$ . Then there exists $k_{0}>0$ such that for $k\leq k_{0}$ there hold

(a)

[TABLE]

(b)

[TABLE]

(c)

[TABLE]

(d)

[TABLE]

where $C=C(T,\alpha,u_{d})$ depends on $T$ , $\alpha$ and $\lVert u_{d}\rVert_{L^{2}(\Omega)}$ .

{rmrk} From the equivalence relation in (LABEL:eqofwinfkr) we directly infer

[TABLE]

from statement (d) above.

To prepare the proof of Theorem 6.1 we first estimate the error in the position $\bar{x}_{i}-\bar{x}_{k,i}$ and in the coefficients $\bar{\beta}_{i}-\bar{\beta}_{k,i}$ in terms of the state error $\lVert(\bar{u}-\bar{u}_{k})(T)\rVert_{L^{2}(\Omega)}$ .

{lmm}

Let Assumption 1 be fulfilled. Then there exists $k_{0}>0$ such that for $k\leq k_{0}$ there holds

[TABLE]

for all $i=1,2,\dots,K$ , where $C=C(T,u_{d})$ depends on $T$ and $\lVert u_{d}\rVert_{L^{2}(\Omega)}$ .

Proof.

For a fixed $i\in\{1,2,\dots,K\}$ we assume without restriction that $\bar{z}(\bar{x}_{i})=-\alpha$ (the case $\bar{z}(\bar{x}_{i})=\alpha$ can be treated similarly). Then we have that $\bar{z}_{k,0}^{+}(\bar{x}_{k,i})=-\alpha$ by Lemma 6.1. The point $\bar{x}_{i}$ is a minimizer of $\bar{z}(0)$ and the point $\bar{x}_{k,i}$ is a minimizer of $\bar{z}_{k,0}^{+}$ . Therefore, there holds

[TABLE]

Due to the fact that ${\rm NewA}^{2}\bar{z}(0)$ is uniformly positive definite on $B_{\varepsilon}(\bar{x}_{i})$ and $\bar{x}_{k,i}\in B_{\varepsilon}(\bar{x}_{i})$ we have

[TABLE]

and therefore

[TABLE]

where $\hat{z}_{k,0}^{+}\in X_{k}^{r}$ is the solution of the intermediate discrete adjoint equation (28) and $\Omega_{0}$ is an interior subdomain with

[TABLE]

By Theorem 3.2 we have

[TABLE]

and by the smoothing property from Lemma 3.1

[TABLE]

This completes the proof. ∎

To proceed we introduce the operators $\widetilde{G},\widetilde{G}^{k},\widetilde{G}_{k}\colon\mathbb{R}^{K}\to L^{2}(\Omega)$ by

[TABLE]

where $S$ and $S_{k}$ are the continuous and the semidiscrete solution operators defined above. Moreover we restrict the codomains of these operators to the corresponding image sets and call the resulting operator $G$ , $G^{k}$ and $G_{k}$ with

[TABLE]

and

[TABLE]

In the next lemma we estimate the errors between these operators. {lmm} Let Assumption 1 be fulfilled. There hold

[TABLE]

and

[TABLE]

Proof.

For given $\beta\in\mathbb{R}^{K}$ we consider $q,q^{k}\in\mathcal{M}(\Omega)$ defined by

[TABLE]

as well as the corresponding states $u=u(q)$ , $u^{k}=u(q^{k})$ in the sense of Proposition 2 and $u_{k}=u_{k}(q^{k})\in\widetilde{X}_{k}^{r}$ in the sense of (22). The second statement is then directly given by Lemma 5, since

[TABLE]

To prove the first statement we consider a dual problem for $y\in W(0,T)$ solving

[TABLE]

and obtain

[TABLE]

where in the last step we used smoothing estimate from Lemma 3.1 for $y$ . This completes the proof. ∎

{lmm}

Let Assumption 1 be fulfilled. The operators $G$ , $G^{k}$ are bijective and there is a constant $c>0$ such that

[TABLE]

hold for all $\beta\in\mathbb{R}^{K}$ . Moreover, there is $k_{0}>0$ such that $G_{k}$ is bijective and the estimate

[TABLE]

holds for all $\beta\in\mathbb{R}^{K}$ and all $k\leq k_{0}$ .

Proof.

All three operators $G$ , $G^{k}$ and $G_{k}$ are surjective by definition. We first argue the injectivity of $G$ . Let $G(\beta)=0$ for some $\beta\in\mathbb{R}^{K}$ . This means that for the solution $v$ of the heat equation with the initial condition given as the corresponding linear combination of Diracs, i.e. $v=S\left(\sum_{i=1}^{K}\beta_{i}\delta_{\bar{x}_{i}}\right)$ , we have $v(T)=0$ . By a similar argument as in proof of uniqueness of the optimal control $\bar{q}$ , cf. [7, Theorem 2.4], we obtain

[TABLE]

This results in $\beta=0$ by the fact that the points $\bar{x}_{i}$ are pairwise disjoint. This provides the existence of an inverse mapping $G^{-1}\colon\operatorname{Im}(\widetilde{G})\subset L^{2}(\Omega)\to\mathbb{R}^{K}$ and the estimate

[TABLE]

holds. For the operator $G^{k}$ we can argue similarly. It remains to show that $G_{k}$ is bijective and $\lVert G_{k}^{-1}\rVert_{L^{2}(\Omega)\to\mathbb{R}^{K}}$ is bounded independently of $k$ . Let $\beta\in\mathbb{R}^{K}$ be arbitrary. There holds by (32) and Lemma 6.1

[TABLE]

where in the last step we used Lemma 6.1 and Theorem 5. Choosing $k_{0}$ small enough we obtain

[TABLE]

which completes the proof. ∎

{lmm}

Let Assumption 1 be fulfilled. Then there exists $k_{0}>0$ such that for $k\leq k_{0}$ there holds

[TABLE]

where $C=C(T,u_{d})$ depends on $T$ and $\lVert u_{d}\rVert_{L^{2}(\Omega)}$ .

Proof.

There holds by Lemma 6.1

[TABLE]

By the definition we have $G(\bar{\beta})=\bar{u}(T)$ and $G_{k}(\bar{\beta}_{k})=\bar{u}_{k}(T)$ . Using Lemma 6.1 and Lemma 6.1 we obtain

[TABLE]

The fact that

[TABLE]

completes the proof. ∎

Previous lemmas allow us to obtain the corresponding estimate for a negative norm of $\bar{q}-\bar{q}_{k}$ in terms of $\lVert(\bar{u}-\bar{u}_{k})(T)\rVert_{L^{2}(\Omega)}$ .

{lmm}

Let Assumption 1 be fulfilled. Let $\bar{q}\in\mathcal{M}(\Omega)$ be the solution of (7) and $\bar{q}_{k}\in\mathcal{M}(\Omega)$ be the solution of the semidiscrete problem (23). Then there holds

[TABLE]

Proof.

Let $\varphi\in W^{1,\infty}(\Omega)$ with $\lVert\varphi\rVert_{W^{1,\infty}(\Omega)}\leq 1$ . We have to estimate

[TABLE]

Using Lemma 6.1 and Lemma 6.1 we obtain

[TABLE]

which completes the proof. ∎

We proceed with the proof of Theorem 6.1.

Proof.

We start with the estimate (29) from the proof of Theorem 5, i.e.

[TABLE]

where $\hat{u}_{k}=u_{k}(\bar{q})\in X_{k}^{r}$ and $\hat{z}_{k}\in X_{k}^{r}$ is the solution of (28). The second term can be estimated as in the proof of Theorem 5 by the first estimate in Lemma 5 leading to

[TABLE]

It remains to estimate the duality product from (33). We have by Lemma 6.1

[TABLE]

where we have used the fact that $\operatorname{supp}\bar{q},\operatorname{supp}\bar{q}_{k}\subset\Omega_{0}$ . Using Theorem 3.2 and Theorem 3.2 we have

[TABLE]

Putting all terms in (33) together we get

[TABLE]

Absorbing $\lVert(\bar{u}-\bar{u}_{k})(T)\rVert_{L^{2}(\Omega)}$ in the left-hand side we obtain the estimate (a) in Theorem 6.1. The estimates (b), (c), and (d) are obtained from (a) using Lemma 6.1, Lemma 6.1 as well as Lemma 6.1. ∎

6.2. Error estimates for the spatial error

For the semidiscretization we have shown (see Lemma 6.1) that the number of support points of the semidiscrete control $\bar{q}_{k}\in\mathcal{M}(\Omega)$ is the same as on the continuous level in Assumption 1. For the fully discrete control $\bar{q}_{kh}\in\mathcal{M}_{h}$ the situation is different. We will show (see next lemma) that in the neighborhood of each support point $\bar{x}_{k,i}$ of $\bar{q}_{k}$ there is at least one support point of $\bar{q}_{kh}$ , but there could be more than one such point. This phenomena is also observed in our numerical experiments.

{lmm}

Let Assumption 1 be fulfilled. Let $\bar{q}_{k}\in\mathcal{M}(\Omega)$ be the optimal solution of the semidiscrete problem (23) with the corresponding state $\bar{u}_{k}\in X_{k}^{r}$ and the adjoint state $\bar{z}_{k}\in X_{k}^{r}$ . Let $\bar{q}_{kh}\in\mathcal{M}_{h}$ be an optimal solution of the fully discrete problem (26) with the corresponding state $\bar{u}_{kh}\in X^{r,1}_{k,h}$ and the adjoint state $\bar{z}_{kh}\in X^{r,1}_{k,h}$ . Then there is $k_{0}>0$ such that for any fixed $k<k_{0}$ the following holds. There is an $\varepsilon>0$ and $h_{0}>0$ such that the neighborhoods $B_{\varepsilon}(\bar{x}_{k,i})$ are pairwise disjoint and for each $i$ and $h\leq h_{0}$ there is at least one $\bar{x}_{kh,ij}\in B_{\varepsilon}(\bar{x}_{k,i})\cap{\mathcal{N}}_{h}$ such that

[TABLE]

and

[TABLE]

Moreover there are no points $x\in\Omega\setminus\bigcup_{i}B_{\varepsilon}(\bar{x}_{k,i})$ with $\bar{z}_{kh,0}^{+}(x)=\pm\alpha$ .

Proof.

The proof is similar to the proof of Lemma 6.1. ∎

Under the conditions of Lemma 6.2 a fully discrete control $\bar{q}_{kh}$ consists of groups of Dirac functionals for each single Dirac $\delta_{\bar{x}_{k,i}}$ on the semidiscrete level. This means, that $\bar{q}_{kh}$ is given as

[TABLE]

where $n_{i}\in{\mathbb{N}}$ describes the cardinality of $\operatorname{supp}\bar{q}_{kh}|_{B_{\varepsilon}(\bar{x}_{k,i})}$ . The cardinality of $\operatorname{supp}\bar{q}_{kh}$ is then $K_{h}=\sum_{i=1}^{K}n_{i}\geq K$ . In order to compare the vector of coefficients $\bar{\beta}_{kh}=\{\bar{\beta}_{kh,ij}\}\in\mathbb{R}^{K_{h}}$ with the vector $\bar{\beta}_{k}\in\mathbb{R}^{K}$ on the semidiscrete level, we define $\hat{\beta}_{kh}\in\mathbb{R}^{K}$ by

[TABLE]

The next theorem is the main result of this section. {thrm} Under the conditions of Lemma 6.2 there holds

(a)

[TABLE]

(b)

[TABLE]

for all $1\leq i\leq K$ and $1\leq j\leq n_{i}$ .

(c)

[TABLE]

(d)

[TABLE]

where $\ell_{kh}=\ln{\frac{T}{k}}+\lvert\ln{h}\rvert$ .

{rmrk} As in Remark 6.1, we directly conclude the a priori estimate

[TABLE]

from statement (d) in Theorem 6.2.

To prove Theorem 6.2 we start with the lemma providing a sub-optimal estimate for the distance between the support points of $\bar{q}_{k}$ and $\bar{q}_{kh}$ .

{lmm}

Under the conditions of Lemma 6.2 there is a constant $C>0$ such that for each $\bar{x}_{kh,i}\in B_{\varepsilon}(\bar{x}_{k,i})$ with $\bar{z}_{kh,0}^{+}(\bar{x}_{kh,i})=\pm\alpha$ there holds

[TABLE]

where $\ell_{kh}=\ln{\frac{T}{k}}+\lvert\ln{h}\rvert$ .

Proof.

We consider the Taylor expansion with an appropriate $\xi\in(\bar{x}_{k,i},\bar{x}_{kh,i})$

[TABLE]

where we used that $\bar{z}_{k,0}^{+}(\bar{x}_{k,i})=\bar{z}_{kh,0}^{+}(\bar{x}_{kh,i})$ by the previous lemma and $\nabla\bar{z}_{k,0}^{+}(\bar{x}_{k,i})=0$ by the optimality of $\bar{x}_{k,i}$ for $\bar{z}_{k,0}^{+}$ , see Corollary 4.1 and Lemma 6.1. Using uniform definiteness of $\nabla^{2}\bar{z}_{k,0}^{+}$ , see Lemma 6.1, we obtain

[TABLE]

where $\Omega_{0}$ is an interior subdomain with

[TABLE]

see Corollary 4.1 and Corollary 4.2. To proceed we introduce an intermediate discrete adjoint state $\hat{z}_{kh}\in X^{r,1}_{k,h}$ fulfilling

[TABLE]

We obtain

[TABLE]

The first term is estimated by Theorem 3.2 leading to

[TABLE]

and the second term by the smoothing property from Corollary 3.1 and by the second estimate from Theorem 5

[TABLE]

This completes the proof. ∎

For the proof of Theorem 6.2 we introduce a further intermediate adjoint state $\tilde{z}_{k}\in X_{k}^{r}$ defined by

[TABLE]

{lmm}

Let the conditions of Lemma 6.2 be fulfilled and let $\tilde{z}_{k}\in X_{k}^{r}$ be defined by (35). Then for each $\bar{x}_{k,i}$ with $\bar{z}_{k,0}^{+}(\bar{x}_{k,i})=-\alpha$ there is a minimizer $\tilde{x}_{k,i}\in B_{\varepsilon}(\bar{x}_{k,i})$ of $\tilde{z}_{k,0}^{+}$ and for each $\bar{x}_{k,i}$ with $\bar{z}_{k,0}^{+}(\bar{x}_{k,i})=\alpha$ there is a maximizer $\tilde{x}_{k,i}\in B_{\varepsilon}(\bar{x}_{k,i})$ of $\tilde{z}_{k,0}^{+}$ . Moreover there holds

[TABLE]

Proof.

Without restriction we assume that $\varepsilon>0$ and $k_{0}>0$ from Lemma 6.2 are chosen such that the statement of Lemma 6.1 holds. We fix $i$ with $\bar{z}_{k,0}^{+}(\bar{x}_{k,i})=-\alpha$ and introduce two functions $F,F_{h}\colon B_{\varepsilon}(\bar{x}_{k,i})\to\mathbb{R}^{N}$ by

[TABLE]

There holds by the optimality of $\bar{x}_{k,i}$ for $\bar{z}_{k,0}^{+}(x)$ that $F(\bar{x}_{k,i})=0$ and by Lemma 6.1 that $F^{\prime}(\bar{x}_{k,i})=\nabla^{2}\bar{z}_{k,0}^{+}(\bar{x}_{k,i})$ is positive definite. Moreover we have

[TABLE]

and

[TABLE]

by the smoothing property from Lemma 3.1 and the estimate from Theorem 5. In addition $F^{\prime}_{h}$ is Lipschitz continuous on $B_{\varepsilon}(\bar{x}_{k,i})$ with the Lipschitz constant

[TABLE]

where we have used interior estimate as in the proof of Lemma 3.1. In this setting we can apply [29, Theorem 3.1] to get the existence of $\tilde{x}_{k,i}\in B_{\varepsilon}(\bar{x}_{k,i})$ (for $h<h_{0}$ ) with $F_{h}(\tilde{x}_{k,i})=0$ and a positive definite $F^{\prime}_{h}(\tilde{x}_{k,i})$ , such that

[TABLE]

This completes the proof. ∎

In the next lemma we improve the estimate from Lemma 6.2. {lmm} Let the conditions of Lemma 6.2 be fulfilled. Then there holds for all $1\leq i\leq K$ and all $1\leq j\leq n_{i}$

[TABLE]

Proof.

We fix an $i$ with $\bar{z}_{k,0}^{+}(\bar{x}_{k,i})=-\alpha$ and an $1\leq j\leq n_{i}$ . For $\tilde{z}_{k}\in X_{k}^{r}$ defined in (35) we observe

[TABLE]

by Theorem 3.2. Due to Corollary 4.1 we have $\bar{z}_{k,0}^{+}(x)\geq-\alpha$ for all $x\in\bar{\Omega}$ and therefore

[TABLE]

Using the Taylor expansion and the fact that $\nabla\tilde{z}_{k,0}^{+}(\tilde{x}_{k,i})=0$ we get with some $\xi\in B_{\varepsilon}(\bar{x}_{k,i})$

[TABLE]

where we have used that $\nabla^{2}\tilde{z}_{k,0}$ is uniformly positive definite on $B_{\varepsilon}(\bar{x}_{k,i})$ by the positive definiteness of $\nabla^{2}\bar{z}_{k,0}$ and (36). This results in

[TABLE]

This completes the proof. ∎

We proceed with the proof of Theorem 6.2.

Proof.

The first statement is already shown in Theorem 5. The second statement follows directly from Lemma 6.2 and Lemma 6.2 by the triangle inequality. It remains to prove statements (c) and (d). To this end we will use the operator $G_{k}$ introduced in (31). Similarly we introduce the operator $G_{k}^{h}\colon\mathbb{R}^{K}\to L^{2}(\Omega)$ by

[TABLE]

Without restriction we assume that $k_{0}>0$ from Lemma 6.2 is chosen such that the statement of Lemma 6.1 holds. Then we obtain similarly to the proof of Lemma 6.1 using $G_{k}(\bar{\beta}_{k})=\bar{u}_{k}(T)$

[TABLE]

The first term is estimated by Theorem 5

[TABLE]

the last term is estimated by Lemma 5 leading to

[TABLE]

To estimate the second term in (37) we observe that

[TABLE]

For a given $\psi\in W^{1,\infty}(\Omega)$ we get for the inner difference using $\hat{\beta}_{kh,i}=\sum_{j=1}^{n_{i}}\bar{\beta}_{kh,ij}$

[TABLE]

Then by a duality argument as in the proof of Lemma 6.1 we obtain

[TABLE]

resulting in

[TABLE]

by the statement (b). Inserting this into (37) completes the proof of the statement (c). To prove statement (d) let $\varphi\in W^{1,\infty}(\Omega)$ with $\lVert\varphi\rVert_{W^{1,\infty}(\Omega)}\leq 1$ be given. We estimate

[TABLE]

Using statements $(b)$ and $(c)$ from Theorem 6.2 as well as the boundedness of $\lvert\hat{\beta}_{kh}\rvert$ we obtain

[TABLE]

which completes the proof. ∎

7. Proof of smoothing error estimates in time

In this section we prove Theorem 3.2 and Theorem 3.2. First we establish the following result.

{lmm}

Let $v_{0}\in L^{2}(\Omega)$ , let $v$ and $v_{k}$ satisfy (5) and (10). Then for $l=0,1,\dots,r$ , there exists a constant $C$ independent of $k$ and $T$ such that

[TABLE]

Proof.

For any $l\in\{0,1,\dots,r\}$ , let $y$ be the solution to the following backward problem

[TABLE]

and $y_{k}$ be its dG( $r$ ) approximation, i.e.

[TABLE]

Using the orthogonality conditions (15a) and the dual representation of the bilinear form (11)

[TABLE]

where $\pi_{k}$ is the projection defined in (12). Note, the jump terms vanish by the definition of $\pi_{k}$ . We set $\eta:=y-\pi_{k}y$ and $\xi_{k}=\pi_{k}y-y_{k}$ . Using the approximation and the standard energy estimate we have

[TABLE]

Using the properties of the bilinear form (9), we have

[TABLE]

Canceling and using (38) we obtain

[TABLE]

Combining (38) and (39) we also have

[TABLE]

Next we estimate $\|{\rm NewA}(-\Delta)^{-l-1}\partial_{t}(y-y_{k})\|_{L^{2}(I_{m}\times\Omega)}$ . By the triangle inequality, inverse inequality and (38), (39), and (40) we obtain

[TABLE]

This allows to estimate $J_{1}$ as follows

[TABLE]

Similarly, using approximation and (40) we obtain for $J_{2}$

[TABLE]

Combining the estimates for $J_{1}$ and $J_{2}$ and canceling $\|(-\Delta)^{-2l-1}(v-v_{k})(T)\|_{L^{2}(\Omega)}$ , we obtain the lemma. ∎

Now we show the next result.

{lmm}

Let $v_{0}\in L^{2}(\Omega)$ , let $v$ and $v_{k}$ satisfy (5) and (10). Then for $j\in\mathbb{N}_{0}$ provided $k\leq\frac{T}{2r+j+2}$ and $M>2r+j+2$ , there exists a constant $C(T)$ independent of $k$ such that

[TABLE]

where $C(T)\sim T^{-2r-j-1}$ .

Proof.

For any $j\in\mathbb{N}_{0}$ , let $y$ and $y_{k}$ be the solutions to the continuous and to the semidiscrete dual problems with $y_{k}(T)=y(T)=(-\Delta)^{j}(v-v_{k})(T)$ , i.e. $y\in H^{1}(I;L^{2}(\Omega))\cap L^{2}(I;H^{1}_{0}(\Omega))$ solving

[TABLE]

and $y_{k}\in X_{k}^{r}$ satisfying

[TABLE]

We choose $\tilde{m}$ such that $\frac{T}{2}\in I_{\tilde{m}}$ and define $\tilde{v}:=\chi_{(t_{\tilde{m}},T]}v$ as well as $\tilde{v}_{k}=\chi_{(t_{\tilde{m}},T]}v_{k}$ , i.e. $\tilde{v}$ and $\tilde{v}_{k}$ are zero on $I_{1}\cup\cdots\cup I_{\tilde{m}}$ and $\tilde{v}=v$ and $\tilde{v}_{k}=v_{k}$ on the remaining time intervals. Then we test (42) with $\varphi=(-\Delta)^{j}\tilde{v}$ and choose $\varphi_{k}=(-\Delta)^{j}\tilde{v}_{k}$ in (43). Using (16), we have

[TABLE]

Note, that $B({v},(-\Delta)^{j}\tilde{y})=0$ and $B({v}_{k},(-\Delta)^{j}\tilde{y}_{k})=0$ by construction. Using the Cauchy-Schwarz inequality, Lemma 7 with $l=r$ and $T=t_{\tilde{m}}$ and the smoothing estimate (17)

[TABLE]

Similarly, using the Cauchy-Schwarz inequality, Lemma 7 with $l=r$ and $T=t_{\tilde{m}}$ and the semidiscrete smoothing estimate in Lemma 3.1

[TABLE]

Combining the estimates for $I_{1}$ and $I_{2}$ and canceling $\|(-\Delta)^{j}(v-v_{k})(T)\|_{L^{2}(\Omega)}$ on both sides, we obtain the lemma. ∎

7.1. Proof of Theorem 3.2

We use the Gagliardo-Nirenberg inequality (19) and obtain

[TABLE]

Application of Lemma 7 with $j=0$ and $j=1$ yields the result.

7.2. Proof of Theorem 3.2

We use Gagliardo-Nirenberg inequality (18) with $B=\Omega_{0}$ as in the proof of Lemma 3.1 and obtain

[TABLE]

Application of Lemma 7 with $j=0$ , $j=1$ , and $j=2$ yields the result.

8. Proof of smoothing error estimates in space

In this section we prove Theorem 3.2. Before we provide the proof we show the following results. {lmm} Let $v_{k}\in X_{k}^{r}$ and $v_{kh}\in X^{r,1}_{k,h}$ be the semidiscrete and fully discrete solutions of (10) and (14), respectively. Then there exists a constant $C$ independent of $h$ , $k$ , and $T$ such that

[TABLE]

Proof.

Let $z_{kh}\in X^{r,1}_{k,h}$ be the solution to a dual problem with $z_{kh}(T)=\Delta_{h}^{-1}(P_{h}v_{k}-v_{kh})(T)$ , i.e.

[TABLE]

Then taking $\chi_{kh}=\Delta_{h}^{-1}(P_{h}v_{k}-v_{kh})$ by the Galerkin orthogonality, the stability of the $L^{2}$ projection, the standard elliptic error estimates, Lemma 3.1 and Corollary 3.1, we obtain

[TABLE]

Canceling, we obtain the result. ∎

In order to establish optimal pointwise error estimates for $R_{h}v_{k}-v_{kh}$ , we first show the corresponding estimate with respect to the $L^{2}(\Omega)$ norm and then for $\Delta_{h}(R_{h}v_{k}-v_{kh})$ in the $L^{2}(\Omega)$ norm likewise. {lmm} Let $v_{k}\in X_{k}^{r}$ and $v_{kh}\in X^{r,1}_{k,h}$ be the semidiscrete and fully discrete solutions of (10) and (14), respectively. There exists a constant $C$ independent of $k$ , $h$ , and $T$ such that

[TABLE]

Proof.

Let $y_{kh}\in X^{r,1}_{k,h}$ be the solution to a dual problem with $y_{kh}(T)=(R_{h}v_{k}-v_{kh})(T)$ , i.e. $y_{kh}\in X^{r,1}_{k,h}$ satisfies

[TABLE]

We abbriviate $\psi_{kh}=R_{h}v_{k}-v_{kh}\in X^{r,1}_{k,h}$ and set $\tilde{\psi}_{kh}$ to be zero on $I_{1}\cup\cdots\cup I_{\tilde{m}}$ for $\tilde{m}$ such that $\frac{T}{2}\in I_{\tilde{m}}$ and $\tilde{\psi}_{kh}=\psi_{kh}$ on the remaining time intervals. Similarly we define $\tilde{y}_{kh}$ . Then by (16) and using the Galerkin orthogonality, we have

[TABLE]

Using (11) and the property of the Ritz projection, we have

[TABLE]

By the Lemma 3.1 and Lemma 3.1 we obtain

[TABLE]

By the approximation properties of the Ritz projection, $H^{2}$ regularity, and using the fact that $\frac{T}{2}\in I_{\tilde{m}}$ we have

[TABLE]

Canceling, be obtain the result for $J_{1}$ .

To estimate $J_{2}$ we add and subtract $v_{k,m}$ . Thus we obtain

[TABLE]

Similarly to the above, using Lemma 3.1 and Lemma 3.1 we obtain

[TABLE]

To estimate $J_{22}$ we use Lemma 8 with $T=t_{\tilde{m}}$ and the fact the constant there does not depend on $T$ together with Lemma 3.1. Hence,

[TABLE]

Canceling, we obtain the lemma. ∎

Next we establish the following smoothing result in $L^{2}$ norm with discrete Laplacian. {lmm} Let $v_{k}\in X_{k}^{r}$ and $v_{kh}\in X^{r,1}_{k,h}$ be the semidiscrete and fully discrete solutions of (10) and (14), respectively. There exists a constant $C$ independent of $k$ , $h$ , and $T$ such that

[TABLE]

Proof.

Let $y_{kh}\in X^{r,1}_{k,h}$ be the solution to a dual problem with $y_{kh}(T)=\Delta_{h}(R_{h}v_{k}-v_{kh})(T)$ , i.e. $y_{kh}$ satisfies

[TABLE]

As in the proof of the previous lemma we abbriviate $\psi_{kh}=R_{h}v_{k}-v_{kh}$ and set $\tilde{\psi}_{kh}$ to be zero on $I_{1}\cup\cdots\cup I_{\tilde{m}}$ for some $\tilde{m}$ to be specified later and $\tilde{\psi}_{kh}=\psi_{kh}$ on the remaining time intervals. Similarly we define $\tilde{y}_{kh}$ . Then setting $\varphi_{kh}=\Delta_{h}\tilde{\psi}_{kh}$ and using the Galerkin orthogonality and (16), we have

[TABLE]

Choosing $\tilde{m}$ such that $\frac{T}{4}\in I_{\tilde{m}}$ , using the definition of the bilinear form $B(\cdot,\cdot)$ , the discrete maximal parabolic regularity from Corollary 3.1 and Lemma 3.1 we obtain

[TABLE]

Canceling we obtain the desired estimate for $J_{1}$ .

To estimate $J_{2}$ we proceed as in the proof of the previous lemma,

[TABLE]

Similarly to the above,

[TABLE]

To estimate $J_{22}$ , we proceed as in the proof of the previous lemma, and using Lemma 8, we have

[TABLE]

Canceling we obtain the lemma. ∎

As a consequence of the two lemmas above and the discrete Gagliardo-Nirenberg inequality (21) we immediately obtain the following result. {lmm} Let $v_{k}$ and $v_{kh}$ be the semidiscrete and fully discrete solutions of (10) and (14), respectively. Then, there exists a constant $C$ independent of $k$ and $h$ such that

[TABLE]

{rmrk}

The result in Lemma 8 is rather interesting and of independent interest. It shows that the $L^{\infty}(\Omega)$ error between the semidiscrete solution and its Ritz projection for piecewise linear elements is of optimal second order even if the exact solution at a final time $T$ is not in $W^{2,\infty}(\Omega)$ as for example in the case if the domain $\Omega$ has strong corner singularities. This in particular shows a well-known fact that the presence of corner singularities is essentially an elliptic problem.

Similar to the results in [33], it also can be used to show a superconvergent result for the gradient. Thus for $N=2$ using the discrete Sobolev inequality

[TABLE]

we also have the following superconvergent estimate

[TABLE]

8.1. Proof of Theorem 3.2

Adding and subtracting $R_{h}v_{k}$ , we have

[TABLE]

From Lemma 8 we have

[TABLE]

Using the pointwise interior elliptic results from [30] we have

[TABLE]

where we used the embedding $H^{4}(B_{d}(x_{0}))\hookrightarrow C^{2}(B_{d}(x_{0}))$ , the interior regularity result [13, Chapter 6.3,Theorem 2] and convexity of $\Omega$ .

9. Algorithmic treatment

This section is devoted to the algorithmic solution of the sparse initial data identification problem under consideration. Let us first note that by Theorem 4.2 we can look for a minimizer $\bar{q}_{kh}$ in the space $\mathcal{M}_{h}$ consisting of linear combinations of Diracs concentrated in the interior nodes $\mathcal{N}_{h}$ of the mesh, i.e.

[TABLE]

where $\bar{\gamma}_{kh}\in\mathbb{R}^{\#\mathcal{N}_{h}}$ is a vector of optimal coefficients. Thus, the fully discrete problem (26) can be equvalently reformulated as a finite dimensional problem (of dimension $\#\mathcal{N}_{h}$ ) in the coefficients $\gamma_{kh}$ with an $l_{1}$ regularization term leading to

[TABLE]

This problem can be solved by a variety of efficient solution algorithms, e.g., semi-smooth Newton methods, [26], or FISTA, [2]. However, a direct application of finite dimensional optimization algorithms to this problem may lead to mesh-dependent methods, whose convergence behavior critically depends on the fineness of the discretization. In contrast, we employ an optimization algorithm, which can be described on the continuous level, as a solution algorithm for the problem (7). This algorithm can be then directly adapted to the discretized problem (26). Since the convergence properties of the presented algorithm can be analyzed on the continuous level, see [34], one expects mesh independent behavior for the discretized problem, which is also confirmed by our numerical results.

We propose a version of the Primal-Dual-Active-Point (PDAP) method from [34] which iteratively generates a sequence of finite linear combinations of Dirac delta functions. The algorithm on the continuous level is briefly described and its convergence properties are summarized below. Given an ordered set of finitely many points $\mathcal{A}=\{x_{i}\}^{K}_{i=1}$ define the parametrization

[TABLE]

as well as the finite dimensional subproblem

[TABLE]

We initialize the proposed algorithm with a sparse initial iterate $q_{0}\in\mathcal{M}(\Omega)$ , $\#\operatorname{supp}q_{0}<\infty$ . In the $n$ -th iteration, a new support point $\hat{x}^{n}\in\Omega$ is determined based on the violation of the condition (a) in Corollary 2 by the current adjoint state $z^{n}(0)=S^{*}(Sq_{n}-u_{d})$ . Subsequently, the new iterate is found as $q_{n+1}=Q_{\mathcal{A}_{n}}(\bar{\beta}^{n+1})$ where $\bar{\beta}^{n+1}\in\mathbb{R}^{\#\mathcal{A}_{n}}$ is a solution to (47) for $\mathcal{A}=\mathcal{A}_{n}$ . Thus, the method alternates between updating the active set $\mathcal{A}_{n}$ by adding $\hat{x}^{n}$ to the support of the current iterate $q_{n}$ and computing a minimizer of $j$ over $\mathcal{M}(\mathcal{A}_{n})$ . The procedure is summarized in Algorithm 1.

Note that the support of $q_{n}$ is pruned after each iteration i.e. Dirac delta functions with zero coefficients are removed from the iterate. Additionally, we observe that Algorithm 1 is monotonous, i.e. $j(q_{n+1})\leq j(q_{n})$ , and thus also $\lVert q_{n}\rVert_{\mathcal{M}(\Omega)}\leq M_{0}$ for all $n\in\mathbb{N}$ . To monitor the convergence of the algorithm we consider the primal-dual-gap functional $\Phi\colon\mathcal{M}(\Omega)\to\mathbb{R}_{+}$ which is defined as

[TABLE]

and $M_{0}=j(q_{0})/\alpha$ . This is justified by the following lemma, see [34, Lemma 6.12 and Lemma 6.41]. {lmm} There holds $\Phi(q)\geq 0$ for all $q\in\mathcal{M}(\Omega)$ with equality if and only if $q=\bar{q}$ is the optimal solution of (7). Furthermore we have

[TABLE]

Let $\{q_{n}\}$ denote the sequence generated by Algorithm 1. Then we obtain

[TABLE]

for $n\geq 1$ .

We point out that, due to (48), the primal-dual-gap $\Phi(q_{n})$ can be cheaply computed as a byproduct of step $2$ .

The following theorem, see [34, Theorem 6.43], provides two convergence results. For the general case we obtain sub-linear convergence of the the cost functional. Under Assumption 1, we obtain linear convergence for the functional, positions of the Diracs and for the corresponding coefficients. {thrm} Let the sequence $\{q_{n}\}\subset\mathcal{M}(\Omega)$ be generated by Algorithm 1 starting from $q_{0}\in\mathcal{M}(\Omega)$ . Then we have

[TABLE]

for all $n\in\mathbb{N}$ and a constant $c_{1}>0$ . If Assumption 1 holds, then $\bar{q}=\sum^{K}_{i=1}\bar{\beta}_{i}\delta_{\bar{x}_{i}}$ and there exist $R,~{}c_{2}>0$ , $\zeta\in(0,1)$ with

[TABLE]

as well as

[TABLE]

for all $n\in\mathbb{N}$ large enough.

We emphasize that the adaption of Algorithm 1 to the discrete problem (26) is straightforward. In detail, we replace the control-to-state operator $S$ by its fully discrete counterpart $S_{kh}$ and compute $z^{n,+}_{kh,0}=S_{kh}^{*}(S_{kh}q_{n}-u_{d})$ . Moreover, in view of Theorem 4.2, the search for the maximizer $\hat{x}^{n}$ in step $2$ can be restricted to the set of interior nodes $\mathcal{N}_{h}$ . The new coefficient vector $\bar{\beta}^{n+1}$ is then found as solution to the finite-dimensional subproblem

[TABLE]

Note, that the support of $q_{n}$ usually consists of only few points, i.e. the dimension of the subproblem (49) is small the this subproblem can be solved efficiently by existing finite dimensional algorithms. In our numerical realization we use the semi-smooth Newton method for the solution of (49).

10. Numerical examples

The final section is devoted to the presentation of numerical experiments which serve to underline the practical applicability of the proposed sparse control approach as well as to verify the derived theoretical results. Throughout the section, the spatial domain is given by the unit square $\Omega=(0,1)\times(0,1)$ and the final time is set to $T=0.1$ . All arising discrete optimal control problems are solved by an adaptation of the PDAP method, Algorithm 1, as described at the end of the previous section.

10.1. Identification of point sources

First, we aim to identify a sparse source term $q^{\dagger}=-10\delta_{x_{1}}+25\delta_{x_{2}}$ from noisy observations of $u(T)=S(q^{\dagger})$ . The time interval $(0,T]$ is uniformly partitioned into $M=256$ subintervals, the spatial domain $\Omega$ is divided into triangles, see the description in Section 3. We emphasize that the support points $x_{1}$ and $x_{2}$ , respectively, correspond to nodes of the triangulation. For the discretization of the state equation a cG( $1$ )dG([math]) (i.e $r=0$ ) approximation is considered. The observations are given by $u_{obs}=S_{kh}(q^{\dagger})+\delta$ where $\delta\in L^{2}(\Omega)$ is a given noise term. We plot $u_{\text{obs}}$ alongside $q^{\dagger}$ in Figure 1.

To reconstruct $q^{\dagger}$ from the given final time observation we propose to solve $\eqref{eq:red_problem_kh}$ with $u_{d}=u_{\text{obs}}$ . For the described example we empirically determine $\alpha=0.001$ as a suitable regularization parameter. Applying the Primal-Dual-Active-Point method to $\eqref{eq:red_problem_kh}$ yields a reconstruction $\bar{q}_{kh}\in~{}\mathcal{M}_{h}$ with $\#\operatorname{supp}\bar{q}_{kh}=3$ . By a closer inspection, two of its support points are located in adjacent nodes of the triangulation. A possible explanation for this clustering of support points is provided by Theorem 4.2. More in detail, a spike appearing in a discrete optimal solution $\bar{q}\not\in\mathcal{M}_{h}$ at an off-grid location will appear as several nodal Dirac delta functions in the projected solution $\Lambda_{h}\bar{q}$ . For a better visualization of the results we replace the Dirac delta functions associated to the clustering support points by a single one of the same combined mass located at the center of gravity of the original positions. The post-processed measure is depicted in Figure 2 together with $\bar{z}_{kh,0}^{+}=S^{*}_{kh}(S_{kh}\bar{q}_{kh}-u_{d})$ .

As predicted by Corollary 4.2 we have $|\bar{z}_{kh,0}^{+}(x)|\leq\alpha$ for all $x\in\Omega$ and equality holds at the support points of $\bar{q}_{kh}$ . Moreover, we also plot the final state $S_{kh}(q^{\dagger})$ corresponding to the initial source $q^{\dagger}$ as well as the reconstructed final state $S_{kh}(\bar{q}_{kh})$ . We see that the proposed sparse control approach together with the lumping of clustering support points recovers the main structural features of the source $q^{\dagger}$ . In particular, we point out to the correct number of points sources as well as quantitatively reasonable estimates of their locations and coefficients. Note that we cannot expect the exact recovery of $q^{\dagger}$ due to the appearance of the noise term $\delta$ as well as the nonzero regularization parameter $\alpha$ . We specifically stress that $\operatorname{supp}\bar{q}_{kh}\cap\operatorname{supp}q^{\dagger}=\emptyset$ .

10.2. Space refinement

Next we practically verify the derived a priori error estimates for the optimal states. Let us first discuss spatial refinement. To this end we consider cG( $1$ )dG( $r$ ) approximations for both $r=0$ and $r=1$ of the state equation on an equidistant grid in time with $M=256$ steps and a sequence $\{\mathcal{T}_{i}\}^{6}_{i=1}$ of spatial triangulations. Here, $\mathcal{T}_{i+1}$ is obtained by one global uniform refinement of $\mathcal{T}_{i}$ , $1\leq i\leq 5$ . The desired state $u_{d}$ and the regularization parameter $\alpha$ are chosen as in Section 10.1. Since no analytic solution for this problem is known we take the optimal state on the finest spatial grid as a reference $\bar{u}$ . On each refinement level, the optimal state $\bar{u}_{kh}$ is computed using the PDAP algorithm. The convergence plots are given in Figure 3. For visual comparison we also plot the corresponding rate of convergence as given in Theorem 6.2 without the logarithmic factor. We clearly see that the computed rates for the optimal states match the predicted order of $\mathcal{O}(h)$ for both temporal approximation schemes.

10.3. Time refinement

In order to verify the temporal error estimate we discretize the state equation again by the cG( $1$ )dG( $r$ ) scheme for both $r=0$ and $r=1$ , on equidistant time grids with $2^{i}$ steps, $i=4,\dots,8$ , and a fixed triangulation of the spatial domain. The desired state $u_{d}$ is chosen as the discrete final state corresponding to the measure $q^{\dagger}$ on the finest discretization. The regularization parameter is set to $\alpha=0.001$ . Again, the optimal state on the finest discretization is considered as reference $\bar{u}$ . The computed convergence results for the optimal states are plotted in Figure 4 alongside the rates of convergence derived in Theorem 6.1. As predicted by the theory, we observe a linear $\mathcal{O}(k)$ rate for dG([math]) and a cubic $\mathcal{O}(k^{3})$ rate of convergence for dG( $1$ ).

Bibliography34

The reference list from the paper itself. Each links out to its DOI / PubMed record.

1[1] R. A. Adams and J. Fournier , Cone conditions and properties of Sobolev spaces , J. Math. Anal. Appl., 61 (1977), pp. 713–734.
2[2] A. Beck and M. Teboulle , A fast iterative shrinkage-thresholding algorithm for linear inverse problems , SIAM J. Imaging Sci., 2 (2009), pp. 183–202.
3[3] V. I. Bogachev , Measure theory. Vol. I, II , Springer-Verlag, Berlin, 2007.
4[4] E. Casas , Pontryagin’s principle for state-constrained boundary control problems of semilinear parabolic equations , SIAM J. Control Optim., 35 (1997), pp. 1297–1327.
5[5] E. Casas, C. Clason, and K. Kunisch , Approximation of elliptic control problems in measure spaces with sparse solutions , SIAM J. Control Optim., 50 (2012), pp. 1735–1752.
6[6] , Approximation of elliptic control problems in measure spaces with sparse solutions , SIAM J. Control Optim., 50 (2012), pp. 1735–1752.
7[7] E. Casas, B. Vexler, and E. Zuazua , Sparse initial data identification for parabolic PDE and its finite element approximations , Math. Control Relat. Fields, 5 (2015), pp. 377–399.
8[8] E. Casas and E. Zuazua , Spike controls for elliptic and parabolic PD Es , Systems Control Lett., 62 (2013), pp. 311–318.

TL;DR

Contribution

Findings

Abstract

Peer Reviews

Videos

Numerical Analysis of Sparse Initial Data Identification

Abstract.

Key words and phrases:

1991 Mathematics Subject Classification:

1. Introduction

2. Optimal control problem

Proof.

Proof.

Proof.

Proof.

3. Discretization and smoothing type error estimates

3.1. Parabolic smoothing

Proof.

Proof.

Proof.

3.2. Smoothing pointwise error estimates

4. Discretization of optimal control problem

4.1. Temporal semidiscretization

Proof.

Proof.

Proof.

Proof.

Proof.

4.2. Space-time discretization

Proof.

Proof.

Proof.

Proof.

5. General error estimates for the optimal control problem

Proof.

Proof.

Proof.

Proof.

6. Improved error estimates for the optimal state and control

Assumption 1**.**

6.1. Error estimates for the temporal error

Proof.

Proof.

Proof.

Proof.

Proof.

Proof.

Proof.

Proof.

6.2. Error estimates for the spatial error

Proof.

Proof.

Proof.

Proof.

Proof.

7. Proof of smoothing error estimates in time

Proof.

Proof.

7.1. Proof of Theorem 3.2

7.2. Proof of Theorem 3.2

8. Proof of smoothing error estimates in space

Proof.

Proof.

Proof.

8.1. Proof of Theorem 3.2

9. Algorithmic treatment

10. Numerical examples

10.1. Identification of point sources

10.2. Space refinement

10.3. Time refinement

Assumption 1.