Finite element error analysis for measure-valued optimal control problems governed by a 1D wave equation with variable coefficients
Philip Trautmann, Boris Vexler, Alexander Zlotnik

TL;DR
This paper analyzes finite element error estimates for measure-valued optimal control problems governed by a 1D wave equation with variable coefficients, including numerical validation.
Contribution
It introduces three-level bilinear finite element discretizations for measure-valued controls in 1D wave equations and derives associated error estimates.
Findings
Error estimates for the optimal state variable are established.
Numerical results confirm the theoretical error bounds.
The approach handles measure-valued controls effectively.
Abstract
This work is concerned with the optimal control problems governed by a 1D wave equation with variable coefficients and the control spaces of either measure-valued functions or vector measures . The cost functional involves the standard quadratic tracking terms and the regularization term with . We construct and study three-level in time bilinear finite element discretizations for this class of problems. The main focus lies on the derivation of error estimates for the optimal state variable and the error measured in the cost functional. The analysis is mainly based on some previous results of the authors. The numerical results are included.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Finite element error analysis for measure-valued optimal control problems governed by a 1D wave equation with variable coefficients
Abstract.
This work is concerned with the optimal control problems governed by a 1D wave equation with variable coefficients and the control spaces of either measure-valued functions or vector measures . The cost functional involves the standard quadratic tracking terms and the regularization term with . We construct and study three-level in time bilinear finite element discretizations for this class of problems. The main focus lies on the derivation of error estimates for the optimal state variable and the error measured in the cost functional. The analysis is mainly based on some previous results of the authors. The numerical results are included.
Key words and phrases:
Wave equation, optimal control, measure-valued control, vector measure control, finite element method, stability, error estimates.
1991 Mathematics Subject Classification:
Primary: 65M60, 49K20, 49M05, 49M25, 49M29; Secondary: 35L05.
The first and the second author were supported by FWF and DFG through the International Research Training Group IGDK 1754 ‘Optimization and Numerical Analysis for Partial Differential Equations with Nonsmooth Structures’. The third has been funded within the framework of the Academic Fund Program at the National Research University Higher School of Economics in 2016-2017 (grant no. 16-01-0054) and by the Russian Academic Excellence Project ‘5-100’. He also thanks the Technical University of Munich for its hospitality in 2014-2015 years.
Philip Trautmann
Department of Mathematics and Scientific Computing
University of Graz
Heinrichstraße 36
8010 Graz, Austria
Boris Vexler
Zentrum Mathematik
Technische Universität München
Boltzmannstraße 3
85748 Garching bei München, Germany
Alexander Zlotnik
Department of Mathematics at Faculty of Economic Sciences
National Research University Higher School of Economics
Myasnitskaya 20
101000 Moscow, Russia
(Communicated by the associate editor name)
1. Introduction
This work is concerned with the discretization and numerical analysis of optimal control problems involving a 1D linear wave equation with variable coefficients and controls taking values in certain measure spaces. The combination of variable coefficients and irregular data leads to significant technical problems.
Motivated by industrial applications as well as applications in the natural sciences, in which one is interested to place actuators in form of point sources in an optimal way, see, e.g., [4, 9] or in the reconstruction of point sources from given measurements, see, e.g., [34, 44], measure valued optimal control problems involving PDEs gained attention in the last years. These problems can be translated into optimization problems in terms of the coordinates and coefficients of the point sources. However, these optimization problem are non-convex since the solution of the state equation (PDE) depends in a non-linear way on the coordinates of the point sources. Thus one has to deal with multiple local minima. Several authors suggested to cast the control problem resp. inverse problem in form of an optimization problem over a suitable measure space involving a convex regularization functional which favors point sources as solutions. In our case we introduce the following problem formulation involving the 1D wave equation
[TABLE]
with additional initial and boundary conditions. The functional is given by a quadratic tracking functional involving , and . The regularization functional and the control space are chosen in a way such that contains point sources of the desired form and promotes controls of such a form, i.e. linear combinations of point sources with time-dependent intensities or more general controls with a small spatial support. Since problem (1.1) is convex, one does not need to deal with several local minima. However, it is not longer guaranteed that the solution consists of a sum of point sources. We enforce such controls via the regularization functional . Problems of the form (1.1) (also involving other PDEs) have been analysed from theoretical, numerical and algorithmic points of view, see [12, 11, 18, 19, 45, 33, 34, 13, 14, 7, 44, 16, 15]. Optimal control problems governed by the linear wave equation were discussed in several different aspects, see [35, 31, 32, 36, 52, 25, 24, 40, 41, 29, 30]. In our particular case we consider the control spaces of measure-valued functions and vector measures with . These two different choices imply different structural properties of the optimal controls. A typical non-regular element from the space is given by
[TABLE]
where are the Dirac delta functions. Point sources of such type with fixed positions and time-dependent intensities are of interest in acoustics or geology, see [34, 44]. If one is interested in controls involving moving point sources of the form
[TABLE]
then the control space rather than is more appropriate. The space and the functional are also related to the term directional sparsity resp. joint sparsity, see [26, 22].
For the discretization of optimal control problem (1.1), we discretize the state equation by space-time finite element method as introduced in [51]. Related methods are also discussed and analyzed in [1, 23], see also [2]. The measure-valued control is not directly discretized, cf. the variational control discretization from [27]. However, there exists optimal controls consisting of Dirac measures in the spatial grid points which can be computed, see also [12, 33]. The numerical analysis of the control problem is based on FEM error estimates for the second order hyperbolic equations from [51] and techniques developed in [12, 33]. It requires to overcome significant technical difficulties caused by non-smoothness of controls and states. To the best of our knowledge, this is the first paper providing such numerical analysis for the studied control problems.
The problem like (1.1) for a parabolic/heat state equation is analyzed for the case in [33] and for the case in [12]. In particular, in both papers the authors prove existence of optimal controls and derive optimality conditions and FEM error estimates. Our analysis is partly based on these results of [33]. In [34] a problem similar to (1.1) involving the linear wave equation with constant coefficients as state equation is analyzed. In particular, existing regularity results for a Dirac right-hand side are extended to sources from . Based on these regularity results existence of optimal controls is proved as well as optimal conditions are derived in the 3D case.
Now we briefly sum up the contents of this work. First of all we collect and partially prove required existence and regularity results for the linear wave equation in the 1D setting. In particular, we check that the notions of a weaker solution defined in [51] and more commonly used very weak solution, e.g. [38], are equivalent. Most importantly we prove that the solution of the linear wave equation with variable coefficients from for any source term is an element of provided that the initial data have relevant regularity. The proof is based on a non-standard energy type bound in space, not only in time, cf. [37, 21]. In [34] the same result is proved for the wave equation with constant coefficients using duality techniques. This proof in [34] provides also corresponding results for multidimensional case but can not be directly extended for treating variable coefficients. This is due to the fact that it uses estimates of the solution of the wave equation in the whole space with a Dirac measure on the right hand-side which are proven using the Fourier-and Laplace-transformation or explicit solution formulas.
The existence of optimal controls and the derivation of optimality conditions are discussed on the basis of results from [34, 33]. In the case we prove that the optimal control belongs to .
Further, the FEM discretization of the state equation is introduced. The state variable belongs to the space of bilinear finite elements and is defined by the regularized Galerkin method. The resulting numerical scheme is a three-level method in time (i.e., its main equation relates the approximate solution values at three consecutive grid time levels). Moreover, we pose and prove the FEM error estimates in for the discrete state equation which we need for the numerical analysis of the control problem. We base this study mainly on the results from [51] concerning error analysis of FEMs for the second order hyperbolic equations in the classes of the data having integer Sobolev or fractional Nikolskii order of smoothness. Note that their sharpness in a strong sense was stated in [50].
Then we consider a semi-discrete optimal control problem in which the continuous state equation is replaced by its discretized version whereas the controls are not discretized. We prove convergence of the discrete optimal controls to the continuous one and derive optimality conditions based on the Lagrange techniques. Most importantly we derive the discrete adjoint state equation. We can conclude that the first-discretize-then-optimize and first-optimize-then-discretize approaches commute. Therefore an analysis of the discrete adjoint state equation including the error estimates in and can also be based on techniques from [51]. Then we use results from [33] to represent the numerical error of state variable and of the cost functional in terms of FEM errors of the state equation and the adjoint state equation. Let and be the optimal control and the corresponding optimal state, and the variables and be their discrete counterparts. As the main result of this paper we prove the error estimates
[TABLE]
where is the step in time, is the maximal step in space and for or for . The latter higher order is due to the above mentioned improved regularity results for the state and optimal control. Such estimates are proved for the measure-valued controls in the hyperbolic case for the first time. Similar estimates are impossible in multidimensional settings due to much less fractional Sobolev regularity of optimal states and controls.
Finally we discuss the numerical computation of the discrete control . Based on a control discretization that given by the sum like (1.2) with at the spatial grid points and in the space of linear finite elements, a solution of the semi-discrete control problem can be calculated similarly to [33]. For the actual numerical computation of the optimal control we add the term , , to (1.1). This regularized problem is solved by a semi-smooth Newton method, see [43]. In a continuation strategy the regularization parameter is made sufficiently small. We complete this work with a numerical example for .
The paper is organized in the following way. In Section 2 we introduce the problem setting and the control spaces resp. the regularization functionals. Section 3 is concerned with regularity properties of the linear wave equation with variable coefficients in the 1D setting. In Section 4 the control problem is analyzed from a theoretical point of view. Section 5 deals with discretization of the state equation. Then we obtain stability bounds and error estimates for the discrete state equation in Section 6. Section 7 is concerned with the analysis of the semi-discrete optimal control problem. The next section discusses stability bounds and error estimates for the discrete adjoint sate equation. In Sections 9 resp. 10 error estimates for the optimal state and cost functional are derived being the main theoretical results of the study. Section 11 deals with the time stepping formulation of the discrete state equation. In Section 12 we discuss the control discretization with Dirac measures at the grid points. Then we introduce the regularized problem and describe its solutions by a semi-smooth Newton method. Finally Section 13 provides a numerical example.
2. Problem setting
We consider optimal control problems of the following form
[TABLE]
with the parameter and the tracking functional
[TABLE]
using , subject to the state equation which is an initial-boundary value problem for a 1D linear wave equation with variable coefficients
[TABLE]
Here, in particular, the initial data , and and . The coefficients satisfy and on .
For brevity we denote , , and equipped with the norms
[TABLE]
Moreover, we utilize the equivalent coefficient-dependent Hilbert norms on , , and
[TABLE]
where is the duality relation on .
For the control space we consider two choices, either the space of vector measures or the space of weak-star measurable, -valued functions . Recall that where for any . Let correspondingly be chosen as or where . The following identifications of dual spaces hold
[TABLE]
with the duality pairings respectively
[TABLE]
for any and . See [12, 17, 20, 33], where more details on the properties of these spaces can be found. In particular, the following embeddings hold
[TABLE]
3. Existence and regularity of the state
3.1. Weak formulations and preliminary existence, uniqueness and regularity results
In this section we introduce our solution concepts for the state equation (2.1). We begin with defining a weak formulation of (2.1).
Definition 3.1**.**
Let with or or . Then is called a weak solution of (2.1) if it satisfies the integral identity
[TABLE]
with the indefinite symmetric bilinear form
[TABLE]
and the initial condition .
The right-hand side in (3.1) is well defined for too due to embeddings (2.2).
Remark 1**.**
It is possible (and more common) to suppose that in (3.1) when the last term on the left disappears (for example, see [51]). This leads to an equivalent formulation. To check this, it is enough to replace there by , where \beta_{\delta}(t)=\min\big{(}1,(T-t)/\delta\big{)}, . Then , where is the characteristic function of . Passing to the limit as with the help of the dominated convergence theorem and the properties of and leads to the result.
Another definition of the weak solution is possible.
Definition 3.2**.**
Let with or or . A function is called a weak solution of (2.1) if it satisfies
[TABLE]
and as well as .
Proposition 1**.**
Definitions 3.1 and 3.2 (up to the property ) are equivalent.
Proof.
The weak solution from Definition 3.1 has according to the integral identity (3.1). Then the equivalence of (3.3) and (3.1) can be proved using integration by parts in time and the density of in , cf. [38, Chapter 1, Theorem 2.1]. ∎
Proposition 2**.**
- (1)
Let with or . Then (2.1) has a unique weak solution satisfying and
[TABLE]
Hereafter , , etc., are independent of and the data.
In the case there even holds as well as
[TABLE] 2. (2)
Let with or . Then the weak solution y satisfies and
[TABLE]
In the case there even holds as well as
[TABLE]
Moreover, satisfies the equation in , i.e. it is the strong solution.
Proof.
For example, see [51, Propositions 1.1 and 1.3]. ∎
Item 2 ensures the regularity of weak solution for more regular data.
For less regular data one can use other weak formulations. To state the first of them, we define the integration operator and its adjoint on .
Definition 3.3**.**
Let . A function with is called a weaker solution of (2.1) if it satisfies
[TABLE]
As in the case of Definition 3.1, it is sufficient to take in (3.6), cf. Remark 1.
Proposition 3**.**
Let . Then there exists a unique weaker solution and it satisfies the bound
[TABLE]
Proof.
See [51, Proposition 1.2]. ∎
We infer that there are other weak formulations of (2.1) for solutions . One can use the concept of very weak solutions.
Definition 3.4**.**
Let . A function satisfying
[TABLE]
for any is called a very weak solution of (2.1).
Actually, these two last solution concepts are equivalent for the considered data spaces.
Theorem 3.5**.**
Definitions 3.3 and 3.4 are equivalent.
Proof.
First of all, we consider the auxiliary integrated in problem (2.1):
[TABLE]
for . Thus, we have . According to Proposition 2 problem (3.8) has a unique weak solution . Moreover, we set . Thus the weak formulation of (3.8) involving coincides with the weaker formulation of (2.1) involving . Furthermore there holds and
[TABLE]
Now we take any and test (3.6) with in the role of :
[TABLE]
Next we rearrange a term on the left integrating by parts in and :
[TABLE]
with . Since , we get
[TABLE]
Since formula (3.9) implies that
[TABLE]
by the density of in we find that is a very weak solution of (2.1).
Now let be a very weak solution of (2.1). Then we take any and test (3.7) with . Thus, we get
[TABLE]
and then
[TABLE]
The last equation yields that . Thus and we can transform a term on the left in (3.11) by replacing by in (3.10):
[TABLE]
Then the density of in shows that is a weaker solution of (2.1). ∎
Moreover, there is the concept of solutions by transposition.
Definition 3.6**.**
Let . A solution by transposition of (2.1) is defined by
[TABLE]
for all where is the weak solution of the adjoint problem
[TABLE]
Proposition 4**.**
Definitions 3.4 and 3.6 are equivalent too.
Proof.
For or , and there holds , see Proposition 2. Due to the density of resp. in as well as in and in a very weak solution is a solution by transposition. Now let and set , and . Thus is the solution of (3.13). Then the density of in implies that a solution by transposition is a very weak solution. ∎
Remark 2**.**
For , the weaker solution coincides with the weak one.
3.2. Existence and regularity of the state
In this section we study the existence, uniqueness and regularity of solution of the state equation for measure valued source terms. We will carry out the analysis for both control spaces. We use the distinct properties of each space in order to show improved regularity of the state.
3.2.1. The control space
The space is not so broad as and contains no moving point sources but contains the standing -sources (1.2). Therefore, we expect that the state has better regularity properties in this case and prove that . The proof will be based on a priori bound and a density argument. First we state the following density result.
Lemma 3.7**.**
Let . Then there exists a sequence such that
[TABLE]
Proof.
We denote by the locally convex space endowed with its weak-star topology and define the absolutely convex set
[TABLE]
Assume that (3.14) is wrong. Then there exists with such that where is the closure of in . Owing to the corollary of a theorem on the separation of convex sets [28, Ch. III, Theorem 6] there exists such that
[TABLE]
On the other hand, is dense in thus
[TABLE]
that contradicts (3.15). Thus . Since is separable, the weak-star topology on is metrizable. Therefore the closure of is equal to its sequential closure, see [5, Theorem 3.28, Corollary 3.30]. ∎
Note that clearly .
Preliminarily we prove the following crucial a priori bound.
Lemma 3.8**.**
Let and be the corresponding strong solution of problem (2.1). Then satisfies the following a priori bound
[TABLE]
Proof.
We first remind the energy equality for problem (2.1)
[TABLE]
After setting
[TABLE]
and c_{0}:=\max\big{(}\|\rho\|_{L^{\infty}(\Omega)},\,\|\kappa\|_{L^{\infty}(\Omega)}\big{)}, the energy equality implies
[TABLE]
We also multiply the equation in (2.1) by and integrate over . Integration by parts in yields the equality
[TABLE]
We define a function on . Since the left-hand side of (3.18) equals \partial_{x}P-\big{(}\partial_{x}(\rho\kappa)\big{)}\|\partial_{t}y\|_{L^{2}(I)}^{2}, taking the modulus and integrating over any we derive
[TABLE]
Let be such that hold and let now . Then the mean value theorem for integrals implies
[TABLE]
By the above definitions we clearly have
[TABLE]
Inserting (3.19) into (3.20) and using (3.21), we obtain
[TABLE]
Owing to (3.17) we can write
[TABLE]
Using this in (3.22) and choosing a small enough such that
[TABLE]
we derive
[TABLE]
Inserting the last bound in (3.23), we also get
[TABLE]
Finally, this yields bound (3.16). ∎
Remark 3**.**
Lemma 3.8 remains valid for . Owing to the absolute continuity of the Lebesgue integral we have , where , thus one can replace by in (3.22) and below in the proof.
Theorem 3.9**.**
Let . Then there exists a unique weak solution and it satisfies the bound
[TABLE]
Proof.
- Let first . According to Proposition 2 were exists a unique weak solution of (2.1) for any and it satisfies
[TABLE]
- Now it suffices to consider the case . Let first since . Then according to Proposition 2 there exists a unique weak solution of (2.1) and it satisfies bound (3.4). Moreover, it is also a weaker solution.
So it remains to prove the bound
[TABLE]
To this end, according to Lemma 3.7 we approximate by functions satisfying (3.14). The strong solution of (2.1) corresponding to satisfies the bound like (3.16) and in particular
[TABLE]
Therefore there exists a subsequence of (not relabeled) and such that converges to in the weak-star sense of . This is sufficient to pass to the limit in the last bound and in (3.6) for , and , see Remark 1. Thus both satisfies the bound
[TABLE]
and is a weaker solution of (2.1). Due to its uniqueness there holds , and bound (3.26) is proved.
- Let now and be the corresponding weaker solution of (2.1), see Proposition 3. The space is dense in , cf. [34, Proposition 2.1]; this also holds since is the projective closure of the tensor product between and , see [46], and is dense in . Thus there exists a sequence such that in as . Let be the above weak solution of (2.1) corresponding to . Since is a Cauchy sequence in , is a Cauchy sequence in too due to bound (3.26) for . Thus in and
[TABLE]
Then we pass to the limit in (3.1) for , and and see that is a weak solution of (2.1). Due to uniqueness of the weaker solution we get , and the proof is complete. ∎
3.2.2. Some function spaces and embeddings
We set
[TABLE]
and introduce the interpolation spaces
[TABLE]
for non-integer using the real -interpolation method of Banach spaces for , see [3]. Recall that the value leads to the broadest intermediate spaces. Their explicit description in terms of the Nikolskii spaces or their subspaces is known, see [42, 48, 49]. In particular,
[TABLE]
where and is the odd extension of with respect to and from to . Hereafter equalities of Banach spaces are understood up to the equivalence of their norms. It is well known that the space contains discontinuous but piecewise -functions.
In addition, let be the distributional derivative and, for a Banach space , denote the subspace of with the mean value
[TABLE]
Define the space of distributions with equipped with the norm . Then , see Item 3 of the proof of Lemma 3.10 below. (Actually a quite similar result is valid for for any .) Note that, in particular, the Dirac delta-function \delta_{a}(x)=D_{x}\big{(}H(x-a)-(1-a/L)\big{)}\in H^{(-1/2)} for any , where for and for is the Heaviside function.
Let and be the forward difference in . Define the spaces and of functions such that respectively and equipped with the norms
[TABLE]
Here is a particular anisotropic Nikolskii space (of the order in only) and is a particular space of functions having the dominating mixed smoothness (of the order in in the Nikolskii sense and in in the Sobolev sense). Note that .
For a Banach space , let be the subspace of such that on . Define the spaces and of distributions with respectively and equipped with the norms
[TABLE]
Note that all the spaces defined above and below in this subsection are Banach ones.
The next technical lemma plays an essential role below.
Lemma 3.10**.**
The following equalities and embeddings hold
[TABLE]
Proof.
- Define the anisotropic Sobolev spaces and equipped with the norms
[TABLE]
The following equalities hold
[TABLE]
for example, see [49, Ch. 1.2]. Recall that the corresponding -embeddings are proved by the classical techniques of approximation by the Steklov averages and the opposite -embeddings are rather simple. Moreover, equality (3.31) involving three spaces implies (3.32) since it concerns the closed subspaces of one and the same type for all of these three spaces. The same is valid for pairs of embeddings (3.33)-(3.34) and (3.35) below.
The elements in and can be uniquely identified as distributions such that respectively with and with , where is an equivalent norm in (in the latter case, of course, ). In particular, for , clearly W(x,t)=\int_{0}^{x}w(\xi,t)\,d\xi-\big{\langle}\int_{0}^{x}w(\xi,t)\,d\xi\big{\rangle}_{\Omega}. Taking into account that one and the same operator establishes the one-to-one correspondence between respectively three spaces involved in equalities (3.32) and (3.27), the latter one is valid too.
- Define the space equipped with the norm . The following equalities
[TABLE]
can be proved similarly to (3.31)-(3.32).
The elements in and can be uniquely identified as the distributions such that respectively , with the equivalent norms and , and , with the equivalent norms and . Thus equality (3.34) implies (3.28).
- The following equalities hold
[TABLE]
which are simpler 1D versions of (3.31)-(3.32), for example, see [3, 48] and [49, Ch. 1.2]. The second equality implies the above mentioned one .
Let be the space of normalized functions of bounded variation on that are continuous from the right at and continuous from the left at any ; we equip it with the norm . Any can be represented as with and , for example, see [10, Ch. 2]. The representation is clearly unique for ; in this subspace serves as an equivalent norm.
Notice that the following inequalities hold
[TABLE]
for any ; the definition of the Riemann integral implies the latter one. Then for any we get the inequalities
[TABLE]
(they remain valid for with replaced by ). Thus
[TABLE]
Let . Then , where the equality is valid due to the classical Pettis theorem [20, Theorem 8.15.2] (since is separable), and with and .
Moreover, we have for a.e. . By applying (3.37) to , omitting on the left, integrating the squared result over and taking back on the left, we obtain
[TABLE]
that completes the proof of embedding (3.29).
- Let . Then and with and
[TABLE]
according to embedding (3.29).
Moreover, define the forward difference quotients in time for . Then for the same and owing to the first inequality (3.36) we get
[TABLE]
Therefore
[TABLE]
Consequently there exists the derivative , and inequality (3.38) implies embedding (3.30). ∎
We also set and define the anisotropic Sobolev subspaces
[TABLE]
for and the anisotropic Nikolskii subspaces
[TABLE]
equipped with the norm for , where . Then the following equality holds
[TABLE]
which is similar to equality (3.31).
4. Analysis of the control problem
According to Theorem 3.9 and Proposition 3 the state equation (2.1) is uniquely solvable for any in either or and the solution depends continuously on the data. Therefore, we can introduce the linear and bounded operator . The control-to-state mapping
[TABLE]
is given by for fixed and and it is an affine and bounded operator. So we can rewrite the original control problem () in its reduced form
[TABLE]
Proposition 5**.**
Problem () has a unique solution .
Proof.
The control-to-state operator is weak-star-to-strong sequential continuous, i.e., if and in , then in . The proof of this continuity property is similar to [34, Lemma 6.1] in the case of solutions by transposition resp. very weak solutions. The strong continuity follows from the compact embeddings and well known Aubin-Lions-Lemma. Then the direct method of calculus of variations combined with the sequential Banach-Alaoglu theorem ( is separable) can be applied to show existence of an optimal control. Additionally the control is unique since the control-to-state operator is injective and the data tracking functional is strictly convex. ∎
Owing to Proposition 3 the optimal control satisfies the inequalities
[TABLE]
and thus
[TABLE]
Hereafter depends on the norms of data.
Next we discuss first order optimality conditions. We introduce the adjoint control-to-solution operator , where is a weak solution of (3.13). This operator is well defined and bounded according to Proposition 2.
We also need the operator , where is the unique solution of
[TABLE]
The next result provides the necessary and sufficient optimality condition for the optimal pair .
Proposition 6**.**
An element is an optimal control of () if and only if
[TABLE]
or equivalently
[TABLE]
where \bar{p}=S^{*}\big{(}\bar{y}-z_{1},-(\bar{y}(T)-z_{2}),A^{-1}(\rho\partial_{t}\bar{y}-z_{3})\big{)} with .
Proof.
For the proof of [34, Theorem 7.1] remains valid; for it is similar to [12, Theorem 3.2]. ∎
To discuss further the properties of the optimal control , we introduce the Jordan decomposition of a signed measure , see [6]. There exists unique elements such that . Moreover, we recall the polar decomposition of a vector measure : , where is the Radon-Nikodym-derivative of with respect to .
The subgradient condition in Proposition 6 implies the following conditions.
Proposition 7**.**
Let be the optimal control of () and be the corresponding adjoint state. Then there holds .
In the cases and there respectively hold
[TABLE]
and
[TABLE]
Proof.
A detailed discussion of the proof of these results can be found in [12, 33]. ∎
The regularity of the adjoint state is now applied to show improved regularity of the optimal control .
Theorem 4.1**.**
Let , , and be the optimal control of (). Then and the following bound holds
[TABLE]
Proof.
There holds according to Theorem 3.9. Thus, the optimal adjoint state has the following regularity by Proposition 2. We have according to (4.6). Moreover, we define the function
[TABLE]
and show that it serves the time derivative of . For any and , we define the difference quotient \bar{u}{(t_{0},t)}=\big{(}\bar{u}(t)-\bar{u}(t_{0})\big{)}/(t-t_{0}). Then we consider
[TABLE]
as since . Next, quite similarly we get
[TABLE]
as . Consequently . Finally, we bound as follows
[TABLE]
owing to Proposition 2 and Theorem 3.9. Utilizing bound (4.2) for , we complete the proof. ∎
5. Discretization of the state equation
We introduce the uniform grid in time with the step and a non-uniform grid in space with the steps , where and . Let also , and . We assume that the space grid is quasi-uniform, i.e., . Hereafter , etc., are grid-independent.
Let and be the spaces of piecewise linear finite elements with respect to the introduced grids on and .
We approximate the state variable by and additionally by . For the discrete state equation has the following form
[TABLE]
involving the indefinite symmetric bilinear form
[TABLE]
with the grid independent parameter , cf. (3.1). This definition follows [51] but notice carefully that normally is uniquely defined by (5.1) with and (5.2). To treat general , we need .
Remark 4**.**
The second term on the right hand-side of (5.3) regularizes the Galerkin (i.e. projection) method with respect to bilinear form (3.2). It is included to ensure unconditional stability for suitable values of . Moreover, the term
[TABLE]
is the error term of the compound trapezoidal rule applied for the calculation of the temporal integral in . So that, in particular, for in (5.3) this temporal integral is calculated using this rule whereas for it is not approximated.
Next we recall the inverse inequality
[TABLE]
where the least constant satisfies for the quasi-uniform grid. For we need to state conditions linking the temporal and spatial grids to ensure stability of the numerical method.
Assumption 1**.**
In what follows, let
[TABLE]
Remark 5**.**
The parameters and can be chosen arbitrarily small but then constants in the stability and error estimates for our FEM can tend to infinity.
Remark 6**.**
As we see below in Section 11, the method is related to well known time-stepping methods, in particular, to the explicit Leap-Frog-method for . Then conditions (5.5) and (5.6) reduce to a CFL-type one . For the method is related to the Crank-Nicolson scheme and is unconditionally stable but in a weaker norm than we need to derive our error estimates so that we impose a very weak CFL-type condition .
Below in proofs we utilize the auxiliary squared norms
[TABLE]
for and . We need to bound them by standard norms.
Lemma 5.1**.**
Under conditions (5.5) and (5.6) the following inequalities hold
[TABLE]
with for and for .
Proof.
For , the first inequality is obvious; for it can be checked by a direct calculation using (5.4). The proof of the second inequality is covered in [51, Corollary 2.1]. ∎
Now we discuss some properties of and that are essential below.
Proposition 8**.**
Let be the solution of (5.1)-(5.2). Then there holds
[TABLE]
Proof.
This is proved by testing (5.1) with time constant functions . ∎
The non-local in time identity (5.8) is convenient for our error analysis but not for the implementation; for the latter issue see Section 11. Identities similar to (5.8) also hold on the continuous level.
Proposition 9**.**
- (1)
Let be the weak solution of (2.1) for . Then there holds
[TABLE] 2. (2)
Let be the weaker (very weak) solution of (2.1) for . Then there holds
[TABLE]
Proof.
For identity (5.9) is proved by testing (3.1) with time constant function . For we test (3.7) with any and get
[TABLE]
According to Proposition 3 we have . Thus there holds
[TABLE]
The density of in implies (5.10). ∎
For our analysis, we need some projection and interpolation operators. We introduce the standard projectors : and : defined by
[TABLE]
Clearly and . Identity (5.2) means that .
Moreover the following property holds
[TABLE]
Following [51], we also introduce the regularized projector : defined by
[TABLE]
with the grid independent parameter . Clearly for .
Let : be the interpolation operator such that for all .
Next we define the operator , where is the unique solution of
[TABLE]
Clearly , see (4.3) with , and the norm in and its discrete counterpart can be written as
[TABLE]
Moreover, we set . First we note that
[TABLE]
Then by the standard FEM error analysis [8] and operator interpolation theory we have
[TABLE]
[TABLE]
6. Stability and error estimates for the discrete state equation
In this section we present error estimates for the state equation. We begin with an auxiliary result.
Lemma 6.1**.**
For , the following estimate holds
[TABLE]
Proof.
We recall the well known estimates
[TABLE]
which are valid using the inverse inequality (5.4). We also remind inequality (5.7) and notice also that for the following additional inequality holds
[TABLE]
Let and . We apply identities (5.11) and (5.14) and get
[TABLE]
Now we set , and from the former equality and estimate (6.2) for as well as the latter equality and estimate (6.3) for we obtain the estimate
[TABLE]
By using the -method, we complete the proof. ∎
Now we get a stability bound and error estimates in for the discrete state equation.
Proposition 10**.**
Let and be the solutions to the state equation (2.1) and the discrete state equation (5.1)-(5.2).
- (1)
For , the following stability bound holds:
[TABLE] 2. (2)
For , the following error estimate holds:
[TABLE] 3. (3)
For , the higher order error estimate holds:
[TABLE]
Proof.
- According to [51, Theorem 2.1 (1)], the bound
[TABLE]
is valid for any . We have . In the case , there clearly holds
[TABLE]
For , we alternatively get using (6.1) for
[TABLE]
for any .
We proceed with the bound for . Identity (5.8) and bound (6.8) together with the generalized Minkowski inequality
imply
[TABLE]
Finally we derive bound (6.5).
- Let be the solution of equation (5.1) for . Bound (6.8) together with (6.1) for , the bound in Proposition 3 and the stability of in imply
[TABLE]
Owing to [51, Theorem 4.1] the following error estimate holds
[TABLE]
Using the -method and equality (3.27) we get the intermediate error estimate
[TABLE]
In the case we can choose , then and . In the case we can use the stability bound (6.8) and estimate (6.1) to get
[TABLE]
Then by subtracting (5.8) from (5.10) and applying identity (5.12) we find
[TABLE]
consequently
[TABLE]
Thus we obtain (6.6).
- Once again we apply [51, Theorem 4.1] and first get the estimate
[TABLE]
Combining it together with (6.11), we derive
[TABLE]
In this proof, we apply this estimate in the case only (but in general case below).
In the remaining case , from [51, Theorem 4.1] we also get the higher order error estimate
[TABLE]
Moreover owing to Proposition 3 and bound (6.5) (both for ) we have
[TABLE]
The last bound and estimate (6.14) imply by the -method and equality (3.28):
[TABLE]
for any . Applying inequality (6.12) we complete the proof. ∎
Remark 7**.**
A priori stability bound (6.5) implies the unique solvability of the discrete state equation (5.1)-(5.2).
Remark 8**.**
According to the given proof, for in place of the norms of in (6.5) and (6.6) can be weakened down to respectively and . For , we have . The same can be shown for also for provided that with any .
7. Discrete control problem
First we introduce the discrete mapping
[TABLE]
and the discrete affine linear control-to-state mapping
[TABLE]
defined by , with . The mapping is a composition of
[TABLE]
where is a basis in , and . The former mapping is bounded due to and the latter one is finite dimensional. Thus is a bounded operator. Then we consider the following semi-discrete optimal control problem
[TABLE]
with the squared semi-norm corresponding to the inner product
[TABLE]
Using the similar argument as in the continuous case it can be shown that () has a solution which is not unique in general, and due to the optimality, the stability bound (6.5) and property (5.16) (for ) one gets
[TABLE]
cf. (4.1), and consequently
[TABLE]
Theorem 7.1**.**
Let , and be the optimal controls of respectively problems () and (). Then there holds
[TABLE]
Proof.
Owing to (7.1) there exists a sequence , , and such that in as . This implies the limit relation
[TABLE]
To prove it, we write the chain of inequalities
[TABLE]
The first term on the right in the last inequality converges to zero according to the error estimate (6.6). The convergence of the second term follows from the weak-star-to-strong continuity of and the stability of in . Finally, property (5.13) for implies the convergence of the last term.
Then (7.2) and the weak-star lower semicontinuity of in implies
[TABLE]
Thus, the uniqueness of means that and in addition implies the convergence of the whole sequence in as . Moreover, we have . This and (7.2) lead to . ∎
For convenience we set . In the following the directional derivative of a functional at in direction is denoted by . In the case , is the Gateaux differentiable in . Moreover, we make use of the convex subdifferential of . Let and . Then there holds if and only if
[TABLE]
An element is an optimal solution of () if and only if . To calculate for , we apply the Lagrange technique and define the Lagrange functional by
[TABLE]
with (where we base on identities (5.1)-(5.2)). We obviously have
[TABLE]
Thus there holds
[TABLE]
provided that is the solution of the discrete problem
[TABLE]
and
[TABLE]
Therefore the discrete optimality system consists of the discrete state equation
[TABLE]
the discrete adjoint state equation
[TABLE]
and the discrete variational inequality
[TABLE]
8. Stability and error estimates for the discrete adjoint state equation
We define the general discrete adjoint state equation
[TABLE]
Here is the solution to the state equation (2.1). Clearly identity (8.2) means simply that with .
Now we get a stability bound and error estimates in and for the discrete adjoint state equation.
Proposition 11**.**
Let p=S^{*}\big{(}y-z_{1},-(y(T)-z_{2}),A^{-1}(\rho\partial_{t}y(T)-z_{3})\big{)} and be the solution of the corresponding general discrete adjoint state equation (8.1)-(8.2).
- (1)
If and , then the following stability bound holds
[TABLE] 2. (2)
If , and , then the following error estimate holds
[TABLE] 3. (3)
If , and , then the following error estimate holds
[TABLE] 4. (4)
If , and , then the following higher order error estimate holds
[TABLE]
Proof.
- According to [51, Theorem 2.1 (2)] the following energy bound hold
[TABLE]
for any . Using (6.2), and (5.16) we get
[TABLE]
By applying also the counterpart of inequalities (6.9) we derive bound (8.3).
- The counterpart of the error estimate (6.13) for the adjoint state equation case and bound (8.7) give
[TABLE]
Owing to inequality (6.12) and Proposition 3 we obtain estimate (8.4).
- Below we need the multiplicative inequalities
[TABLE]
Let be the auxiliary solution to (8.1) for . Owing to inequality (8.8) and the stability bounds [51, Theorem 2.1] we get
[TABLE]
Consequently, for , by (6.2), (5.17) and (5.18) the following chain of inequalities hold
[TABLE]
for . Thus it is enough to prove error estimates (8.5) and (8.6) for instead of .
According to [51, Theorem 5.3 and estimate (5.18)] we have the error estimate
[TABLE]
for . We emphasize that due to [51, Theorem 4.3 (2) (e)] and (6.1) this estimate holds for .
Inequality (8.8), Proposition 2 (applied to the adjoint state problem) and property (5.16) imply the following error estimate for the time interpolation
[TABLE]
for . Owing to estimates (8.10) and (8.11) and Propositions 3 and 2 we get
[TABLE]
for , where and .
Applying the -method together with equalities (3.27) and (3.39) for , we get (8.5) for in the role of .
- First notice that the multiplicative inequality (8.9), Proposition 2 (2) (applied for the adjoint state problem) and property (5.16) imply another error estimate for the time interpolation
[TABLE]
Then Proposition 2 (1) leads to
[TABLE]
Next we derive the error estimate
[TABLE]
According to [51, Theorem 5.3 and estimate (5.18)] and equality (3.39) for together with Propositions 3 and 2 the following three estimates hold
[TABLE]
[TABLE]
[TABLE]
and for (for the same reason as above). Then applying the -method to the two last estimates and using equality (3.28) we get
[TABLE]
By combining this estimate and (8.15) we obtain (8.14).
Estimates (8.13) and (8.14) imply
[TABLE]
that completes the proof of (8.6) for in the role of . ∎
Remark 9**.**
A priori stability bound (6.5) (taken for ) implies the unique solvability of the general discrete adjoint state equation (8.1)-(8.2).
9. Error estimates for the state variable
We introduce the discrete adjoint control-to-state operator , defined by
[TABLE]
with . Similarly to bound (8.3) and Remark 9 it is well defined and satisfies
[TABLE]
Let for brevity be the duality mappings defined by
[TABLE]
for any . With this notation, the function
[TABLE]
solves the general discrete adjoint state equation (8.1)-(8.2).
Proposition 12**.**
Let and . Then the following estimate holds
[TABLE]
Proof.
We recall that and and test the continuous subgradient condition (4.5) with the discrete optimal control and the discrete subgradient condition (7.5) with the continuous optimal control . Then we subtract the first inequality from the second one and get
[TABLE]
We define , insert it between and and obtain
[TABLE]
For convenience we introduce the variables and remark that the state equations for and have the same initial data. With the help of them we rewrite the second term on the right in (9.2) taking first the difference of the discrete state equations (7.3) and (5.1) (taken for ) for , next the difference of the discrete adjoint state equations (7.4) and (8.1)-(8.2) (taken for ) for and and finally using (5.15)
[TABLE]
Further we easily get
[TABLE]
Thus (9.2) implies
[TABLE]
Finally by applying bounds (4.2) and (7.1) we derive (9.1). ∎
This proposition is important since it allows one to derive estimates for with the help of the above error estimates for the discrete state and adjoint state equations.
Theorem 9.1**.**
- (1)
Let , and . Then the following error estimate holds
[TABLE] 2. (2)
Let , and . Then the following higher order error estimate holds
[TABLE]
Proof.
[TABLE]
Second, Proposition 11 (3) leads to
[TABLE]
Now owing to Proposition 12, embedding (3.29) and bound (4.2) for error estimate (9.3) is proved.
- First, Proposition 10 (3) implies
[TABLE]
Second, Proposition 11 (4) leads to
[TABLE]
Now owing to Proposition 12, embedding (3.30) and Theorem 4.1 for error estimate (9.4) is proved too. ∎
Remark 10**.**
Note that our error bounds could be better provided that one would improve the last term on the right in (9.1) by increasing the power . But this seems a complicated problem.
10. Error estimate for the cost functional
In this section we derive error estimate for the cost functional. We first observe the inequalities
[TABLE]
which can be equivalently rewritten in the form
[TABLE]
Therefore, to bound below we apply the following result.
Proposition 13**.**
Let . Then for any
[TABLE]
with and the same and as in Proposition 11.
Proof.
Let . According to the definitions of the continuous and discrete cost functionals and property (5.13) for and we get
[TABLE]
We set .
Owing to the adjoint problem (3.12) with
[TABLE]
we have
[TABLE]
Similarly owing to the general discrete adjoint state equation (8.1)-(8.2) for and the discrete state equation (5.1)-(5.2) for and we get
[TABLE]
In addition owing to the definitions (8.2) of and (5.15) of , we can write
[TABLE]
Consequently we obtain
[TABLE]
In addition using property (5.13) we derive
[TABLE]
Next, for the term in (10.4) we have
[TABLE]
due to the bounds , (5.18) and (6.2). Clearly also . Finally from (10.3)-(10.6) we derive (10.2). ∎
Now we prove for the cost functional a higher order error estimate than (9.3) for the state variable in the case .
Theorem 10.1**.**
Let , and . Then the following error estimate for the cost functional holds
[TABLE]
Proof.
Let us base on Proposition 13 and take any . Owing to Proposition 10 (2) we have
[TABLE]
Proposition 11 (3) leads to
[TABLE]
Owing to Propositions 2(1) (applied to the adjoint state problem) and 3 we have
[TABLE]
(like in estimates (8.11)-(8.12) for ). By using estimate (5.17) for we obtain
[TABLE]
By collecting all these estimates together with embedding (3.29), Proposition 11 (2) to bound and applying Proposition 13, we derive
[TABLE]
Owing to inequalities (10.1) together with bounds (4.2) for and (7.1) for the proof is complete. ∎
Remark 11**.**
In the case we know that (cf. Theorem 4.1). The lack of the corresponding bound at least at the discrete level does not allow us to prove the error estimate . The estimate follows directly from (9.4).
11. Time-stepping formulation
In this section we discuss the time-stepping formulation of the discrete state equation (5.1)-(5.2) and the discrete adjoint state equation (7.4).
We introduce the piecewise-linear “hat” functions such that for any , where is the Kroneker delta. We recall that are “half” hat functions for . There holds . Similarly, we introduce the spatial hat functions such that for any and ; then .
Then the approximate state variable can be represented in the following forms
[TABLE]
for with , and .
We also define the forward and backward difference quotients and the average in time operator
[TABLE]
We define the self-adjoint positive-definite operators and acting in (in other words, the mass and stiffness matrices) such that
[TABLE]
For and we define the vectors and
[TABLE]
We recall the form of the discrete state (11.1).
The forward time-stepping is implemented as follows. The integral identities (5.1)-(5.2) are equivalent to the operator equations
[TABLE]
followed by the counterpart of (11.3) at time for :
[TABLE]
Next the adjoint (backward) time-stepping is implemented in a similar manner. Namely, the integral identities (7.4) are equivalent to the operator equations
[TABLE]
followed by the counterpart of (11.5) for :
[TABLE]
Remark 12**.**
For the three-level time stepping scheme (11.2)-(11.5) is closely related to the well-known two-level Crank-Nicolson method applied to the first order in time system
[TABLE]
see [51, Section 8] for details, as well as to the Petrov-Galerkin method described in [32]. After the mass lumping, for our method becomes explicit and is related to the Leap-Frog method; moreover, for any it becomes close to three-level finite-difference schemes with such weight in time, eg. see [47].
12. Control discretization. Solution process and -regularization
Now we discuss in more detail solving of the semi-discrete optimization problem () in the case .
An important point is that we can seek its solution in the form
[TABLE]
To show that, let be the projector in on . Note that, for , it satisfies
[TABLE]
Then we define : by and . The following identity holds
[TABLE]
with the interpolation operator : such that for all . In particular, if , then
[TABLE]
and consequently (like in [33, Lemma 3.11]) we have as well as . Thus for each solution of problem (), the discrete control satisfies
[TABLE]
Therefore is also a solution of (). This is a justification for solving the fully discrete problem
[TABLE]
in order to get a solution of ().
The direct solution of (12.1) by means of a generalized Newton type method is a challenging problem since a proper globalization strategy is needed, see [39]. Thus we propose a solution strategy based on an additional -regularization of (12.1) with a parameter and a continuation method. For high values of the corresponding Newton type method converges independently of the initial guess in numerical practice. Thus the continuation strategy can be seen as simple globalization strategy.
On the continuous level we consider the following regularized problem
[TABLE]
It is possible to formulate a semi-smooth Newton method for this problem on the continuous level which is based on the following necessary and sufficient optimality condition
[TABLE]
with . Moreover, this semi-smooth Newton method is superlinear convergent. Let and be the unique solutions of (12.2) and (). Then we have in , see [33, 43, 26]. This justifies the use of a continuation strategy in . The control discretization described above can not be used for (12.2). Instead we propose to use discrete controls from , i.e.,
[TABLE]
cf. (11.1). In particular, we solve the following fully discrete regularized problem
[TABLE]
with
[TABLE]
where is the lumped mass matrix. Moreover, the operator is defined by
[TABLE]
The use of allows us to derive the following optimality conditions for (12.4)
[TABLE]
for all and , with , cf. (12.3). Based on (12.5) we can set up a semi-smooth Newton method. Since problem (12.4) is a discretization of (12.2), we can expect that this method behaves mesh independently. Let be the solution of (12.4) and we define
[TABLE]
As the control tends to a solution of (12.1) justifying the use of this control discretization and the continuation strategy. For more details see [43].
13. Numerical results
In this section, we present results of numerical experiments and consider two examples both involving zero initial data , the control space and the tracking functional
[TABLE]
with the time independent desired state which is a Gaussian centered at . We choose and as an irrational parameter.
For sufficiently large (), we expect that the optimal control consists of one point source with a position close to . If the Gaussian would move through the domain, a point source shaped is not able to follow the center of the Gaussian since contains no moving point sources. The optimal control would rather consist of some additional fixed point sources. This would not lower the regularity of the state whereas a moving point source can cause it.
The domain and the time segment are discretized by the uniform grids for and where with . The stability parameter is fixed to its lowest value ensuring unconditional stability of the time-stepping method. The discrete control problem is solved for and the fixed and then vice versa. The solution process has been described above in Section 12. Numerically the desired state is replaced by for simplicity, moreover the corresponding error is negligible. Since the optimal pairs are not known in our examples, we replace them by reference solutions which are taken as the approximate solutions on the finest grid level.
Example 1. We first take the constant coefficients and and set . We depict the reference solution in Figure 1.
As expected, the optimal control consists only of one point source positioned in the vicinity of . Thus, the state has a kink at this position. Due to reflections at the boundary, has also kinks at other positions.
Next, we discuss the convergence results. In Figure 2, we see the convergence rate of (left) and the objective functional (right) as refines. The state error behaves mostly in a linear way and the rate for the functional is close to two; as usual the latter is approximately the doubled rate of the former, and fortunately both are better than the above proved theoretical rates.
In Figure 3, we see the similar results as refines. The error of the functional stagnates at the last refinement that is caused by a too coarse space grid. Nevertheless, we observe reduced rates for much less than two caused by its reduced regularity (kinks).
Example 2. Now we take the variable coefficient
[TABLE]
and set . Our analysis does not cover discontinuous coefficients, but they are of great importance in applications, for example, in seismic tomography. A jump discontinuity in translates to a jump in the wave speed which can be related to two different material characteristics changing at the point of discontinuity. Note that the point of discontinuity is a grid point for all grid levels.
The reference solution is displayed in Figure 4. Once again consists of one Dirac measure with a time-dependent intensity located in the vicinity of and thus has a kink at this position. Moreover, we can clearly see that at the wave speed changes and the wave propagation becomes slower.
In Figure 5 we observe that the error of the state variable converges in a linear way whereas the error measured in the objective functional behaves quadratically.
Finally in Figure 6 we study the error behaviour for -refinement and find the similar rates of convergence. So somewhat surprisingly we find that the convergence behavior of the error is comparable to the previous Example 1 with that stimulates further possible studies.
The reference list from the paper itself. Each links out to its DOI / PubMed record.
- 1[1] L. Bales and I. Lasiecka , Continuous finite elements in space and time for the nonhomogeneous wave equation , Computers & Mathematics with Applications, 27 (1994), pp. 91 – 102.
- 2[2] W. Bangerth, M. Geiger, and R. Rannacher , Adaptive Galerkin finite element methods for the wave equation , Comput. Methods Appl. Math., 10 (2010), pp. 3–48.
- 3[3] J. Bergh and J. Löfström , Interpolation Spaces. An Introduction , Springer, Berlin-New York, 1976.
- 4[4] A. Bermúdez, P. Gamallo, and R. Rodríguez , Finite element methods in local active control of sound , SIAM J. Control Optim., 43 (2004), pp. 437–465.
- 5[5] H. Brezis , Functional analysis, Sobolev spaces and partial differential equations , Springer, New York, 2011.
- 6[6] V. I. Bogachev , Measure theory. Vol. I, II , Springer, Berlin, 2007.
- 7[7] K. Bredies and H. K. Pikkarainen , Inverse problems in spaces of measures , ESAIM Control Optim. Calc. Var., 19 (2013), pp. 190–218.
- 8[8] S. C. Brenner and L. R. Scott , The mathematical theory of finite element methods , vol. 15 of Texts in Applied Mathematics, Springer, New York, 3 ed., 2008.
