Optimal a priori error estimates of parabolic optimal control problems with a moving point control
Dmitriy Leykekhman, Boris Vexler

TL;DR
This paper establishes optimal a priori error estimates for a parabolic optimal control problem involving a moving point source, correcting previous flawed analysis and providing new error bounds with logarithmic factors.
Contribution
It offers the first correct proof of optimal error estimates for this problem, including global and local error analysis on a curve, improving upon prior flawed results.
Findings
Optimal convergence rates achieved in discretization
Error estimates include logarithmic factors
Corrected proof addresses previous flaws in analysis
Abstract
In this paper we consider a parabolic optimal control problem with a Dirac type control with moving point source in two space dimensions. We discretize the problem with piecewise constant functions in time and continuous piecewise linear finite elements in space. For this discretization we show optimal order of convergence with respect to the time and the space discretization parameters modulo some logarithmic terms. Error analysis for the same problem was carried out in the recent paper [17], however, the analysis there contains a serious flaw. One of the main goals of this paper is to provide the correct proof. The main ingredients of our analysis are the global and local error estimates on a curve, that have an independent interest.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
11institutetext: Dmitriy Leykekhman 22institutetext: Department of Mathematics, University of Connecticut, Storrs, CT 06269, USA, 22email: [email protected] 33institutetext: Boris Vexler 44institutetext: Technical University of Munich, Chair of Optimal Control, Center for Mathematical Sciences, Boltzmannstraße 3, 85748 Garching by Munich, Germany, 44email: [email protected]
Optimal a priori error estimates of parabolic optimal control problems with a moving point control
Dmitriy Leykekhman and Boris Vexler
Abstract
In this paper we consider a parabolic optimal control problem with a Dirac type control with moving point source in two space dimensions. We discretize the problem with piecewise constant functions in time and continuous piecewise linear finite elements in space. For this discretization we show optimal order of convergence with respect to the time and the space discretization parameters modulo some logarithmic terms. Error analysis for the same problem was carried out in the recent paper GongW_YanN_2016a , however, the analysis there contains a serious flaw. One of the main goals of this paper is to provide the correct proof. The main ingredients of our analysis are the global and local error estimates on a curve, that have an independent interest.
1 Introduction
In this paper we provide numerical analysis for the following optimal control problem:
[TABLE]
subject to the second order parabolic equation
[TABLE]
and subject to pointwise control constraints
[TABLE]
Here , is a convex polygonal domain and is the Dirac delta function at point at each . We will assume:
Assumption 1
- •
* and .*
Assumption 2
- •
, for any , with .
The parameter is assumed to be positive and the desired state fulfills . The control bounds fulfill . The precise functional-analytic setting is discussed in the next section.
For the discretization, we consider the standard continuous piecewise linear finite elements in space and piecewise constant discontinuous Galerkin method in time. This is a special case (, ) of so called dG()cG() discretization, see e.g. ErikssonK_JohnsonC_ThomeeV_1985 for the analysis of the method for parabolic problems and e.g. MeidnerD_VexlerB_2008a ; MeidnerD_VexlerB_2008b for error estimates in the context of optimal control problems. Throughout, we will denote by the spatial mesh size and by the size of time steps, see Section 3 for details.
The main result of the paper is the following.
Theorem 1.1
Let be optimal control for the problem (1)-(2) and be the optimal dG(0)cG(1) solution. Then there exists a constant independent of and such that
[TABLE]
We would also like to point out that in addition to the optimal order estimate, modulo logarithmic terms, our analysis does not require any relationship between the sizes of the space discretization and the time steps .
The problem with fixed location of the point source (i.e. with for some fixed ) starting with the work of Lions LionsJL_1971 , was investigated in a number of publications, see AmourouxM_BabaryJP_1978 ; BanksHT_1992 ; ChryssoverghiI_1981 ; DroniouJ_RaymondJP_2000 ; NguyenPA_RaymondJP_2011 for the continuous problem and GongW_HinzeM_ZhouZ_2014 ; LeykekhmanD_VexlerB_2013 ; LeykekhmanD_VexlerB_2016c for the finite element approximation and error estimates. There is also a closely related problem of measured valued controls, which received a lot of attention lately CasasE_ClasonC_KunischK_2013 ; CasasE_KunischK_2016 ; CasasE_VexlerB_ZuazuaE_2015 ; CasasE_ZuazuaE_2013 ; KunischK_PieperK_VexlerB_2014 .
The problem with moving Dirac was considered in CastroC_ZuazuaE_2004a ; NguyenPA_RaymondJP_2001 on a continuous level. The error analysis was carried out in the recent paper GongW_YanN_2016a . However, the analysis there contains a serious flaw. The last inequality in the estimate in GongW_YanN_2016a is not correct. One of the main goals of this paper is to provide the correct proof. The main ingredients of our analysis are the global and local error estimates on a curve, Theorem 3.1 and Theorem 3.2, respectively. These results are new and have an independent interest.
Throughout the paper we use the usual notation for Lebesgue and Sobolev spaces. We denote by the inner product in and by the inner product in for any subinterval .
The rest of the paper is organized as follows. In Section 2 we discuss the functional analytic setting of the problem, state the optimality system and prove regularity results for the state and for the adjoint state. In Section 3 we establish important global and local best approximation results along the curve for the heat equation. Finally in Section 4 we prove our main result.
2 Optimal control problem and regularity
In order to state the functional analytic setting for the optimal control problem, we first introduce the auxiliary problem
[TABLE]
with a right-hand side for some . This equation possesses a unique solution
[TABLE]
Due to the convexity of the polygonal domain the solution possesses an additional regularity for :
[TABLE]
with the corresponding estimate
[TABLE]
see, e.g., EvansLC_2010 . From the Sobolev embedding for any in two space dimensions and the previous lemma we can establish the following result for ,
[TABLE]
The exact form of the constant can be traced, for example, from the proof of (AltHW_2016, , Thm. 10.8). In addition, there holds the following regularity result (see LeykekhmanD_VexlerB_2013 ).
Lemma 1
If for an arbitrary , then and
[TABLE]
where , as .
We will also need the following local regularity result (see LeykekhmanD_VexlerB_2013 ).
Lemma 2
Let and for some . Then and there exists a constant independent of such that
[TABLE]
To introduce a weak solution of the state equation (2) we use the method of transposition, (cf. LionsJL_MagenesE_Vol2 ). For a given control we denote by with a weak solution of (2), if for all with there holds
[TABLE]
where is the weak solution of the adjoint equation
[TABLE]
The existence of this weak solution follows by duality using the embedding for . Using Lemma 1 we can prove additional regularity for the state variable .
Proposition 2.1
Without lose of generality we assume . Let be given and be the solution of the state equation (2). Then for any and the following estimate holds for with a constant independent of ,
[TABLE]
Proof
To establish the result we use a duality argument. There holds
[TABLE]
Let be the solution to (7) for with . From Lemma 1, and the following estimate holds
[TABLE]
Thus,
[TABLE]
Remark 1
We would like to note that the above regularity requires only Assumption 2 on . Higher regularity of is needed for optimal order error estimates only.
A further regularity result for the state equation follows from ElschnerJ_RehbergJ_SchmidtG_2007 .
Proposition 2.2
Let be given and be the solution of the state equation (2). Then for each there holds
[TABLE]
Moreover, the state fulfills the following weak formulation
[TABLE]
where and is the duality product between and .
Proof
For we have and therefore is embedded into . Therefore the right-hand side of the state equation can be identified with an element in . Using the result from (ElschnerJ_RehbergJ_SchmidtG_2007, , Theorem 5.1) on maximal parabolic regularity and exploiting the fact that is an isomorphism, see JerisonD_KenigCE_1995 , we obtain
[TABLE]
Given the above regularity the corresponding weak formulation is fulfilled by a standard density argument.
As the next step we introduce the reduced cost functional on the control space by
[TABLE]
where is the cost function in (1) and is the weak solution of the state equation (2) as defined above. The optimal control problem can then be equivalently reformulated as
[TABLE]
where the set of admissible controls is defined according to (3) by
[TABLE]
By standard arguments this optimization problem possesses a unique solution with the corresponding state for all , see Proposition 2.1 for the regularity of . Due to the fact, that this optimal control problem is convex, the solution is equivalently characterized by the optimality condition
[TABLE]
The (directional) derivative for given can be expressed as
[TABLE]
where is the solution of the adjoint equation
[TABLE]
and on the right-hand side of (11a) is the solution of the state equation (2). The adjoint solution, which corresponds to the optimal control is denoted by .
The optimality condition (10) is a variational inequality, which can be equivalently formulated using the projection
[TABLE]
The resulting condition reads:
[TABLE]
In the next proposition we provide regularity results for the solution of the adjoint equation.
Proposition 2.3
Let be given, let be the corresponding state fulfilling (2) and let be the corresponding adjoint state fulfilling (11). Then,
- (a)
* and the following estimate holds*
[TABLE]
- (b)
If , then for all and the following estimate holds
[TABLE]
Proof
- (a)
The right-hand side of the adjoint equation fulfills for all , see Proposition 2.1. Due to the convexity of the domain we directly obtain and the estimate
[TABLE]
The result from Proposition 2.1 leads directly to the first estimate.
- (b)
From Lemma 2 for we have
[TABLE]
Hence, by the triangle inequality and Proposition 2.1 we obtain
[TABLE]
That completes the proof.
3 Discretization and the best approximation type results
3.1 Space-time discretization and notation
For discretization of the problem under the consideration we introduce a partitions of into subintervals of length , where . We assume that
[TABLE]
The maximal time step is denoted by . The semidiscrete space of piecewise constant functions in time is defined by
[TABLE]
where is the space of constant functions in time with values in Banach space . We will employ the following notation for functions in
[TABLE]
Let denote a quasi-uniform triangulation of with a mesh size , i.e., is a partition of into triangles of diameter such that for ,
[TABLE]
hold. Let be the set of all functions in that are linear on each , i.e. is the usual space of continuous piecewise linear finite elements. We will require the modified Clément interpolant and the -projection defined by
[TABLE]
To obtain the fully discrete approximation we consider the space-time finite element space
[TABLE]
We will also need the following semidiscrete projection defined by
[TABLE]
and the fully discrete projection defined by .
To introduce the dG(0)cG(1) discretization we define the following bilinear form
[TABLE]
where is the duality product between and . We note, that the first sum vanishes for . Rearranging the terms, we obtain an equivalent (dual) expression for :
[TABLE]
In the two following theorems we establish global and local best approximation type results along the curve for the error between the solution of the auxiliary equation (4) and its dG(0)cG(1) approximation defined as
[TABLE]
Since dG(0)cG(1) method is a consistent discretization we have the following Galerkin orthogonality relation:
[TABLE]
3.2 Discretization of the curve and the weight function
To define fully discrete optimization problem we will also require a discretization of the curve . We define by
[TABLE]
i.e., is a piecewise constant approximation of . Next we introduce a weight function
[TABLE]
and a discrete piecewise constant in time approximation
[TABLE]
Define
[TABLE]
One can easily check that and satisfy the following properties for any ,
[TABLE]
3.3 Global error estimate along the curve
In this section we prove the following global approximation result.
Theorem 3.1 (Global best approximation)
Assume and satisfy (4) and (20) respectively. Then there exists a constant independent of and such that for any ,
[TABLE]
Proof
To establish the result we use a duality argument. First, we introduce a smoothed Delta function, which we will denote by . This function on each is defined as and supported in one cell, which we denote by , i.e.
[TABLE]
In addition we also have (see (SchatzAH_WahlbinLB_1995, , Appendix))
[TABLE]
Thus in particular , , and .
We define to be a solution to the following backward parabolic problem
[TABLE]
There holds
[TABLE]
Let be dG(0)cG(1) solution defined by
[TABLE]
Then using that dG(0)cG(1) method is consistent, we have
[TABLE]
where we have used the dual expression (19) for the bilinear form and the fact that the last term in (19) can be included in the sum by setting and defining consequently . The first sum in (19) vanishes due to . For each , integrating by parts elementwise and using that is linear in the spacial variable, by the Hölder’s inequality we have
[TABLE]
where denotes the jumps of the normal derivatives across the element faces.
From Lemma 2.4 in RannacherR_1991a we have
[TABLE]
where is the discrete Laplace operator, defined by
[TABLE]
To estimate the term involving the jumps in (29), we first use the Hölder’s inequality and the inverse estimate to obtain
[TABLE]
Now we use the fact that the equation (28) can be rewritten on the each time level as
[TABLE]
or equivalently as
[TABLE]
where is the -projection, see (15). From (32) by the triangle inequality, we obtain
[TABLE]
Using that the -projection is stable in -norm (cf. CrouzeixM_ThomeeV_1987a ), we have
[TABLE]
Inserting the above estimate into (31) and using (25a), we obtain
[TABLE]
Combining (29) and (30) with the above estimates we have
[TABLE]
To complete the proof of the theorem it is sufficient to show
[TABLE]
Then from (33) and (34) it would follow that
[TABLE]
Then using that the dG(0)cG(1) method is invariant on , by replacing an with and for any , we obtain Theorem 3.1.
The estimate (34) will follow from the series of lemmas. The first lemma treats the term .
Lemma 3
For any there exists such that
[TABLE]
where and are defined in (23) and (24), respectively.
Proof
The equation (28) for each time interval can be rewritten as (32). Multiplying (32) with and integrating over , we have
[TABLE]
We have
[TABLE]
By the Cauchy-Schwarz inequality and using (25b) we get
[TABLE]
On the other hand we have
[TABLE]
Using the identity
[TABLE]
we have
[TABLE]
By the Cauchy-Schwarz inequality, we obtain
[TABLE]
where in the last step we used that from (25d)
[TABLE]
for some . Using the Young’s inequality for , neglecting , and using the assumption on the time steps and that , we obtain
[TABLE]
To estimate , first by the Cauchy-Schwarz inequality and the approximation theory we have
[TABLE]
Using that is piecewise linear we have
[TABLE]
There holds and . Thus by the properties of (25b) and (25c), we have
[TABLE]
Same estimates hold for . Using these estimates, the fact that and the inverse inequality (in view of (25e) the inverise inequality is valid with inside the norm), we obtain
[TABLE]
To estimate we first notice that
[TABLE]
The proof is identical to the proof of in LeykekhmanD_VexlerB_2013 .
By the Cauchy-Schwarz inequality, (38), and the Young’s inequality, we obtain
[TABLE]
Using the estimates (36), (37), and (39) we have
[TABLE]
Summing over and using that we obtain the lemma.
The second lemma treats the term involving jumps.
Lemma 4
There exists a constant such that
[TABLE]
Proof
We test (32) with and obtain
[TABLE]
The first term on the right hand side of (40) using the Young’s inequality can be estimated as
[TABLE]
The last term on the right hand side of (40) can easily be estimated using (38) as
[TABLE]
Combining the above two estimates we obtain
[TABLE]
Summing over we obtain the lemma.
Lemma 5
There exists a constant such that
[TABLE]
Proof
Adding the primal (18) and the dual (19) representation of the bilinear form one immediately arrives at
[TABLE]
see e.g., MeidnerD_VexlerB_2008a . Applying this inequality together with the discrete Sobolev inequality, see (BrennerSC_ScottLR_2008, , Lemma 4.9.2), results in
[TABLE]
This gives the desired estimate.
We proceed with the proof of Theorem 3.1. From Lemma 3, Lemma 4, and Lemma 5. It follows that
[TABLE]
Taking sufficiently small we have (34). From (33) we can conclude that
[TABLE]
for some constant independent of and . Using that dG(0)cG(1) method is invariant on , by replacing and with and for any , we obtain
[TABLE]
By the triangle inequality and the above estimate we deduce
[TABLE]
Taking the infimum over , we obtain Theorem 3.1.
3.4 Interior error estimate
To obtain optimal error estimates we will also require the following interior result.
Theorem 3.2 (Interior approximation)
Let denote a ball of radius centered at . Assume and satisfy (4) and (20) respectively and let . Then there exists a constant independent of , and such that for any
[TABLE]
Proof
To obtain the interior estimate we introduce a smooth cut-off function in space and piecewise constant in time, such that ,
[TABLE]
As in the proof of Theorem 3.1 we obtain by (29) that
[TABLE]
where is the solution of (28). Note that is discontinuous in time. The first term can be estimated using the global result from Theorem 3.1. To this end we introduce the solution defined by
[TABLE]
There holds
[TABLE]
Applying Theorem 3.1 for the second term, we have
[TABLE]
From (43), canceling and using the above estimate, we obtain
[TABLE]
It remains to estimate the term . Using the dual expression (19) of the bilinear form we obtain
[TABLE]
To estimate we define and proceed using the Ritz projection defined by
[TABLE]
There holds
[TABLE]
Using the estimate
[TABLE]
where in the last step we used (25a), we obtain
[TABLE]
By the interior pointwise error estimates from Theorem 5.1 in SchatzAH_WahlbinLB_1977 , we have for each ,
[TABLE]
since the support of is contained in . On there holds and therefore for each ,
[TABLE]
Inserting the last two estimates into (47) we get
[TABLE]
Using a standard elliptic estimate and recalling we have
[TABLE]
where in the last step we used . This results in
[TABLE]
Therefore, we get
[TABLE]
For we obtain
[TABLE]
where we used that and on this set as well as the definition of (17). Inserting the estimate (48) for and the estimate (49) for into (45) we obtain
[TABLE]
Using the estimate (34) and Lemma 4
[TABLE]
Inserting this inequality into (44) we obtain
[TABLE]
Using that the dG([math])cG() method is invariant on , by replacing and with and for any , we obtain the estimate in Theorem 3.2.
4 Discretization of the optimal control problem
In this section we describe the discretization of the optimal control problem (1)-(2) and prove our main result, Theorem 1.1. We start with discretization of the state equation. For a given control we define the corresponding discrete state by
[TABLE]
Using the weak formulation for from Proposition 2.2 we obtain the perturbed Galerkin orthogonality,
[TABLE]
Note, that the jump terms involving vanish due to the fact that
[TABLE]
and .
Similarly to the continuous problem, we define the discrete reduced cost functional by
[TABLE]
where is the cost function in (1). The discretized optimal control problem is then given as
[TABLE]
where is the set of admissible controls (9). We note, that the control variable is not explicitly discretized, cf. HinzeM_2005a . With standard arguments one proves the existence of a unique solution of (52). Due to convexity of the problem, the following condition is necessary and sufficient for the optimality,
[TABLE]
As on the continuous level, the directional derivative for given can be expressed as
[TABLE]
where is the solution of the discrete adjoint equation
[TABLE]
The discrete adjoint state, which corresponds to the discrete optimal control is denoted by . The variational inequality (53) is equivalent to the following pointwise projection formula, cf. (12),
[TABLE]
or
[TABLE]
on each . Due to the fact that , we have is piecewise constant and therefore by the projection formula also is piecewise constant. As a result no explicit discretization of the control variable is required.
To prove Theorem 1.1 we first need estimates for the error in the state and in the adjoint variables for a given (fixed) control . Due to the structure of the optimality conditions, we will have to estimate the error , where and . Note, that is not the Galerkin projection of due to the fact that the right-hand side of the adjoint equation (11) involves and the right-hand side of the discrete adjoint equation (54) involves . To obtain an estimate of optimal order, we will first estimate the error with respect to the norm. Note, that an estimate would not lead to an optimal result.
Theorem 4.1
Let be given and let be the solution of the state equation (2) and be the solution of the discrete state equation (50). Then there holds the following estimate
[TABLE]
Proof
We denote by the error and consider the following auxiliary dual problem
[TABLE]
where
[TABLE]
and the corresponding discrete solution defined by
[TABLE]
Using (51) for and the Galerkin orthogonality for we obtain,
[TABLE]
Using the local estimate from Theorem 3.2 with for any where , we obtain
[TABLE]
We take , where is the modified Clément interpolant and is the projection defined in (17). Thus, by the triangle inequality, approximation theory, inverse inequality and the stability of the Clément interpolant in norm, we have
[TABLE]
can be estimated similarly since for by the triangle inequality we have
[TABLE]
As a result
[TABLE]
Using Lemma 2, we obtain
[TABLE]
and hence
[TABLE]
For the terms and we obtain using an -estimate from MeidnerD_VexlerB_2008a
[TABLE]
can be estimated similarly since by the triangle inequality
[TABLE]
On the other hand using that for and that for , and using Assumption 1, we have
[TABLE]
where in the last two steps we used (56). Combining the estimate for , , , , and the above estimate and inserting them into (55) we obtain:
[TABLE]
Setting completes the proof.
In the following theorem we provide an estimate of the error in the adjoint state for fixed control .
Theorem 4.2
Let be given and let be the solution of the adjoint equation (11) and be the solution of the discrete adjoint equation (54). Then there holds the following estimate
[TABLE]
Proof
First by the triangle inequality
[TABLE]
Using Proposition 2.3 and the assumptions on , we have similarly to Theorem 4.1
[TABLE]
Setting , we obtain
[TABLE]
Next, we introduce an intermediate adjoint state defined by
[TABLE]
where and therefore is the Galerkin projection of . By the local best approximation result of Theorem 3.2 for any we have
[TABLE]
The terms , , , and can be estimated the same way as in the proof of Theorem 4.1 using the regularity result for the adjoint state from Proposition 2.3. This results in
[TABLE]
Setting and taking square root, we obtain
[TABLE]
It remains to estimate the corresponding error between and . We denote . Then we have
[TABLE]
As in the proof of Lemma 5 we use the fact that
[TABLE]
holds for all . Applying this inequality together with the discrete Sobolev inequality, see BrennerSC_ScottLR_2008 , results in
[TABLE]
Therefore
[TABLE]
Using Theorem 4.1 we obtain
[TABLE]
Combining this estimate with (59) we complete the proof.
Using the result of Theorem 4.2 we proceed with the proof of Theorem 1.1.
Proof
Due to the quadratic structure of discrete reduced functional the second derivative is independent of and there holds
[TABLE]
Using optimality conditions (10) for and (53) for and the fact that we obtain
[TABLE]
Using the coercivity (60) we get
[TABLE]
Applying Theorem 4.2 completes the proof.
The reference list from the paper itself. Each links out to its DOI / PubMed record.
- 1(1) H. W. Alt , Linear functional analysis , Universitext, Springer-Verlag London, Ltd., London, 2016. An application-oriented introduction, Translated from the German edition by Robert Nürnberg.
- 2(2) M. Amouroux and J.-P. Babary , On the optimal pointwise control and parametric optimization of distributed parameter systems , Internat. J. Control, 28 (1978), pp. 789–807.
- 3(3) H. T. Banks , ed., Control and estimation in distributed parameter systems , vol. 11 of Frontiers in Applied Mathematics, Society for Industrial and Applied Mathematics (SIAM), Philadelphia, PA, 1992.
- 4(4) S. C. Brenner and L. R. Scott , The mathematical theory of finite element methods , vol. 15 of Texts in Applied Mathematics, Springer, New York, third ed., 2008.
- 5(5) E. Casas, C. Clason, and K. Kunisch , Parabolic control problems in measure spaces with sparse solutions , SIAM J. Control Optim., 51 (2013), pp. 28–63.
- 6(6) E. Casas and K. Kunisch , Parabolic control problems in space-time measure spaces , ESAIM Control Optim. Calc. Var., 22 (2016), pp. 355–370.
- 7(7) E. Casas, B. Vexler, and E. Zuazua , Sparse initial data identification for parabolic PDE and its finite element approximations , Math. Control Relat. Fields, 5 (2015), pp. 377–399.
- 8(8) E. Casas and E. Zuazua , Spike controls for elliptic and parabolic PD Es , Systems Control Lett., 62 (2013), pp. 311–318.
