Quasi-best approximation in optimization with PDE constraints
Fernando Gaspoz, Christian Kreuzer, Andreas Veeser and, Winnifried Wollner

TL;DR
This paper establishes quasi-best approximation bounds for finite element solutions to PDE-constrained quadratic optimization problems, linking error bounds to best approximation errors and analyzing parameter dependencies.
Contribution
It introduces a quasi-best approximation result for PDE-constrained optimization, including bounds that are independent of regularization parameters under certain conditions.
Findings
Error in state and adjoint state bounded by best approximation error
Constant depends on inverse square-root of Tikhonov parameter
Independence of approximation constant when operators are compact
Abstract
We consider finite element solutions to quadratic optimization problems, where the state depends on the control via a well-posed linear partial differential equation. Exploiting the structure of a suitably reduced optimality system, we prove that the combined error in the state and adjoint state of the variational discretization is bounded by the best approximation error in the underlying discrete spaces. The constant in this bound depends on the inverse square-root of the Tikhonov regularization parameter. Furthermore, if the operators of control-action and observation are compact, this quasi-best-approximation constant becomes independent of the Tikhonov parameter as the meshsize tends to and we give quantitative relationships between meshsize and Tikhonov parameter ensuring this independence. We also derive generalizations of these results when the control variable is discretized…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Quasi-best approximation
in optimization
with PDE constraints
Fernando Gaspoz
Technische Universität Dortmund, Fakultät für Mathematik, Vogelpothsweg 87, 44227 Dortmund, Germany.
,
Christian Kreuzer
Technische Universität Dortmund, Fakultät für Mathematik, Vogelpothsweg 87, 44227 Dortmund, Germany.
,
Andreas Veeser
Dipartimento di Matematica ’F. Enriques’, Università degli Studi di Milano, Via C. Saldini, 50, 20133 Milano, Italy.
and
Winnifried Wollner
Technische Universität Darmstadt, Fachbereich Mathematik, Dolivostr. 15, 64293 Darmstadt, Germany.
Abstract.
We consider finite element solutions to quadratic optimization problems, where the state depends on the control via a well-posed linear partial differential equation. Exploiting the structure of a suitably reduced optimality system, we prove that the combined error in the state and adjoint state of the variational discretization is bounded by the best approximation error in the underlying discrete spaces. The constant in this bound depends on the inverse square-root of the Tikhonov regularization parameter. Furthermore, if the operators of control-action and observation are compact, this quasi-best-approximation constant becomes independent of the Tikhonov parameter as the meshsize tends to [math] and we give quantitative relationships between meshsize and Tikhonov parameter ensuring this independence. We also derive generalizations of these results when the control variable is discretized or when it is taken from a convex set.
1. Introduction
Optimization problems with PDE constraints are ubiquitous. A basic, and regularly considered, example is
[TABLE]
where denotes the -norm over some underlying domain, is the desired state and scales the cost of the control. Additionally, constraints on the control and/or the state can be added, and the error due to a discretization of the state equation, and possibly the control, have been analyzed. For piecewise constant discretizations of the control this has been done in [9, 12] including possible box-constraints on the control variable, see also the summary of obtainable convergence orders including Neumann-control in [16]. The consideration of element wise linear functions for the control has been done in [3, 21] in the presence of control constraints.
In [14] it was observed, that the minimization problem could be solved without prescribing a discretization of the control since the control can be recovered from the optimality condition and thus a discretization of the control is induced by the discretization for the state equation. With this convergence for the control in could be shown even in the presence of box control-constraints. It was observed by [18] that the same convergence order can be obtained if a discretized control is used and a post-processing step based upon the optimality conditions is applied.
Due to the structure of the objective in (1.1) these above mentioned estimates make use of the ‘natural norm’
[TABLE]
Although this norm is natural due to the functional, it induces a scaling in all estimates involving the control. Further estimates, for instance of -norms of the state thereby also contain this scaling. Moreover, the above ‘natural norm’ is not balanced in terms of approximation accuracy, i.e., the error of the state in will typically decay at least as fast as the error of the control.
The later effect, however, is invisible as long as the approximation accuracy of both terms is limited by the selected discrete spaces, and not by the regularity of the solutions, as it is typically the case for the model (1.1). However, in the presence of pointwise constraints on the state, see, e.g., [2, 7, 17, 8, 19] or the gradient of the state [6, 13, 20, 25] optimal order estimates can only be obtained for the control variable; while numerics shows a faster convergence of the error in the state variable in .
As an alternative to the aforementioned works, one may combine the error in the state with error in the (suitably rescaled) adjoint state, measuring both in the norms that are given by the functional analytic set-up of the PDE constraint. For problem (1.1), this leads to the norm
[TABLE]
where denotes the -norm. For respective counterparts of (1.2), Chrysafinos and Karatzas [5, 4] prove so-called symmetric error estimates or quasi-best approximation results. The growth of the quasi-best-approximation constant is limited by and , respectively.
In this article, we prove abstract quasi-best approximation results, where the discretization error is measured in a counterpart of (1.2). In order to illustrate our results, assume that the underlying domain is convex, let be a sequence of conforming finite-dimensional spaces that approximates , and consider the variational discretization of (1.1). If we denote by the pairs of approximate primal and dual states, our results yield (cf. Theorem 3.2 and Example 3.8)
[TABLE]
with
[TABLE]
Here is the constant in the Friedrichs inequality and is an interpolation constant depending on the shape regularity on the underlying meshes. In contrast to the first, non-asymptotic relationship, the second, asymptotic one exploits the compactness of the observation and control-action operators and elliptic regularity theory. Notably, the latter reveals that Céa’s lemma, which holds for the constraint discretization, is recovered as and, in particular, ensures an approximation quality independent of for .
The rest of the paper proceeds as follows. In Section 2, we state precisely the considered problem class, allowing for any linear, bounded, and inf-sup-stable operator in the constraint. Furthermore, we reduce the optimality system by eliminating the control, and we lay the groundwork for our results by a careful discussion of the continuity and nondegeneracy properties of the associated bilinear form.
Section 3 constitutes the core of this work and establishes quasi-best approximation for the variational discretization. To this end, the variational discretization is viewed as a Petrov-Galerkin method and we employ the formula for the quasi-best-approximation constant in Tantardini and Veeser [23]. For the asymptotic behavior of the quasi-best-approximation constant, we additionally invoke a duality argument, which is similar to, but simpler than, Schatz [22].
The last two sections center on generalizations of these results. In Section 4, we consider approximate control-action operators, covering in particular the discretization of the control variable. Finally, Section 5, deals with nonlinear optimality systems arising from additional convex constraints for the control. The derived results complement those of the linear case and the simplification of Schatz’ argument comes in quite useful.
2. Model optimization problem and reduced optimality system
We introduce our model optimization problem. Assume that the control variable is taken from a real Hilbert space with scalar product and induced norm . Its corresponding state is determined by solving a linear boundary value problem of the form
[TABLE]
with the following setting:
- •
The state space is a Hilbert space with scalar product and induced norm . Its dual and the corresponding duality pairing are indicated with and , respectively.
- •
The differential operator is induced by bilinear form , where is a second Hilbert space with scalar product , induced norm , dual space , and dual pairing . We assume that the bilinear form is bounded and satisfies the following inf-sup conditions:
[TABLE]
Employing well-known inf-sup theory (cf., e.g., Babuška [1]), we see that the operator is linear and boundedly invertible.
- •
The control-action operator is linear and bounded with constant .
Our goal is then to numerically solve the constrained optimization problem
[TABLE]
where we assume in addition:
- •
The desired “state” is an element of a Hilbert space with scalar product and induced norm .
- •
The observation operator is linear, and bounded with constant .
- •
The cost of the control, which can be viewed as a Tikhonov regularization, is scaled with the parameter .
Problem (2.3) is a quadratic minimization problem with a linear constraint. The objective function is convex with respect to
[TABLE]
and strictly convex in . Consequently, standard arguments ensure the existence of a unique solution; see, e.g., Lions [15, Theorem 1.1] or Tröltzsch [24, Chapter 2.5].
If , , is the (weak) Laplacian, and and are the canonical compact immersions L^{2}\to\big{(}H^{1}_{0}\big{)}^{*} and , then (2.3) simplifies to the optimization problem (1.1) in the introduction. Notice that, in this case, the operators and are related by .
To formulate the optimality system for (2.3), it is useful to define the adjoint operators , , of , , by
[TABLE]
for all , , , . Thanks to the convexity of the problem (2.3), a pair is a minimum point if and only if there exists such that
[TABLE]
We may eliminate by inserting the last equation into the first one and multiplying the second equation by . We thus obtain the following reduced optimality system for the pair :
[TABLE]
Notice that the second row of equations, , suggests scaling the adjoint state by the factor , while the first row, , suggests no scaling at all. As a compromise, we propose to use and .
We thus transform the optimality system (2.5) into
[TABLE]
and the reduced optimality system (2.6) into
[TABLE]
This rescaled and reduced optimality system deviates from the usual KKT-formulation, but has an interesting structure. As the KKT-formulation, it is symmetric also for non-symmetric . The off-diagonal consists of two interrelated invertible operators, while the diagonal entries are (semi-)definite, symmetric operators. Notice that, upon inverting the rows, the roles of the diagonal and off-diagonal can be exchanged. For the optimization problem (1.1), the operator matrix is then diagonally dominant in that and are compact operators.
Let us give a weak formulation of the rescaled and reduced optimality system. Its rows are equivalently written as
[TABLE]
and so we are led to introduce the Hilbert space
[TABLE]
and the bilinear form given by
[TABLE]
for . Note that we use the same letter for the bilinear form inducing the operator and for the one in (2.10b); this “operator overloading” should not cause confusion when the domain is clear. If not, we shall distinguish the two forms by writing or . In this notation, the variational formulation of the rescaled and reduced optimality system (2.8) simply reads
[TABLE]
A pair is a solution of (2.11) if and only if is a solution of (2.9) if and only if the triple verifies the rescaled optimality system (2.7). Consequently, thanks to the convexity of (2.3), if is a solution of (2.11), then is a solution of the original optimization problem (2.3).
Let us analyze the bilinear form . We readily see that
[TABLE]
but is not coercive in general. Consider, for example, a set-up where there exists such that . Then is not coercive and so, even for coercive, also is not coercive for sufficiently small.
In order to obtain further properties, let us first consider the contributions and separately. The bilinear form is closely related to the original minimization problem (2.3) and its “energy seminorm” (2.4). To see this, observe that, if and , we have the correspondence
[TABLE]
which motivates to introduce the seminorm
[TABLE]
on . Thus, denoting by the kernel of and realizing that the bilinear form is well-defined on the quotient space , we see that
[TABLE]
where the second identity relies on
[TABLE]
Since
[TABLE]
with
[TABLE]
the form is also continuous in , with constant .
The bilinear form inherits its continuity and nondegeneracy properties from . More precisely, we have
[TABLE]
with and from (2.2). While the first identity is straight-forward, the second one hinges on the inf-sup-duality (cf. Babuška [1])
[TABLE]
for with domain .
Turning to the complete bilinear form , we may sum up the continuity properties as follows: for all , we have
[TABLE]
with
[TABLE]
Here we have equipped as trial space with and as test space with . The former is in accordance with our scopes in the error analyses below and the latter avoids in particular a dependence on of the continuity constant of and in the following bound for the right-hand side in (2.11): for all ,
[TABLE]
The derivation of the nondegeneracy properties of the bilinear form is more subtle. In order to establish the crucial inf-sup condition (2.2c), let be given.
In order to find a suitable , we combine the nondegeneracy properties of and in the ansatz
[TABLE]
thanks the continuity (2.14) of and . Using the inequality with and
[TABLE]
and recall (2.22b), we arrive at
[TABLE]
where the norms on the right-hand side coincide with those in the continuity bound (2.19). We therefore have the following basic result.
Theorem 2.1** (Bilinear form of reduced optimality system).**
If we equip as trial space with and as test space with , then the inf-sup constant and the continuity constant of the bilinear form (2.10) satisfy
[TABLE]
where is defined by the relations (2.23).
The inequalities of Theorem 2.1 yield for the condition number of the bilinear form (i.e., the ratio of its continuity constant to its inf-sup constant)
[TABLE]
The second factor, the condition number of the bilinear form associated with the constraint, is expected to be a kind of lower bound. In this vein, we may view the first factor as a bound for the possible amplification of the constraint conditioning, resulting from the interplay of constraint and the objective in the constrained optimization problem (2.3). Inspecting (2.23), we see that is a function of the parameters , , , and . The next three remarks discuss asymptotic behaviors of that will play major roles in what follows or are of independent interest.
Remark 2.2* (Amplification for pure constraint case).*
Consider the special case and . Then the rescaled and reduced optimality system (2.8) is a well-posed ‘double’ boundary value problem. Its condition number with respect to is ; cf. (2.17). As and imply , , and so and , this is reproduced by Theorem 2.1.
It is worth mentioning that this limiting case of “pure constraint” is attained in a continuous manner:
[TABLE]
where is essentially the operator norm of the perturbation.
Remark 2.3* (Amplification for degenerating constraint).*
While the continuity constant of the bilinear form does not enter , its inf-sup constant does, in a critical manner. More precisely, we have
[TABLE]
Notice that the fraction involving has only values in the interval .
Remark 2.4* (Amplification for vanishing regularization).*
Consider the limit of the Tikhonov regularization parameter (while and are fixed). Then so that
[TABLE]
Let us see with a simple example that the inf-sup constant in Theorem 2.1 can blow up with this rate and so the lower bound therein cannot be improved for small without further assumptions on the structure of .
Consider , where and are the Euclidean norm in ,
[TABLE]
and . The symmetric bilinear form of the optimality system is then given by the matrix
[TABLE]
For , we have and
[TABLE]
so that
[TABLE]
Hence, the asymptotic behavior of in (2.25) is attained.
The chosen norms for as trial and test space are not always the most convenient ones. This follows from the following remark considering a special case.
Remark 2.5* (Coercive constraints with ).*
Suppose that and with coinciding scalar products and norms and that the bilinear form is coercive with constant and . It is worth noting that, as is not necessarily symmetric, the best coercivity constant may be much smaller than the inf-sup constant . Given , we proceed as in (2.22) taking , , and obtain
[TABLE]
Hence, in this case, the condition number of with respect to the norms in (2.27) is independent of the Tikhonov regularization parameter . Nevertheless, if , also this choice of norms cannot offer in general an asymptotic behavior better than as . In fact, re-computing the example in Remark 2.4 with the norms in (2.27) does not change the behavior of its inf-sup constant.
Let us conclude this section with the following side product of our discussion of the bilinear form .
Corollary 2.6** (Existence and uniqueness).**
*The rescaled and reduced optimality system (2.11) and thus (2.5) has a unique solution. *
Proof.
Inequality (2.24) ensures (2.2c) for the bilinear form and, thanks to the algebraic symmetry of , also (2.2b). ∎
3. Analysis for variational discretization
In this section, we analyze the error of the variational discretization of the optimization problem (2.3) according to Hinze [14]. Our key tool is the rescaled and reduced optimality system (2.8), whose Galerkin solution coincides with the approximate solution of the variational discretization.
3.1. Variational discretization and reduced optimality system
We start by discretizing the PDE constraint (2.1) of the optimization problem (2.3). Recalling its variational formulation
[TABLE]
we choose some conforming finite-dimensional spaces , , such that the restriction of the bilinear form on is nondegenerate. The corresponding Petrov-Galerkin method then reads
[TABLE]
Using this for the constraint in (2.3), we arrive at the (semi-)discrete optimization problem
[TABLE]
where we, in addition, assume that can be exactly evaluated for any function from . As in the continuous case, is the unique solution of (3.1) if and only if there exists such that
[TABLE]
Also here, we may eliminate the approximate control by inserting the third equation into the first one. Setting , the variational formulation of the ensuing discrete rescaled and reduced optimality system is
[TABLE]
Its solution is the Galerkin approximation in to the solution of the variational formulation (2.11) of the rescaled and reduced optimality system. Applying Corollary 2.6 to the discrete spaces therefore yields the following approach to uniqueness and existence of the variational discretization of (2.11).
Lemma 3.1** (Discrete well-posedness).**
The discrete reduced optimality system (3.3) has a unique variational solution . Consequently, the pair with is the unique solution of the semidiscrete optimization problem (3.1).
Remarkably, the approximate solutions of the variational discretization (3.2) are computable whenever and can be evaluated exactly.
3.2. Non-asymptotic quasi-best approximation
We shall assess the quality of the Galerkin approximation from (3.3), assuming that we are interested particularly in the -error of the approximate state . For this purpose, we compare it with a suitable best error in .
Let us first recall some basic results in Petrov-Galerkin approximation, which we already formulate for the discretization of the constraint. Let be the generalized Ritz projection of given by for all . Since satisfies (2.2) and is nondegenerate on , there exists a constant such that
[TABLE]
see, e.g., Babuška [1]. We refer to the smallest possible choice of as the quasi-best-approximation constant of the constraint discretization. Xu and Zikatanov [26] show the identities
[TABLE]
and Tantardini and Veeser [23, Theorem 2.1] give the formula
[TABLE]
where varies in and varies in and, for the sake of notational simplicity, a tedious is avoided.
A perhaps striking feature of these formulas is that they are not affected by the choices of the norms in the test spaces and . This comes in quite useful in our context, as the adjoint state is an auxiliary variable and, in the original approximation problem (2.3), the norm is free as long as (2.2) continues to hold with . Exploiting this freedom, we propose to (possibly) redefine the norm on the space by
[TABLE]
and so, in particular, to measure the error of the approximate adjoint state in this norm. This redefinition affects the constants that we associated with the constrained optimization problem (2.3). The new continuity and inf-sup constants of the bilinear forms are
[TABLE]
The constant is not affected, while we have
[TABLE]
where we indicate quantities before the redefinition by an additional index “old”. As in addition
[TABLE]
the results below hold also with the original norm in , but the constants have to be revisited.
The convenience of the choice (3.6) lies in the following consequences of (3.7). The numerator in (3.5) is , which, together with the inf-sup-duality, cf. (2.18), yields
[TABLE]
for the inf-sup constant of . Accordingly, the generalized Ritz projection of given by for all verifies
[TABLE]
Setting , we also have
[TABLE]
After these preparations, we are ready to derive a first result about quasi-best approximation of the variational discretization (3.1).
Theorem 3.2** (Non-asymptotic quasi-best approximation).**
Let be any solution of the optimality system (2.11) and choose (3.6) as norm in . The combined error in the corresponding approximate state and its adjoint of the variational discretization is quasi-best in with
[TABLE]
Here
[TABLE]
and is the quasi-best-approximation constant of the constraint discretization.
Proof.
Thanks to Theorem 2.1 and Lemma 3.1, we can use the counterpart of (3.5) for the characterization (3.3) of the variational discretization. Let . The continuity bound (2.19) and (3.7) give for the numerator
[TABLE]
For the denominator, we use (2.22), where is replaced by and, therefore, with in place of in view of (3.9). We thus obtain
[TABLE]
and the proof is finished. ∎
In the special situation of Remark 2.5, we can obtain the following quasi-best approximation result.
Remark 3.3* (Quasi-best approximation for coercive constraints and ).*
Suppose that and with coinciding scalar products and norms and that the bilinear form is -coercive with constant and . Exploiting the coercivity and continuity properties of Remark 2.5, we derive for the error of the variational discretization (2.11)
[TABLE]
The quasi-best approximation constant in the preceding Remark 3.3 does not blow up for vanishing regularization. Nonetheless, when measuring the error merely with , it does not exclude an -blow up of the quasi-best approximation constant even in the special case considered in Remark 2.4 and, in the light of the example therein, it does not exclude an -blow up for general operators and . As we shall see, the -dependence in Theorem 3.2 is less severe.
Remark 3.4* (Vanishing regularization and quasi-best approximation).*
As in Remark 2.4, we consider the limit for the Tikhonov regularization parameter. Similarly to there, we have
[TABLE]
This blow up arises from the lower bound of the inf-sup constant in Theorem 2.1, which cannot be improved because of (2.26). Note however, that the equivalence of the norms and is not uniform in . In the light of (3.5), it is therefore conceivable that (3.12) could be improved by using the latter as test space norm. However, the determination of the discrete inf-sup constant with respect to this abstract norm appears to be much more involved than the approach (2.22), which directly carries over to discrete spaces.
In any case, we shall show below that, under refinement, the -dependence disappears for many instances of the optimality system (2.7).
3.3. Asymptotic quasi-best approximation
In this section, we complement Theorem 3.2. To be more precise, let be the quasi-best-approximation constant of the variational discretization therein and consider a sequence of discrete spaces leading to a uniform stable constraint discretization in that
[TABLE]
which is equivalent to discrete inf-sup stability in view of (3.9). Theorem 3.2 then ensures the existence of a constant such that
[TABLE]
This upper bound may be pessimistic. To motivate this assessment, represent the bilinear form by the operator matrix
[TABLE]
which is the one in (2.8) with inverted rows. If and are compact, this matrix is diagonally dominant in an operator sense and can be viewed as a compact perturbation of the diagonal matrix with the entries and . Therefore, in order to improve on (3.14), we mimic somewhat the argument in Schatz [22], introducing some new twist.
Let us first observe that, in accordance with Remark 2.2, Theorem 3.2 yields whenever . More precisely and generally, we have the following relationship between the two quasi-best-approximation constants.
Lemma 3.5** (Quasi-best-approximation constants).**
The quasi-best-approximation constants and are related by
[TABLE]
where is as in Theorem 3.2 and is the generalized Ritz projection in (3.10).
Proof.
As in the proof of Theorem 3.2, we will make use of (3.5) with replaced by . Given and , we can write
[TABLE]
because of . Hence,
[TABLE]
As
[TABLE]
with equality for some , we obtain
[TABLE]
Thanks to (2.14), (2.20), and (3.11) this proves the claimed inequality. ∎
In order to deploy Lemma 3.5, we need additional assumptions for our optimization problem and its discretization. We shall consider two settings: a “qualitative” and a “quantitative” one. The former assumes in addition
[TABLE]
for the constraint discretization. Notice that, owing to (3.8), the condition (3.15a) is independent of our choice to equip with the norm (3.6).
Lemma 3.6** (Qualitative asymptotic quasi-best approximation).**
Under the assumptions (3.13) and (3.15), the quasi-best-approximation constant satisfies
[TABLE]
where
[TABLE]
Proof.
In the light of Lemma 3.5 and (3.13), it suffices to verify the uniform convergence
[TABLE]
This follows from a standard argument; we provide details for the sake of completeness. Let be any sequence with and choose such that
[TABLE]
where we write instead whenever the latter is an index. Exploiting (3.13) another time, we see that the sequence given by is bounded in the Hilbert space . Owing to (3.15b), its weak limit satisfies
[TABLE]
for any and . Choosing by means of (3.15b), we derive by . Consequently, (2.17) yields . Thanks to (3.15a), the operator and the adjoint are compact. This turns the weak convergence in into the strong convergence and the proof is finished. ∎
In order to quantify the convergence in Lemma 3.6, we shall use a duality argument. This requires a second, more specific setting of additional assumptions involving the Sobolev spaces , , and their norms over some domain. We use instead of in order to avoid confusion with the norms and of and . For , we denote by the (topological) dual space of and stands for the dual norm of .
We suppose that spaces and relate to Sobolev spaces in the following way: There are , , and a constant such that
[TABLE]
for some constant , which quantifies the approximation property (3.15b).
Theorem 3.7** (Quantitative asymptotic best approximation).**
Under the assumptions (3.13) and (3.17), the quasi-best-approximation constant satisfies
[TABLE]
where is as in Lemma 3.6. For the -dependence of , cf. Remark 3.4.
Proof.
Similarly as in the first step of the proof of Lemma 3.6, inserting (3.13) and
[TABLE]
into Lemma 3.5 establishes the claim. To show (3.18), let with and define as the solution of the following “dual” problem associated with the bilinear form :
[TABLE]
where . We thus have
[TABLE]
where is arbitrary. For the first factor, (3.10) and (3.13) imply
[TABLE]
For second factor, we employ (3.17d) with suitable to obtain
[TABLE]
and it remains to show that the norms on the right-hand side are suitably bounded. Let consider the first one. Making use of the regularity estimate (3.17c) and the definition of , we deduce
[TABLE]
where is the operator norm of from (3.17b). A similar argument yields
[TABLE]
where is the operator norm of in (3.17b). We insert the previous estimates in the first one and conclude
[TABLE]
with , i.e., (3.18). ∎
Let us exemplify Theorem 3.7 by two applications. The first one considers the optimization problem (1.1) of the introduction, while the second one is more involved in that the constraint does not allow for a coercive set-up.
Example 3.8* (Simple model optimization).*
Discretize the optimization problem (1.1) of the introduction with linear finite elements on quasi-uniform meshes with meshsize . We have and, if we choose , we already have and (3.6) does not change the norm in . Further, , where is the constant in the Poincaré-Friedrichs inequality. Moreover, we have and, assuming that the underlying domain is convex, . Taking Sobolev seminorm instead of norms in (3.17a), we then have for the relevant cases and thanks to elliptic regularity as well as . Standard approximation theory shows (3.17d) with depending on the shape regularity of the underlying meshes. Since , we conclude
[TABLE]
for the quasi-best-approximation constant of the variational discretization in this case.
Example 3.9* (Point source control).*
We consider the following modification of the optimization problem (1.1), where the distributed control is replaced by a finite number of point sources:
[TABLE]
where the underlying domain is planar, polygonal, Lipschitz, but not necessarily convex, are distinct points, denotes the Dirac functional at the point , and . The bilinear form , , has a continuous and inf-sup-stable extension on with and and allows for a standard discretization with linear finite elements for both trial and test space; see, e.g., [11]. For the verification of the discrete inf-sup condition, denote by and the Ritz projection and the Scott-Zhang interpolation operator, respectively. As
[TABLE]
and
[TABLE]
the continuous inf-sup-condition yields, for any ,
[TABLE]
and so
[TABLE]
where depends only on continuous inf-sup constant and on the shape regularity of the underlying mesh and we switched to (3.6) for the norm on . To complete the setting, we set , , and let be the canonical embedding and be given by . The continuity constants and are of order and , respectively. Notice that, for , is not continuous because functions in do not have point values in general. Choosing , we have (3.17) with , and therefore
[TABLE]
4. Analysis with approximate control-action operator
In this section, we shall analyze the approximation properties of a variational discretization, where the control-action operator is approximated. This includes the case of a discretized control space.
4.1. Approximate variational discretization
Let , , be the same finite-dimensional conforming spaces introduced in Section 3.1 and assume that the linear operator approximates . Then the (semi-)discrete optimization
[TABLE]
generalizes (3.1). It has the solution if and only if there exists such that
[TABLE]
As before, we may eliminate . If we define
[TABLE]
with
[TABLE]
for , then the reduced version of (4.2) is the following perturbation of the optimality system (3.3):
[TABLE]
where . Before we proceed to analyze its discretization error, let us give an important class of examples.
Example 4.1* (Discretized controls).*
We consider a conforming discretization of the control variable. More precisely, replacing in (3.1) with a finite-dimensional subspace leads to the discrete optimality system
[TABLE]
If we denote by the -orthogonal projection onto , then the third equations mean
[TABLE]
and, therefore, the right-hand side of the first equation can be rewritten as follows:
[TABLE]
Hence, the reduced version of (4.4) is a special case of (4.3) with
[TABLE]
As the bilinear form coincides with except for using in place of , the non-asymptotic continuity and nondegeneracy properties of in Section 2-3 immediately carry over by replacing with the operator norm of . In particular, setting and defining
[TABLE]
inequality (2.19) yields
[TABLE]
for all . Furthermore, (3.11) and the inf-sup duality (2.18) for imply
[TABLE]
for all , where
[TABLE]
and is the quasi-best-approximation constant of the constraint discretization.
Since the structures of the discrete problems (4.3) and (3.3) are the same, well-posedness of (4.3) follows from Lemma 3.1.
4.2. Approximation
As in the error analysis of Section 3.2, we adopt the convenient choice
[TABLE]
Here we start our analysis by splitting the error into an approximation part and a consistency part.
Lemma 4.2** (Approximation and consistency error).**
Let be any solution of the optimality system (2.11) and let be its approximation from (4.3). Then the error satisfies
[TABLE]
Here is defined by (4.8) and is the quasi-best-approximation constant of the constraint discretization from (3.10).
Proof.
Define by
[TABLE]
Then Theorem 3.2 with , , in place of , , gives
[TABLE]
and we have the identities
[TABLE]
for all . In view of (4.6) and (4.7), these identities imply
[TABLE]
The claim follows from the obvious inequalities and . ∎
For the next corollary it is necessary to consider a sufficiently large class of optimization problems, e.g., the class of optimization problems, where a constraint can be of the form for some and may be surjective.
Corollary 4.3** (Necessary condition for quasi-best approximation).**
If the approximate variational discretization (4.3) is quasi-best in the class , then
[TABLE]
Proof.
Let be arbitrary and take some . Then is a possible solution in the class . Since (4.3) is quasi-best in , the discrete solution is exactly . Hence, by Lemma 4.2 we have , which yields . ∎
Although possible, it is difficult to imagine that a practical approximation satisfies the condition in Corollary 4.3 without coinciding with . We therefore consider in what follows only assumptions on that lead to asymptotic quasi-best approximation. In view of Lemma 4.2, this requires, that the consistency error vanishes at least as fast as the best approximation error, i.e.,
[TABLE]
Moreover, to capture in the limit the compactness of resulting from assumption (3.15a), we assume that
[TABLE]
This implies that the operator norms are uniformly bounded. Indeed, suppose that as and, for each , let be such that and . Then in as , which, in view of (4.10), yields a contradiction. Consequently,
[TABLE]
is finite.
Lemma 4.4** (Qualitative asymptotic quasi-best approximation with approximate control-action).**
Let be a solution to problem (2.11) and let , , be the corresponding approximations given by (4.3). Furthermore, assume uniform stability (3.13), approximability (3.15b), limiting compactness (4.10), and that is compact. If the exact solution satisfies (4.9), we have
[TABLE]
where
[TABLE]
Proof.
As in the proof of Lemma 4.2, define by
[TABLE]
We deduce
[TABLE]
by replacing with and with in Lemma 3.5 and using the limiting compactness (4.10) instead of the compactness of in the proof of Lemma 3.6. Next, proceeding as in the proof of Lemma 4.2, assumption (4.9) on the exact solution gives
[TABLE]
We therefore conclude by inserting the two preceding relationships into the triangle inequality ∎
We turn to prove a quantitative quasi-best approximation result. To this end, we need to specify the qualitative assumptions (4.9) and (4.10) by quantitative ones. We shall assume that
[TABLE]
and that
[TABLE]
where is suitably chosen. Note that (4.13) reduces for to the part regarding in the quantitative counterpart (3.17b) of the qualitative compactness (3.15a).
Theorem 4.5** (Quantitative asymptotic quasi-best approximation with approximate control-action).**
Let , , , and be as in Lemma 4.4. In addition, assume uniform stability (3.13) and that there exists such that we have (3.17), where (4.13) replaces the assumption on in (3.17b). If the exact solution satisfies also (4.12) with the same , we have
[TABLE]
Proof.
We follow the lines of the proof of Lemma 4.4, but replacing (4.9) with (4.12) and (4.11) with a quantitative argument in the spirit of Theorem 3.7. To this end, it suffices to use (4.13) instead of (3.17b). ∎
We conclude this section by assessing the key assumptions (4.9) and (4.12) by a remark and an example.
Remark 4.6* (Ensuring dominated consistency error).*
As
[TABLE]
for
[TABLE]
we may verify assumptions (4.9) and (4.12) using relationships for .
Example 4.7* (Simple model optimization and piecewise constant controls).*
Consider the setting of Example 3.8, but with problem (1.1) with linear finite elements for the constraint and piecewise constants for the control variable. In the light of Example 4.1, this full discretization can be cast into (4.3) with , where is the -projection onto piecewise constants. By duality, we have
[TABLE]
where depends on the shape regularity of the underlying meshes. Suppose that there is a constant such that
[TABLE]
This holds for example if the matrix norm of the Hessian of the exact state or its adjoint state are bounded away from 0 in a fixed subdomain. We conclude
[TABLE]
i.e., (4.12) with and a constant depending on the exact solution under consideration.
5. Analysis with Control Constraints
This section generalize our approach to optimization problems that are nonlinear because of constraints on the control.
5.1. Control constraints and discretization
Let be the set of admissible controls. We assume that
[TABLE]
and denote by the projection operator onto which is characterized by or, equivalently, by
[TABLE]
The latter characterization implies
[TABLE]
for all , which in turn shows that the operator is strongly monotone and Lipschitz continuous, in both cases with constant 1.
The generalization of problem (2.3) incorporating convex control constraints is then the convex optimization problem
[TABLE]
Thanks to (5.1), a solution is characterized by the existence of such that the following counterpart of the rescaled optimality system (2.7) is satisfied:
[TABLE]
As in Section 2, we insert the third equation into the first one and consider the corresponding weak formulation of the rescaled and reduced optimality system:
[TABLE]
where and
[TABLE]
which already incorporates the -scaling. In contrast to the previous sections, and so are in general not linear in the first argument. Nonetheless, if we introduce the pseudometric
[TABLE]
inequality (5.2) leads to the following replacement of the properties (2.14) of the bilinear form : if and \varphi=\big{(}{-}(v_{1}-w_{1}),v_{2}-w_{2}\big{)}, then
[TABLE]
In addition, we have, for ,
[TABLE]
The continuity bound (5.6b) leads to
[TABLE]
with the metric
[TABLE]
Notice that the role of the two arguments of and cannot be interchanged. We adapt (2.22) to this new situation in the following way: given , we choose , where is the linear operator given by
[TABLE]
as in (2.23b), and is the Riesz map for , . In view of (2.24), we thus obtain the following counterpart of Theorem 2.1.
Theorem 5.1** (Properties of form ).**
If we equip as trial space with and as test space with , then we have, for any ,
[TABLE]
and
[TABLE]
where is defined by (2.23).
Also here, we can conclude existence and uniqueness as a side-product.
Corollary 5.2** (Well-posedness with control constraints).**
The optimization problem (5.5) has a unique solution.
Proof.
We shall apply the Zarantonello’s theorem of strongly monotone operators [27, Theorem 25.B] in the Hilbert space . To prepare this, we first observe that
[TABLE]
Indeed, it is continuous with constant owing to (2.22b) and boundedly invertible on account of the consequence
[TABLE]
of (2.19) and (2.24) for the bilinear form . Let us consider the nonlinear operator defined by
[TABLE]
where denotes the duality pairing associated with . Making use of Theorem 5.1, (2.19) and (5.7), we see that, for all ,
[TABLE]
and
[TABLE]
Hence, is strongly monotone and Lipschitz continuous and therefore boundedly invertible by [27, Theorem 25.B]. In light of (5.10), we can conclude by noting for all . ∎
In order to discretize the optimization problem (5.3) with control constraints, we proceed as in Section 3.1. Introducing the discrete space as therein, the variational discretization can be characterized as follows:
[TABLE]
Here we need that can be evaluated exactly for . This occurs, for example, when we consider (1.1) with box constraints and discretize with linear finite elements. If has to be approximated, the subsequent error analysis involves additional technicalities, similar to those addressed in Section 4.
Existence and uniqueness of solutions to (5.11) can be established in a similar way as Corollary 5.2. Using (3.6) as in norm in , the major change is to replace the operator (5.9) by given by
[TABLE]
where , , is the discrete counterpart of , is its inf-sup-constant, is as in (2.23), and is the Riesz map for , .
5.2. Quasi-best approximation
We analyze the quasi-best-approximation properties of the nonlinear variational discretization (5.11), adopting again
[TABLE]
The following non-asymptotic result draws heavily on Theorem 5.1, which needed an -dependent error notion for as trial space.
Theorem 5.3** (Non-asymptotic quasi-best approximation with control constraints).**
If is the approximation given by (5.11) to an arbitrary solution of (5.5), then its error is quasi-best in in that
[TABLE]
where and are as in Theorem 3.2.
Proof.
Given any , we first write
[TABLE]
To bound the second term, we employ Theorem 5.1 with, respectively, , , , , and in place of , , , , and . Writing , the definitions of and thus yield,
[TABLE]
and the claimed inequality is established as is invertible. ∎
The “” in the bound for the quasi-best-approximation constant in Theorem 5.3 arises from the triangle inequality (5.13), which is avoided in deriving in (3.5). Yet, the following asymptotic quasi-best approximation results involving the generalized Ritz projection from (3.10) are not affected by such an augmentation.
Lemma 5.4** (Nonlinear variational and generalized Ritz approximations).**
Let and be as in Theorem 5.3. The generalized Ritz projection of and are related by
[TABLE]
where and are as in Theorem 3.2.
Proof.
Applying Theorem 5.1 with the setting as in Theorem 5.3, writing , and recalling (5.7), we derive
[TABLE]
and, again thanks to the invertibility of , the proof is finished. ∎
Let us sharpen Lemma 5.4 with the help of the additional assumptions and arguments from Section 3.3 regarding the linear optimality system.
Theorem 5.5** (Supercloseness to the generalized Ritz approximation).**
Let , , and be as in Lemma 5.4. Moreover, assume (3.13) and define as in Lemma 3.6. If (3.15) holds, then
[TABLE]
More specifically, if (3.17) holds, then
[TABLE]
For the -dependence of , cf. Remark 2.4.
Proof.
In view of Lemma 5.4, it suffices to show . To this end, we modify the argument in Lemma 3.6 slightly; a similar argument has been used by [10] under weaker assumptions on . Let be any sequence with and, writing whenever is an index, consider
[TABLE]
The sequence is bounded in the Hilbert space by definition. For its weak limit , we have
[TABLE]
for arbitrary and . Consequently, (3.15b), , and (2.17) yield . In view of (3.15a), weakly in then implies .
For the second statement, we just note that the main step of the proof of Theorem 3.7 with leads to . ∎
In view of the inverse triangle inequality
[TABLE]
Theorem 5.5 readily yields the following asymptotic quasi-best approximation result.
Corollary 5.6** (Asymptotic quasi-best approximation with control constraints).**
Let be the quasi-best-approximation constant for the nonlinear variational discretization (5.11) with respect to . Moreover, assume (3.13) and define as in Lemma 3.6. If (3.15) holds, then
[TABLE]
More specifically, if (3.17) holds, then
[TABLE]
For the -dependence of , cf. Remark 2.4.
In comparison with Lemma 3.6 and Theorem 3.7, Corollary 5.6 features an additional -factor. This factor stems from the fact that the derivation we went through used an error notion that also incorporates it.
The reference list from the paper itself. Each links out to its DOI / PubMed record.
- 1[1] I. Babuška , Error-bounds for finite element method , Numer. Math., 16 (1971), pp. 322–333.
- 2[2] E. Casas and M. Mateos , Uniform convergence of the FEM. Applications to state constrained control problems , Comput. Appl. Math., 21 (2002), pp. 67–100.
- 3[3] E. Casas and F. Tröltzsch , Error estimates for linear-quadratic elliptic control problems , in Analysis and Optimization of Differential Systems (Constanta, 2002), Kluwer Acad. Publ., Boston, MA, 2003, pp. 89–100.
- 4[4] K. Chrysafinos and E. N. Karatzas , Symmetric error estimates for discontinuous Galerkin approximations for an optimal control problem associated to semilinear parabolic PDE’s , Mar. 2012.
- 5[5] K. Chrysafinos and E. N. Karatzas , Symmetric error estimates for discontinuous Galerkin time-stepping schemes for optimal control problems constrained to evolutionary Stokes equations , Comput. Optim. Appl., 60 (2015), pp. 719–751.
- 6[6] K. Deckelnick, A. Günther, and M. Hinze , Finite element approximation of elliptic control problems with constraints on the gradient , Numer. Math., 111 (2009), pp. 335–350.
- 7[7] K. Deckelnick and M. Hinze , Convergence of a finite element approximation to a state-constrained elliptic control problem , SIAM J. Numer. Anal., 45 (2007), pp. 1937–1953.
- 8[8] , Numerical analysis of a control and state constrained elliptic control problem with piecewise constant control approximations , in Numerical Mathematics and Advanced Applications, 2008, pp. 597–604.
